Wk04 Lect
Wk04 Lect
1
Outline
• Random variables
• Distribution functions
• Gaussian / Normal distribution
• Properties
• Standard normal distribution
• Discussion of engineering applications examples
• Linear combinations of random variables
• Probability Plots
• Discrete Random Variables
• Probability Mass function
• Binomial Distribution
• Normal Approximation to binomial
• Engineering applications
2
Review of Tutorial Case Studies
2. Concrete strength: The design specification requires a minimum strength of
concrete of 60 MPa. Two suppliers A and B claim that they can reliably deliver to
this specification and quoted the mean and standard deviation of their
processes:
Supplier A: mA = 64.5MPa; sA = 2.7MPa; Supplier B: mB = 62.5MPa; sB = 1MPa.
However, supplier B
manufacturing process
has lower variability,
and can be expected B
to deliver far fewer
below spec units.
It should therefore be Initially many engineers
preferred to Supplier were tempted to select
A. A Supplier A given the higher
mean strength of concrete.
Spec
3
Review of Tutorial Case Studies
3. Manufacturing of a plastic Pin (for the FFD): The design specification for the diameter
of the pin used in the Transit Van fuel filled door design is F5.5 ± 0.1 mm (i.e. the lower
specification limit is LSL = 5.4 mm and the upper specification limit is USL = 5.6 mm).
Injection moulding process parameters - m = 5.543mm, s = 0.0908mm.
26.5%
above USL
USL
LSL-R USL-R
Stopper Gap
Interference
Stopper
What is the
distribution of the
Interference?
6
Linear Combinations of Independent
Random Variables Normally Distributed
mi = mP - mS
8
Project Management Problem
mA, sA
Task A
mB, sB
Task B
Project Duration
mP, sP
mP = mA + mB
On this basis we can calculate the risk of not
sP2= sA2+ sB2 completing the project in time, estimate the
associated financial penalties, etc…
9
Modelling Random Behaviour
• The statistical task:
Prediction of
Data Our model
Choose Predict future variation
a good
model
• Two questions:
1. what values to use for m and s?
2. is the Gaussian/Normal model a reasonable one?
10
Normal Probability Plot
• To answer Question No 2 we can look for evidence of symmetry, lack of
Q 2. Is the skewness and excess kurtosis;
Normal model • A more powerful graphical test is to construct a Normal Probability Plot;
a reasonable
• The normal probability plot compares the distribution in the actual data
one?
sample actual data with a “perfect Normal sample” of the same size;
• For the DC Alternator data the “Perfect Normal sample” uses the
standardized “normal scores” (z-values) for n = 25 (25 measurements).
4%
4%
2%
11
-2.05 -1.55 -1.28
Normal Probability Plot
The DC alternator data – as run chart and dot plot
17
16
15
14
Measurement no.
13
0 5 10 15 20 25
12
Normal plot for the DC Alternator Data
17
If our model is right we should see
an approximate straight line
16
Actual sample
15
14
13
-2.5 -1.5 -0.5 0.5 1.5 2.5
Perfect Normal sample
17
Actual sample
16
We should
remove the
15 outlier from the
data set and redo
Root causes for the normal plot
14 outliers should always with n-1 values
be investigated!
13
-2.5 -1.5 -0.5 0.5 1.5 2.5
Normal score 14
Interpretation of Normal Plot
400
200
We can try a
response
150 transformation
100
For example we can take the
50 natural logarithm of x.
0
-2.5 -1.5 -0.5 0.5 1.5 2.5
Normal score
15
Interpretation of Normal Plot
17
Normal plot after
taking logs
16
Actual sample
A logarithmic
15
transformation
of the response
often improves
14 normality.
13
-2.5 -1.5 -0.5 0.5 1.5 2.5
Normal score
16
Example
• Normal Plot for the impurity in the Digozo Blue pigment
data (see Computer Tutorial Week 4 – Additional Exercise
– in Blackboard)
3 42 -1.64
1000
4 56 -1.48
5 66 -1.34
800
6 71 -1.23
7 76 -1.13
600
8 83 -1.04
9 83 -0.95 400
10 103 -0.88
200
(50 data points in total)
0
-3.00 -2.00 -1.00 0.00 1.00 2.00 3.00
17
Example
• Normal Plot for natural logarithm of impurity in the
Digozo Blue pigment data
1.00
0.00
-3.00 -2.00 -1.00 0.00 1.00 2.00 3.00
18
Normal Probability Plots Using Software
• Note that some software (and textbooks) plot the normal
scores on the Y axis, and the actual sample data on the X
axis – this should not affect the analysis.
Minitab
Normal plot for
the DC
Alternator
Voltage data
19
Characteristics of Non-Gaussian Distributions
Skewed
distributions:
e.g. life data /
durability
Excess kurtosis
e.g. stock markets
Positive excess kurtosis Negative excess kurtosis
(e.g. high market volatility – many (e.g. currency exchange – many
medium changes, few extreme changes) small changes, more than
20
expected extreme changes)
Outline
• Random variables
• Distribution functions
• Gaussian / Normal distribution
• Properties
• Standard normal distribution
• Discussion of engineering applications examples
• Linear combinations of random variables
• Probability Plots
• Discrete Random Variables
• Probability Mass function
• Binomial Distribution
• Normal Approximation to binomial
• Engineering applications
21
Discrete Random Variables
• In the Fuel Filler Door example we have asked
customers for their rating of opening effort of
various prototypes on a scale from 1 to 10.
Histogram of relative
frequencies of customer
ratings for Prototype P1*
Attribute / Discrete Data
35%
30%
25%
20%
15%
0%
1 2 3 4 5 6 7 8 9 10
22
*Data available from Tutorial Sheet 1 – Additional Exercises
Discrete Random Variables
35%
30%
25%
20%
15%
0%
1 2 3 4 5 6 7 8 9 10
23
Discrete Random Variables
100%
90%
80%
Cumulative
70%
relative
60%
frequencies plot
of customer 50%
20%
10%
0%
1 2 3 4 5 6 7 8 9 10 24
FFD customer rating data
Discrete Random Variables
25
Discrete Random Variables - Example
FFD customer rating data – Prototype P1 Relative
(Bins) Frequency
Frequency
35% Score (xi) fi
f(xi)
30% 1 1 0.02
25% 2 2 0.04
20% 3 7 0.14
4 15 0.3
15%
5 8 0.16
10% 6 3 0.06
5% 7 5 0.1
0% 8 6 0.12
1 2 3 4 5 6 7 8 9 10 9 2 0.04
10 1 0.02
10
X xi fxi 1 0.02 2 0.04 3 0.14 ... 10 0.02 5.1
i1
10
2
s
10
x 2i f x i X 2
4.29
Note that this is equivalent
to calculating the
xi fi
i1
i1 arithmetic mean in the X
normal way .
n
s 2.07
26
Binomial Distribution
• Assume that the requirement specification for a DC
Alternator is 14.3V; manufactured alternators with
voltage below this value are rejected.
17
16.5
What is the
Accepted
16
probability
15.5
that in the
Voltage [V]
15 next batch of
14.5
14.3 25 DC
alternators
Rejected
14
15
20
25
10
11
12
13
14
16
17
18
19
21
22
23
24
Sample No
27
Binomial Experiment
Example
Based on the last 25
What is the samples our estimate
probability that of the probability of a
in the next reject is
batch of 25 DC p = 5/25 = 0.2
2.36%
alternators
there will be 25
just 1 reject? f(1) 0.21 1 0.2251 0.0236
1 29
Binomial Distribution Example
• Assuming p = 0.2 (i.e. 1 in 5 is below spec – 20%)
What is the
probability that x f(x) n x
0 0.38% f(x) p 1 pnx
the next batch 1 2.36% x
of 25 DC 2 7.08%
3 13.58%
alternators 4 18.67% 25%
f(x)
9 2.94%
10 1.18% 10%
11 0.40%
12 0.12% 5%
13 0.03%
14 0.01% 0%
15 0.00% 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
30
Binomial Distribution Example
• Assuming p = 0.2 (i.e. 1 in 5 is below spec – 20%)
x f(x) Cumulative
What is the 0 0.38% 0.38% 100.0%
1 2.36% 2.7%
probability that 2 7.08% 9.8%
90.0%
15 0.00% 100.0%
Binomial Cumulative Probability Plot for Alternator Data
What is the probability
that the next batch of 25
DC alternators there will P(X>5p=0.2) = 1 – P(X 5 p=0.2) = 38.3%
be more than 5 rejects? 31
Binomial Distribution Example
• If we reduce the probability of failure to 10% or 1%, the chance (or
risk) of getting more than 2 defects in a batch of 25 decreases
dramatically.
(0.2%)100%
90%
80%
70%
60%
(46.3%) p=0.2
50%
p=0.1
40% p=0.01
30%
20%
(90.2%)10%
0%
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
32
DC Alternator Data
Binomial Distribution Parameters
Example 25%
20%
• For the DC alternators if
p=0.2: 15%
f(x)
m n p 25 0.2 5
10%
5%
s2 n p 1 p 25 0.2 0.8 4 0%
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
x
33
Normal Approximation to Binomial
This approximation
works if np > 5 and
n(1-p) > 5
25%
Example P(X2):
20%
2 25 0.2
Z 1.5
15%
25 0.2 1 0.2
f(x)
10%
Using the Z tables:
5% F(Z) = 0.0668 (6.68%)
0%
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Using the binomial cdf P(X2) = 9.8%
34
DC Alternator Data x
Normal Approximation to Binomial
For example we should calculate
In cases like this a correction
P(X2.5) rather than P(X2)
can be applied to reflect the
fact that the normal 2.5 25 0.2
distribution is defined over the Z 1.25
whole domain, and not only 25 0.2 1 0.2
discrete values like 2 and 3.
Using the Z tables:
F(-1.25) = 0.105 (10.5%)
25%
Example P(X2):
20%
2 25 0.2
Z 1.5
15%
25 0.2 1 0.2
f(x)
10%
Using the Z tables:
5% F(Z) = 0.0668 (6.68%)
0%
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Using the binomial cdf P(X2) = 9.8%
35
DC Alternator Data x
Summary of Session – learning objectives
36