Week 5 Lecture Note-1
Week 5 Lecture Note-1
Week 5
1
17/03/2023
We need to conduct a
planned experiment to
predict Y1, Y2, Y3, Y4,
and Y5 from X1, X2, and
X3.
We deviate somewhat from what we did up to now with R on two counts:
We collect data via a planned experiment (so controlled conditions).
We have limited number of observations.
2
17/03/2023
Check
6. Make actionable conclusions. 5
Regression
Factors Predict Confirm Conclude
Problem Model
Why?
How?
Example of a problem:
What?
“What settings of X1,X2, and
Data Collection X3 get us the desired
sensory experience to the
consumers?” - See slide 6
3
17/03/2023
Or
10
Best-fitting model :
12
4
17/03/2023
13
...
Output Product
input PROCESS
Labour y Typically, a quality
Material/components characteristic(s)
Machinery ... of a product
Knowledge
z1 z2 zq Cannot be
controlled in the
Uncontrollable input factors experiment
(sometimes can,
with difficulty) or in
the actual operation.
14
5
17/03/2023
Moen, R.D, Nolan, T.W., & Provost, L.P (1999). Quality improvement
through planned experimentation (2nd ed.). New York: McGraw-Hill
16
55 two-way interactions to
estimate, in addition to the 11
main effects and the intercept! So
at least 67 trials. Lot of money for
little gain, because in reality, only few terms in the model would be significant.
17
18
6
17/03/2023
19
20
3 94.18 4 90.95 (a) Why do values within each column differ, even
5 92.18 6 90.46 though we are not changing anything?
7 95.39 8 93.21
9 91.79 10 97.19 (b) How do we test whether or not both catalysts
11 89.07 12 97.04 produce the same mean yield?
13 94.72 14 91.07
15 89.21 16 92.75 (c) What would be your H0 and H1? Think about the obj.
21
7
17/03/2023
22
Minitab 21 Results
Individual Value Plot of Catalyst A, Catalyst B
98
Graphical plots are
97 also very informative
96 in DOE
95
94
μ ^ = 92.73
^ = 92.26; μ
Data
93 A B
92.73
92 92.26
91
90
Estimate for difference
89 (i.e. μ^A – μ^B) = -0.48.
Catalyst A Catalyst B
Makes sense ?
H0: μA - μB = 0
Null hypothesis
H1: μA - μB ≠ 0 Alternative hypothesis (can also be μA - μB < 0)
24
8
17/03/2023
25
26
27
9
17/03/2023
28
29
25
21.1667
20
Strength
17.0000 µ = 15.9583
15.6667
15
Keep in mind
10 10.0000
that H0 and
So what do you think? Your expectation (H1), “at least H1 refer to
one mean is different” might be supported by the data? population
5 means and
5% 10% 15% 20% not anything
Concentration else
30
10
17/03/2023
• There is variation in the 6 observations within each group (factor level) around the
group averages. This is known as within-group variation or Error Variation (SSE).
31
32
From statistics first principals we can prove that SST = SSTR + SSE
SST corresponds to (24 - 1) degrees of freedom (df).
SSTR corresponds to (4 -1) df
SSE corresponds to (24 - 4) df
More generally,
33
11
17/03/2023
Source SS DF MS F p
Treatment (TR) √ √ SS/DF
Total (T) √ √
SS = Sum of Squares
DF = Degrees of Freedom
MS = Mean Square
F = F statistic (test statistic)
p = p-value (significance of the test statistic)
34
Analysis of Variance
35
20
Data
15
36
12
17/03/2023
yij = µ + τi + εij
Where:
yij = jth value of the ith treatment level
µ = overall (grand) mean
τi = effect of ith treatment level
εij = random error of the jth value of the ith treatment level
37
τ2 µ^2 µ
15
τ^ 1
10 µ^1
Predicted value for each treatment
5
5% 10% 15% 20%
Concentration
38
2
Percent
50
0
10 -2
1 -4
-5.0 -2.5 0.0 2.5 5.0 10.0 12.5 15.0 17.5 20.0
Residual Fitted Value
3.6
Residual
2.4 0
1.2 -2
0.0 -4
-4 -2 0 2 4 2 4 6 8 10 12 14 16 18 20 22 24
Residual Observation Order
A histogram of residuals εij to assess A plot of residuals εij against the order in
normality. which the observations were obtained to
test independence of observations.
39
13
17/03/2023
The engineer suspects that the four plots of lands that she
selected for her experiment could potentially influence the
results. The engineer needs to control the effect of plot-of-land
(background variable) statistically, within her ANOVA.
She has for plots of land: Land 1, Land 2, Land 3 and Land 4
Factor Information
Factor Levels Values
Land (Blocks) 4 1, 2, 3, 4
Formulation 4 A, B, C, D
40
41
42
14
17/03/2023
yij = µ + τi + βj + εij
Where:
yij = the value in the jth block for the ith treatment level
µ = overall (grand) mean
τi = effect of treatment i
βj = effect of block j
εij = random error component
i = 1,2,…,a
j = 1,2,…,b
43
15