ASSIGNMENT 3
AZ Evulukwu
Assignment 3 - Similitude,
Modelling, and Data
Analysis
[email protected]
Memorial
University
Newfoundland
709-691-3790
Dr Leonardo Lye
10/25/2011
TABLE OF CONTENTS
1) ......................................................................................................... 5
a. ...................................................................................................... 5
b. ...................................................................................................... 6
c........................................................................................................ 6
2) ....................................................................................................... 17
a. .................................................................................................... 17
b. .................................................................................................... 18
c...................................................................................................... 26
d. .................................................................................................... 27
3) ....................................................................................................... 28
a. .................................................................................................... 28
b. .................................................................................................... 29
c...................................................................................................... 29
d. .................................................................................................... 32
4) ....................................................................................................... 33
a. .................................................................................................... 33
i.................................................................................................... 33
ii. .................................................................................................. 34
iii. ................................................................................................. 44
2
Summary of Results
Q1)
We have done half the runs of a full factorial but have
still been able to get similar results. All effects gained from
this half factorial have very little % difference compared to
the full factorial. In practical terms, we have been able to
considerably reduce our cost without compromising the
results of the experiment.
Q2)
We have reduced our runs by half again but are still able
to get fairly similar results as before. The quality of results
though has taken a bit of a hit as Effect E has a 45%
difference from the half factorial design. This is too large a
difference to ignore as an anomaly. Overall we have further
reduced our cost but accuracy of the results has declined in
the process.
Q3)
The blocked design has forced us to lose some effects
but we ensured these were insignificant effects. The %
difference of blocked effects with the full factorial is very
3
little. This means the two designs are very similar. In fact we
could argue the blocked design is more suitable as its effects
are judged against a smaller range of variables i.e. lost
effects.
Q4)
Results are still fairly similar to full factorial. The
assumption of normal distribution no longer holds despite
transformation of the model. This puts in question whether a
combination of half factorial and block design is suitable to
analyse this type of data
4
1)
Consider Question 3 of Assignment 2. Assume that ce=600. Suppose that only ½ of the
32 runs could be made due to budget constraints
a.
Choose the half you think should be run.
We have factors A, B, C, D, E
To get the runs to make, we need to choose a defining contrast
We want this contrast to be an effect that most likely have a
zero value as we lose it anyway
I choose I= ABCDE as the defining contrast
All runs with 2 letters or none in common go in the principal
block
This is the block that is run and given in the table below TAB 1
TAB 1
1st Block (runs made)
a b
c abc
d abd
acd bcd
e abe
ace bce
ade bde
cde abcde
5
b.
What are the alias relationships for your design?
We get our alias relationships by multiplying all effects by our
defining contrast (I= ABCDE).
The multiplication is done as below
Now as said earlier, we automatically lose our defining contrast
as an effect. We continue the above example for all the rest of
our 30 effects, we get the correlation for alias relationship
below TAB 2
TAB 2
Alias Relationships
A=BCDE AB=CDE CD=ABE
B=ACDE AC=BDE AE=BCD
C=ABDE BC=ADE BE=ACD
D=ABCE AD=BCE CE=ABD
E=ABCD BD=ACE DE=ABC
c.
Analyze the results and provide a practical interpretation of the results and compare
them to your answers for a full factorial design.
Now we use design expert for our analysis
We go to 2 factorial design
We check 25-1, 1 replicate and 1 block
We also want to show the generators so that we can alter it if
necessary
6
Given our defining contrasts is I= ABCDE, we pick E=ABCD for
our factor generator
We now label our factors
We put in the yield values for the tc’s we want to run. Our table
is shown in TAB 3
TAB 3
Next we go to the effects lists and turn any low value effects to
error. These are in-significant effects
We now do an ANOVA analysis and look at our Prob > F values.
Any one greater than 0.05 or even close is insignificant
We then go back to our effects list and turn these effects to
error
We keep doing this till all our effects in the ANOVA are
significant
Our effects list is shown below FIG 1
7
FIG 1
Our Normal plot vs. standard effects is shown below FIG 2
FIG 2
8
It is pretty clear that our significant effects are A (1436.75), B
(3688.25), D (486.75), E (413.25) and AB (1236.75).
These are the effects that fall outside the straight line
The half normal vs. standard effects plot also shows similar
trends in FIG 3
FIG 3
Our ANOVA table is shown below TAB 4
TAB 4
9
The model overall though is significant as its F value is 211.27
Now we get our model together
A full model would look like below. We get our coefficient from
our ANOVA analysis (Effect divided by 2)
But we only use significant effects which are A, B, AB, D and E
The model now takes the form
Now checking assumptions, we look at normality of the
residuals
We make a normal plot of the residuals shown in FIG 4
FIG 4
The residuals are all normally distributed as the values are all
fairly close to the straight line. And as a result our statistical
test is valid
10
Next we check whether the residuals have a constant variance
with FIG 5 below
FIG 5
The plot shows a nice scatter without a funnel shape. It means
variance is fairly constant
We now check the independency of the results with the
Residuals vs. Run plot in FIG 6
FIG 6
11
We get a nice scatter or residuals and runs showing our runs
are independent of each other.
FIG 7 shows a goodness of fit for predicted and actual values
The diagram shows a very good fit
FIG 7
We use our model graph for further analysis
From previous work our significant effects are A, B, AB, D & E
Now looking at our previous list, there a no significant
interactions between D, E & any of the other effects
Looking at our graph below in FIG 8 & 9, compression strength
does not change much with change in D or E level
We therefore ignore the D or E diagrams in the analysis
Our analysis on compression strength will be based on A, B &
AB
12
FIG 8
FIG 9
13
This is shown in FIG 10
FIG 10
At low level A, interaction with low level B gives over 1000units
of compressive strength
Interaction with high level B though gives close to 3600 units of
compressive strength
At high level A, interaction with low level B gives about a
1000units of compressive strength
Interaction with high level B gives over 5200units of
compressive strength
The most gain is obtained from switch from low-high level B
Therefore to get high compressive strength, we should increase
Time to as high as possible and keep the mix (effect A) as high
as possible
Given the high interaction, we now look at conditional effects
14
B+ has the highest value
This means the effect B is has its highest impact when A is high
as well
A- has the lowest value
This means the effect A has the its lowest impact when B is low
as well
Below is a table that compares values from the first and second
model TAB 5
The principal values we look at are the effects
TAB 5
Full Factorial 1 half factorial % Difference
Effect A 1474.25 1436.75 2.54
Effect B 3550 3688.25 -3.89
Effect E 413.25 413.25 0
Effect D 486.75 486.75 0
Effect AB 1187.5 1236.75 -4.15
The table show that there is not a large a difference between
effect values for the full factorial and half factorial
The highest % diff are with Effect B (time) & Effect AB which are
3.89% and 4.15% respectively
This translates to the various plots for each model being
similar.
15
The significance of this is we needed 32 runs to get our data
fully analysed in Full factorial
We only needed half of those runs to get similar results in the
second model
This would mean a big savings in cost for any practical
situations
16
2)
a.
Repeat Problem 1 if only ¼ of the 32 runs could be run
We have factors A, B, C, D, E
To get the runs to make, we need to choose a defining contrast
We want this contrast to be an effect that most likely have a
zero value as we lose it anyway
Given we are only doing 8 runs out of a possible 32, we chose 2
defining contrast and the 3rd on automatically picks itself
I choose I= ABCD = ACE = BDE as the defining contrasts
Design expert decides the runs that are made with these
defining contrast
This is the block that is run and given in the table below TAB 6
TAB 6
1st Block (runs made)
e cd
ad ace
bde bc
ab abcde
We get our alias relationships by multiplying all effects by our
defining contrast (I = ABCD = ACE = BDE).
The multiplication is done as below
17
The resulting alias relationship would then be
Now as said earlier, we automatically lose our defining contrast
as an effect. We continue the above example for all the rest of
our 30 effects, we get the correlation for alias relationship
below TAB 7
TAB 7
Alias Relationships
A = BCD = CE =ABDE
B = ACD = ABCE =ADE
C = ABD = AE = BCDE
D = ABC = ACDE = BE
E = ABCDE = AC =BD
AB = CD = BCE = ADE
BC = AD =ABE = CDE
b.
Construct the design and analyze the data that are obtained by selecting only the
response for the eight runs in your design
We put in the yield values for the tc’s we want to run. Our table
is shown in TAB 8
Next we go to the effects lists and turn any low value effects to
error. These are in-significant effects
We now do an ANOVA analysis and look at our Prob > F values.
18
TAB 8
Any one greater than 0.05 or even close is insignificant
We then go back to our effects list and turn these effects to
error
We keep doing this till all our effects in the ANOVA are
significant
Our effects list is shown below FIG 11
FIG 11
19
Our Normal plot vs. standard & half normal vs. standard effects
plots are shown below FIG 12 & FIG 13
It is pretty clear that our significant effects are A (1500), B
(3850), D (550), E (600) and AB (1150).
These are the effects that fall outside the straight line
FIG 12
FIG 13
20
Our ANOVA table is shown below TAB 9
TAB 9
The model overall though is significant as its F value is 609.84
Now we get our model together
A full model would look like below. We get our coefficient from
our ANOVA analysis (Effect divided by 2)
But we only use significant effects which are A, B, AB, D and E
The model now takes the form
Now checking assumptions, we look at normality of the
residuals
We make a normal plot of the residuals shown in FIG 14
21
FIG 14
The residuals are all normally distributed as the values are all
fairly close to the straight line. And as a result our statistical
test is valid
Next we check whether the residuals have a constant variance
with FIG 15 below
FIG 15
22
The plot shows a nice scatter without a funnel shape. It means
variance is fairly constant
We now check the independency of the results with the
Residuals vs. Run plot in FIG 16
FIG 16
We get a nice scatter or residuals and runs showing our runs
are independent of each other.
FIG 17
23
FIG 17 shows a goodness of fit for predicted and actual values
The diagram shows a very good fit
From previous work our significant effects are A, B, AB, D & E
Now looking at our previous list, there a no significant
interactions between D, E & any of the other effects
Looking at our graph below in FIG 18 & 19, compression
strength does not change much with change in D or E level
FIG 18
FIG 19
24
We therefore ignore the D or E diagrams in the analysis
Our analysis on compression strength will be based on A, B &
AB
This is shown in FIG 20
FIG 20
At low level A, interaction with low level B gives over 500units
of compressive strength
Interaction with high level B though gives close to 3650 units of
compressive strength
At high level A, interaction with low level B gives about a
1000units of compressive strength
Interaction with high level B gives over 5225units of
compressive strength
The most gain is obtained from switch from low-high level B
Therefore to get high compressive strength, we should increase
Time to as high as possible and keep the mix (effect A) as high
as possible
Given the high interaction, we now look at conditional effects
25
B+ has the highest value
This means the effect B is has its highest impact when A is high
as well
A- has the lowest value
This means the effect A has the its lowest impact when B is low
as well
c.
Compare the answers obtained with that of the ½ factorial above
The table below places all the Half factorial and Quarter
factorial effects side by side TAB 10
TAB 10
1 quarter
1 half Factorial % Difference
factorial
Effect A 1436.75 1500 -4.4
Effect B 3688.25 3850 -4.38
Effect D 486.75 550 -13
Effect E 413.25 600 -45.19
Effect AB 1236.75 1150 7
The table show that there is one large difference between
effect values for the half factorial and quarter factorial
The highest % diff are with Effect D (temperature) & Effect E
(Drying time) which are 13% and 45.19% respectively
26
The % diff for Effect E is very big and as a result cannot be
particularly trusted
Given our half factorial uses a resolution IV while our quarter
factorial uses a resolution III, we have to say our half factorial
model is more reliable
Still all the plots for each model are similar.
The significance of this is we needed 16 runs to get our data
fully analysed in half factorial model
We only needed half of those runs to get similar results in the
quarter factorial model
This did come at a cost though as the half factorial results seem
a lot more reliable than quarter factorial results
Overall we achieved similar results at a lower cost but with less
precision
d.
Comment on this design.
The design is good enough for normal analysis as it gives very
similar values to the half and full factorial designs
But still it is a lot less precise than those designs
27
3)
Based on the problem of Question 3 of Assignment 2, if the 32 runs can only be
completed in 4 days,
a.
Set up the blocking scheme for the four days.
We have 32 runs divided up in 4days.We are therefore going to
divide the runs in different blocks
To do this, we go to 2-level factorial in design expert, we
choose 25, 1 replicate, 4 blocks
We choose I = ABC = ADE as our defining contrast. An
additional effect is confounded but this is shown later
In our principal block all the runs must have either 2 or 0 (even
numbers) letters in common with all our defining contrasts
After this we multiply our principal block by runs that have not
been used yet to get the contents of the remaining blocks
Our blocking scheme will look as in TAB 11
TAB 11
Principal
Block 2 Block 3 Block 4
Block
(Day 2) (Day 3) (Day 4)
(Day 1)
1 b ab a
bc c ac abc
abd ad d bd
acd abcd bcd cd
abe ae e be
ace abce bce ce
de bde abde ade
bcde cde acde abcde
28
b.
What additional effect is confounded with days?
The extra confounded effect is shown by the multiplication of
the 2 defining effects above
c.
Analyze the data and determine which factors are significant.
Now we input our yield values for each and every one of the
blocks
Next we go to the effects lists and turn any low value effects to
error. These are in-significant effects
We now do an ANOVA analysis and look at our Prob > F values.
Any one greater than 0.05 or even close is insignificant
We then go back to our effects list and turn these effects to
error
We keep doing this till all our effects in the ANOVA are
significant
Our effects list is shown below FIG 21
29
FIG 21
Our Normal plot vs. standard & half normal vs. standard effects
plots are shown below FIG 22 & FIG 23
It is pretty clear that our significant effects are A (1462.13), B
(3537.88), D (474.63), E (425.38) and AB (1199.63).
These are the effects that fall outside the straight line
30
FIG 22
FIG 23
31
d.
Is there a difference in results between the full factorial and blocked designs? Give an
explanation.
We use the table below to compare side by side the effects
from the Full factorial and clocked designs TAB 12
TAB 12
Full Factorial Blocked design % Difference
Effect A 1474.25 1462.13 0.82
Effect B 3550 3537.88 0.34
Effect D 486.75 474.63 2.4
Effect E 413.25 425.38 2.94
Effect AB 1187.5 1199.63 1.02
We get our highest % difference in effects D and E. This is 2.4%
and 2.94 % respectively
This shows a high level of similarity between our Full factorial
and blocked design
We lose effects ABC, ADE and BCDE in our blocked design but
these effects are in-significant anyway
In fact the blocked design is probably a better model because
it’s main effects a judged against a smaller range of variables
i.e. the lost effects
32
4)
a.
Based on the problem of Question 2, Assignment 2, if there is budget for only ½ the runs,
and only 4 runs can be completed per day
i.
Set up the blocking scheme
We have 8 runs to be done, 4 runs can be done per day
We choose 24-1 factorial (IV), 1 replicate and 2 blocks
We choose I = ABCD as our factor generator and I = AB as our
block generator
They are both effectively the same and help us half our runs
and generate the 2 needed blocks
Note we lose effect ABCD but it is assumed in-significant
Runs with 2 or 0 letters in common with AB go into our
principal block
We multiply our principal block runs with ‘ad’ to get the rest of
the runs for block 2
As said earlier ABCD is the factor generator so runs in block 2
must have at least 2 or zero letters in common with it.
The blocks look as below TAB 13
TAB 13
Principal Block Block 2
(DAY 1) (DAY 2)
1 ad
ab bd
cd ac
abcd bc
33
ii.
Analyze the resulting data
We now input our yield values for the various blocks and the
resultant table is shown below TAB 14
TAB 14
Next we go to the effects lists and turn any low value effects to
error. These are in-significant effects
We now do an ANOVA analysis and look at our Prob > F values.
Any one greater than 0.05 or even close is insignificant
We then go back to our effects list and turn these effects to
error
We keep doing this till all our effects in the ANOVA are
significant
Our effects list is shown below FIG 24
34
FIG 24
Our Normal plot vs. standard & half normal vs. standard effects
plots are shown below FIG 25 & FIG 26
It is pretty clear that our significant effects are A (1500), B
(3850), D (550), E (600) and AB (1150).
These are the effects that fall outside the straight line
FIG 25
35
FIG 26
Our ANOVA table is shown below TAB 15
TAB 15
This table reveals a problem The Prob>F value for effect C is
0.0577
This is a bigger value than 0.05 indicating C is not a significant
effect
36
Normally we would need to go back and change C to error in
our error list
Unfortunately this would also mean we change AC to error due
to hierarchy
This gives a bad model where all the effects turn out to be in-
significant (Prob>F higher 0.05) shown in TAB 16
TAB 16
We therefore have to assume C is significant despite the
ANOVA analysis saying otherwise
This problem has been caused by the combination of the half
factorial and the blocking and has lead to slightly lower
accuracy of results
Now moving from assuming C is significant, we get our model
together
A full model would look like below. We get our coefficient from
our ANOVA analysis (Effect divided by 2)
But we only use significant effects which are A, C, AC, D and AD
The model now takes the form
37
Now checking assumptions, we look at normality of the
residuals
We make a normal plot of the residuals shown in FIG 27
FIG 27
The plot shows the residuals does not follow a normal
distribution as the points do not fall on the straight line.
To try to solve this, we need to perform a transform using
power transforms.
We get our optimum lambda value from the box Cox plot
From this lambda is 1.99
We go to make the transform and this changes our ANOVA
table TAB 17 and overall model as below
38
TAB 17
We make a second residual vs probability plot in FIG 28
FIG 28
Once again the model points do not fall on a straight line. This
ultimates means the new model does not follow a normal
distribution regardless of transformation or not
39
Next we check whether the residuals have a constant variance
with FIG 29 below
FIG 29
The plot shows a nice scatter without a funnel shape. It means
variance is fairly constant
We now check the independency of the results with the
Residuals vs. Run plot in FIG 30
FIG 30
40
We get a nice scatter or residuals and runs showing our runs
are independent of each other.
FIG 31 shows a goodness of fit for predicted and actual values
FIG 31
The diagram shows a very good fit
From our significant effects are A, C, D, AC & AD
Interaction AC is shown in FIG 32
FIG 32
41
At low level A, interaction with low level C gives over 140units
Interaction with high level C though gives close to 375 units
At high level A, interaction with low level C gives over 375 units
Interaction with high level C gives about 375 units
The most gain is obtained from switch from low-high level A
Therefore to get high results, we should increase Time to as
high as possible
Given the high interaction, we now look at conditional effects
A- has the highest value
This means the effect A is has its highest impact when C is low
as well
A+ has the lowest value
This means the effect A has the its lowest impact when C is high
as well
Interaction AD is shown in FIG 33
42
FIG 33
At low level A, interaction with low level D gives over 257units
Interaction with high level D though gives close to 257 units as
well
At high level A, interaction with low level D gives below 257
units
Interaction with high level D gives over 492.5 units
The most gain is obtained from switch from low-high level D
Therefore to get high results, we should increase Temperature
to as high as possible
Given the high interaction, we now look at conditional effects
D+ has the highest value
43
This means the effect D is has its highest impact when A is high
as well
D- has the lowest value
This means the effect D has the its lowest impact when A is low
as well
iii.
Comment on your results
The results are still fairly similar to that of the full factorial
experiment.
But the combination of the half factorial and the blocking
makes the results a lot less accurate than before.
Assumption of normal distribution no longer holds despite
transformation of the model
This subsequently puts in question whether a combination of
half factorial and blocked design is suitable to analyse the data
44