Solutions - Lab 4 - Assumptions & Multiple Comparisons: Learning Outcomes
Solutions - Lab 4 - Assumptions & Multiple Comparisons: Learning Outcomes
Comparisons
Learning outcomes
An analysis of variance was undertaken to determine if the density of planting influenced the total dry weight
of maize for the plot. The results are shown below.
Calculate the standard error for the difference between any two treatment means;
√
SED = Residual M S × 2r ,
tcrit = t0.025
12 = 2.179
1
LSD = 2.179 × 1.412 = 3.077kg
Determine which pairs of means are significantly different from each other.
Comparisons: P20 vs P30 |17.58 – 27.18| = 9.6 significant (since abs mean diff > LSD) P20 vs P40 |17.58 –
27.14| = 9.56 significant (since abs mean diff > LSD) P30 vs P40 |27.18 – 27.14| = 0.04 not significant (since
abs mean diff < LSD) So P20 has a significantly lower mean yield than both P30 and P40, but there is no
significant difference between P30 and P40 (P > 0.05).
In this Topic you are encouraged to test the assumptions using the residuals. This exercise illustrates why
it is not ideal to test the normality assumption using all of the observations irrespective of the treatments.
First we will create 2 synthetic datasets which we sample 50 times (n=50) from a normally distributed
population. Both underlying populations have the same variation (sd=3) but have a different mean (mean=10,
mean=40). We then plot the histograms for each individually, both groups combined and the combined
residuals (observation minus group mean).
par(mfrow=c(2,2))
group1<-rnorm(n=50,mean=10,sd=3)
group2<-rnorm(n=50,mean=40,sd=3)
hist(group1,main="A: Group1",xlab="")
hist(group2,main="B: Group2",xlab="")
hist(c(group1,group2),main="C: Group1&2",xlab="")
hist(c(group1-mean(group1),group2-mean(group2)),main="D: Residuals Group1&2",xlab="")
A: Group1 B: Group2
15
Frequency
Frequency
10
5
0
4 6 8 10 12 14 16 18 35 40 45
25
Frequency
Frequency
10
0 10
0
0 10 20 30 40 50 −5 0 5
2
We can see that histogram of each group is normally distributed (A,B), however when we combine the data
we have 2 distinct groupings centered on the mean of each group (C). Therefore, if we look at the raw data
irrespective of the groups we would not see a normally distributed dataset. This is because the effect
of individual treatments (or groups) is different so each observation is perturbed according to the treatment
it receives or group it is in. If we examine the residuals (D), the treatment (or group) effects have been
removed and we can then test if the data is normal or has constant variance. It requires fitting of a model
to the data, in this case a 1-way ANOVA model. This is why we test the assumptions on the residuals. You
could look at the distribution of each group separately but then for some experiments the replication is small
so it is hard to assess normality, using residuals allows all of the observations to be pooled together.
This exercise is from Exercise 1 in Practcial 3. Here we will test the assumptions using residual diagnostics
and finding significant differences using LSDs. The data is found in the Diatoms worksheet.
library(readxl)
Testing assumptions
3
plots for outliers. Based on the normal distribution 95% of observations fall within +/- 2SD’s of the mean
or in the case of standardised residuals +/- 2.
The figure below presents the histogram of the standardised residuals. The majority of the observations plot
as a bell-shaped (normal) distribution. The exception are 2 observations less than -2. Given there are 34
observations this is out about 6% of the dataset so acceptable given we expect 95% observations to be in
the interval of ~[-2, 2].
hist(rstandard(anova.diatoms))
Histogram of rstandard(anova.diatoms)
7
6
5
Frequency
4
3
2
1
0
−2 −1 0 1 2
rstandard(anova.diatoms)
The QQ plot belows shows that the observed quantiles match the theoretical quantiles (assuming normality)
based on the observations reasonably following the 1:1 line. We can assume the data is normally distributed.
qqnorm(rstandard(anova.diatoms))
abline(0,1)
4
Normal Q−Q Plot
1
Sample Quantiles
0
−1
−2
−2 −1 0 1 2
Theoretical Quantiles
The plot below shows the standardised residuals plotted against the fitted values (the group means in this
case). To test the assumption of constant variance we want to have the same spread of observations for
increases in the fitted values. This is the case here. We don’t want to see fanning where the spread of
residuals increases or decreases while the fitted values increasing.
plot(fitted(anova.diatoms),rstandard(anova.diatoms))
5
rstandard(anova.diatoms)
1
0
−1
−2
fitted(anova.diatoms)
Statistics is made up of different tribes and some tribes use hypothesis testing to see if a dataset meets the
assumptions of normality and constant variance. One option is the Bartlett’s test for constant variances.
The mechanics are not important but the function and syntax are shown below. The hypotheses are:
2 2 2 2
• H0 : σBACK = σLOW = σM ED = σHIGH
##
## Bartlett test of homogeneity of variances
##
## data: Diversity by Zinc
## Bartlett's K-squared = 0.25294, df = 3, p-value = 0.9686
Based on the P-value being > 0.05 we could state that we retain the null hypothesis and that the variances
are equal.
6
Identify significant differences
In Topic 3 we used the lsmeans package to extract means for each group and their associated 95% CI. The
lsmeans is useful to produce a plot showing the mean and 95% CI which is a nice way to present the results.
#install.packages("lsmeans",repos="http://cran.csiro.au/")
#library(lsmeans)
#lsmeans(anova.diatoms, "Zinc")
#plot(lsmeans(anova.diatoms, "Zinc"))
Based on the non-overlapping confidence intervals the only pairs of groups that are significantly different
are HIGH and LOW. However more correctly we are looking at whether the difference in means = 0 which is a
slightly different question to seeing if the 95% CI around the mean overlaps. Looking at the 95% CI around
the means is a conservative test in that it will under-estimate the amount of times a significant difference
occurs. Therefore, the better approach is to use a least significant difference test which we can extract
using the agricolae package.
library(agricolae)
LSD.test(anova.diatoms,"Zinc",console=T)
##
## Study: anova.diatoms ~ "Zinc"
##
## LSD t Test for Diversity
##
## Mean Square Error: 0.2172137
##
## Zinc, means and individual ( 95 %) CI
##
## Diversity std r LCL UCL Min Max
## BACK 1.797500 0.4852613 8 1.4609789 2.134021 0.76 2.27
## HIGH 1.277778 0.4268717 9 0.9605026 1.595053 0.63 1.90
## LOW 2.032500 0.4449960 8 1.6959789 2.369021 1.40 2.83
## MED 1.717778 0.5030104 9 1.4005026 2.035053 0.80 2.19
##
## Alpha: 0.05 ; DF Error: 30
## Critical Value of t: 2.042272
##
## Groups according to probability of means differences and alpha level( 0.05 )
##
## Treatments with the same letter are not significantly different.
##
## Diversity groups
## LOW 2.032500 a
## BACK 1.797500 a
## MED 1.717778 ab
## HIGH 1.277778 b
The LSD.test function also gives the 95% CI around the mean but the last output identifes which pairs of
the means are significantly different. You will note the following pairs are different:
• ‘LOW’ and ‘HIGH’;
• ‘BACK’ and ‘HIGH’;
We have one more pair being significant compared to looking at the 95% CI. Make sure you can
interpret the letter notation for identifying pairs of groups with signficantly different means.
7
Exercise 4 - Mean comparisons, residual diagnostics and back-transformations
In this exercise will add a layer of complexity by considering a transformation. If our data does not meet
the assumptions we need to transform the data, possible transformations are the square root (weak) and log
(high). When we transform the data we need to be careful about how we interpret the results.
Concentration of prolactin (units g/L) in the pituitary glands of nine-spined stickleback fish was assessed.
The fish were kept in either saltwater or freshwater prior to assay and were different batches were examined
on three successive occasions. Cysts tend to develop in fish when kept in saltwater and sometimes develop in
freshwater populations. The four different groups of fish were used in a preliminary experiment to examine
the effects of cysts, whether induced by saltwater or normally present, on the prolactin production of the
pituitary gland.
The four groups of fish were codes as follows, with 10 fish per group:
• A = saltwater cysts, day 1;
• B = freshwater, no cysts, day 2;
• C = freshwater, no cysts, day 2;
• D = freshwater, cysts, day 3.
The data is found in the Prolactin worksheet.
(i) Import the data into R, perform some exploratory data analysis to make tentative suggestions about
differences between means and the likelihood of the data meeting the assumptions.
library(readxl)
fish<-read_excel("Data4.xlsx",sheet="Prolactin")
fish$Treatment <- as.factor(fish$Treatment)
str(fish)
8
100
Prolactin concentration
80
60
40
20
A B C D
Next we generate summary statistics. The variances are very different (ratio of largest: smallest > 4:1),
therefore the assumption of constant variance is unlikley to be met. The mean and median are similar for
each treatment so the normality assumption could be met.
aggregate(Prolactin ~ Treatment, mean, data = fish)
## Treatment Prolactin
## 1 A 16.89
## 2 B 28.22
## 3 C 52.27
## 4 D 60.73
aggregate(Prolactin ~ Treatment, median, data = fish)
## Treatment Prolactin
## 1 A 14.20
## 2 B 27.35
## 3 C 49.60
## 4 D 56.20
aggregate(Prolactin ~ Treatment, sd, data = fish)
## Treatment Prolactin
## 1 A 8.629723
## 2 B 10.786700
## 3 C 26.165628
## 4 D 23.211302
(ii) Fit an ANOVA model and test the assumption of normality using a QQ plot and a histogram - both
9
based on standardised residuals.
The QQ plot and the histogram indicate the data is normally distributed.
pro.aov <- aov(Prolactin ~ Treatment, data = fish)
qqnorm(rstandard(pro.aov))
abline(0,1)
1
0
−1
−2 −1 0 1 2
Theoretical Quantiles
hist(rstandard(pro.aov))
10
Histogram of rstandard(pro.aov)
10
8
Frequency
6
4
2
0
−2 −1 0 1 2 3
rstandard(pro.aov)
11
2
rstandard(pro.aov)
1
0
−1
20 30 40 50 60
fitted(pro.aov)
##
## Bartlett test of homogeneity of variances
##
## data: Prolactin by Treatment
## Bartlett's K-squared = 13.651, df = 3, p-value = 0.003421
• calculating the ratio of the larges SD:smallest SD to see if it is below 2:1;
The ratio is 3.03 so further evidence of the variances being unequal.
out<-tapply(fish$Prolactin,fish$Treatment,sd)
out
## A B C D
## 8.629723 10.786700 26.165628 23.211302
out[3]/out[1]
## C
## 3.032036
(iv) The data does not meet the assumptions so log transform (‘log’ function) the response and repeat (ii)
and (iii) to test the assumptions;
In R you can transform data in the model formula see below or you could create a new column in your data
12
frame, for example fish$logProlactin<-log(fish$Prolactin). The log transformation has not changed
the distribution dramatically, it is still normally distributed.
pro.aov <- aov(log(Prolactin) ~ Treatment, data = fish)
qqnorm(rstandard(pro.aov))
abline(0,1)
0
−1
−2
−2 −1 0 1 2
Theoretical Quantiles
hist(rstandard(pro.aov))
13
Histogram of rstandard(pro.aov)
8
6
Frequency
4
2
0
−2 −1 0 1 2
rstandard(pro.aov)
All of the ways to assess the constant variance assumption indicate the variances are equal after the log
transformation.
plot(fitted(pro.aov),rstandard(pro.aov))
14
2
1
rstandard(pro.aov)
0
−1
−2
fitted(pro.aov)
##
## Bartlett test of homogeneity of variances
##
## data: log(Prolactin) by Treatment
## Bartlett's K-squared = 1.5541, df = 3, p-value = 0.6698
out<-tapply(log(fish$Prolactin),fish$Treatment,sd)
out
## A B C D
## 0.5097462 0.3994729 0.5382265 0.3785816
out[3]/out[4]
## C
## 1.421692
(v) If the assumptions are met and there is significant F-test perform LSD tests and identify which pairs are
significantly different.
The ANOVA table indicates we reject the null hypothesis.
summary(pro.aov)
15
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The results of the LSD test are show below.
library(agricolae)
LSD.test(pro.aov,"Treatment",console=T)
##
## Study: pro.aov ~ "Treatment"
##
## LSD t Test for log(Prolactin)
##
## Mean Square Error: 0.2131079
##
## Treatment, means and individual ( 95 %) CI
##
## log.Prolactin. std r LCL UCL Min Max
## A 2.713137 0.5097462 10 2.417071 3.009202 1.722767 3.575151
## B 3.270747 0.3994729 10 2.974681 3.566812 2.653242 3.795489
## C 3.833124 0.5382265 10 3.537058 4.129190 3.054001 4.593098
## D 4.042180 0.3785816 10 3.746114 4.338245 3.446808 4.680278
##
## Alpha: 0.05 ; DF Error: 36
## Critical Value of t: 2.028094
##
## least Significant Difference: 0.4186999
##
## Treatments with the same letter are not significantly different.
##
## log(Prolactin) groups
## D 4.042180 a
## C 3.833124 a
## B 3.270747 b
## A 2.713137 c
(vi) One issue is that we have performed our hypothesis testing on the log scale. This means there are some
steps to be made if we wish to interpret the data on the original scale; e.g. provide a 95% CI on the original
scale. We will step through these.
Suppose the biologist was primarily interested in comparing the prolactin concentrations for A (saltwater
cysts, day 1) vs B (freshwater, no cysts, day 1).
• Calculate the difference in the means;
out<-tapply(log(fish$Prolactin),fish$Treatment,mean)
out
## A B C D
## 2.713137 3.270747 3.833124 4.042180
diffm<-out[2]-out[1]
diffm
## B
## 0.5576099
• Use the R output to calculate the Least Significant Difference (LSD):
√ ( )
0.025
LSD = tresid df × SED = t crit × ResidM S n11 + n12 .
16
This is given in the output of the LSD.test function; LSD = 0.4187.
• Calculate the lower and upper end-point of the 95% CI around the difference in mean based on the
the 95% CI being:
√ ( )
95CI = y ± tcrit × ResidM S n11 + n12 .
## B
## 0.13891
u95
## B
## 0.9763098
• The CI and mean are on the log scale, so backtransform the difference in the means (‘exp’ function),
the lower and upper end-point 95% CI. Note that the upper and lower tail are not of equal length on
the original scale.
exp(diffm)
## B
## 1.746493
exp(l95)
## B
## 1.149021
exp(u95)
## B
## 2.654642
(vii) Now have an estimate of the difference in the means on the original scale. It actually corresponds to
a ratio on the original scale. The reason is based on log laws, we can write the difference between 2 logged
numbers (A and B) as a log of their ratio (A/B);
(A)
log (A) − log (B) = log B .
If we backtransform the log of their ratio we get the ratio on the original scale;
elog( B ) =
A
A
B.
17
Exercise 5 - Broiler Chickens
This exercise is an analysis of a set of growth data. It is an open question for you to gain more practice.
The effect of weight gain in dressed broiler chickens was determined after five generations of selection. Group
A was bred by using only the heaviest 10% in each generation; groups B and C were bred using respectively
the heaviest 30% and 50%; group D was obtained by crossing groups A and C of the previous generation.
The dressed weights (kg) of 25 birds from each group have been recorded.
The data is found in the Broilers worksheet.
(i) Write down the null and alternate hypothesis. What is the treatment factor, and how many levels does
it have? What are the sample sizes for each group (ni )?
H0 : µA = µB = µC = µD
H1 : not all µi are equal
where i (i = A, B, C, D) is the population mean weight gain for broilers in selection group i.
The treatment factor is the selection group, with t = 4 levels in this factor. There are r = 25 chicks in each
selection group (equal replication).
(ii) Import the data into R, and then obtain some numerical and graphical summaries of the data, by each
group. How would you interpret these data? From these summaries, is the assumption of homogeneity of
variances met? What about normality? Try a formal Bartlett’s test using the bartlett.test function. Use
residual diagnostics to assess the assumptions.
The summary statistics by group indicate the data is likely to be normally distributed (mean ~ median) and
the variances are equal. This confirmed by boxplots for each group.
library(readxl)
broilers<-read_excel("Data4.xlsx",sheet="Broilers")
str(broilers)
18
1.8
1.7
Weight gain (kg)
1.6
1.5
1.4
1.3
A B C D
## Group WtGain
## 1 A 0.09046546
## 2 B 0.10984838
## 3 C 0.09673848
## 4 D 0.09925892
The Bartlett’s test indicates the variance are equal.
bartlett.test(WtGain ~ Group, data = broilers)
##
## Bartlett test of homogeneity of variances
##
## data: WtGain by Group
## Bartlett's K-squared = 0.93192, df = 3, p-value = 0.8177
The residual diagnostics indicate the data is normally distributed and variances are equal.
broilers.aov <- aov(WtGain ~ Group, data = broilers)
qqnorm(rstandard(broilers.aov))
abline(0,1)
19
Normal Q−Q Plot
3
2
Sample Quantiles
1
0
−1
−2
−2 −1 0 1 2
Theoretical Quantiles
hist(rstandard(broilers.aov))
20
Histogram of rstandard(broilers.aov)
20
15
Frequency
10
5
0
−2 −1 0 1 2 3
rstandard(broilers.aov)
plot(fitted(broilers.aov),rstandard(broilers.aov))
21
3
rstandard(broilers.aov)
2
1
0
−1
−2
fitted(broilers.aov)
(iii) Note that the results of the analysis can only be used when the assumptions of the analysis have been
met. If you believe that the assumptions are met, then what would your conclusions of the analysis of
variance be? You should use the ‘summary function’ applied to your ‘aov’ object to obtain the ANOVA
table.
The ANOVA table indicates we reject the null hypothesis.
summary(broilers.aov)
##
## Study: broilers.aov ~ "Group"
##
22
## LSD t Test for WtGain
##
## Mean Square Error: 0.009865333
##
## Group, means and individual ( 95 %) CI
##
## WtGain std r LCL UCL Min Max
## A 1.5644 0.09046546 25 1.524969 1.603831 1.40 1.70
## B 1.5160 0.10984838 25 1.476569 1.555431 1.32 1.81
## C 1.4860 0.09673848 25 1.446569 1.525431 1.30 1.69
## D 1.5424 0.09925892 25 1.502969 1.581831 1.35 1.76
##
## Alpha: 0.05 ; DF Error: 96
## Critical Value of t: 1.984984
##
## least Significant Difference: 0.05576452
##
## Treatments with the same letter are not significantly different.
##
## WtGain groups
## A 1.5644 a
## D 1.5424 a
## B 1.5160 ab
## C 1.4860 b
23