Tutorials
Tutorials
13JUN 2021
# analysis of variance
anova <- aov(weight ~ feed, data = chickwts)
summary(anova)
## Df Sum Sq Mean Sq F value Pr(>F)
## feed 5 231129 46226 15.37 5.94e-10 ***
## Residuals 65 195556 3009
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Tukey’s test
The means comparison by Tukey’s test can be run on the object
resulting from the analysis of variance (anova). The result (below) is an
extensive table with all pairwise comparisons and the p-value for each
one of them. This data can be tricky to interpret and it is usual to use
letters to indicate significant differences among the means.
# Tukey's test
tukey <- TukeyHSD(anova)
print(tukey)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = weight ~ feed, data = chickwts)
##
## $feed
## diff lwr upr p adj
## horsebean-casein -163.383333 -232.346876 -94.41979 0.0000000
## linseed-casein -104.833333 -170.587491 -39.07918 0.0002100
## meatmeal-casein -46.674242 -113.906207 20.55772 0.3324584
## soybean-casein -77.154762 -140.517054 -13.79247 0.0083653
## sunflower-casein 5.333333 -60.420825 71.08749 0.9998902
## linseed-horsebean 58.550000 -10.413543 127.51354 0.1413329
## meatmeal-horsebean 116.709091 46.335105 187.08308 0.0001062
## soybean-horsebean 86.228571 19.541684 152.91546 0.0042167
## sunflower-horsebean 168.716667 99.753124 237.68021 0.0000000
## meatmeal-linseed 58.159091 -9.072873 125.39106 0.1276965
## soybean-linseed 27.678571 -35.683721 91.04086 0.7932853
## sunflower-linseed 110.166667 44.412509 175.92082 0.0000884
## soybean-meatmeal -30.480519 -95.375109 34.41407 0.7391356
## sunflower-meatmeal 52.007576 -15.224388 119.23954 0.2206962
## sunflower-soybean 82.488095 19.125803 145.85039 0.0038845
Compact letter display to indicate
significant differences
The use of letters to indicate significant differences in pairwise
comparisons is called compact letter display, and can simplify the
visualisation and discussion of significant differences among means.
We are going to use the multcompLetters4 function from
the multcompView package. The arguments are the object from
an aov function and the object from the TukeyHSD function.
print(Tk)
## # A tibble: 6 x 4
## feed mean quant cld
## <fct> <dbl> <dbl> <chr>
## 1 sunflower 329. 340. a
## 2 casein 324. 371. a
## 3 meatmeal 277. 320 ab
## 4 soybean 246. 270 b
## 5 linseed 219. 258. bc
## 6 horsebean 160. 176. c
Basic boxplot
We are going to use the function ggplot to build the boxplots. The first
argument is the data file, chickwts, and the second argument is the
aesthetics aes, where we define the x and y variables, feed and weight.
However, if we run only this code, we will have a blank plot. We also
need to define the geom, and is this case, geom_boxplot() for the
boxplot.
# boxplot
ggplot(chickwts, aes(feed, weight)) +
geom_boxplot()
Customizing x and y titles
Let’s now customise the x and y titles using the function ‘labs’.
# boxplot
ggplot(chickwts, aes(feed, weight)) +
geom_boxplot() +
labs(x="Feed Type", y="Weight (g)")
The plot could be used as it is, but there is still some space for
improvement.
# boxplot
ggplot(chickwts, aes(feed, weight)) +
geom_boxplot() +
labs(x="Feed Type", y="Weight (g)") +
theme_bw() +
theme(panel.grid.major = element_blank(), panel.grid.minor =
element_blank()) +
geom_text(data = Tk, aes(x = feed, y = quant, label = cld))
As we can see, the labels of the Tukey’s test were centralized on the
third quantile. To relocate them up and to the right we are going to
define their position in relation to this point using the
arguments hjust and vjust. I am also going to decrease the size of the
letters.
# boxplot
ggplot(chickwts, aes(feed, weight)) +
geom_boxplot() +
labs(x="Feed Type", y="Weight (g)") +
theme_bw() +
theme(panel.grid.major = element_blank(), panel.grid.minor =
element_blank()) +
geom_text(data = Tk, aes(x = feed, y = quant, label = cld),
size = 3, vjust=-1, hjust =-1)
# boxplot
ggplot(chickwts, aes(feed, weight)) +
geom_boxplot(fill = "lightblue", color = "darkblue") +
labs(x="Feed Type", y="Weight (g)") +
theme_bw() +
theme(panel.grid.major = element_blank(), panel.grid.minor =
element_blank()) +
geom_text(data = Tk, aes(x = feed, y = quant, label = cld),
size = 3, vjust=-1, hjust =-1, color = "darkblue")
# boxplot
ggplot(chickwts, aes(feed, weight)) +
geom_boxplot(aes(fill = feed)) +
labs(x="Feed Type", y="Weight (g)") +
theme_bw() +
theme(panel.grid.major = element_blank(), panel.grid.minor =
element_blank()) +
geom_text(data = Tk, aes(x = feed, y = quant, label = cld),
size = 3, vjust=-1, hjust =-1)
As the legend is not necessary in this case, we can hide it using the
code show.legend = FALSE in the geom_boxplot() arguments, and change
the colours to a more interesting palette using
the scale_fill_brewer() function. I have chosen the qualitative palette
“Pastel1” for this example.
# boxplot
ggplot(chickwts, aes(feed, weight)) +
geom_boxplot(aes(fill = feed), show.legend = FALSE) +
labs(x="Feed Type", y="Weight (g)") +
theme_bw() +
theme(panel.grid.major = element_blank(), panel.grid.minor =
element_blank()) +
geom_text(data = Tk, aes(x = feed, y = quant, label = cld),
size = 3, vjust=-1, hjust =-1) +
scale_fill_brewer(palette = "Pastel1")
# boxplot
ggplot(chickwts, aes(feed, weight)) +
geom_boxplot(aes(fill = factor(..middle..)), show.legend =
FALSE) +
labs(x="Feed Type", y="Weight (g)") +
theme_bw() +
theme(panel.grid.major = element_blank(), panel.grid.minor =
element_blank()) +
geom_text(data = Tk, aes(x = feed, y = quant, label = cld),
size = 3, vjust=-1, hjust =-1) +
scale_fill_brewer(palette = "Blues")
In the resulting plot, the higher the median, the darker the colour hue in
the boxplot.