0% found this document useful (0 votes)
35 views26 pages

Chapter 6 ANOVA (Analysis of Variance)

Uploaded by

sayihmehari74
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views26 pages

Chapter 6 ANOVA (Analysis of Variance)

Uploaded by

sayihmehari74
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Types of t-test:

– One-sample t-test: which is used to compare a


single mean to a fixed number or "gold standard".
– Paired t-test: which is used to compare two
means based on samples that are paired in some
way.
– Two-sample t-test: which is used to compare two
population means based on independent samples
from the two populations or groups.
– A t-distribution can be used for testing
hypotheses about differences of means for
independent samples if both populations are
normal and have the same variances.
– However, the usual two-sample t-test cannot be
applied when more complex sets of data
comprising more than two groups are considered.
In this regard, one-way analysis of variance
(ANOVA) is used to compare the means of several
groups.
 Comparison of several means - Analysis of variance
– It is used when there is a single way of classifying
individuals. That is, when the subgroups to be
compared are defined by just one factor,
E.g. For example, say you are interested in comparing/
studying the blood pressure level of three groups of
patients who take three different treatments. There
is only one grouping (type of treatment
administered) that you are using to define the groups.
– When there are two factors classifying the
observations we need two way analysis of variance,
and so on.
– One-way analysis of variance is based on assessing
how much of the overall variation in the data is
attributable to differences between the group
means, and comparing this with the amount
attributable to differences between individuals in
the same group.

– The calculations for one way ANOVA are


expressed in relation to the sum of the
observations in each sample.
– Suppose we have K samples of observations, with
ni observations in the sample, then we calculate:
– X = mean of observations in the ith group,
i k
– T = sum of all observations =  ni xi =ΣXi
i 1
n
– S = sum of squares of all observations = 
i 1
xi2

– N = total number of observations = 


i 1
ni
 One way ANOVA partitions the total sum of squares
(SST) into two distinct components.
– The sum of squares due to differences between
the group means (SSB).

– The sum of squares due to differences between


the observations within each group (SSW). This is
also called the residual sum of squares or
unexplained.
SST = SSB + SSW
– SST = Total sum of squared deviations of each
observation about grand mean
– SSB = Total sum of squared deviations of group
means about grand mean
– SSW = Total sum of squared deviations of each
observation about group mean
• The sum of squares for one way ANOVA are given as
follows:
Source of variation Sum of squares

k
Between groups 2
(Explained) SSB =  i i
n x
i 1
 T 2
/N

Within groups SSW = 2


(Unexplained) S T / N
k

Total SST = S   ni M i 2
i 1

(= SSB +SSW)
– The significance test for differences between the
groups is based on a comparison of the between
groups and within groups mean squares.

– If the observed differences between the means of


the groups are simply due to chance variation, the
variation between these group means will be
about the same as the variation within individuals
of the same type.
• If there are real differences, the between groups
variation will be larger. The mean squares are
compared using the F-test. This test is sometimes
known as variance-ratio test.
B e tw e e n g r o u p s
F =
W ith in g r o u p s

Df Between-groups = k-1
Df within-groups = N-k
where:
N is the total number of observations and
k is the number of groups.
One way ANOVA table looks like the following:
Source of DF SS Mean square F P
variation

Between k-1 SSB SSB / k-1


groups (SSB / k-1)/
(SSW / N-k)
Within N-k SSW SSW / N-k
groups

Total N-1 SST


• Assumptions
– The data are normally distributed or the samples
have to come from Normally distributed
populations.
– The population value for the standard deviation
between individuals is the same for each group
(equal variance).
– Moderate departures from normality and unequal
standard deviations may be safely ignored. If not
transforming the data may be useful.
• Example 1
Twenty-two patients undergoing cardiac bypass surgery were
randomized to one of three ventilation groups:
Group I: Patients received a 50% nitrous oxide and
50% oxygen mixture continuously for 24 hours;

Group II: Patients received a 50% nitrous oxide and 50%


oxygen mixture only during the operation;

Group III: Patients received no nitrous oxide but received


35-50% oxygen for 24 hours.
– The table below shows red cell folate levels for the
three groups after 24 hours' ventilation. We wish to
compare the three groups, and test the null
hypothesis that the three groups have the same red
cell folate levels.

– Examination of the data does not reveal any obvious


outliers and the data in each group look plausible
samples from a Normal distribution. The standard
deviation in group I is rather higher than those in the
other groups, but moderate variability is not a
problem.
– Levene statistic test is useful for assessing the
null hypothesis that more than two samples come
from populations with the same variance. Some
computer programs incorporate this test.
Example 1: Red cell folate levels (μg/l) in three groups
of cardiac bypass patients given different levels of
nitrous oxide ventilation (Amess et al., 1978)
Group I Group II Group III
(n=8) (n=9) (n=5)
243 206 241
251 210 258
275 226 270
291 249 293
347 255 328
354 273
380 285
392 295
309

Mean =316.6 256.4 278.0


SD = 58.7 37.1 33.8
• Hypotheses
– Ho : μ1 = μ2 = μ3 or means of groups are not
significantly different.

– HA : Differences exist between at least some of the


means/ groups
• ANOVA table Explained by the model
ANOVA

Red cell folate levels (µg/l)


Sum of
Squares df Mean Square F Sig.
Between Groups 15515.766 2 7757.883 3.711 .044
Within Groups 39716.097 19 2090.321
Total 55231.864 21

Since the P value is less than 0.05, the null hypothesis


is rejected. This is a global test

Unexplained variation
• Pair-wise comparisons of group means
– One way ANOVA is an extension of the two
sample t test. When there are only two groups,
the F value will be the square of the
corresponding t value with (1, N-2) degrees of
freedom. Remember the degrees of freedom for
the two sample t test is N-2.
– With two groups the interpretation of a significant
difference is reasonably straightforward, but how
do we interpret significant variation among the
means of three or more groups?
– Further analysis is required to find out how the
means differ, for example, whether one group
differs from all the others.
– It should be noted that pair-wise comparisons will
be carried out when the overall comparison of
groups in the analysis of variance is significant.
This is called Post Hoc multiple comparison
– With k groups, there are ½k(k-1) possible pair-
wise comparisons of group means.
The Post Hoc tests
are divided into two
sets:
The first set assumes
groups with equal
variances.

The second set does not


assume that the variances are
equal.
Do post hoc test for example 1 above using:
– Benferroni method
– Scheffe method
Post hoc test result (Bonferroni)
Multiple Comparisons

Dependent Variable: Red cell folate levels (µg/l)

Mean
Difference 95% Confidence Interval
(I) group (J) group (I-J) Std. Error Sig. Lower Bound Upper Bound
Scheffe 1.00 2.00 60.18056* 22.21594 .045 1.2192 119.1420
3.00 38.62500 26.06443 .354 -30.5503 107.8003
2.00 1.00 -60.18056* 22.21594 .045 -119.1420 -1.2192
3.00 -21.55556 25.50141 .704 -89.2366 46.1255
3.00 1.00 -38.62500 26.06443 .354 -107.8003 30.5503
2.00 21.55556 25.50141 .704 -46.1255 89.2366
Bonferroni 1.00 2.00 60.18056* 22.21594 .042 1.8614 118.4998
3.00 38.62500 26.06443 .464 -29.7969 107.0469
2.00 1.00 -60.18056* 22.21594 .042 -118.4998 -1.8614
3.00 -21.55556 25.50141 1.000 -88.4995 45.3884
3.00 1.00 -38.62500 26.06443 .464 -107.0469 29.7969
2.00 21.55556 25.50141 1.000 -45.3884 88.4995
*. The mean difference is significant at the .05 level.
A. Groups I and II
– The P- value = 0.042
– 95% CI = (1.86, 118.5)
B. Group I and III
– the p-value = 0.464
– 95% CI = (-29.8,107.05)
C. Group II and III
– The p-value = 1.00
– 95% CI = (-88.50, 45.39)
Therefore, the main explanation for the difference between
the groups that was identified in the analysis of variance is
thus the difference between groups I and II.
Post hoc test result (Scheffe)
Multiple Comparisons

Dependent Variable: Red cell folate levels (µg/l)

Mean
Difference 95% Confidence Interval
(I) group (J) group (I-J) Std. Error Sig. Lower Bound Upper Bound
Scheffe 1.00 2.00 60.18056* 22.21594 .045 1.2192 119.1420
3.00 38.62500 26.06443 .354 -30.5503 107.8003
2.00 1.00 -60.18056* 22.21594 .045 -119.1420 -1.2192
3.00 -21.55556 25.50141 .704 -89.2366 46.1255
3.00 1.00 -38.62500 26.06443 .354 -107.8003 30.5503
2.00 21.55556 25.50141 .704 -46.1255 89.2366
Bonferroni 1.00 2.00 60.18056* 22.21594 .042 1.8614 118.4998
3.00 38.62500 26.06443 .464 -29.7969 107.0469
2.00 1.00 -60.18056* 22.21594 .042 -118.4998 -1.8614
3.00 -21.55556 25.50141 1.000 -88.4995 45.3884
3.00 1.00 -38.62500 26.06443 .464 -107.0469 29.7969
2.00 21.55556 25.50141 1.000 -45.3884 88.4995
*. The mean difference is significant at the .05 level.

You might also like