AG213 UNIT 4 QM Revision PDF
AG213 UNIT 4 QM Revision PDF
Learning Outcomes
This unit will focus on comparing two independent and dependent/related groups.
At the end of this unit, you should be able to:
Explain how comparing two groups at the sample level can be applied to
determine the difference between the same groups at the population level.
Define the Null Hypothesis (Ho) and Alternative Hypothesis (HA) in general
terms then formulate a hypothesis about difference between two groups for a
specific example.
Test a hypothesis about difference between two independent groups using t-
test.
Calculate acceptance region and confidence limits for difference between
independent group means.
Carry out 2-sample t-test for appropriate data.
Test a hypothesis about difference between two related groups using t-test.
Calculate acceptance region and confidence limits for difference between
related group means.
Carry out paired t-test for appropriate data.
Differentiate between one tailed and two tailed tests.
Explain the terms presented in this unit.
4.1 Introduction
In this unit we will extend our study of variation between individuals and groups to a
study of methods for comparing groups. To do this we will look at how differences
between group means can vary. Then we will draw conclusions about the likelihood
that two groups have been sampled from one population, or, alternatively, come from
different populations.
relatively longer tails and a lower peak. Another important point is that the t-
distribution changes shape with changes in the number of individual items included
in the sample.
You also saw in Unit 3 that one could estimate the expected standard deviation of
sample means (called the Standard Error of the Mean) from the standard deviation of
individual items in one sample and the number of items measured. With this, and
tables of the t-distribution, confidence intervals could be calculated within which we
would expect to find the population mean. These procedures give us useful
information about a sample and the population it came from. However, we will often
want to do more than that – we will want to compare different groups and draw
conclusions about possible differences between the populations they came from.
Such comparisons are the basis of agricultural experiments and observations that we
often use to help us make decisions.
In experimental situations we might want to decide, for example, whether:
An insecticide will reduce blemishes on fruit.
Fertiliser will increase plant growth.
An amino acid supplement will make up for a deficiency in pig diets.
Local materials can replace imported feeds in chicken rations.
For each of these examples we would impose different treatments on groups of
subjects that were initially the same and then compare the groups after the
experimental treatment.
In an observational situation we might want to decide whether:
A new hybrid variety of coconuts produces more than a local variety.
The meat from female animals contains more fat than meat from males.
Taro Niue tastes better than Alafua Sunrise.
Brahman cattle are better adapted to tropical conditions than Herefords.
Here we start with subjects with different characteristics and compare the two groups
for other features that interest us. No experiments or treatments are involved.
In all of the above cases we will still be making comparisons between samples and
drawing conclusions about possible differences between the populations that they
came from or the populations that had been given different treatments. In this unit we
will study the methods for making comparisons by considering the simplest situation
of examining two groups. For example, in an experiment there might be one treated
group and one group not treated (called a control group). For observations, we may
compare one group of females with another group of males.
Figure 4.2: Illustration of two samples taken from the same population where the samples differ by
chance.
As we only have the samples and not the populations, we do not really know which
of these two possible situations is true. We therefore have to make a hypothesis (a
theory) and we start by assuming that situation (B) is true, that is, there is no
difference between the populations. This is called the null hypothesis (Ho). Situation
(A) is then called our alternative hypothesis (HA), that there is a real difference
between the populations.
In statistical analyses we always start with the null hypothesis (assume no difference
between populations) and we then calculate the probability that our observed
difference between samples could happen just by chance. If this probability is too
small to be reasonable, we reject the null hypothesis and accept the alternative
hypothesis. In other words, we say that it is not likely the two samples could have
come from the same population and we assume that the populations they came from
are really different.
Before proceeding, it is important to note that the formulae for the standard deviation
and the variance of a sample you learnt in Unit 3 are known in statistics as definition
formulae. In other words, the definition formulae for the standard deviation and
variance of a sample can be used for calculations as well as for describing in words
what they mean. However, when you are handling a large set of data from a sample,
their applications to calculations can be quite laborious, even with a scientific
calculator. Alternative formulae for the standard deviation and variance of a sample
exist and they are normally referred to in statistics as calculation formulae. The
calculations involved are undemanding especially when a scientific calculator is
used.
x
x n
2
2
s
n 1
and that for the variance of a sample is:
x
x n
2
2
SS
s2
n 1 df
The Σx refers to x1 + x2 + x3 + …+ xn, and Σx2 refers to x12 + x22 + x32 + … + xn2. In
this unit, we will use these calculation formulae to calculate these two sample
statistics.
Activity 4.1
It is important that you have a clear understanding of the terms we have just been
introducing you to. Read through this first section of Unit 4 again and then write
down in your own words definitions for the following terms:
Hypothesis
Null hypothesis
Alternative hypothesis
Definition formulae
Calculation formulae
Variety 1 Variety 2
890 957
522 909
849 1081
885 1134
990 1133
1018 933
632 974
697 849
Before we deal with the methods for calculating the t-test we need to extend slightly
our mathematical notation so that we can represent statistics for the different groups.
We do this by writing a numerical subscript after the symbol for the statistic, as
shown on the next page.
Using the methods you learnt in Unit 3 and the calculation formulae introduced
earlier in this unit, you should now be able to calculate these sample statistics for the
two groups and verify the following:
For variety 1:
n1 = 8 df1 = 7
_
x1
x 6483 810.38 g
n1 8
x
x n
2
2
5470267
6483
2
s1 1
8 175.908g
n1 1 7
For variety 2:
n2 = 8 df2 = 7
_
x2
x 7970 996.25 g
n2 8
x
x n
2
2
8020302
7970
2
s2 2
8 107.031g
n 2 basic
Now with these 1 7 group we can carry out the t-test using the
statistics for each
procedure given on next page.
Step 1 Hypotheses
The null hypothesis is that there is no real difference in corm weight between the two
taro varieties that the sample groups were taken from, or simply:
HO : 1 = 2
The alternative hypothesis is that there is indeed a real difference in corm weight
between the varieties that the sample groups were taken from, or simply:
HA : µ1 ≠ µ2
Calculate a weighted average of the two variances. This is called the pooled
variance with the symbol s2pooled.
s 2pooled
n1 1s12 n 2 1s 22
SS1 SS 2
n1 n 2 2 df1 df2
(7)(175.908) 2 (7)(107.031) 2
882
= 21199.88 g2
Calculate the standard error for the difference between the group means.
2 s 2pooled
sed s _ _
x1 x 2 n
sed = 2(21199.88)
72.80 g
8
Step 4 t-value
Look up the t-table (Table 3.4) in the 95% column for the degrees of freedom of the
difference which is simply df1 + df2 = 7 + 7 = 14.
t0.05,14 2.145
_ _
( x1 x 2 ) (810.38 996.25) 185.87 g
Step 7 Conclusion
Compare the observed difference in step 6 with the acceptance region calculated in
step 5.
In this case the observed difference (-186 g) is outside the acceptance region ( 156
g), so we reject the null hypothesis and instead accept the alternative hypothesis
(there is a real difference between the populations – the varieties, i.e., 1 2). This
is illustrated in Figure 4.3. Notice that the difference and acceptance limits displayed
in the figure have been rounded off to the nearest gram. At least two decimal places
were maintained in the calculations to avoid rounding off errors (see Unit 2), but it
would be unrealistic to imply such an accuracy of measurement when presenting the
results of an investigation in a report.
Figure 4.3: t-distribution of differences between two means (df = 14) showing limits of the
acceptance region for the example data in Table 4.1.
_ _
Confidence limits for difference = x1 x 2 (t0.05,14 )(s( x1 x2 ) )
= -185.87 (2.145)(72.80)
= -185.87 156.16
We can add to our conclusion, therefore, and say that variety 1 has lower corm
weights than variety 2, with a 95% probability that the true difference between the
populations that the samples came from lies between the following limits:
Note that this method for calculating the t-test is slightly different to that used in
most text books. We have found that this one makes it easier for students to
understand why a null hypothesis is accepted or rejected. The more conventional
method used in most text books is outlined below.
Figure 4.4: Acceptance limits and the observed difference between means. When the data illustrated
in Figure 4.3 are expressed in t units (i.e., divided by the standard error of the difference).
Now, to practice the t-test method, please revise the steps and then complete Activity
4.2
Activity 4.2
Observations were made of the annual nut production from 10 trees each of two
cultivars of dwarf coconuts. The numbers of nuts counted are given in Table 4.2.
Compare the production of the two varieties using a 2 sample t-test after stating an
appropriate null hypothesis. Make a conclusion from the test about the productivity
of the populations that the sample trees were taken from, and calculate the 95%
confidence limits of the difference between the populations.
Table 4.2: Annual nut production of two varieties of dwarf coconuts
Cultivar 1 Cultivar 2
162 134
115 58
176 67
200 108
217 89
181 52
191 190
168 93
149 127
149 130
In your answer:
State the null and the alternative hypotheses;
Show all step by step working/calculations; and,
Draw conclusions about the two populations (cultivars) that the samples were
taken from. That is, comment about the differences in the productivity of the
two dwarf cultivars of coconuts.
1 1
sed s _ _
s 2pooled
x1 x 2
n1 n 2
All other steps are calculated as outlined above or with the conventional method.
It is important that you can recognise when to use this modified method for
comparing groups that have unequal sizes. For more practice in doing this and the
calculations involved, stop now and complete Activity 4.3.
Activity 4.3
In a survey of sweet potato yields in village gardens, available soil Phosphorus (P)
was measured. The gardens were classed as below or above average P. The yields of
sweet potato are listed in Table 4.3. State a null hypothesis and do a 2 sample t-test.
What is your conclusion?
Table 4.3: Yields (t/ha) of sweet potato
Below average P Above average P
12.0 14.3
12.3 15.2
11.4 11.2
9.0 15.5
9.5 13.8
8.9 15.4
10.0 12.2
10.3
11.5
In your answer:
treatment is applied to one member of each pair while a second treatment is applied
to the other member of the pair.
Examples of meaningfully ‘paired’ individuals are pairs of female piglets from the
same litter. Each pair will be full sisters so they will be genetically similar and share
a common maternal environment. This pair is as close to being exactly alike as
possible which means that any treatment applied that results in significantly different
findings will most likely be because of the treatment. The pairs would be taken from
a number of litters. In another case, pots in a shade house can be arranged in pairs so
that variation in light, temperature or other position effects will be similar for each
pair. When we use pairs that are almost perfectly the same we are said to be
controlling all the other variables that could upset the experiment.
A special use of this technique involves self-pairing. Each individual is measured
before and after a treatment is applied, or two treatments are applied to different
sides of the same individual. These are particularly useful when there is wide
variation between individuals. We will illustrate the analysis method with this type
of paired experiment, but it is the same for any type of pairing.
Table 4.4 gives data from a paired experiment. They are the number of lesions on
tobacco leaves caused by two different preparations of a virus extract. One
preparation was applied to half of a leaf while the second preparation was applied to
the other half. Thus, the two halves of the same leaf were the pair and this was
repeated on a number of plants. The table also shows the calculations needed for the
analysis.
Table 4.4: Number of lesions on half tobacco leaves
(Source: Snedecor, G.W. and Cochran, W.G. (1980). Statistical Methods 7th ed.
Iowa State University Press, Ames. 99 86-87.)
Step 1 Hypotheses
Define the null hypothesis and the alternative hypothesis.
Null hypothesis
First we have to set up a null hypothesis. This is: There is no difference between the
virus extracts in the number of lesions they cause on tobacco leaves. If this is true,
then we expect the mean difference to be 0, or simply:
HO: D =0
Alternative hypothesis
There is indeed a real difference between the virus extracts in the number of lesions
they cause on tobacco leaves.
HA: D ≠0
d
d
32
4
n 8
Where n, in this case, is the number of pairs (which is eight) and NOT the number of
individual values from the two virus extract preparations, which is 16.
d
d n
32
2 2
2
258
sd 8 4.309
n 1 8 1
Step 4 Standard error of mean difference
The standard error of the mean difference is:
sd 4.309
sd = = = 1.52
n 8
Step 6 Conclusion
As the mean difference of 4 is outside this acceptance region we reject the null
hypothesis and conclude that there is a real difference between the virus extracts in
the number of D 0 they produce, i.e.,
lesions
Step 7 Confidence limits
Following this conclusion, we will be interested in estimating confidence limits for
the mean difference from our analysis. We saw earlier that these are simply the mean
difference plus the acceptance limits. Thus:
Conclusion
As the observed t-value is outside the acceptance region, we reject the null
hypothesis as before.
Activity 4.4
In a study of change in blood cholesterol after heart attack, cholesterol levels were
measured for the same patients after 2 and 4 days (Table 4.5). State an appropriate
null hypothesis for the change in cholesterol level and carry out a paired sample t-
test. Make a conclusion about the change and calculate the 95% confidence limits.
(Source of data: Ryan, B.F., Joiner, B.L. and Ryan, T.A. (1985) MINITAB Handbook
2nd ed. PWS-Kent Publishing Company, Boston. p.92)
In your answer:
Type I error
When the null hypothesis is true and you reject it, you make a type I error. The
probability of making a type I error is α, which is the level of significance you set for
your hypothesis test. An α of 0.05 indicates that you are willing to accept a 5%
chance that you are wrong when you reject the null hypothesis. To lower this risk,
you must use a lower value for α. However, using a lower value for alpha means that
you will be less likely to detect a true difference if one really exists.
Type II error
When the null hypothesis is false and you fail to reject it, you make a type II error.
The probability of making a type II error is β, which depends on the power of the
test. You can decrease your risk of committing a type II error by ensuring your test
has enough power. You can do this by ensuring your sample size is large enough to
detect a practical difference when one truly exists.
The probability of rejecting the null hypothesis when it is false is equal to 1–β. This
value is the power of the test.
Decision based
H0 is true H0 is false
on sample
Type II Error
Correct Decision
Retain H0 Accepting H0 when it is false
(probability = 1 - α)
(probability = β)
Type I Error
Correct Decision
Reject H0 Rejecting H0 when it is true
(probability = 1 - β)
(probability = α)
Two-sided
Use a two-sided alternative hypothesis (also known as a non-directional hypothesis)
to determine whether the population parameter is either greater than or less than the
hypothesized value. A two-sided test can detect when the population parameter
differs in either direction, but has less power than a one-sided test.
One-sided
Use a one-sided alternative hypothesis (also known as a directional hypothesis) to
determine whether the population parameter differs from the hypothesized value in a
specific direction. You can specify the direction to be either greater than or less than
the hypothesized value. A one-sided test has greater power than a two-sided test, but
it cannot detect whether the population parameter differs in the opposite direction.
Two-sided
A researcher has results for a sample of students who took a national exam at a high
school. The researcher wants to know if the scores at that school differ from the
national average of 850. A two-sided alternative hypothesis (also known as a non-
directional hypothesis) is appropriate because the researcher is interested in
determining whether the scores are either less than or greater than the national
average. (H0: μ = 850 vs. H1: μ≠ 850)
One-sided
A researcher has exam results for a sample of students who took a training course for
a national exam. The researcher wants to know if trained students score above the
national average of 850. A one-sided alternative hypothesis (also known as a
directional hypothesis) can be used because the researcher is specifically
hypothesizing that scores for trained students are greater than the national average.
(H0: μ = 850 vs. H1: μ > 850)
Confidence level
Confidence level refers to the possibility of a parameter that lies within a specified
range of values, which is denoted as c. Moreover, the confidence level is connected
with the level of significance. The relationship between level of significance and the
confidence level is c=1−α.
The common level of significance and the corresponding confidence level are given
below:
Rejection region:
The rejection region is the values of test statistic for which the null hypothesis is
rejected.
Revision of terms
When you have finished the analyses in Activities 3.2, 3.3 and 3.4, revise the unit and
answer the short questions below:
Activity 4.5
1. What distribution does the differences between sample means follow?
2. What is a null hypothesis?
3. What is the standard error of the difference between means?
4. What is an acceptance region?
5. When is a difference between means significant?
6. What conclusion is drawn when a difference between means is significant?
7. What is a paired sample t-test?
8. When is a paired experiment useful?