0% found this document useful (0 votes)
9 views12 pages

ICS Week 4 - Handouts

This document discusses the use of the t-distribution for statistical inference when the population standard deviation is unknown, highlighting its bell-shaped and symmetrical nature. It explains how to calculate confidence intervals and perform hypothesis testing using the one-sample and two-sample t-tests, including examples and critical values based on degrees of freedom. Additionally, it addresses the importance of dependent and independent samples in statistical analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views12 pages

ICS Week 4 - Handouts

This document discusses the use of the t-distribution for statistical inference when the population standard deviation is unknown, highlighting its bell-shaped and symmetrical nature. It explains how to calculate confidence intervals and perform hypothesis testing using the one-sample and two-sample t-tests, including examples and critical values based on degrees of freedom. Additionally, it addresses the importance of dependent and independent samples in statistical analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Unit 2 – Statistical Inference 75

4 STUDENT’S T-DISTRIBUTION
In previous sections, inferences about the population mean were made on the
assumption that the population standard deviation was known. The hypothesis test
based on the z-statistic is called z-test. When the population standard deviation (𝜎𝜎)
is unknown, estimation and hypothesis testing approach based on standard Normal
distribution cannot be used as the value of 𝜎𝜎 is required in both calculations. We
can then make inferences relying on t-distribution instead of standard Normal
distribution. Like the Normal distribution, the t-distribution is bell-shaped and
symmetrical. It has only one parameter, the degrees of freedom. The degrees of
freedom defines the shape of a t-distribution. A t-distribution with 𝑘𝑘 degrees of
freedom is denoted as 𝑡𝑡𝑘𝑘 . With the degrees of freedom increasing, the t-distribution
is more similar to a standard Normal distribution (Figure 8).

FIGURE 8. PROBABILITY DENSITY PLOTS OF SOME T-DISTRIBUTION TOGETHER WITH THE STANDARD
NORMAL DISTRIBUTION.

In statistics, the degrees of freedom equals the sample size minus the number of
parameters to be estimated. When having a dataset of sample size 𝑛𝑛 and making
inference about the population mean based on a t-distribution, the degrees of
freedom = 𝑛𝑛 − 1, as only one parameter (i.e. the mean) is estimated. Therefore, if
the sample size is large, the degrees of freedom is also large.

Inferences based on the t-distribution are made in the same way as described for
the Normal. The only difference is in the critical values, which depend on the sample
size and significance level (𝛼𝛼). For example, given the 5% significance level (𝛼𝛼 =
0.05), the critical value of the Normal distribution is 1.96, while for a t-distribution
this is 2.57 (when 𝑑𝑑𝑑𝑑 = 5) and 1.98 (if 𝑑𝑑𝑑𝑑 = 100) (see Table 2).

Unit 2 – Statistical Inference 75


Unit 2 – Statistical Inference 76
TABLE 2. CRITICAL VALUES OF THE T-DISTRIBUTION.

Significance level
Degrees of freedom
𝜶𝜶 = 𝟎𝟎. 𝟏𝟏𝟏𝟏 𝜶𝜶 = 𝟎𝟎. 𝟎𝟎𝟎𝟎 𝜶𝜶 = 𝟎𝟎. 𝟎𝟎𝟎𝟎
5 2.02 2.57 4.03
10 1.81 2.23 3.17
20 1.73 2.09 2.85
30 1.70 2.04 2.75
50 1.68 2.01 2.68
100 1.66 1.98 2.63
150 1.66 1.98 2.61
Standard Normal Distribution 1.65 1.96 2.58

4.1 ESTIMATION BASED ON T-DISTRIBUTION


Given a sample data of size 𝑛𝑛, if the population standard deviation 𝜎𝜎 is unknown,
the 90%, 95% and 99% confidence intervals of the population mean estimate are:
𝑠𝑠
90% 𝐶𝐶𝐶𝐶 = 𝑋𝑋� ± 𝑡𝑡𝑛𝑛−1,0.10 ,
√𝑛𝑛
𝑠𝑠
95% 𝐶𝐶𝐶𝐶 = 𝑋𝑋� ± 𝑡𝑡𝑛𝑛−1,0.05 ,
√𝑛𝑛
𝑠𝑠
99% 𝐶𝐶𝐶𝐶 = 𝑋𝑋� ± 𝑡𝑡𝑛𝑛−1,0.01 ,
√𝑛𝑛

respectively, where:

 𝑋𝑋� is the sample mean;

 𝑠𝑠 is the sample standard deviation;

 𝑛𝑛 is the sample size;

 𝑡𝑡𝑛𝑛−1,𝛼𝛼 denotes the critical value of the t-distribution with 𝑛𝑛 − 1 degrees of


freedom at a significance level of 𝛼𝛼.

Worked Example 6: Confidence interval for a mean

Twenty patients attended an osteoporosis clinic for a bone mass density (BMD) scan.
It is known that the BMD in this sample is plausibly normally distributed with an
average of 0.8 g/cm2 and a standard deviation of 0.1 g/cm2. Given that the 95%
critical values for a t distribution with 19 degrees of freedom are ± 2.093, find the
95% confidence interval for the mean BMD.

Solution 6

Unit 2 – Statistical Inference 76


Unit 2 – Statistical Inference 77
The data is plausibly normally distributed, but the population standard deviation is
unknown. Therefore, inferences about the population mean cannot be based on
Normal distribution. Hence t-distribution can be used.

• The degrees of freedom = Sample size – 1 = 20 – 1 = 19;

• 𝛼𝛼 = 1 − 95% = 0.05;

• The critical value 𝑡𝑡19,0.05 = 2.093;


𝑠𝑠 0.1
• The 95% CI = Sample mean ± 𝑡𝑡19,0.05 × = 0.8 ± 2.093 × √20 = 0.75 to 0.85
√𝑛𝑛
g/cm2.

Notice that rounding to two decimal places is done at the last stage of the
calculation. Rounding in earlier stages may induce errors.

4.2 ONE-SAMPLE T-TEST


The procedure for testing a hypothesis about a population mean with unknown
population standard deviation is the same as when the population standard
deviation in known. However, instead of using the z-statistic with critical values 1.96
(for 95% confidence level), 2.58 (for 99% confidence level), etc., the t-statistic is
used, and critical values are determined from the t-distribution. Such hypothesis
test is referred to as one-sample t-test.

Worked Example 7: One-sample t-test

A standard is set of no more than 300 parts pollution per million for air in a city. In
a sample of measurements at 15 random locations, the reading was plausibly
normally distributed with a mean of 303 parts pollution per million and a standard
deviation of 3.6 parts per million. Can the city authorities claim they are achieving
the standard?

Solution 7

1. The data is plausibly normally distributed, but the population standard


deviation is unknown. Therefore, inferences about the population mean cannot
be based on Normal distribution. Hence t-distribution can be used.

2. The null and alternative hypotheses are, respectively:

• 𝐻𝐻0 : The population mean pollution reading = 300 parts per million;

• 𝐻𝐻1 : The population mean pollution reading ≠ 300 parts per million.

3. The significance level is taken to be 5%, i.e. 𝛼𝛼 = 0.05.

4. The test statistic is:

Unit 2 – Statistical Inference 77


Unit 2 – Statistical Inference 78
𝑋𝑋� − 𝜇𝜇0 303 − 300
𝑡𝑡 = = = 3.23.
𝑠𝑠/√𝑛𝑛 3.6/√15

5. At the 5% significance level, the t-distribution with 14 (= 15 - 1) degrees of


freedom has the critical value 𝑡𝑡14,0.05 = 2.145. The decision rule is:

• Reject the null hypothesis if |𝑡𝑡| > 2.145;

• Do not reject the null hypothesis if |𝑡𝑡| ≤ 2.145.

6. Because the value of the test statistic is in the critical region (𝑡𝑡 = 3.23 > 2.145),
the null hypothesis is rejected at the 5% level of significance. Therefore, the city
authorities cannot claim they are achieving the standard of 300 parts pollution
per million for air in the city.

Note that if carrying out a one-sample t-test using statistical software, a p-value will
be given. The decision rule for the ‘p-value approach’ is always the same. For
example, if significance level is taken to be 0.05, then

• Reject the null hypothesis if p < 0.05;

• Do not reject the null hypothesis if p > 0.05.

SAQ 3:

In an operating theatre a surgeon found that it takes an average of 8.5 hours to


perform a particular procedure. A random sample of 12 recent operations shows
that the procedures have taken 8.2 hours on average with a standard deviation of
0.2 hours. Assume that the operation time is plausibly normally distributed. Has the
mean operation time changed?

4.3 TWO-SAMPLE T-TEST


In the hypothesis test above, a sample value was compared to a hypothesised value.
Most commonly, however, the objective is to compare two samples and answer
research questions such as Is Treatment A more effective than Treatment B?

Essentially, the procedure is the same as before. However, there is one further and
important consideration: the relationship between the two samples. Consider the
following scenarios.

Scenario 1: A new treatment for eczema has been developed and a trial is set up to
determine if it is more effective than the current preparation of choice. Each patient
is given both treatments simultaneously, applying the new preparation to one side
of the body and the existing formula to the opposite side.

Unit 2 – Statistical Inference 78


Unit 2 – Statistical Inference 79
Scenario 2: Two types of cast are used for supporting fractures of the radial bone:
full and half. A trial is set up to determine whether the casts are equally effective in
preventing displacement of the fracture. Patients with the injury are selected at
random and given either the full or the half cast.

Superficially these two scenarios are similar but there is one essential difference. In
the first case, the outcomes of the treatments will be dependent because they are
applied to the same individual, albeit on different sides of the body. These are
paired observations. In the second case, however, the casts will be applied to
different individuals and the outcomes will be independent.

Generally, studies in which repeated observations are made on individuals are


extremely efficient. Between-subject differences are often the greatest sources of
variation and, by eliminating them, other effects, such as treatments, become easier
to identify. Repeated measures on patients are often used to assess the results of
an intervention and monitor progress over time (e.g. blood pressure pre- and post-
treatment).

The tests described here can be extended to more than two groups. The techniques
are broadly known as analysis of variance (ANOVA). Details are beyond the scope of
this unit but are described in Altman (1997).

Worked Example 8: Reflection

A study is designed to determine if a new treatment for asthma is more effective than
the existing preparation. Subjects are matched by age, severity of illness and previous
treatment history, which are the factors known to affect asthma. Are the outcomes
dependent or independent? Why?

Solution 8

The principal source of variation of response to a treatment arises from between-


subject differences. These are due to the natural variation of measurements made
on living organisms. Some can be identified (e.g. lifestyle factors, patient history,
other disease) and some are purely random. Between-subject differences can be
eliminated entirely by applying both treatments to the same individual but, when
this is not feasible, they can be reduced by matching for factors known to affect the
outcome. In trials where matching has taken place, the outcomes will be dependent,
because much of the variation will have been controlled. These and other issues in
study design are discussed in Section 8 of this unit.

SAQ 4:

State whether samples in the following studies are paired or independent:

Unit 2 – Statistical Inference 79


Unit 2 – Statistical Inference 80
a) Levels of lead are measured in samples of blood taken from two different groups
of children. The first group live near a lead smelter and the second group live in
an unpolluted area.

b) Levels of lead are measured in samples of blood taken from two different groups
of children. The first group live near a lead smelter and the second group live in
an unpolluted area. Each child in the smelter group is matched for age and
gender with a child in the unpolluted area.

c) Two types of bronchodilator used to treat asthma are to be compared. Ten


asthmatics use each of two bronchodilators in turn over two four-week periods.
The order is randomised and there is a one-week gap between the two periods.

d) Two analgesics are compared in two groups of patients with chronic pain.

e) Smoking rates have been measured in the same twenty patients selected at
random from a general practice before and six-months after, the introduction
of a health education campaign.

4.3.1 Dependent samples


In a comparison of dependent continuous data distributions (e.g. pre- and post-
treatment blood pressure measurements), the difference between each pair of
measurements is determined and inferences are made about the mean difference.
In such cases, the paired t-test would be an appropriate choice. The paired t-test is
derived from the one-sample t-test (Section 4.2), under the following assumptions:

1. The distribution of the paired differences (not the original data) is plausibly
Normal.

2. The differences are independent of each other (i.e. the difference from the
1st pair is independent from the differences from the 2nd, 3rd, …, nth pairs).
Note this does not mean the original data are independent. Within each pair,
the data are dependent (e.g. pre-SBP & post-SBP of a same patient are
dependent).

When the two assumptions are satisfied, the inferences can be made based on the
t-distribution. The null and alternative hypotheses are:

𝐻𝐻0 : There is no difference between the two samples, i.e. the mean of the paired
differences 𝜇𝜇𝑑𝑑 = 0.

𝐻𝐻1 : There is significant difference between the two samples, i.e. the mean
difference 𝜇𝜇𝑑𝑑 ≠ 0.

Similar as the one-sample t-test, the test statistic is:

𝑑𝑑̅ − 0 𝑑𝑑̅
𝑡𝑡 = = ,
𝑠𝑠𝑑𝑑 /√𝑛𝑛 𝑠𝑠𝑑𝑑 /√𝑛𝑛

Unit 2 – Statistical Inference 80


Unit 2 – Statistical Inference 81
where 𝑑𝑑̅ is the mean difference observed from the two samples (which is used as
an estimate of 𝜇𝜇𝑑𝑑 ), 𝑠𝑠𝑑𝑑 is the standard deviation of the paired differences (not the
original data), and 𝑛𝑛 is the sample size.

The value of 𝑡𝑡, calculated using the above formula, is then compared with a critical
value 𝑡𝑡𝑛𝑛−1,𝛼𝛼 (degrees of freedom = n – 1, significance level = 𝛼𝛼). The decision rule
will be the same as the one-sample t-test, that is,

• reject 𝐻𝐻0 if |𝑡𝑡| > 𝑡𝑡𝑛𝑛−1,𝛼𝛼 ;

• do not reject 𝐻𝐻0 if |𝑡𝑡| ≤ 𝑡𝑡𝑛𝑛−1,𝛼𝛼 .

Note that the calculation of 𝑑𝑑̅ is demonstrated as below:

When we get {𝑑𝑑1 , 𝑑𝑑2 , … , 𝑑𝑑𝑛𝑛 }, the sample mean difference 𝑑𝑑̅ is determined by:

𝑑𝑑1 + 𝑑𝑑2 + ⋯ + 𝑑𝑑𝑛𝑛


𝑑𝑑̅ = .
𝑛𝑛

Worked Example 9: Paired t-test

Ten patients with eczema were given two treatments: Preparation A and
Preparation B. Preparation A was applied to one side of the body and B to the other.
The severities of each patient’s symptoms were measured before and after
treatment and the reductions in severity were recorded. The mean difference of the
reduction in severity was 1.3 units and the standard deviation was 4.55 units. The
differences of reduction (i.e. reduction under A – reduction under B) are plausibly
normally distributed. Does this study support the notion that both preparations are
equally effective? (Usually in practice, data is given in a form of the following table)

Patient 1 2 3 4 5 6 7 8 9 10
Treatment A 𝑑𝑑1𝐴𝐴 𝑑𝑑2𝐴𝐴 𝑑𝑑3𝐴𝐴 … … … … … … 𝑑𝑑10 𝐴𝐴
Treatment B 𝑑𝑑1𝐵𝐵 𝑑𝑑2𝐵𝐵 𝑑𝑑3𝐵𝐵 … … … … … … 𝑑𝑑10 𝐵𝐵

Solution 9

Unit 2 – Statistical Inference 81


Unit 2 – Statistical Inference 82
It is important to understand that what the values given in the question mean.

Patient 1 2 3 4 5 6 7 8 9 10
Treatment A 𝑑𝑑1𝐴𝐴 𝑑𝑑2𝐴𝐴 𝑑𝑑3𝐴𝐴 … … … … … … 𝑑𝑑10 𝐴𝐴
Treatment B 𝑑𝑑1𝐵𝐵 𝑑𝑑2𝐵𝐵 𝑑𝑑3𝐵𝐵 … … … … … … 𝑑𝑑10 𝐵𝐵
Difference
𝑑𝑑1 𝑑𝑑2 𝑑𝑑3 𝑑𝑑10
(A-B)

Here:

• 𝑑𝑑1𝐴𝐴 denotes the reduction of severity in patient 1 under treatment A; 𝑑𝑑1𝐵𝐵


denotes the reduction of severity in the same patient (patient 1) under
treatment B. Therefore, 𝑑𝑑1𝐴𝐴 and 𝑑𝑑1𝐵𝐵 are dependent. So do the values for
the other 9 patients. Hence we have paired data.

• 𝑑𝑑1 = 𝑑𝑑1𝐴𝐴 − 𝑑𝑑1𝐵𝐵 , 𝑑𝑑2 = 𝑑𝑑2𝐴𝐴 − 𝑑𝑑2𝐵𝐵 , …, 𝑑𝑑10 = 𝑑𝑑10 𝐴𝐴 − 𝑑𝑑10 𝐵𝐵 . They are the
differences of the two treatments (A and B) in a same patient. The ten
differences themselves are independent, as they are from independent
patients.

• The mean of the differences {𝑑𝑑1 , 𝑑𝑑2 , … 𝑑𝑑10 } is 1.3 units and the standard
deviation of {𝑑𝑑1 , 𝑑𝑑2 , … 𝑑𝑑10 } is 4.55 units, i.e. 𝑑𝑑̅ = 1.3 and 𝑠𝑠𝑑𝑑 = 4.55.

The procedure of carrying out a paired t-test is as follows:

1. The differences are independent and plausibly normally distributed with


unknown population standard deviation. Therefore, paired t-test (t-test for
paired data) is appropriate. (Note: the first step is always checking assumptions.
This is necessary.)

2. The null and alternative hypotheses are, respectively:

• 𝐻𝐻0 : The two treatments are equally effective, i.e. 𝜇𝜇𝑑𝑑 = 0.

• 𝐻𝐻1 : The two treatments are not equally effective, i.e. 𝜇𝜇𝑑𝑑 ≠ 0.

3. The significance level is taken to be 5%, i.e. 𝛼𝛼 = 0.05.

4. The test statistic is:

𝑑𝑑̅ − 0 1.3
𝑡𝑡 = = = 0.91.
𝑠𝑠𝑑𝑑 /√𝑛𝑛 4.55/√10

5. At the 5% significance level, the t-distribution with 9 (= 10 - 1) degrees of


freedom has the critical value 𝑡𝑡9,0.05 = 2.262. The decision rule is:

• Reject the null hypothesis if |𝑡𝑡| > 2.262;

• Do not reject the null hypothesis if |𝑡𝑡| ≤ 2.262.

Unit 2 – Statistical Inference 82


Unit 2 – Statistical Inference 83
6. Because the value of the test statistic is not in the critical region (|𝑡𝑡| = 0.91 ≤
2.262), we do not reject the null hypothesis at the 5% level of significance. There
is not enough evidence suggesting the two treatments are not equally effective.

4.3.2 Independent samples


Sometimes the two samples of data are not related, i.e. two independent samples.
For example, we want to compare two oral antidiabetic drugs in terms of their real-
world effectiveness on reducing HbA1c. The two samples of data are the diabetic
patients in an electronic healthcare records (EHR) database who received either
drug A or drug B. The two treatments have been given to different patients.
Therefore, this is the case of two independent samples.

Generally, testing the equality of means from two independent samples is to test if
the two independent samples are from the same (plausibly Normal) population.
Necessary assumptions include:

1. The data is plausibly normally distributed. (In the above example, this
means that the HbA1c reduction in both groups, drug A and drug B, should
be normally distributed.)

2. The population variances (or standard deviation) of the two groups are
equal.

When the two assumptions are satisfied, we can test the difference in the population
means using the unpaired t-test (also known as the t-test for two independent
samples).

Suppose the two samples are of size 𝑛𝑛1 and 𝑛𝑛2 . The sample means are 𝑋𝑋�1 and 𝑋𝑋�2, and
the standard deviations are 𝑠𝑠1 and 𝑠𝑠2 , respectively. The null and alternative
hypotheses are:

𝐻𝐻0 : The two population means are equal;

𝐻𝐻1 : The two population means are not equal.

We need to calculate a pooled standard deviation of the two samples, given by:

(𝑛𝑛1 − 1)𝑠𝑠12 + (𝑛𝑛2 − 1)𝑠𝑠22


𝑠𝑠 = � .
𝑛𝑛1 + 𝑛𝑛2 − 2

Then the test statistic is given by:

(𝑋𝑋�1 − 𝑋𝑋�2 ) − 0 𝑋𝑋�1 − 𝑋𝑋�2


𝑡𝑡 = = ,
1 1 1 1
𝑠𝑠 × �𝑛𝑛 + 𝑛𝑛 𝑠𝑠 × �𝑛𝑛 + 𝑛𝑛
1 2 1 2

which follows the t-distribution with (𝑛𝑛1 + 𝑛𝑛2 − 2) degrees of freedom. Then the
decision rule is,

Unit 2 – Statistical Inference 83


Unit 2 – Statistical Inference 84
• reject 𝐻𝐻0 if |𝑡𝑡| > 𝑡𝑡𝑛𝑛1 +𝑛𝑛2 −2,𝛼𝛼 ;

• do not reject 𝐻𝐻0 if |𝑡𝑡| ≤ 𝑡𝑡𝑛𝑛1 +𝑛𝑛2 −2,𝛼𝛼 .

The unpaired t-test described above requires two conditions to be satisfied: (i)
Normality; and (ii) equal variances, i.e. the two population variances are equal.
Unpaired t-test is fairly robust to slight departures from Normality. But it is less robust
to unequal variances. When variances are not equal, caution should be exercised in
comparing the means, because clearly the samples are not drawn from the same
population. In such cases, there are four options:

1. Use Welch’s test (also known as Welch’s t-test).

2. Use a non-parametric test to compare the means (introduced in Study Unit


3).

3. Try data transformation to see if equal variances can be achieved. In some


cases, the variability increases with magnitude. It may be possible to achieve
equality of variances by applying a logarithmic transformation to the data
(Unit 1, Section 6.1.4).

4. Do not proceed with the test of the means.

Welch’s test modifies unpaired t-test and is suitable to use when samples are known
to have arisen from Normal distributions with unequal variances. The Welch’s test
statistic is given by:

(𝑋𝑋�1 − 𝑋𝑋�2 ) − 0 𝑋𝑋�1 − 𝑋𝑋�2


𝑡𝑡 = = ,
𝑠𝑠 2 𝑠𝑠22 𝑠𝑠 2 𝑠𝑠22
� 1 + � 1 +
𝑛𝑛1 𝑛𝑛2 𝑛𝑛1 𝑛𝑛2

which follows the t-distribution with the following degrees of freedom:

𝑠𝑠12 𝑠𝑠22
𝑛𝑛1 + 𝑛𝑛2
𝑑𝑑𝑑𝑑 = .
𝑠𝑠14 𝑠𝑠 4
2 + 2 2
𝑛𝑛1 (𝑛𝑛2 − 1) 𝑛𝑛2 (𝑛𝑛1 − 1)

You do not need to remember these formulas as the essential idea of Welch’s test is
the same as unpaired t-test, apart from the modifications shown above. It can be
implemented by software.

Worked Example 10: Test of independent sample means

A sample of nineteen severely underweight patients was divided into two groups.
Twelve were given high protein diets and seven followed a low protein regime. The

Unit 2 – Statistical Inference 84


Unit 2 – Statistical Inference 85
gain in weight (kg) in an 84-day period was measured for each person. The results
were as follows.

• Mean weight gain of patients on the high protein diet = 12.0 kg.

• Mean weight gain of patients on the low protein diet = 10.1 kg.

• Variances of the samples are comparable (𝑠𝑠1 = 2.14, 𝑠𝑠2 = 2.06).

Given that the weight gains are plausibly normally distributed,

a) Conduct an appropriate hypothesis test to compare the effectiveness of the


two diets (high vs low protein).

b) Determine the 95% confidence interval for the difference between the mean
weights gained with the two diets.

Solution 10

a) An unpaired t-test is appropriate as: (i) the two samples are independent; (ii)
weight gain is plausibly Normal; and (iii) the population standard deviation is
unknown. The steps of carrying out an unpaired t-test are given below:

1. The null & alternative hypotheses are:

𝐻𝐻0 : The mean weight gain on both diets are the same;

𝐻𝐻1 : The mean weight gain on both diets are not the same.

2. The significance level is taken to be 5%.

3. The two sample standard deviations are comparable. Then the pooled
standard deviation can be calculated as:

(𝑛𝑛1 − 1)𝑠𝑠12 + (𝑛𝑛2 − 1)𝑠𝑠22


𝑠𝑠 = �
𝑛𝑛1 + 𝑛𝑛2 − 2

(12 − 1)2.142 + (7 − 1)2.062


=� = 2.11 kg.
12 + 7 − 2

4. The test statistic is:

(𝑋𝑋�1 − 𝑋𝑋�2 ) − 0 𝑋𝑋�1 − 𝑋𝑋�2 12.0 − 10.1


𝑡𝑡 = = = = 1.89 .
1 1 1 1 1 1
𝑠𝑠 × �𝑛𝑛 + 𝑛𝑛 𝑠𝑠 × �𝑛𝑛 + 𝑛𝑛 2.11 × �12 +
1 2 1 2 7

5. The test statistic follows a t-distribution with degrees of freedom of 17 (= 12


+ 7 - 2). The critical value at the 5% significance level is 𝑡𝑡17,0.05 = 2.11.

6. The decision rule is:

Unit 2 – Statistical Inference 85


Unit 2 – Statistical Inference 86
• Reject 𝐻𝐻0 if |𝑡𝑡| > 𝑡𝑡17,0.05 ;

• Do not reject 𝐻𝐻0 if |𝑡𝑡| ≤ 𝑡𝑡17,0.05.

7. The null hypothesis is not rejected as |𝑡𝑡| = 1.89 ≤ 𝑡𝑡17,0.05 = 2.11.

8. There is not enough evidence to suggest the weight gain on the two diets are
significantly different at the 5% significance level.

b) The 95% confidence interval is given by:

1 1
(𝑋𝑋�1 − 𝑋𝑋�2 ) ± 𝑡𝑡𝑛𝑛1 +𝑛𝑛2 −2 × 𝑠𝑠 × � +
𝑛𝑛1 𝑛𝑛2

1 1
= (12.0 − 10.1) ± �2.11 × 2.11 × � + �
12 7
= [−0.22, 4.02] .

The 95% confidence interval of the weight gain difference is -0.22 kg to 4.02 kg.

Worked Example 11: Reflection

1. What do you notice about the 95% confidence interval for the difference
between the mean weight gained with the two diets?

2. What can you conclude?

Solution 11

1. The 95% confidence interval (-0.22, 4.02) includes zero.

2. It is possible that the difference between the mean weight gained with each diet
is equal to zero. This is equivalent to the outcome of the hypothesis test, which
concluded that both diets lead to a similar weight gain (i.e. the difference
between the mean weight gained with the two diets is zero).

Unit 2 – Statistical Inference 86

You might also like