MTPDF9 Statistical Inference of Two Samples
MTPDF9 Statistical Inference of Two Samples
and
σ 12 σ 22
σ x x σ x2 σ x2 .
1 2 1 2
n1 n2
Sampling distribution
for x1 x 2 σ x 1
x 2
μ1 μ2 σ x x
1 2
x1 x 2
Two-Sample z-Test for the Difference Between Means
A two-sample z-test can be used to test the difference between two population
means μ1 and μ2 when a large sample (at least 30) is randomly selected from each
population and the samples are independent. The test statistic is x1 x 2and the
standardized test statistic is
z
x1 x 2 μ1 μ2
where σ x x
σ 12 σ 22
.
σ x x
1 2
1 2
n1 n2
When the samples are large, you can use s1 and s2 in place of 1 and 2. If the
samples are not large, you can still use a two-sample z-test, provided the
populations are normally distributed and the population standard deviations are
known.
In Words In Symbols
1. State the claim mathematically. Identify the null and
State H0 and Ha.
alternative hypotheses.
2. Specify the level of significance. Identify .
3. Sketch the sampling distribution.
4. Determine the critical value(s).
Use z-table.
5. Determine the rejection regions(s).
In Words In Symbols
x1 x 2 μ1 μ2
6. Find the standardized test statistic. z
σ x x
1 2
H0: 1 2
= 0.10
Ha: 1 > 2 (Claim)
-3 -2 -1 0 1 2 3
z
z0 = 1.28
H0: 1 2 z0 = 1.28
Ha: 1 > 2 (Claim) z
-3 -2 -1 0 1 2 3
There is enough evidence at the 10% level to support the teacher’s claim that her
students score better on the ACT.
• The production manager of Risen Manufacturing would like to decide
which of the two plants should be given the responsibility of producing
the soft drink bottle cups. This decision is to be based on productivity
levels. A sample of 50 days at the Golden Star Plant produced the mean
of 104.6 thousand cups a day with s = 13.4 thousand. The Blue Moon
Plant produced an average of 98.7 thousand per day with s=15.2
thousand over 60 days. Do these plants differ significantly in production
level? Use a 0.05 level of significance.
Three conditions are necessary to use a t-test for small independent samples:
1. The samples must be randomly selected.
2. The samples must be independent. Two samples are independent if the sample
selected from one population is not related to the sample selected from the
second population.
3. Each population must have a normal distribution.
A two-sample t-test is used to test the difference between two population means μ1 and
μ2 when a sample is randomly selected from each population. Performing this test
requires each population to be normally distributed, and the samples should be
independent. The standardized test statistic is
t
x1 x 2 μ1 μ2 .
σ x x
1 2
If the population variances are equal, then information from the two samples is combined
to calculate a pooled estimate of the standard deviation σ. ˆ
σˆ
n1 1 s12 n2 1 s 22
n1 n2 2
The standard error for the sampling distribution of x1 x 2 is
1 1
σ x x σˆ Variances equal
1 2
n1 n2
and d.f.= n1 + n2 – 2.
If the population variances are not equal, then the standard error is
s12 s 22
σ x x Variances not equal
1 2
n1 n2
7. Make a decision to reject or fail to reject the null If t is in the rejection region,
hypothesis. reject H0. Otherwise, fail to
8. Interpret the decision in the context of the original reject H0.
claim.
A random sample of 17 police officers in Brownsville has a mean annual income of
$35,800 and a standard deviation of $7,800. In Greensville, a random sample of 18
police officers has a mean annual income of $35,100 and a standard deviation of
$7,375. Test the claim at = 0.01 that the mean annual incomes in the two cities
are not the same. Assume the population variances are equal.
H0: 1 = 2
= 0.005 = 0.005
Ha: 1 2 (Claim)
-2 t
d.f. = n1 + n2 – 2 -3 -1 0 1 2
t0
3
= 2.733
–t0 = –2.733
= 17 + 18 – 2 = 33
H0: 1 = 2
Ha: 1 2 (Claim) t
-3 -2 -1 0 1 2 3
–t0 = –2.733 t0 = 2.733
σ x x σˆ 1
1
n1 1 s12 n2 1 s 22
1
1
1 2
n1 n2 n1 n2 2 n1 n2
17 1 78002 18 1 73752
1
1
17 18 2 17 18
7584.0355(0.3382)
2564.92 Continued.
H0: 1 = 2
Ha: 1 2 (Claim) -3 -2 -1 0 1 2 3
t
–t0 = –2.733 t0 = 2.733
t
x1 x 2 μ1 μ2 35800 35100 0
σ 0.273
x x
1 2
2564.92
Fail to reject H0.
There is not enough evidence at the 1% level to support the claim that the mean
annual incomes differ.
Example
These samples are independent because it is not possible to pair the new trucks
with the used sedans. The data represents prices for different vehicles.
To perform a two-sample hypothesis test with dependent samples, the difference
between each data pair is first found:
d
–t0 μd t0
The following symbols are used for the t-test for μd .
Symbol Description
n The number of pairs of data
d The difference between entries for a data pair, d = x1 – x2
μd The hypothesized mean of the differences of paired data in the population
d The mean of the differences between the paired data entries in the dependent samples
d d
n
sd The standard deviation of the differences between the paired data entries in the
dependent samples
n(d 2 ) d
2
sd
n(n 1)
A t-test can be used to test the difference of two population means when a sample is
randomly selected from each population. The requirements for performing the test are
that each population must be normal and each member of the first sample must be paired
with a member of the second sample.
d d
The test statistic is
n
d μd
and the standardized test statistic is t .
sd n
Continued.
In Words In Symbols
5. Determine the rejection region(s). d d
n
sd n(d 2 ) (d )2
6. Calculated andsd .Use a table. n(n 1)
d μd
7. Find the standardized test statistic. t
sd n
8. Make a decision to reject or fail to reject the null If t is in the rejection region,
hypothesis. reject H0. Otherwise, fail to
reject H0.
9. Interpret the decision in the context of the original
claim.
A reading center claims that students will perform better on a standardized reading test
after going through the reading course offered by their center. The table shows the
reading scores of 6 students before and after the course. At = 0.05, is there enough
evidence to conclude that the students’ scores after the course are better than the scores
before the course?
Student 1 2 3 4 5 6
Score (before) 85 96 70 76 81 78
Score (after) 88 85 89 86 92 89
H0: d 0
Ha: d > 0 (Claim)
d.f. = 6 – 1 = 5
H0: d 0 = 0.05