0% found this document useful (0 votes)
74 views89 pages

BUSS7902 Chapter 5 Lecture (Notes)

Lecturer source

Uploaded by

Bey Alivand
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views89 pages

BUSS7902 Chapter 5 Lecture (Notes)

Lecturer source

Uploaded by

Bey Alivand
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 89

BUSS7902

Chapter 5A
Statistical Inference III (CI)
Unit Coordinator: A/Prof Boris Choy

BUSINESS SCHOOL

Discipline of
Business Analytics
2024 S1
Lecture Outline
1. Estimation of the difference of two population means μ1 and μ2
with independent samples – σ1 and σ2 are known
2. Estimation of the difference of two population means μ1 and μ2
with independent samples – σ1 and σ2 are unknown but equal
3. Estimation of the difference of two population means μ1 and μ2
with independent samples – σ1 and σ2 are unknown and unequal
4. Estimation of the difference of two population means μ1 and μ2
with dependent samples
5. Estimation of the difference of two population proportions p1
and p2 with independent samples
References: (SSK-7/8) 11.1 – 11.4
Excel Analysis ToolPak & BUSS7902 Analysis ToolPak
ü Expected lecture time: 1 hour
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 2
Concepts of Estimation
(Two populations)

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 3
Two Normal Populations

Q: How can we make inference about the unknown μ1 and μ2?


USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 4
Two Binomial Populations

v Q: How can we make inference about the unknown p1 and p2?


USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 5
Estimating μ1 – μ2
when the Population
Variances are Known
(SSK 11.1)

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 6
Two Populations – Known Variances

v Assumptions:
1. Populations are normal - N(μ1,σ12) and N(μ2,σ22).
2. The two random samples are independent.
3. Population variances σ12 and σ22 are known.
v If the populations are non-normal or unknown, CLT can be
used if the sample sizes are large (> 30).
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 7
Two Populations – Known Variances
v Objective: Estimate the difference of the 2 population means
µ X - X = E[X 1 - X 2 ] = µ1 - µ 2
1 2

v Point estimator: µˆ X - X = X 1 - X 2 , µˆ X - X is unbiased for μ1-μ2


1 2 1 2

v Variance of point estimator:

s X2 1-X2
= V [ X 1 - X 2 ] = V [X 1 ] + V [X 2 ] by independence

v Sampling distribution of X 1 - X 2:

(
X 1 - X 2 ~ N µ X1 - X 2 , s 2
X1 - X 2
)
Note: CLT can be applied here for non-normal distributions
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 8
Two Populations – Known Variances
v Sampling distribution of X1 - X 2 :

v Interval estimator: The 100(1 – α)% CI for µ X1 - X 2 is

(LCL, UCL) = s 12 s 22
X 1 - X 2 ± za / 2 +
n1 n2

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 9
Example
v Example: (Commission SSK-7 11.1 - Modified) A business
analyst in a car dealer wants to compare the commission
received by male and female car salespeople. She took a
random sample of 60 female salespeople and 100 male
salespeople and recorded their commission income for the
preceding year. The mean commission income for the
female salespeople was $33,000 and for the male
salespeople it was $30,250. From past records, it is known
that the commission incomes are normally distributed with
standard deviation $7,000 for both male and female
salespeople. Find a 95% conference interval for the
difference of mean commission incomes between female
and male car salespeople.
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 10
Example
Solution: Let µ1 and µ2 be the (population) mean incomes
earned by male and female car salespeople, respectively.
Given n1 = 60, n2 = 100, x1 = 33000, x2 = 30250,
s 1 = 7000, s 2 = 7000.
Note that the difference in the sample means is normally
distributed. A 95% CI for µ X 1 - X 2 (= µ1 - µ 2 ) is
s 12 s 22
x1 - x2 ± za / 2 +
n1 n2
7000 2 7000 2 LCL = 509.5
= 33000 - 30250 ± 1.96 ´ +
60 100
= 2750 ± 2240.5 = (509.5, 4,990.5) UCL = 4990.5
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 11
Example
Using BUSS7902 Analysis ToolPak:

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 12
Example
v Example: (Long Service Leave) Recall the long service leave
example in Chapter 6. The company offers two options for
benefiting from long service leave: (1) claim the days and
take paid leave, (2) cash-in the long service leave for the
salary equivalent. Understanding the difference between
the two is critical information for allocating work-hours and
estimating cash budgets. After observing the average of the
past 36 quarters, since the program has started, we find
that the average number of days of paid leave emanating
from long service leave is 73 per quarter. The average
number of long service leave days claimed as salary
equivalent is 62 per quarter. For now, assume σ1=σ2=6 days
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 13
Example
of leave per quarter. Find a 99% confidence interval (CI) for
the difference in the population means of the two options.
Solution: First of all, we have to make the assumptions that
the number of days taken as leave or cashed-in are
normally distributed or approximately distributed as
normal (via CLT), and the two random samples are
independent.
Let µ1 (take leave) and µ2 (cashed-in) be the two unknown
populations means.

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 14
Example
Using BUSS7902 Analysis ToolPak:

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 15
Sample Size Determination
For reference only
v To determine the sample size required to estimateµ X1 - X 2 within
the error bound B, we set

s 12 s 22
B = za / 2 +
n1 n2

We cannot solve this equation for two unknowns. So we set n1 =


n2. Hence, 2

n1 = n2 = ç a / 2 ÷ (s 12 + s 22 )
æ z ö
è B ø
v Note: The variances can be approximated (conservatively) using
the ranges if they are known.
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 16
Estimating μ1 – μ2
When the Population
Variances are Unknown
(SSK 11.2)

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 17
Two Populations – Unknown Variances

v Assumptions:
1. Populations are normal or approximately normal
(CLT for large n), N(μ1,σ12) and N(μ2,σ22).
2. The two random samples are independent.
3. Population variances σ12 and σ22 are unknown.

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 18
Two Populations – Unknown Variances
v Case 1: σ12 and σ22 are unknown and σ12 = σ22 = σ2, say
Note that the sample variance s12 is an unbiased estimate
of σ12 and the sample variance s22 is an unbiased estimate
of σ22 . Since σ12 = σ22 = σ2, s12 and s22 can be combined to
provide a better estimate for σ2.
v Definition: If σ12 = σ22 = σ2, a pooled estimate of σ2 is given
by
( n - 1 ) s 2
+ ( n - 1) s 2
s 2p = 1 1 2 2
n1 + n2 - 2
v A 100(1 – α)% CI for µ X - X is
1 2

(LCL, UCL) = x - x ± t æ1 1ö
1 2 a / 2,n + n - 2 s çç + ÷÷
2
p
è n1 n2 ø
1 2

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 19
Two Populations – Unknown Variances
v Example: (SSK-7 Exercise 11.12) Independent random
samples of 15 observations from each of two normal
distributions were taken, with the summary statistics

x1 = 1.48, x2 = 1.23, s1 = 0.18, s2 = 0.14


Assume that the two populations have the same variance.
Find a 90% CI for the difference in the population means.
Solution: The pooled estimate of the variance σ2 is

( n - 1) s 2
+ ( n - 1) s 2
14 ´ 0.18 2
+ 14 ´ 0.14 2
s 2p = 1 1 2 2
= = 0.026
n1 + n2 - 2 28

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 20
Two Populations – Unknown Variances
A 90% CI for µ X1 - X 2 is

æ1 1 ö
x1 - x2 ± ta / 2,n1 + n2 - 2 s çç + ÷÷
2
p
è n1 n2 ø
æ 1 1ö
= 1.48 - 1.23 ± 1.701´ 0.026ç + ÷
è 15 15 ø
= 0.25 ± 0.10 = (0.15, 0.35)

Note: t0.05, 28 = 1.701


.

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 21
Two Populations – Unknown Variances
Using BUSS7902 Analysis ToolPak:

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 22
Two Populations – Unknown Variances
v Case 2: σ12 and σ22 are unknown and σ12 ≠ σ22
If the population variances are unknown, then the 100(1 –
α)% CI for µ X1 - X 2 is given by

s12 s22
(LCL, UCL) = x1 - x2 ± ta / 2, df +
n1 n2
where a suggested value for df is
2
æs 2
s ö
2
çç + ÷÷
1 2

è n1 n2 ø
df = 2 2
1 æ s1 ö 2
1 æ s2 ö2
çç ÷÷ + çç ÷÷
n1 - 1 è n1 ø n2 - 1 è n2 ø

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 23
Two Populations – Unknown Variances
Using BUSS7902 Analysis ToolPak:

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 24
Known Variances vs Unknown Variances

v Q: How do I know whether or not σ12 = σ22 ?


A: Using an F-test in Excel Analysis ToolPak
Test H0: σ12 = σ22 vs HA: σ12 ¹ σ22

This hypothesis test will be discussed later.

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 25
Matched Pairs Experiment
(SSK 11.3)

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 26
Two Dependent Samples
v In the previous sections, we derive point and interval
estimates for the difference of two population means and
the two random samples are independent. In some
situations, two random samples are taken from the same
subject (or 2 subjects of similar characteristics) and are
therefore not independent. In this case, we call this a
matched pairs experiment and n1 = n2 = n, say.
§ Population means: μ1 and μ2
§ Objective: Estimate μD = μ1 – μ2
§ Let XD = X1 – X2 for each pair of observations
§ Let σD2 = V[XD] and SD = SD of XD
§ Assumption: XD is normally distributed
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 27
Two Dependent Samples
v Sampling distribution of XD :

X D ~ N µD ,s ( 2
D ) or (
X D ~ N µ1 - µ 2 , s 2
D )
v Point estimate of μ1 – μ2 : xD
v Interval estimate of μ1 – μ2 : (LCL, UCL) = sD
xD ± ta / 2,n -1
n
v Note: A new random sample of n observations, xD1, xD2,…,
xDn, is obtained where xDi = x1i – x2i and the sample mean
and sample standard deviation are xD and sD respectively.
v Note: We don’t care about the variances of X1 and X2.

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 28
Example
v Example: (SSK-7 Exercise 11.24) Construct a 95% CI for μ1 –
μ2 of a matched pairs experiment. The data come from the
normal distributions and are given below.

Solution: n = 5, xD = 3.8, sD = 3.11. A 95% CI for μ1 – μ2 is


sD 3.11
xD ± ta / 2,n -1 = 3.8 ± 2.776 ´ = (-0.06, 7.66)
n 5
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 29
Example
v Using BUSS7902 Analysis ToolPak:

v A 95% CI for μ1 – μ2 is (-0.0616, 7.6616)

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 30
Estimating the Difference of
Two Population Proportions
p1 – p2
(SSK 11.4)

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 31
Two Population Proportions

v Assumptions:
1. Populations are binomial, Bin(n1, p1) and Bin(n2, p2)
2. The two random samples are independent
v Population proportions: p1 and p2
v Objective: Estimate p1 – p2
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 32
Two Population Proportions
X1 X2
v Sample proportions: pˆ 1 = and pˆ 2 =
n1 n2
v Point estimate of p1 – p2: pˆ1 - pˆ 2
v Sampling distribution of pˆ1 - pˆ 2:
æ p1 (1 - p1 ) p2 (1 - p2 ) ö
pˆ1 - pˆ 2 ~ N çç p1 - p2 , + ÷÷
è n1 n2 ø
v Requirements:
n1 ³ 30, n2 ³ 30, n1 p1 ³ 5, n2 p2 ³ 5, n1(1 - p1 ) ³ 5, n2(1 - p2 ) ³ 5.
v Interval estimate of p1 – p2: A 100(1 – α)% CI for p1 – p2 is
pˆ1 (1 - pˆ1 ) pˆ 2 (1 - pˆ 2 )
pˆ1 - pˆ 2 ± za / 2 +
n1 n2
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 33
Examples
v Example: (SSK-7 Exercise 11.33) A random sample of n1 =
200 from population 1 produced X1 = 50 successes, and
another independent random sample of n2 = 100 from
population 2 produced X2 = 35 successes. Construct a 95%
CI for the difference between the population proportions.
x1 50 x 35
Solution: pˆ 1 = = = 0.25, pˆ 2 = 2 = = 0.35
n1 200 n2 100

A 95% CI for p1 – p2 is

= (-0.2111, 0.0111)

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 34
Examples
v Using BUSS7902 Analysis ToolPak:

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 35
Sample Size Determination
For Reference Only
v To determine the sample size required to estimate p1 – p2
within the error bound B, we set
pˆ1 (1 - pˆ1 ) pˆ 2 (1 - pˆ 2 )
B = za / 2 +
n1 n2
We cannot solve this equation for two unknown. So we set
n1 = n2. Hence, we choose
2
æz ö
n1 = n2 = ç a / 2 ÷ ( pˆ 1 (1 - pˆ 1 ) + pˆ 2 (1 - pˆ 2 ) )
è B ø 2
or more conservatively, n1 = n2 = 1 æç za / 2 ö÷ i.e. take
2è B ø
pˆ 1 = pˆ 2 = 0.5
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 36
Chapter Summary (Textbook)

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 37
BUSS7902
Chapter 5B
Statistical Inference III (HT)
Unit Coordinator: A/Prof Boris Choy

BUSINESS SCHOOL

Discipline of
Business Analytics
2024 S1
Lecture Outline
6. Hypothesis test for two population means μ1 and μ2 with
independent samples – σ1 and σ2 are known
7. Hypothesis test for two population means μ1 and μ2 with
independent samples – σ1 and σ2 are unknown but equal
8. Hypothesis test for two population means μ1 and μ2 with
independent samples – σ1 and σ2 are unknown and unequal
9. Hypothesis test for two population means μ1 and μ2 with
dependent samples
10. Hypothesis test for two population proportions p1 and p2
with independent samples
References: (SSK-7/8) 13.1 – 13.3
Excel Analysis ToolPak & BUSS7902 Analysis ToolPak
ü Expected lecture time: 1 hour
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 2
Testing μ1 – μ2
with Independent Samples
(SSK 13.1)

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 3
Distribution Theory
v Recall: Let X1 ~ N(μ1,σ12), X2 ~ N(μ2,σ22) and X1 and X2 are
independent. Two independent samples of sizes n1 and n2,
respectively are taken (say X11, X12, …, X1n1 and X21, X22, …, X2n2)
and the sample means are
1 n1 1 n2
X 1 = å X 1i and X 2 = åX 2i
n1 i =1 n2 i =1
It is known that
æ s 12 ö æ s 22 ö
X 1 ~ N çç µ1 , ÷÷ and X 2 ~N çç µ 2 , ÷÷
è n1 ø è n2 ø
Note: If the distributions are not normal but the samples sizes
are greater than 30, then CLT is used and the sampling
distribution of the sample mean can be approximated by the
normal distribution.
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 4
Distribution Theory
v Since X1 and X2 are independent, we have
æ s 12 s 22 ö
X 1 - X 2 ~ N çç µ1 - µ 2 , + ÷÷
è n1 n2 ø
Ø Case 1: If σ1 and σ2 are known, then

Z=
( X 1 - X 2 ) - (µ1 - µ 2 )
~ N (0,1)
s 2
s 2
1
+ 2
n1 n2
Ø Case 2: If σ1 and σ2 are unknown but equal, then

T=
( X 1 - X 2 ) - (µ1 - µ 2 )
~ t n1 + n2 - 2 (0,1)
æ1 1 ö
S ç + ÷÷
ç 2
p
è n1 n2 ø
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 5
Distribution Theory
where the pooled estimator of the variance is
(n1 - 1) S12 + (n2 - 1) S 22
S =2
p
n1 + n2 - 2
Ø Case 3: If σ1 and σ2 are unknown and unequal, then SSK suggests

T=
( X 1 - X 2 ) - (µ1 - µ 2 )
~ t df (0,1)
2 2
S S
+
1 2
n1 n2
2
æS S ö
2 2
çç + 1
÷÷ 2

è n1 n2 ø
where df = 2 2
1 æ S 2
ö 1 æ S 2
ö
çç 1 ÷÷ + çç 2 ÷÷
n1 - 1 è n1 ø n2 - 1 è n2 ø
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 6
Hypothesis Test on µ1 - µ2
v Consider the hypotheses H0: μ1 – μ2 = d vs HA: μ1 – μ2 ≠ d(or
μ1 – μ2 > d or μ1 – μ2 < d). The test statistic of the test is
determined by the assumption on the population
variances.

Ø Case 1: σ1 and σ2 are known


Under H0, the test statistic is Z =
(X 1 - X 2 )- d
~ N (0,1)
s 12 s 22
+
n1 n2
x1 - x2 - d
At α significance level, H0 is rejected if | z0 |= > za / 2
s 2
s 2
1
+ 2
n1 n2
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 7
Hypothesis Test on µ1 - µ2
v For alternative hypothesis HA: μ1 – μ2 > d, the decision ruleis
to reject H0 at significance level α if
x1 - x2 - d
z0 = > za
s 12 s 22
+
n1 n2

v For alternative hypothesis is HA: μ1 – μ2 < d, the decision


rule is to reject H0 at significance level α if
x1 - x2 - d
z0 = < - za
s 2
s 2
1
+ 2
n1 n2
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 8
Hypothesis Test on µ1 - µ2
v Case 2: σ1 and σ2 are unknown but equal
Under H0, the test statistic is
X1 - X 2 - d
T= ~ tn1 + n2 -2 (0,1)
æ1 1ö
S çç + ÷÷
2
p
è n1 n2 ø

At α significance level, H0 is rejected if

x1 - x2 - d
t0 = > t n1 + n2 - 2,a / 2
æ1 1 ö
s çç + ÷÷
2
p
è n1 n2 ø
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 9
Hypothesis Test on µ1 - µ2
v For alternative hypothesis is HA: μ1 – μ2 > d, the decision
rule is to reject H0 at significance level α if
x1 - x2 - d
t0 = > t n1 + n2 -2,a
2æ 1 1ö
s p çç + ÷÷
è n1 n2 ø
v For alternative hypothesis is HA: μ1 – μ2 < d, the decision
rule is to reject H0 at significance level α if
x1 - x2 - d
t0 = < -t n1 + n2 - 2,a
æ1 1 ö
s çç + ÷÷
2
p
è n1 n2 ø
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 10
Hypothesis Test on µ1 - µ2
v Case 3: σ1 and σ2 are unknown and unequal
Under H0, the test statistic is
X1 - X 2 - d
T= ~ tdf (0,1)
2 2
S S
1
+ 2
n1 n2

At α significance level, H0 is rejected if

x1 - x2 - d
t0 = > tdf ,a / 2
2 2
s s
1
+ 2
n1 n2
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 11
Hypothesis Test on µ1 - µ2
v If the alternative hypothesis is HA: μ1 – μ2 > d, then the
decision rule under H0 is to reject H0 at significance level α
if x -x -d
t0 = 21 2
> t df ,a
2 2
s s
1
+ 2
n1 n2
v If the alternative hypothesis is HA: μ1 – μ2 < d, then the
decision rule under H0 is to reject H0 at significance level α
if
x21 - x2 - d
t0 = < -t df ,a
2 2
s s
+1 2
n1 n2

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 12
Hypothesis Test on µ1 – µ2
v Recall: Procedure of Hypothesis Testing
ü Specify H0 and HA
ü Determine the test statistic and its distribution
ü Specify the significance level α
ü Define the decision rule
ü Calculate the value of the test statistic
ü Make a decision and answer the question

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 13
Example
v Example: (Location of New Department Store, SSK-7 13.1)
Solution: (Case 1: σ1 and σ2 are known)
Using Excel:
Command: Data > Data Analysis > z-Test: Two Sample for
Means

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 14
Example
Using BUSS7902 Analysis ToolPak:

Test H0: μ1 – μ2 = 0 vs HA: μ1 – μ2 > 0. Since p-value » 0 <


0.05, H0 is rejected at α = 0.05 and we conclude that the
mean household income in Logan City exceeds that of
Ipswich.
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 15
Example
v Example: (Designs of Desk, SSK-7 13.3) The plant manager of a
company that manufactures office equipment believes that
worker productivity is a function of the design of the job. Two
designs are being considered for the production of a new type
of ergonomic computer desk. Twenty-five workers are randomly
selected to assemble the desk using design A, and another
twenty-five workers are randomly selected to assemble the desk
using design B. Assembly times in minutes are recorded.
Design A: 6.8 5.0 7.9 5.2 7.6 5.0 5.9 5.2 6.5 7.4 6.1 6.2
7.1 4.6 6.0 7.1 6.1 5.0 6.3 7.0 6.4 6.1 6.6 7.7 6.4
Design B: 5.2 6.7 5.7 6.6 8.5 6.5 5.9 6.7 6.6 4.2 4.2 4.5
5.3 7.9 7.0 5.9 7.1 5.8 7.0 5.7 5.9 4.9 5.3 4.2 7.1
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 16
Example
The manager would like to know whether or not the assembly
times of the two designs are different. Assume that the two
assembly times are normally distributed and have the same
variance. Test the hypotheses at the 5% level of significance.

Solution: (Case 2: σ12 = σ22) Let μ1 and μ2 be the mean assembly


times of design A and design B, respectively. Test H0: μ1 – μ2 = 0
vs HA: μ1 – μ2 ≠ 0. The sample statistics are

x1 = 6.288 x2 = 6.016 s1 = 0.921 s2 = 1.14


The pooled estimate of the common variance is
(n1 - 1) s12 + (n2 - 1) s22 24 ´ 0.9212 + 24 ´1.14 2
sp =
2
= = 1.074
n1 + n2 - 2 48
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 17
Example
Under H0, the decision rule is to reject H0 at 5% level of
significance if

x1 - x2
t0 = > t 48, 0.025 » 2.009
æ1 1 ö
s ç + ÷÷
2

è n1 n2 ø
Now,

x1 - x2 6.288 - 6.016
t0 = = = 0.93
æ1 1 ö æ 1 1 ö
s ç + ÷÷
2
pç 1.075ç + ÷
è n1 n2 ø è 25 25 ø

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 18
Example
Since 0.93 < 2.009, H0 is not rejected at the 5% level of
significance and we conclude that the assembly times of the two
designs are not different.
Using Excel Command:
Data > Data Analysis > t-Test: Two-Sample Assuming Equal Variance

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 19
Example
Using BUSS7902 Analysis ToolPak:

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 20
Example
v Example: (Breakfast Cereal Part II, SSK-7 13.2) σ 12 ¹ σ 22
A scientist claims that people who eat high-fibre cereal for
breakfast will consume, on average, fewer kilojoules for lunch
than people who do not eat high-fibre cereal for breakfast. To
test this claim preliminarily, 30 people were randomly selected
andcasked what they regularly eat for breakfast and lunch. The
data are as follows.
Consumers of high-fibre cereal – kilojoules for lunch X1
2560 2420 2116 2364 2384 2256 2460 2240 2540 2492
Non-consumers of high-fibre cereal – kilojoules for lunch X2
2008 2812 2940 2828 2092 2136 3072 2504 2480 2356
2944 2260 2744 2116 2528 3804 2976 2528 2372 3388
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 21
Example
Test, at the 5% level of significance, whether the scientist’s claim
is correct. Assume that the two populations are normal.
Solution: (Case 3: σ12 ¹ σ22) Let μ1 and μ2 be mean kilojoules intake
for lunch of consumers and of non-consumers, respectively.
Note that the population variances are unknown and we assume
that they are unequal. The sample means and sample variances
are
x1 = 2383.2 x2 = 2644.4 s12 = 20376.18 s22 = 214004
Test H0: μ1 – μ2 = 0 vs HA: μ1 – μ2 < 0
X1 - X 2
Under H0, the test statistic T= ~ t df (0,1)
2 2
S S
+ 1 2
n1 n2
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 22
Example
Using Excel Command:
Data > Data Analysis > t-Test: Assuming Unequal Variances

Since t0 = -2.314 < -1.7081 (or p-value = 0.0146 < 0.05), H0 is


rejected at the 5% level of significance, and we conclude that the
scientist’s claim is correct.
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 23
Example
Using BUSS7902 Analysis ToolPak:

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 24
Example
Ø Example: (Compare Tyre Design I, SSK-7 13.4) A tyre
manufacturer would like to know whether the new steel-belted
tyre lasts longer than existing-design tyre. Two new-design tyres
were installed on the rear wheels of 20 randomly selected cars
and two existing-design tyres were installed on the rear wheels
of another 20 randomly selected cars. Drivers of the cars were
told to drive in their usual way until the tyres wore out. The
distance driven by each driver were recorded below.
Newly-design tyre (in thousands of kilometres)
70 83 78 46 74 56 74 52 99 57 77 84 72 98 81 63 88 69 54 97
Existing-design tyre (in thousands of kilometres)
47 65 59 61 75 65 73 85 97 84 72 39 72 91 64 63 79 74 76 43

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 25
Example
Can the company infer that the new-design tyre will last longer
on average than their existing design? Use α = 0.05 and assume
that the two populations of tyre lifetimes are normal with equal
variance.
Solution: Let µ1 and µ2 be the mean lifetimes (in thousands of km) of
the new-design and existing-design tyres, respectively.
Ø Test H0: µ1 = µ2 vs HA: µ1 > µ2
Using Excel and p-value approach:
Since p-value = 0.1849 > 0.05,
H0 is not rejected at α = 0.05 and
we conclude that new-design tyre
does not last longer.
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 26
Example
Descriptive statistics:

Solution: (Case 2: σ12 = σ22)

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 27
Testing μ1 – μ2
with Dependent Samples
(SSK 13.2)

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 28
Matched Pairs Experiment
v Recall the Matched Pairs Experiment.
First sample: X11, X12,…, X1n from a population with μ1
Second sample: X21, X22,…, X2n from a population with μ2
Since the samples are dependent, we define
XDi = X1i –X2i for i = 1,…,n
Assumption: XDi ~ N(μD = μ1 – μ2, σD2) where σD2 is
unknown. If this is not true, use CLT if n > 30.
Test H0: μD = d vs HA: μD ≠ d (or μD > d or μD < d)
Under H0, the test statistic XD -d
T= ~ t n -1 (0,1)
SD / n
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 29
Matched Pairs Experiment
v Decision rule is to reject H0 at the α level of significance if
xD - d
t0 = > tn -1,a / 2
sD / n
v For HA: μD > d , H0 is rejected at the α level of significance if
xD - d
t0 = > tn-1,a
sD / n
v For HA: μD < d , H0 is rejected at the α level of significance if
xD - d
t0 = < -tn-1,a
sD / n
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 30
Example
Ø Example: (Compare Tyre Design II, SSK-7 13.5) Consider a
different experimental design. On 20 randomly selected cars,
one of the type of tyre was installed on the rear wheels and the
cars were driven until the tyres worn out. The distance driven
until the tyres worn out were recorded below.
Newly-design tyre (in thousands of kilometres)
65 72 110 70 90 95 69 70 82 70 108 98 91 92 94 70 75 48 79 86
Existing-design tyre
56 58 97 64 87 83 58 57 78 74 106 94 86 98 106 66 66 49 69 91
Does the new-design tyre last longer on average than the
existing-design tyre?

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 31
Example
Solution: Note that data are collected from the same car and hence
the two samples are dependent. This is a matched pairs experiment.
Let XD be the difference between new-design and existing-design
distances. We assume that XD is normally distributed. The observed
values of XD are
9 14 13 6 3 12 11 13 4 -4 2 4 5 -6 -12 4 9 -1 10 -5
and the sample mean and sample standard deviation are 4.55 and
7.22, respectively.
Test H0: µ1 = µ2 (or μD = 0) vs HA: µ1 > µ2 (or μD > 0)
Under H0, H0 is rejected at the 5% level of significance if the test
statistic xD
t0 = > t19,0.05 = 1.729
sD / 20
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 32
Example
Now, 4.55
t0 = = 2.83 > 1.729
1.61
Hence, H0 is rejected at the 5% level of significance, and we
conclude that the new-design tyre lasts longer than the existing-
design tyre.
Using Excel Command:

Data > Data Analysis > t-Test: Paired Two Sample for Means
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 33
Example

v p-value = 0.0055. Since p-value < 0.05, H0 is rejected at α =


0.05 (also at α = 0.01). The new-design tyre lasts longer
than the existing-design tyre.
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 34
Example
Using BUSS7902 Analysis ToolPak:
Summary statistics of the difference

d=0

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 35
Example
d=1

d=2

What conclusions can you draw at a =0.05?


USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 36
Testing p1 – p2
(SSK 13.3)

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 37
Distribution Theory
v Recall: Let X1 ~ Bin(n1, p1), X2 ~ Bin(n2, p2) and X1 and X2 are
independent. It is known that the two sample proportions

X1 æ p1 (1 - p1 ) ö X2 æ p2 (1 - p2 ) ö
pˆ1 = ~ N çç p1 , ÷÷ and pˆ 2 = ~ N çç p2 , ÷÷
n1 è n1 ø n2 è n2 ø

approximately. Since the samples are independent, we have


æ p1 (1 - p1 ) p2 (1 - p2 ) ö
pˆ1 - pˆ 2 ~ N çç p1 - p2 , + ÷÷
è n1 n2 ø
approximately. Required conditions:

n1 ³ 30, n2 ³ 30, n1 p1 ³ 5, n2 p2 ³ 5, n1(1 - p1 ) ³ 5, n2(1 - p2 ) ³ 5

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 38
Hypothesis Test on p1 - p2
v Case 1: Consider the hypotheses H0: p1 – p2 = d vs
HA: p1 – p2 ≠ d (or p1 – p2 > d or p1 – p2 < d) where d ≠ 0.
Under H0, the test statistic
pˆ 1 - pˆ 2 - d
Z= ~ N (0,1)
p1 (1 - p1 ) p2 (1 - p2 )
+
n1 n2
At α significance level, H0 is rejected if

pˆ 1 - pˆ 2 - d
| z0 |= > za / 2
pˆ 1 (1 - pˆ 1 ) pˆ 2 (1 - pˆ 2 )
+
n1 n2
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 39
Hypothesis Test on p1 - p2
v If the alternative hypothesis is HA: p1 – p2 > d, then the
decision rule under H0 is to reject H0 if
pˆ 1 - pˆ 2 - d
z0 = > za
pˆ 1 (1 - pˆ 1 ) pˆ 2 (1 - pˆ 2 )
+
n1 n2

v If the alternative hypothesis is HA: p1 – p2 < d, then the


decision rule under H0 is to reject H0 if
pˆ 1 - pˆ 2 - d
z0 = < - za
pˆ 1 (1 - pˆ 1 ) pˆ 2 (1 - pˆ 2 )
+
n1 n2
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 40
Hypothesis Test on p1 - p2
v Case 2: Consider the hypotheses H0: p1 – p2 = 0 vs
HA: p1 – p2 ≠ 0 (or p1 – p2 > 0 or p1 – p2 < 0). ç d = 0
Under H0, p1 = p2 = p, say, and a pooled estimate of p is
given by
X 1 + X 2 Total number of successes
pˆ = =
n1 + n2 Total number of trials
and the test statistic
pˆ1 - pˆ 2
Z= ~ N (0,1)
æ1 1ö
pˆ (1 - pˆ )çç + ÷÷
è n1 n2 ø
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 41
Hypothesis Test on p1 - p2
At α significance level, H0 is rejected if

pˆ 1 - pˆ 2
| z0 |= > za / 2
æ1 1 ö
pˆ (1 - pˆ )çç + ÷÷
è n1 n2 ø

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 42
Hypothesis Test on p1 - p2
v If the alternative hypothesis is HA: p1 – p2 > 0, then the decision
rule under H0 is to reject H0 if

pˆ 1 - pˆ 2
z0 = > za
æ1 1 ö
pˆ (1 - pˆ )çç + ÷÷
è n1 n2 ø
v If the alternative hypothesis is HA: p1 – p2 < 0, then the
decision rule under H0 is to reject H0 if
pˆ 1 - pˆ 2
z0 = < - za
æ1 1ö
pˆ (1 - pˆ )çç + ÷÷
è n1 n2 ø
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 43
Example
v Example: (SSK-7 Exercise 13.56) Random samples from
two independent binomial populations yield the following
statistics
pˆ1 = 0.45 pˆ 2 = 0.40 n1 = 100 n2 = 100
(a) Using the p-value approach to test whether the two
population proportions are different at the 5% level of
significance.
(b) Repeat (a) when n1 = n2 = 400. Briefly describe the
effect of increase sample size on the p-value.

Solution: (a) Test H0: p1 = p2 vs HA: p1 ≠ p2

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 44
Example
The pooled estimate of population proportion is
x1 + x2 n1 pˆ 1 + n2 pˆ 2 100 ´ 0.45 + 100 ´ 0.40
pˆ = = = = 0.425
n1 + n2 n1 + n2 100 + 100

Under H0, the test statistic is

0.45 - 0.40
| z0 |= = 0.7153
æ 1 1 ö
0.425(1 - 0.425)ç + ÷
è 100 100 ø
p-value ≈ 2 ´ P(Z > 0.72) = 2 ´ 0.2358 = 0.47. Since p-value
> 0.05, H0 is not rejected at α = 0.05 and we conclude that
the two population proportions are not different.
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 45
Example
(b) When n1 = n2 = 400, the pooled estimate of population
proportion is
x1 + x2 n1 pˆ 1 + n2 pˆ 2 400 ´ 0.45 + 400 ´ 0.40
pˆ = = = = 0.425
n1 + n2 n1 + n2 400 + 400
Under H0, the test statistic is

0.45 - 0.40
| z0 |= = 1.43
æ 1 1 ö
0.425(1 - 0.425)ç + ÷
è 400 400 ø
p-value ≈ 2 ´ P(Z > 1.43) = 2 x 0.0764 = 0.1528. The
increase in sample sizes will reduce the p-value.
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 46
Example
Using BUSS7902 Analysis ToolPak:

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 47
Example
v Example: (SSK-7 Exercise 13.60) In a random sample of 500 TV
sets from a large production line, there were 80 defective sets.
In a random sample of 200 TV sets from a second production
line, there were 10 defective sets. Test, at the 5% level of
significance, whether the proportion of defective sets from the
first production line exceeds the proportion of defective sets
from the second production line by more than 3%.
Solution: Let p1 and p2 be the population proportions of
defective sets in the first (i.e. large) and second production lines.
Given x1 = 80, x2 = 10, n1 = 500 and n2 = 200 , the
sample proportions are pˆ = 80 = 0.16 and pˆ = 10 = 0.05
1 2
500 200
USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 48
Example
Test H0: p1 – p2 = 0.03 vs HA: p1 – p2 > 0.03
Under H0, the test statistic
pˆ 1 - pˆ 2 - 0.03
Z= ~ N (0,1)
p1 (1 - p1 ) p2 (1 - p2 )
+
n1 n2

At the 5% level of significance, H0 is rejected if

pˆ 1 - pˆ 2 - 0.03
z0 = > z0.05 = 1.645
pˆ 1 (1 - pˆ 1 ) pˆ 2 (1 - pˆ 2 )
+
n1 n2

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 49
Example
Now,
0.16 - 0.05 - 0.03
z0 = = 3.555 > 1.645
0.16 ´ 0.84 0.05 ´ 0.95
+
500 200
Therefore, H0 is rejected at the 5% level of significance, and
we conclude that the proportion of defective sets from the
first production line exceeds the proportion of defective
sets from the second production line by more than 3%.

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 50
Example
v Using BUSS7902 Analysis ToolPak:

Since p-value = 0.0002 < 0.05, H0 is rejected at the 5% level of


significance.
Note: H0 can also be rejected at α = 0.01 and 0.001.

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 51
Chapter Summary (Textbook)

USYD > BUSS7902 > Dr Boris Choy Copyright: The University of Sydney, Australia 52

You might also like