0% found this document useful (0 votes)
19 views58 pages

AES Lecture5 Testing

The document outlines a lecture on statistical inference and hypothesis testing. It discusses the concepts of hypothesis testing, including one-sample and two-sample tests. It details the steps of hypothesis testing including formulating hypotheses, selecting a significance level, calculating test statistics, finding critical values, and making a decision about whether to reject the null hypothesis.

Uploaded by

Fariha Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views58 pages

AES Lecture5 Testing

The document outlines a lecture on statistical inference and hypothesis testing. It discusses the concepts of hypothesis testing, including one-sample and two-sample tests. It details the steps of hypothesis testing including formulating hypotheses, selecting a significance level, calculating test statistics, finding critical values, and making a decision about whether to reject the null hypothesis.

Uploaded by

Fariha Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

Applied Economics and Statistics

Lecture 5: Statistical Inference – Hypotehsis Testing

Dr. Mary Dawood


Lecturer in Economics

1 / 58
Outline

Introduction to Hypothesis Testing

One-sample Hypothesis Testing

Two-sample Hypothesis Testing – Paired

Two-sample Hypothesis Testing - Unpaired

2 / 58
Introduction to Hypothesis Testing1

1
Lind et al., Chapters 10 & 11

3 / 58
What is Hypothesis Testing

With confidence intervals we estimate the population parameters


using the sample statistics

A further step would be to conduct tests about a claimed value of


a parameter, or to compare between different populations

Hypothesis: a statement about the value of a population


parameter subject to verification

Hypothesis Testing: a procedure based on sample evidence and


probability theory to determine whether a hypothesis is statistically
valid

Confidence intervals and hypothesis testing are complementary


procedures in inferential statistics

4 / 58
What is Hypothesis Testing

Types of Hypothesis Tests


1. one-sample test: of whether a particular claimed value of a
population parameter is true
I Ex: The grades of economic students are significantly higher
than the average of the entire BBS

2. paired two-sample test: of the difference between two sample


groups drawn from the same population
I Ex: The grades of economic students have improved after
introducing the new teaching platform

3. unpaired two-sample test: of the difference between samples


drawn from different populations
I Ex: The grades of economic students in UK campus are
significantly higher than those in Dubai campus?

5 / 58
What is Hypothesis Testing

One-sample
Sample
one sample against population
Population

Two-sample – Paired
Sample 1 Sample 2
two samples from same population
Population

Two-sample – Unpaired
Sample 1 Sample 2
two samples from diff. populations
Population 1 Population 2

6 / 58
One-sample Hypothesis Testing2

2
Lind et al., Chapters 10 & 11

7 / 58
Preparation for the Test

Although the population mean is unknown, there may be a claim


that it assumes a hypothesized value µ0

To test this hypothesis we take a sample and use the evidence it


provides (x̄) to decide whether to accept or reject the claim

Testing requirements:
1. random sampling is employed
2. level of measurement is at least interval
3. sampling distribution is normal (Central Limit Theorem)

8 / 58
Procedure of the Test

Hypothesis testing involves five steps:

1. formulating the statistical hypothesis

2. selecting the level of significance

3. computing the test statistic

4. finding critical value

5. taking the decision

9 / 58
1. Formulate the Statistical Hypotheses

Null Hypothesis (H0 ): a statement about the value of a population


parameter
I equality is always part of H0 (e.g. =, ≥, ≤)

Alternative Hypothesis (H1 ): a counter statement that is accepted


if the null hypothesis is rejected
I inequality is always part of H1 (e.g. 6=, >, <)

H0 and H1 are mutually exclusive and collectively exhaustive

H0 claims that there is no difference/change (hence “null”), while


H1 carries the burden of proof

10 / 58
1. Formulate the Statistical Hypotheses

rejecting H0 suggests that H1 is true at a given confidence level

if H0 is not rejected, it does not necessarily mean that it is true,


rather that there is no sufficient evidence to reject H0

H0 : swans are only white

This proves black swans exist Does not prove non-existence of


⇒ reject H0 based on evidence black swans ⇒ cannot accept H0

11 / 58
1. Formulate the Statistical Hypotheses

Two-tailed test: the population mean is not statistically different


from µ0
H0 : µ = µ 0 vs. H1 : µ 6= µ0

Right-tailed test: the population mean is not statistically larger


than µ0
H0 : µ ≤ µ 0 vs. H1 : µ > µ0

Left-tailed test: the population mean is not statistically smaller


than µ0
H0 : µ ≥ µ 0 vs. H1 : µ < µ0

12 / 58
1. Formulate the Statistical Hypotheses

a certain training program is claimed to reduce the usual 35 min


average production time
H0 : µ ≥ 35 vs. H1 : µ < 35

a filling machine is adjusted to fill boxes with 500gm of a certain


drug, but a complaint was filed that it is not working properly
H0 : µ = 500 vs. H1 : µ 6= 500

a new additive to gasoline should increase the mileage of four


cylinder 2000CC cars to more than 28 miles/gallon
H0 : µ ≤ 28 vs. H1 : µ > 28

13 / 58
2. Select the Level of Significance

Type I Error: H0 is actually true


I significance level (α): probability of rejecting H0 when it is true

I confidence level (1 − α): prob. of accepting H0 when it is true

Type II Error: H0 is actually false


I β is the probability of accepting H0 when it is false

I power of the test (1 − β): prob. of rejecting H0 when it is false

Trade-off: reducing prob. of making Type I error generally means


increasing prob. of making Type II error

It is not possible to minimise both except by increasing sample size

14 / 58
3. Calculate the Test Statistic

since x̄ is an unbiased estimate of µ, we can use it to test the


statistical distance between µ and the claimed value µ0

test statistic: a value obtained by transforming the data (here x̄)


into standardised units

if population variance is known: use the Z-statistic

x̄ − µ0
Z∗ = √
σ/ n

15 / 58
3. Calculate the Test Statistic

if population variance is unknown but sample size is large:


use ‘s’ as estimate for σ ⇒ still use the Z-statistic

x̄ − µ0
Z∗ = √
s/ n

if population variance is unknown and sample size is small:


use t-statistic with (n − 1) d.f.

x̄ − µ0
t∗ = √
s/ n

16 / 58
4. Find the Critical Value

critical value is the cut-off Z -score or t-score which divides the


sampling distribution into the region where H0 is rejected and the
region where it is not rejected

critical/rejection region consists of those areas under the sampling


distribution that include unlikely outcomes

standardized values of µ0 that lie in the rejection regions are


unlikely to be equal to (are significantly different from) the true
population mean

the critical value is obtained from the relevant Z -table or t-table at


the chosen confidence level

17 / 58
4. Find the Critical Value

Finding the relevant critical value depends on the type of


hypothesis test being conducted

I under two-tailed test: the critical value is the score that puts
α/2 on either tail of the distribution and (1 − α) in the middle

Z(0.5−α/2) t(α/2,n−1)

Rejection region Rejection region


.025 Do not .025
reject H0

.95
-1.96 1.96
Critical value 0 Critical value

18 / 58
4. Find the Critical Value

under right-tailed test: the critical value is the score that puts α
on right tail of the distribution and (1 − α) to the left

Z(0.5−α) t(α,n−1)

Rejection region
Do not .05
reject H0
Probability = .95

1.65
0 Critical value

19 / 58
4. Find the Critical Value

under left-tailed test: the critical value is the score that puts α on
left tail of the distribution and (1 − α) to the right

−Z(0.5−α) −t(α,n−1)

Rejection region Do not


reject H0

-1.65
Critical value 0

20 / 58
5. Take the Decision

involves comparing the calculated test statistic (Z ∗ or t ∗ ) with the


tabulated critical value to determine whether the former lies within
or outside of the rejection region

if the test statistic is larger in absolute magnitude than the


tabulated critical value (i.e. lies within the critical region)
⇒ reject H0

if the test statistic is lower in absolute magnitude than the


tabulated critical value (i.e. lies outside the critical region)
⇒ do not reject H0

21 / 58
Numerical Example

Research question: Are the grades of economic students


significantly different from the overall average of the entire BBS?

Information gathered:
I the mean and std. dev. of the student grades in BBS

I the mean of a sample of economic students grades

BBS grades Economics grades


µ0 = 65 x̄ = 74
σ = 20 n = 100

22 / 58
Numerical Example

1. Formulate the statistical hypothesis:


H0 : µ = 65 vs. H1 : µ 6= 65

2. Selecting level of significance:


α = 0.05 ⇒ confidence level = 95%

3. Computing test statistic: σ is known and n is large


x̄ − µ0 74 − 65
Z∗ = √ = √ = 4.5
σ/ n 20/ 100
4. Finding critical value: two-tailed test
Z(0.5−α/2) = Z0.475 = 1.96

23 / 58
Numerical Example

Rejection region Rejection region


Do not
.025 reject H0 .025

X
-1.96 0 1.96 Z*=4.5
Critical value Critical value

24 / 58
Numerical Example

5. Taking the decision:

|Z ∗ | > Z

since test statistic is larger in magnitude than the critical value


(i.e. lies in the rejection region), we reject H0

we conclude that at 95% confidence level the grades of


economic students are significantly different (higher) from the
overall average of the entire BBS

25 / 58
Two-sample Hypothesis Testing – Paired3

3
Lind et al., Chapters 10 & 11

26 / 58
Preparation for the Test

When comparing two population means, we distinguish between


three main cases:

1. paired tests: if samples come from dependent populations


(a) share same element units before and after a certain event

(b) have similar element units that were matched in pairs

2. unpaired tests: if samples come from independent populations


I do not share any of their element units

I we distinguish cases where σ’s are known or unknown


(more on that next lecture)

27 / 58
Preparation for the Test

Paired hypothesis testing involves testing the difference between


two samples drawn from the same/dependent population

If we find a large enough difference, we can argue that it did not


occur by simple random chance, rather represents a real difference
between the two samples/populations

The test follows similar statistical principles (and procedures) as in


the one-sample hypothesis testing

Additional testing requirement: samples must be of equal sizes

28 / 58
Procedure of Paired Test

1. Formulate the Statistical Hypotheses


in two-sample testing, the null and alternative hypotheses
compare between the two sample means

equality is always part of H0 , and inequality part of H1

we can have two-tailed, right-tailed, or left-tailed tests


H0 : µ1 ≤=≥ µ2 vs. H1 : µ1 <6=> µ2

2. Select the Level of Significance


the test is subject to both Type I and Type II errors

the most commonly used levels are: 0.01, 0.05, and 0.1

29 / 58
Procedure of Paired Test

3. Compute the Test Statistic


a) calculate the difference between each pair of observations:
di = x1i − x2i

b) compute mean and standard deviation of the differences


rP
¯2
P
d i (di − d)
d¯ = , sd =
n n−1

c) standardise the mean difference using the t-statistic



t∗ = √
sd / n

30 / 58
Procedure of Paired Test

4. Find the Critical Value


involves finding the t-score which defines the critical/rejection
region under the sampling distribution of the differences in
sample means

under two-tailed test: the critical value puts α/2 on either tail
of the distribution

t(α/2,n−1)

under one-sided tests: the critical value puts α on the


corresponding tail of the distribution

t(α,n−1)

31 / 58
Procedure of Paired Test

5. Take the Decision

if the standardised difference in sample means falls within the


critical region, i.e. |t ∗ | > t ⇒ reject H0

µ1 - µ2
Critical region Critical region
(rejection region) (rejection region)

32 / 58
Numerical Example

An investor wishes to compare between two firms that appraise


house values. Using a sample of 10 houses, the results are reported
in £000 below. At 5% significance level, can we conclude that firm
1 has higher appraisal of houses than firm 2?

House Firm 1 Firm 2


1 235 228
2 210 205
3 231 219
4 242 240
5 205 198
6 230 223
7 231 227
8 210 215
9 225 222
10 249 245

33 / 58
Numerical Example

1. Formulate the statistical hypothesis:

H 0 : µ1 ≤ µ2 vs. H1 : µ1 > µ2

2. Selecting level of significance:

α = 0.05 ⇒ confidence level = 95%

3. Finding critical value: right-tailed test

t(α,n−1) = t(0.05,9) = 1.833

34 / 58
Numerical Example
Home Firm 1 Firm 2 di (di − d̄) (di − d̄)2
1 235 228 7 2.4 5.76
2 210 205 5 0.4 0.16
3 231 219 12 7.4 54.76
4 242 240 2 -2.6 6.76
5 205 198 7 2.4 5.76
6 230 223 7 2.4 5.76
7 231 227 4 -0.6 0.36
8 210 215 -5 -9.6 92.16
9 225 222 3 -1.6 2.56
10 249 245 4 -0.6 0.36
46 0 174.4
r
174.4
d¯ = 46/10 = 4.6 sd = = 4.402
10 − 1

35 / 58
Numerical Example

4. Compute test statistic:

d¯ 4.6
t∗ = √ = √ = 3.305
sd / n 4.402/ 10

5. Taking the decision:


|t ∗ | > t

since test statistic is larger in magnitude than the critical value


(i.e. lies in the rejection region), we reject H0

we conclude that at 95% confidence level firm 1 has higher


appraisal of houses than firm 2

36 / 58
Numerical Example

One-sample or two-sample hypothesis testing?

1. Test that youngsters (15-29 years) use Twitter messages more


frequently per day than adults (30-55) do.

2. Test of a battery manufacturer claim that its batteries last for


5000 hours.

3. Test of the difference between the average net income of retail


stores in Birmingham and Manchester

4. Test of the claim that an average child spends 15 hours per


week watching TV.

37 / 58
Two-sample Hypothesis Testing - Unpaired4

4
Lind et al., Chapters 10-11

38 / 58
Preparation for the Test

Unpaired hypothesis testing involves testing if there is a significant


difference between two samples drawn from independent
populations

We distinguish between four main cases:

1. population variances are known

2. population variances are unknown but sample sizes are large

3. population variances are unknown and sample sizes are small,


but variances could be assumed to be equal

4. population variances are unknown, sample sizes are small,


variances are unequal

39 / 58
Procedure of the Test

1. Formulate the Statistical Hypotheses


in two-sample testing, the null and alternative hypotheses
compare between the two sample means

equality is always part of H0 , and inequality part of H1

we can have two-tailed, right-tailed, or left-tailed tests


H0 : µ1 ≤=≥ µ2 vs. H1 : µ1 <6=> µ2

2. Select the Level of Significance


the test is subject to both Type I and Type II errors

the most commonly used levels are: 0.01, 0.05, and 0.1

40 / 58
Procedure of the Test

3. Compute the Test Statistic: by standardising the difference


between the two sample means

a) If population variances are known: use the Z-statistic

x̄1 − x̄2 x̄1 − x̄2


Z∗ = =s
σx̄1 −x̄2 σ12 σ22
+
n1 n2

where σx̄1 −x̄2 is standard deviation of the mean-difference


sampling distribution

41 / 58
Procedure of the Test

3. Compute the Test Statistic (cont.)

b) If σ 2 are unknown but sample sizes are large: use s as


an estimate for σ ⇒ still use the Z-statistic

x̄1 − x̄2 x̄1 − x̄2


Z∗ = =s
sx̄1 −x̄2 s12 s2
+ 2
n1 n2

where sx̄1 −x̄2 is estimated standard deviation of the


mean-difference sampling distribution

42 / 58
Procedure of the Test

3. Compute the Test Statistic (cont.)

c) If σ 2 are unknown and sample sizes are small:

both populations must follow normal distribution and have


equal variances

equal variances can be tested using F -test (next lecture)

use t-statistic with (n1 + n2 − 2) d.f.

since we cannot use two estimators for one parameter (σ 2 ),


we calculate its weighted average (adjusted for both d.f.)
and use it as a pooled variance

43 / 58
Procedure of the Test

3. Compute the Test Statistic (cont.)

c) If σ 2 are unknown and sample sizes are small: (cont.)

x̄1 − x̄2
t∗ = s
sp2 sp2
+
n1 n2

where sp2 is the estimated pooled variance, calculated as:

(n1 − 1) s12 + (n2 − 1) s22


sp2 =
n1 + n2 − 2

44 / 58
Procedure of the Test

3. Compute the Test Statistic (cont.)

d) If σ 2 are unknown and unequal with small n:


use s as an estimate for σ

use t-statistic with d.f. adjusted downward by a rather


complex formula (will be given to you)

x̄1 − x̄2
t∗ = s
s12 s2
+ 2
n1 n2
[(s12 /n1 ) + (s22 /n2)]2
d.f =
(s12 /n1 )2 (s 2 /n2 )2
+ 2
n1 − 1 n2 − 1

45 / 58
Procedure of the Test

4. Find the Critical Value

involves finding the Z-score or t-score which define the


critical/rejection region under the sampling distribution of the
differences in sample means

under two-tailed test: the critical value puts α/2 on either tail
of the distribution

Z(0.5−α/2) t(α/2,n1 +n2 −2)

under one-tailed tests: the critical value puts α on the


corresponding tail of the distribution

Z(0.5−α) t(α,n1 +n2 −2)

46 / 58
Procedure of the Test

5. Take the Decision

if the difference in sample means falls within the critical region,


i.e. test statistic is larger in magnitude than the critical value
⇒ reject H0

µ1 - µ2
Critical region Critical region
(rejection region) (rejection region)

47 / 58
Numerical Example 1

A complaint was filed across the UK universities that women are


on average paid less than men

Gathering a sample of the wages from both populations, test this


claim at 1% significance level

Sample 1 (Men) Sample 2 (Women)


x̄1 = 12.89 x̄2 = 10.73
s1 = 2.32 s2 = 2.64
n1 = 325 n2 = 316

48 / 58
Numerical Example 1

1. Formulate the statistical hypothesis:


H 0 : µ1 ≤ µ2 vs. H1 : µ1 > µ2

2. Selecting level of significance:


α = 0.01 ⇒ confidence level = 99%

3. Computing test statistic: σ unknown but n is large


x̄1 − x̄2 12.89 − 10.73
Z∗ = s =r = 10.99
s12 s22 (2.32)2 (2.64)2
+ +
n1 n2 325 316
since sample 1 is men and sample 2 is women, a positive
z-score indicates that men have a higher wage

49 / 58
Numerical Example 1

4. Finding critical value: right-tailed test

Z(0.5−α) = Z(0.49) = 2.33

5. Taking the decision:


|Z ∗ | > Z

since test statistic is larger in magnitude than the critical value


(i.e. lies in the rejection region), we reject H0

we conclude that at 99% confidence level men wage in UK


universities is significantly higher than that of women

50 / 58
Numerical Example 2

A firm assembling lawnmowers uses two different assembly lines

To evaluate both methods, the firm decided to conduct a time and


motion study, which assumed the lines had equal variances

At 10% significance level test if there is a significant difference in


the mean time of both lines using the results of the study below

Method 1 (minutes) 2 4 9 3 2

Method 2 (minutes) 3 7 5 8 4 3

51 / 58
Numerical Example 2

1. Formulate the statistical hypothesis:

H 0 : µ1 = µ2 vs. H1 : µ1 6= µ2

2. Selecting level of significance:

α = 0.1 ⇒ confidence level = 90%

3. Finding critical value: two-tailed t-test

t(α/2,n1 +n2 −2) = t(0.05,9) = 1.833

52 / 58
Numerical Example 2

x1 x2 x12 x22
2 3 4 9
4 7 16 49
9 5 81 25
3 8 9 64
2 4 4 16
3 9
20 30 114 172

20 30
x̄1 = =4 x̄2 = =5
5 6
114 − 5(4)2 172 − 6(5)2
s12 = = 8.5 s22 = = 4.4
4 5

53 / 58
Numerical Example 2

4. Computing test statistic: σ unknown but equal, with small n ⇒


use pooled variance

(n1 − 1) s12 + (n2 − 1) s22 (5 − 1) 8.5 + (6 − 1) 4.4


sp2 = = = 6.222
n1 + n2 − 2 5+6−2

x̄1 − x̄2 4−5


t∗ = s =r = −0.662
sp2 sp2 6.222 6.222
+ +
n1 n2 5 6

54 / 58
Numerical Example 2

5. Taking the decision:

|t ∗ | < t ⇒ do no reject H0

Do not
Rejection region Rejection region
reject H0
.05 .05

X
-1.833 -0.662 0 1.833
Critical value Critical value

we conclude that at 90% confidence level there is no significant


difference in the mean times of the two assembly lines

55 / 58
Numerical Example 3

A consumer testing laboratory is evaluating the performance of two


brands of batteries (measured in years of life), brand A and B

Test if brand B performs better using the sample results below,


given the samples failed the F -test (use d.f = 11)

Brand A Brand B

sample size 9 12

mean 6.44 9.42

st. dev. 3.32 1.62

56 / 58
Numerical Example 3

1. Formulate the statistical hypothesis:


H 0 : µ1 ≥ µ2 vs. H1 : µ1 < µ2

2. Selecting level of significance:


α = 0.05 ⇒ confidence level = 95%

3. Computing test statistic: σ unknown and unequal with small n

x̄1 − x̄2 6.44 − 9.42


t∗ = s =r = −2.48
s12 s22 (3.32)2 (1.62)2
+ +
n1 n2 9 12

57 / 58
Numerical Example 3

4. Finding critical value: two-tailed t-test


t(0.05,11) = 1.796

5. Taking the decision:


|t ∗ | > t ⇒ reject H0

since test statistic is larger in magnitude than the critical value


(i.e. lies inside the rejection region), we reject H0

we conclude that at 95% confidence level battery brand B


performs better (lives longer) on average compared to brand A

58 / 58

You might also like