0% found this document useful (0 votes)
14 views20 pages

ABHyp Test

Filenotmine, hopeithelps

Uploaded by

AJ Adarna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views20 pages

ABHyp Test

Filenotmine, hopeithelps

Uploaded by

AJ Adarna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

URBDP 520 Lecture 7 Page 1 of 20

Urban Design and Planning


URBDP 520 Quantitative Methods in Urban Design and Planning

Lecture Notes 7: Hypothesis Testing

Introduction

The whole idea here is that we're going to determine whether we can reject some
assumption about a population given information about some sample. For
example...
You a sample of 100 light bulbs and determine their mean lifetime. Based on
information from the sample, can you reject the hypothesis that the mean lifetime for
the entire population of bulbs is 2500 hours?
500 students take an exam. You look at 50 exams and get a sample mean. Based on
information from the sample, can you reject the hypothesis that the population mean
is greater than 40?

The way this is all going to work is that we're going to form two hypotheses.
1. The null hypothesis. If the sample value is close to the value stated in the null
hypothesis, the data won't cause us to reject the null hypothesis. We won't actually
accept it, we'll just say that we can't reject it.

2. The alternative hypothesis. If the sample value is far away from the value stated in the
null hypothesis, then the data allow us to say, with some degree of certainty, that the null
hypothesis isn't true. We thus reject the null hypothesis in favor of the alternative.

Looking at the light bulb example from above, if we get a sample mean lifetime that is far
away from 2500 hours, we would probably reject the null hypothesis that the population
mean lifetime is equal to 2500 in favor of the alternative that the lifetime isn't equal to
2500 hours.

Looking at the test example from above, let's say that the null hypothesis is that the
population mean exam score is greater than or equal to 40 and that the null hypothesis is
that the population mean is less than 40. Very large sample means wouldn't allow us to
reject the null hypothesis while sample means much less than 40 would allow us to reject
the null hypothesis.

There are three possible forms of the null/alternative hypothesis combination...

H0 : µ ≥ µ 0 , H A : µ < µ 0
H0 : µ ≤ µ 0 , H A : µ > µ 0
H0 : µ = µ 0 , H A : µ ≠ µ 0
URBDP 520 Lecture 7 Page 2 of 20

Type I and Type II Errors


If you accept a true hypothesis or reject an untrue hypothesis, then you're doing the right
thing. There are two ways in which testing a hypothesis can go wrong...

Type I Error: Rejecting a true null hypothesis.

Type II Error: Failing to reject a false hypothesis.

The probability of a type I error (that is, the probability that you incorrectly reject a true
null hypothesis) is the level of significance of the hypothesis test. Common levels of
significance are 5% and 1%, which roughly correspond to 95% and 99% confidence
intervals.

Note that as the probability of making one type of error falls, the probability of making
the other type rises. As you struggle to avoid rejecting true hypotheses, it becomes more
likely that you will fail to reject false hypotheses. What you do depends on how critical
each of the two types of error are.

For example, donated blood is tested for various diseases. If the null hypothesis is that a
blood sample is not infected versus the alternative that it is, the consequences
surrounding a Type I error are not so bad (good blood is rejected) while the consequences
surrounding a Type II error are severe (bad blood is given to a patient). In this case, it
would be best to use a test of high significance, a high chance of Type I error in order to
have a low probability of a Type II error.

One-Sided Tests For A Population Mean


In this case, we're looking at testing the hypotheses...

H0 : µ ≥ k
HA : µ < k

or

H0 : µ ≤ k
HA : µ > k

The key is going to be to determine how far from the proposed value k the sample value
has to be for us to reject the null hypothesis. If it is further away, we will reject the null.

In order to do this, we will come up with a distribution about the proposed value. We'll
then want to find what value will give us the desired area under a tail of the distribution
to one end.

In the first case, we'll reject the null hypothesis is the sample mean is sufficiently small.
URBDP 520 Lecture 7 Page 3 of 20

In the second case, we'll reject the null hypothesis if the sample mean is sufficiently
large.

The trick is to figure out how large or small the sample mean needs to be to allow us to
reject the null hypothesis in favor of the alternative.

The answer to this comes from the standard normal distribution (at least in the case of a
large sample). We want to be sure that we reject the null hypothesis incorrectly with a
probability of α. For the first statement of the null and alternative hypotheses, this means
that we're looking for a value such that

H0 : µ ≥ k
HA : µ < k

For the second statement of the null and alternative hypotheses, this means that we're
looking for a value such that

H0 : µ ≤ k
HA : µ > k

To determine whether or not the sample means are sufficiently large or small, we will
first need to convert them to a test statistic, which means we will convert them to a z or a
t value according to the following formula:
URBDP 520 Lecture 7 Page 4 of 20

x −k
z=
s
n

The critical value is the value which will just allow us to reject the null hypothesis in
favor of the alternative. These critical values will come right out of the standard normal
table in the case of a large sample or out of the t-distribution table in the case of a small
sample.

For a one-sided hypothesis test with a large sample, the critical values are:

Level of Significance (α) Critical Value


0.10 1.282
0.05 1.645
0.025 1.960
0.01 2.326

Once again, we'll be subtracting a value and then dividing by the sample standard error.
This is exactly what we did in forming an interval for a sample mean.

EX: Consider the following hypothesis test...


H0 : µ ≥ 10
H A : µ < 10

A sample with n=50 provides a sample mean of 9.46 and a sample standard deviation of
2.
Is this sufficiently smaller than 10 to allow us to reject the null hypothesis?

A. At α=0.05 (a level of significance of 0.05 or 5%), what is the critical value for z?
A level of significance of 0.05 means that the area in the left tail of the distribution is
0.05.

Going back to the standard normal random variable table, we see that this indicates a z
value of -1.645.
URBDP 520 Lecture 7 Page 5 of 20

So, if the value of the test statistic (the z value) is less than -1.645, we will reject the
null hypothesis in favor of the alternative hypothesis.

We'll take the sample mean minus the value in the hypothesis and divide by the sample
standard deviation...

B. The test statistic is

x −k 9.46 − 10 −0.54
z= = = = −1.93
s 2 2
n 50 7.07

OK, so we were looking at a null hypothesis that the population mean was less than
10. Our sample mean was less than 10, but was it enough less? The answer is yes,
because the z-value we calculated (-1.93) is bigger (in absolute terms) than the critical
value of
-1.645. So, we can say with 95% certainty that the actual population mean is less than
10 and so we reject the null hypothesis.

C. Calculate the p-value.

The p-value is the area under the distribution curve beyond the value of the test
statistic. This is the level of significance at which the data would just allow you to
reject the null hypothesis.

If the p-value is less than the level of significance originally asked for, reject the null.

If the p-value is greater than the original level of significance, don't reject the null.

In this case, we had a test statistic of -1.93. The p-value is the area under the standard
normal distribution curve to the left of -1.93

Because the area under the curve to the left of -1.93 is 0.0268, the p-value in this case
is 0.0268. This is less than the 5% level of significance we were originally asked to
use in testing the hypothesis, so we rejected the null hypothesis.
URBDP 520 Lecture 7 Page 6 of 20

In fact, we would reject the null hypothesis at any level of significance greater than
0.0268 and not reject the null at any level of significance less than 0.0268.

EX: n=40, x-bar=16.5, s=7. Test the following hypothesis against its alternative at the
2% level of significance.
H0: µ<=15
Ha: µ>15

A. At α=0.02, what is the critical value for z and what is the rejection rule?

2.055, reject if z>2.055

B. Compute the value of the test statistic z

x − k 16.5 − 15 1.5
z= = = = 1.35
s 7 7
n 40 6.32

So don't reject the null hypothesis.

C. What is the p-value?


Looking at the standard normal table, we see that
P(z > 1.35) = 0.5000 - 0.4115 = 0.0885

Again, because the p-value is greater than α, we don't reject the null hypothesis.
URBDP 520 Lecture 7 Page 7 of 20

EX: Demographers will tell you that for a population to replace itself, fertility rates (the
number of children the average woman has) need to be at least 2.1. The Government of a
country which shall remain nameless is concerned about this trend. The commission a
survey in which they first ask women whether they plan to have any children in the
future. Those who answer no are then asked how many children they have already had.
Among 217 women who have completed their planned fertility, the mean number of
children was 1.98 with a standard deviation of 1.1. The Government is considering
taking some drastic measures to increase fertility, but will only do this if there is
sufficient evidence.
Ignoring problems with this research technique, do the data provide sufficient evidence
that the fertility rate is below 2.1?

H0 : µ ≥ 2.1
H A : µ < 2.1
x − µ0 1.98 − 2.1 − 0.12
z= = = = −1.607
s 1.1 1.1
n 217 14.73

Because this is a one-sided test, the p-value is 0.0537, which means that the null
hypothesis would not be rejected at the 5% level. Thus, there is not significant
evidence that the fertility rate is below 2.1, but just barely.

Two-sided Tests About a Population Mean

The point here is to see how far from the hypothesized value the sample mean actually is.
This, again, is done by generating a z-statistic and then seeing if it is sufficiently large or
small to reject the null hypothesis.

In contrast to the one-tailed test, we will reject the null hypothesis if the value of the test
statistic is sufficiently large or sufficiently small:

The critical values in this case will be different from the critical values for the one-sided
hypothesis test:
URBDP 520 Lecture 7 Page 8 of 20

Level of Significance (α) Critical Value


0.10 1.645
0.05 1.960
0.025 2.326
0.01 2.575

EX: n = 36, x = 11, s = 2.5 Test the following hypothesis against its null at a 5% level
of significance:

H0 : µ = 10
H A : µ ≠ 10

A. For the two tailed test, we need the area in each tail to be equal to the significance
level divided by two. Here, α=0.05 (significance level of 5%) so the rejection rule is to
reject if the sample mean is far enough away (greater than or less than) 10. In this case,
reject if the z-value is less than -1.96 or greater than 1.96.

B. Calculate the value of the test statistic:

x − k 11 − 10 1
z= = = = 2.40
s 2.5 2.5
n 36 6

So, because the test statistic is greater than 1.96, you can reject the null hypothesis in
favor of the alternative.

C. Calculate the p-value for this hypothesis test:

With a two-sided hypothesis test, the p-value is the area to the right of the positive value
of the test statistic plus the area to the left of the negative value of the test statistic. In
this case, the value of the test statistic was 2.40. The area under the standard normal
distribution to the right of 2.40 is 0.0082, so the p-value is 2 x 0.0082 = 0.0164.
URBDP 520 Lecture 7 Page 9 of 20

EX: x = 16.23, s = 0.8, n = 30 Test the following hypothesis at a significance level of


5%.

H0 : µ = 16
H A : µ ≠ 16

A. The decision rule at a significance level of 0.05 is reject the null hypothesis if the test
statistic is less than -1.96 or greater than 1.96. (These will always be the critical values
for a two-tailed test with significance of 5%).

B.
x − k 16.32 − 16
z= = = 2.19
s 0.8
n 30

Because this is greater than the critical values of +/-1.96, reject H0 in favor of the null
hypothesis that the mean is not equal to 16.

The p-value in this case is 2 x 0.0143 = 0.0286

C. If x = 15.82

x − k 15.82 − 16
z= = = −1.23
s 0.8
n 30
URBDP 520 Lecture 7 Page 10 of 20

Because this is less than the critical values of +/- 1.96, do not reject H0.

p-value = 2(0.5000-0.3907) = 0.2186.


URBDP 520 Lecture 7 Page 11 of 20

Hypothesis Tests with Small Samples


The only difference when doing hypothesis testing with small samples is that you get
critical values from the t-distribution table rather than the standard normal random
variable table. The degrees of freedom are n-1.

p-values are difficult to calculate when you are using a small sample because the t-
distribution tables are not set up for this purpose. However, when doing hypothesis
testing in SPSS or other software packages, the p-value will be automatically reported.
Packages always do t-tests with the appropriate number of degrees of freedom.

EX: Consider a sample with x = 8.0, s = 2, n = 20 Do the following hypothesis tests.

H0 : µ ≥ 7.8
A. α=0.05
H A : µ < 7.8
Because this is a one-sided test, the critical value is t0.05,19=1.729
8.0 − 7.8 0.2
t = = = 0.447
2 2
20 4.47
Because this is less than the critical value of 1.729, do not reject the null hypothesis.

H0 : µ = 7.5
B. α=0.05
H A : µ ≠ 7.5
Because this is a two-sided test, the critical value is t0.025,19=2.093
8.0 − 7.5 0.5
t = = = 1.1175
2 2
20 4.47
Because this is less than the critical value of 2.093, do not reject the null hypothesis.
URBDP 520 Lecture 7 Page 12 of 20

H0 : µ ≤ 7.0
C. α=0.10
H A : µ > 7.0
Because this is a one-sided test, the critical value is t0.10,19=1.328

8.0 − 7.0 1
t = = = 2.235
2 2
20 4.47
Because this is greater than the critical value of 1.328, reject the null hypothesis in favor
of the alternative.
URBDP 520 Lecture 7 Page 13 of 20

H0 : µ ≤ 7.2
D. α=0.05
H A : µ > 7.2
Because this is a one-sided test, the critical value is t0.05,19=1.729

8.0 − 7.2 0.8


t = = = 1.788
2 2
20 4.47
Because this is greater than the critical value of 1.729, reject the null hypothesis in favor
of the alternative.

H0 : µ = 7.2
E. α=0.05
H A : µ ≠ 7.2
Because this is a two-sided test, the critical value is t0.025,19=2.093

8.0 − 7.2 0.8


t = = = 1.788
2 2
20 4.47
Because this is less than the critical value of 2.093, do not reject the null hypothesis.

The important point illustrated by parts D and E is that it may be possible for a sample to
suggest that a population is significantly greater than or less than some number without
actually being significantly different from it.
URBDP 520 Lecture 7 Page 14 of 20

Hypothesis Tests About A Population Proportion


This is basically the same, except that the test statistic is

H0 : p = p 0
HA : p ≠ p 0

H0 : p ≥ p 0
HA : p < p 0

H0 : p ≤ p 0
HA : p > p 0

p − p0
z=
p 0 (1 − p 0 )
n

160
EX: n = 200, p = = 0.80, p 0 = 0.91 Test the following hypothesis at the 5% level.
200

H0 : p ≥ 0.91
H A : p < 0.91

Because this is a one-sided test, at a 5% level of significance, reject the null hypothesis if
the test statistic is less than -1.645.

p − p0 0.80 − 0.91
z= = = −5.44
p 0 (1 − p 0 ) 0.91 ⋅ 0.09
n 200

Reject the null hypothesis in favor of the alternative.


URBDP 520 Lecture 7 Page 15 of 20

EX: According to an article from the Seattle Post-Intelligencer (Wednesday, February


14, 2001, page E1) a new federal study of how vehicles interact in crashes suggests that
Ford Explorers are more likely to kills the drivers of cars they impact than are other
similarly-sized SUVs. The rate of death for car drivers who impacted Explorers was
0.010 while the rate of death for other SUVs was 0.006. If this study was based on
analysis of 1000 Explorer crashes, is the Explorer rate significantly different from the rate
for other SUVs?

To answer this, let's generate the test statistic first, then get a p-value and see what
the answer implies.

H0 : p ≤ 0.006
H A : P > 0.006
0.010 − 0.006 0.004 0.004
z= = = = 1.638
0.006 ⋅ 0.994 0.005964 0.002442
1000 1000

Because this is a one-sided test, the p-value is about 0.0507.


The data do not suggest that Explorers kill drivers of other cars at a significantly
higher rate than do other similarly-sized SUVs at a 5% level of significance, but the
results are very close.

Hypothesis Tests About A Population Variance


In addition to testing a hypothesis about the population mean or proportion based on what
you know about the sample mean or proportion, you can also do the same thing with the
population variance. The critical assumption here is that the population is normal, or at
least approximately normal.

Hypothesis H0 : σ 2 ≥ σ 20 H0 : σ 2 ≤ σ 20 H0 : σ 2 = σ 20
H A : σ 2 < σ 20 H A : σ 2 > σ 20 H A : σ 2 ≠ σ 20
Test Statistic
χ2 =
(n − 1)s2 χ2 =
(n − 1)s2 χ2 =
(n − 1)s2
σ 20 σ 20 σ 20
Rejection χ 2 < χ12 − α χ 2 > χ 12 − α χ 2 < χ12 − (α / 2)
Region or
χ 2 > χ 2α / 2

To be honest, I don't really know why you'd want to do this (outside of maybe a quality
assurance environment) but here it is in black and white.

EX: For reasons too complicated to explain here, you wind up as quality control
manager at an ammunition plant. You're monitoring the amount of powder going into
some bullets. You take a sample of 81 bullets and find that the amount of powder in each
URBDP 520 Lecture 7 Page 16 of 20

one averages 0.403 grams with a sample variance of 0.025 grams. The production line
must be shut down and calibrated if the variance is greater than 0.016 grams, but this is a
costly procedure, so it is only done if you are 95% certain that the standard deviation is
above the acceptable limit. Should the line be shut down?

You will only shut down the line if the sample standard deviation is sufficiently large.
The hypotheses are:

H0 : σ 2 ≤ 0.016
H A : σ 2 > 0.016

The critical value is χ 20.05,80 = 101.879

The value of the test statistic is

χ2 =
(n − 1)s2 =
(81 − 1) ⋅ 0.025 = 125
σ 20 0.016

Because this is greater than the critical value, we reject the null hypothesis in favor of the
alternative and shut down the assembly line for readjustment.

The Useful Stuff, Chapter 9


Inferences Based on Two Samples
There are some really nasty equations in this chapter, but none of them are important
because machines will be doing all of this for you anyway.

The deal here is that you are trying to see if the means of two different populations are
significantly different by looking at the means of samples drawn from each population.

So, what you're looking at is

(x1 − x 2 ) as an indicator of (µ 1 − µ 2 ) .

The sampling distribution of (x1 − x 2 ) is approximately normal for large samples with
σ12 σ22
mean (µ1 − µ 2 ) and standard deviation σ (x1 − x2 ) = + where
n1 n2

x1 sample mean from first population


x 2 sample mean from second population
µ 1 population mean for the first population
µ 2 population mean for the second population
n1 sample size for the first sample
URBDP 520 Lecture 7 Page 17 of 20

n2 sample size for the second sample


σ12 population variance for the first population
σ 22 population variance for the second population

What you will always be interested in asking is whether the evidence suggests that one
population has a mean which is greater than or different than another.

Hypothesis
H0 : µ 1 − µ 2 ≥ 0 H0 : µ1 − µ2 ≤ 0 H0 : µ1 − µ2 = 0
H A : µ1 − µ 2 < 0 HA : µ1 − µ2 > 0 HA : µ1 − µ2 ≠ 0

Test Statistic
z=
(x1 − x2 ) − 0
z=
(x1 − x2 ) − 0
z=
(x1 − x2 ) − 0

σ12 σ2 σ12 σ 22 σ12 σ2


+ 2 + + 2
n1 n2 n1 n2 n1 n2

Rejection z < −z α z > zα z < −z α / 2


Region or
z > zα / 2

The equations for the small sample test statistics are even nastier.

The whole point, though, is to compare the populations based on the provided samples.

The result of the hypothesis test will be a t-value which will be basically impossible to
interpret, and a p-value, which is usually reported as a value with the label Prob>[T]. If
this is small (usually less than 0.05) then one sample mean is significantly greater than
the other.

This depends, in part, on whether the two population have the same variances or different
variances. It will make a difference in the resulting p-values. Happily, most software
packages will also tell you whether or not this seems to be true. There will be a number
labeled with Prob>F. If this is a small number (again, usually less than 0.05), then the
variances are probably different and you should use the p-value associated with the
variances being unequal.
URBDP 520 Lecture 7 Page 18 of 20

Here is an example from Excel, where you can find hypothesis tests under the
"Tools/DataAnalysis" menu:

Sample1 Sample2
0.010141 0.898268 t-Test: Two-Sample Assuming Equal Variances
0.990306 0.987607
0.305954 0.570143 Sample1 Sample2
0.385001 0.603906 Mean 0.51550493 0.637826874
0.065999 0.509418 Variance 0.110927604 0.083977939
0.915778 0.923419 Observations 22 22
0.613044 0.292365 Pooled Variance 0.097452772
0.194624 0.763637 Hypothesized Mean Difference 0
0.185926 0.774173 df 42
0.863759 0.148169 t Stat -1.299581795
0.853136 0.451536 P(T<=t) one-tail 0.100416238
0.532089 0.664071 t Critical one-tail 1.681951289
0.598750 0.285516 P(T<=t) two-tail 0.200832476
0.532470 0.365205 t Critical two-tail 2.018082341
0.793333 0.717621
0.895895 0.114969
0.250961 0.307759
0.796461 1.019315
0.089256 0.822870
0.951113 0.987905
0.049663 0.893837
0.467451 0.930484

The t-stat for µ1-µ2 is -1.299 and the resulting p-value for a one-tailed test is 0.1004 and
the p-value for a two-tailed test is 0.2008, meaning that at the 5% level of significance the
two population means are not different.

In case you're curious, the population mean for the second sample was 0.03 greater than
the population mean for the first sample.

EX: n=40, x-bar=16.5, s=7


H0: µ<=15
Ha: µ>15

A. At a=0.02 (2% level of significance) the critical value is given by the z value (from
the standard normal table) for which the area to the right of the number is 0.02.

This means that the value in the table would be equal to 0.5000 - 0.0200 = 0.4800 which
1/2
occurs at a value of about 2.055. So, we will calculate a test statistic (using (x-m)/(s/n )
and if that test statistic is larger than 2.055, we will reject the null hypothesis.
URBDP 520 Lecture 7 Page 19 of 20

B. The test statistic is

1/2
(x-bar - M0)/(s/ n )
1/2
(16.5-15)/(7/ 40 ) = 1.36

so you should not reject the null hypothesis.

C. The p-value is the area to the right of 1.36 which is (0.500-0.4131)=0.0869 which is
greater than the level of significance. When the p-value is greater than the level of
significance, you do not reject the null hypothesis.

D. Don't reject the null hypothesis.

EX: n=40, x-bar=7, s=3.2


H0: m>=8
Ha: m<8

A. With a level of significance of a=0.05 (5%) the rejection rule is that you reject the
null hypothesis if the test statistic is less than -1.645.

B.

1/2
z = (x-bar - M0)/(s/ n )
1/2
z = (7-8)/(3.2/40 )
z = -1.98, reject the null hypothesis.
The weight loss claim is "incorrect."

C. The p-value is 0.5000-0.4761=0.0239.

The Chi-square (χ2) Test


The Chi-square test is just like any other hypothesis test. The null hypothesis in this case
is that the population proportions for a large number of groups (more than two) are all
equal (or equal particular values) against the alternative hypothesis that at least one of
them is not equal to the others.

So, if you have a population that can be divided into multiple groups, you can use a Chi-
square test to determine if they fall into those groups in the proportions or numbers you
expect or if they are divided significantly differently.
URBDP 520 Lecture 7 Page 20 of 20

The important thing is that this is just another sort of hypothesis test. Doing the test will
generate a p-value that has the same interpretation as the t-test p-value. The null
hypothesis is that things are equal. If you get a small p-value, you reject the null in favor
of the alternative hypothesis that things are not equal.

EX: You are looking at educational achieve for men and women in their 40’s. You
know that among men, 5% have not graduated from high school, 40% have graduated
from high school but have no college degree, 45% have a bachelors degree but no higher
college degree and 10% have a higher college degree. You would like to see if
educational outcomes for women are significantly different from those for men, so you
would use a Chi-square test to see if the women’s percentages are significantly different
from the men’s percentages.

You might also like