ABHyp Test
ABHyp Test
Introduction
The whole idea here is that we're going to determine whether we can reject some
assumption about a population given information about some sample. For
example...
You a sample of 100 light bulbs and determine their mean lifetime. Based on
information from the sample, can you reject the hypothesis that the mean lifetime for
the entire population of bulbs is 2500 hours?
500 students take an exam. You look at 50 exams and get a sample mean. Based on
information from the sample, can you reject the hypothesis that the population mean
is greater than 40?
The way this is all going to work is that we're going to form two hypotheses.
1. The null hypothesis. If the sample value is close to the value stated in the null
hypothesis, the data won't cause us to reject the null hypothesis. We won't actually
accept it, we'll just say that we can't reject it.
2. The alternative hypothesis. If the sample value is far away from the value stated in the
null hypothesis, then the data allow us to say, with some degree of certainty, that the null
hypothesis isn't true. We thus reject the null hypothesis in favor of the alternative.
Looking at the light bulb example from above, if we get a sample mean lifetime that is far
away from 2500 hours, we would probably reject the null hypothesis that the population
mean lifetime is equal to 2500 in favor of the alternative that the lifetime isn't equal to
2500 hours.
Looking at the test example from above, let's say that the null hypothesis is that the
population mean exam score is greater than or equal to 40 and that the null hypothesis is
that the population mean is less than 40. Very large sample means wouldn't allow us to
reject the null hypothesis while sample means much less than 40 would allow us to reject
the null hypothesis.
H0 : µ ≥ µ 0 , H A : µ < µ 0
H0 : µ ≤ µ 0 , H A : µ > µ 0
H0 : µ = µ 0 , H A : µ ≠ µ 0
URBDP 520 Lecture 7 Page 2 of 20
The probability of a type I error (that is, the probability that you incorrectly reject a true
null hypothesis) is the level of significance of the hypothesis test. Common levels of
significance are 5% and 1%, which roughly correspond to 95% and 99% confidence
intervals.
Note that as the probability of making one type of error falls, the probability of making
the other type rises. As you struggle to avoid rejecting true hypotheses, it becomes more
likely that you will fail to reject false hypotheses. What you do depends on how critical
each of the two types of error are.
For example, donated blood is tested for various diseases. If the null hypothesis is that a
blood sample is not infected versus the alternative that it is, the consequences
surrounding a Type I error are not so bad (good blood is rejected) while the consequences
surrounding a Type II error are severe (bad blood is given to a patient). In this case, it
would be best to use a test of high significance, a high chance of Type I error in order to
have a low probability of a Type II error.
H0 : µ ≥ k
HA : µ < k
or
H0 : µ ≤ k
HA : µ > k
The key is going to be to determine how far from the proposed value k the sample value
has to be for us to reject the null hypothesis. If it is further away, we will reject the null.
In order to do this, we will come up with a distribution about the proposed value. We'll
then want to find what value will give us the desired area under a tail of the distribution
to one end.
In the first case, we'll reject the null hypothesis is the sample mean is sufficiently small.
URBDP 520 Lecture 7 Page 3 of 20
In the second case, we'll reject the null hypothesis if the sample mean is sufficiently
large.
The trick is to figure out how large or small the sample mean needs to be to allow us to
reject the null hypothesis in favor of the alternative.
The answer to this comes from the standard normal distribution (at least in the case of a
large sample). We want to be sure that we reject the null hypothesis incorrectly with a
probability of α. For the first statement of the null and alternative hypotheses, this means
that we're looking for a value such that
H0 : µ ≥ k
HA : µ < k
For the second statement of the null and alternative hypotheses, this means that we're
looking for a value such that
H0 : µ ≤ k
HA : µ > k
To determine whether or not the sample means are sufficiently large or small, we will
first need to convert them to a test statistic, which means we will convert them to a z or a
t value according to the following formula:
URBDP 520 Lecture 7 Page 4 of 20
x −k
z=
s
n
The critical value is the value which will just allow us to reject the null hypothesis in
favor of the alternative. These critical values will come right out of the standard normal
table in the case of a large sample or out of the t-distribution table in the case of a small
sample.
For a one-sided hypothesis test with a large sample, the critical values are:
Once again, we'll be subtracting a value and then dividing by the sample standard error.
This is exactly what we did in forming an interval for a sample mean.
A sample with n=50 provides a sample mean of 9.46 and a sample standard deviation of
2.
Is this sufficiently smaller than 10 to allow us to reject the null hypothesis?
A. At α=0.05 (a level of significance of 0.05 or 5%), what is the critical value for z?
A level of significance of 0.05 means that the area in the left tail of the distribution is
0.05.
Going back to the standard normal random variable table, we see that this indicates a z
value of -1.645.
URBDP 520 Lecture 7 Page 5 of 20
So, if the value of the test statistic (the z value) is less than -1.645, we will reject the
null hypothesis in favor of the alternative hypothesis.
We'll take the sample mean minus the value in the hypothesis and divide by the sample
standard deviation...
x −k 9.46 − 10 −0.54
z= = = = −1.93
s 2 2
n 50 7.07
OK, so we were looking at a null hypothesis that the population mean was less than
10. Our sample mean was less than 10, but was it enough less? The answer is yes,
because the z-value we calculated (-1.93) is bigger (in absolute terms) than the critical
value of
-1.645. So, we can say with 95% certainty that the actual population mean is less than
10 and so we reject the null hypothesis.
The p-value is the area under the distribution curve beyond the value of the test
statistic. This is the level of significance at which the data would just allow you to
reject the null hypothesis.
If the p-value is less than the level of significance originally asked for, reject the null.
If the p-value is greater than the original level of significance, don't reject the null.
In this case, we had a test statistic of -1.93. The p-value is the area under the standard
normal distribution curve to the left of -1.93
Because the area under the curve to the left of -1.93 is 0.0268, the p-value in this case
is 0.0268. This is less than the 5% level of significance we were originally asked to
use in testing the hypothesis, so we rejected the null hypothesis.
URBDP 520 Lecture 7 Page 6 of 20
In fact, we would reject the null hypothesis at any level of significance greater than
0.0268 and not reject the null at any level of significance less than 0.0268.
EX: n=40, x-bar=16.5, s=7. Test the following hypothesis against its alternative at the
2% level of significance.
H0: µ<=15
Ha: µ>15
A. At α=0.02, what is the critical value for z and what is the rejection rule?
x − k 16.5 − 15 1.5
z= = = = 1.35
s 7 7
n 40 6.32
Again, because the p-value is greater than α, we don't reject the null hypothesis.
URBDP 520 Lecture 7 Page 7 of 20
EX: Demographers will tell you that for a population to replace itself, fertility rates (the
number of children the average woman has) need to be at least 2.1. The Government of a
country which shall remain nameless is concerned about this trend. The commission a
survey in which they first ask women whether they plan to have any children in the
future. Those who answer no are then asked how many children they have already had.
Among 217 women who have completed their planned fertility, the mean number of
children was 1.98 with a standard deviation of 1.1. The Government is considering
taking some drastic measures to increase fertility, but will only do this if there is
sufficient evidence.
Ignoring problems with this research technique, do the data provide sufficient evidence
that the fertility rate is below 2.1?
H0 : µ ≥ 2.1
H A : µ < 2.1
x − µ0 1.98 − 2.1 − 0.12
z= = = = −1.607
s 1.1 1.1
n 217 14.73
Because this is a one-sided test, the p-value is 0.0537, which means that the null
hypothesis would not be rejected at the 5% level. Thus, there is not significant
evidence that the fertility rate is below 2.1, but just barely.
The point here is to see how far from the hypothesized value the sample mean actually is.
This, again, is done by generating a z-statistic and then seeing if it is sufficiently large or
small to reject the null hypothesis.
In contrast to the one-tailed test, we will reject the null hypothesis if the value of the test
statistic is sufficiently large or sufficiently small:
The critical values in this case will be different from the critical values for the one-sided
hypothesis test:
URBDP 520 Lecture 7 Page 8 of 20
EX: n = 36, x = 11, s = 2.5 Test the following hypothesis against its null at a 5% level
of significance:
H0 : µ = 10
H A : µ ≠ 10
A. For the two tailed test, we need the area in each tail to be equal to the significance
level divided by two. Here, α=0.05 (significance level of 5%) so the rejection rule is to
reject if the sample mean is far enough away (greater than or less than) 10. In this case,
reject if the z-value is less than -1.96 or greater than 1.96.
x − k 11 − 10 1
z= = = = 2.40
s 2.5 2.5
n 36 6
So, because the test statistic is greater than 1.96, you can reject the null hypothesis in
favor of the alternative.
With a two-sided hypothesis test, the p-value is the area to the right of the positive value
of the test statistic plus the area to the left of the negative value of the test statistic. In
this case, the value of the test statistic was 2.40. The area under the standard normal
distribution to the right of 2.40 is 0.0082, so the p-value is 2 x 0.0082 = 0.0164.
URBDP 520 Lecture 7 Page 9 of 20
H0 : µ = 16
H A : µ ≠ 16
A. The decision rule at a significance level of 0.05 is reject the null hypothesis if the test
statistic is less than -1.96 or greater than 1.96. (These will always be the critical values
for a two-tailed test with significance of 5%).
B.
x − k 16.32 − 16
z= = = 2.19
s 0.8
n 30
Because this is greater than the critical values of +/-1.96, reject H0 in favor of the null
hypothesis that the mean is not equal to 16.
C. If x = 15.82
x − k 15.82 − 16
z= = = −1.23
s 0.8
n 30
URBDP 520 Lecture 7 Page 10 of 20
Because this is less than the critical values of +/- 1.96, do not reject H0.
p-values are difficult to calculate when you are using a small sample because the t-
distribution tables are not set up for this purpose. However, when doing hypothesis
testing in SPSS or other software packages, the p-value will be automatically reported.
Packages always do t-tests with the appropriate number of degrees of freedom.
H0 : µ ≥ 7.8
A. α=0.05
H A : µ < 7.8
Because this is a one-sided test, the critical value is t0.05,19=1.729
8.0 − 7.8 0.2
t = = = 0.447
2 2
20 4.47
Because this is less than the critical value of 1.729, do not reject the null hypothesis.
H0 : µ = 7.5
B. α=0.05
H A : µ ≠ 7.5
Because this is a two-sided test, the critical value is t0.025,19=2.093
8.0 − 7.5 0.5
t = = = 1.1175
2 2
20 4.47
Because this is less than the critical value of 2.093, do not reject the null hypothesis.
URBDP 520 Lecture 7 Page 12 of 20
H0 : µ ≤ 7.0
C. α=0.10
H A : µ > 7.0
Because this is a one-sided test, the critical value is t0.10,19=1.328
8.0 − 7.0 1
t = = = 2.235
2 2
20 4.47
Because this is greater than the critical value of 1.328, reject the null hypothesis in favor
of the alternative.
URBDP 520 Lecture 7 Page 13 of 20
H0 : µ ≤ 7.2
D. α=0.05
H A : µ > 7.2
Because this is a one-sided test, the critical value is t0.05,19=1.729
H0 : µ = 7.2
E. α=0.05
H A : µ ≠ 7.2
Because this is a two-sided test, the critical value is t0.025,19=2.093
The important point illustrated by parts D and E is that it may be possible for a sample to
suggest that a population is significantly greater than or less than some number without
actually being significantly different from it.
URBDP 520 Lecture 7 Page 14 of 20
H0 : p = p 0
HA : p ≠ p 0
H0 : p ≥ p 0
HA : p < p 0
H0 : p ≤ p 0
HA : p > p 0
p − p0
z=
p 0 (1 − p 0 )
n
160
EX: n = 200, p = = 0.80, p 0 = 0.91 Test the following hypothesis at the 5% level.
200
H0 : p ≥ 0.91
H A : p < 0.91
Because this is a one-sided test, at a 5% level of significance, reject the null hypothesis if
the test statistic is less than -1.645.
p − p0 0.80 − 0.91
z= = = −5.44
p 0 (1 − p 0 ) 0.91 ⋅ 0.09
n 200
To answer this, let's generate the test statistic first, then get a p-value and see what
the answer implies.
H0 : p ≤ 0.006
H A : P > 0.006
0.010 − 0.006 0.004 0.004
z= = = = 1.638
0.006 ⋅ 0.994 0.005964 0.002442
1000 1000
Hypothesis H0 : σ 2 ≥ σ 20 H0 : σ 2 ≤ σ 20 H0 : σ 2 = σ 20
H A : σ 2 < σ 20 H A : σ 2 > σ 20 H A : σ 2 ≠ σ 20
Test Statistic
χ2 =
(n − 1)s2 χ2 =
(n − 1)s2 χ2 =
(n − 1)s2
σ 20 σ 20 σ 20
Rejection χ 2 < χ12 − α χ 2 > χ 12 − α χ 2 < χ12 − (α / 2)
Region or
χ 2 > χ 2α / 2
To be honest, I don't really know why you'd want to do this (outside of maybe a quality
assurance environment) but here it is in black and white.
EX: For reasons too complicated to explain here, you wind up as quality control
manager at an ammunition plant. You're monitoring the amount of powder going into
some bullets. You take a sample of 81 bullets and find that the amount of powder in each
URBDP 520 Lecture 7 Page 16 of 20
one averages 0.403 grams with a sample variance of 0.025 grams. The production line
must be shut down and calibrated if the variance is greater than 0.016 grams, but this is a
costly procedure, so it is only done if you are 95% certain that the standard deviation is
above the acceptable limit. Should the line be shut down?
You will only shut down the line if the sample standard deviation is sufficiently large.
The hypotheses are:
H0 : σ 2 ≤ 0.016
H A : σ 2 > 0.016
χ2 =
(n − 1)s2 =
(81 − 1) ⋅ 0.025 = 125
σ 20 0.016
Because this is greater than the critical value, we reject the null hypothesis in favor of the
alternative and shut down the assembly line for readjustment.
The deal here is that you are trying to see if the means of two different populations are
significantly different by looking at the means of samples drawn from each population.
(x1 − x 2 ) as an indicator of (µ 1 − µ 2 ) .
The sampling distribution of (x1 − x 2 ) is approximately normal for large samples with
σ12 σ22
mean (µ1 − µ 2 ) and standard deviation σ (x1 − x2 ) = + where
n1 n2
What you will always be interested in asking is whether the evidence suggests that one
population has a mean which is greater than or different than another.
Hypothesis
H0 : µ 1 − µ 2 ≥ 0 H0 : µ1 − µ2 ≤ 0 H0 : µ1 − µ2 = 0
H A : µ1 − µ 2 < 0 HA : µ1 − µ2 > 0 HA : µ1 − µ2 ≠ 0
Test Statistic
z=
(x1 − x2 ) − 0
z=
(x1 − x2 ) − 0
z=
(x1 − x2 ) − 0
The equations for the small sample test statistics are even nastier.
The whole point, though, is to compare the populations based on the provided samples.
The result of the hypothesis test will be a t-value which will be basically impossible to
interpret, and a p-value, which is usually reported as a value with the label Prob>[T]. If
this is small (usually less than 0.05) then one sample mean is significantly greater than
the other.
This depends, in part, on whether the two population have the same variances or different
variances. It will make a difference in the resulting p-values. Happily, most software
packages will also tell you whether or not this seems to be true. There will be a number
labeled with Prob>F. If this is a small number (again, usually less than 0.05), then the
variances are probably different and you should use the p-value associated with the
variances being unequal.
URBDP 520 Lecture 7 Page 18 of 20
Here is an example from Excel, where you can find hypothesis tests under the
"Tools/DataAnalysis" menu:
Sample1 Sample2
0.010141 0.898268 t-Test: Two-Sample Assuming Equal Variances
0.990306 0.987607
0.305954 0.570143 Sample1 Sample2
0.385001 0.603906 Mean 0.51550493 0.637826874
0.065999 0.509418 Variance 0.110927604 0.083977939
0.915778 0.923419 Observations 22 22
0.613044 0.292365 Pooled Variance 0.097452772
0.194624 0.763637 Hypothesized Mean Difference 0
0.185926 0.774173 df 42
0.863759 0.148169 t Stat -1.299581795
0.853136 0.451536 P(T<=t) one-tail 0.100416238
0.532089 0.664071 t Critical one-tail 1.681951289
0.598750 0.285516 P(T<=t) two-tail 0.200832476
0.532470 0.365205 t Critical two-tail 2.018082341
0.793333 0.717621
0.895895 0.114969
0.250961 0.307759
0.796461 1.019315
0.089256 0.822870
0.951113 0.987905
0.049663 0.893837
0.467451 0.930484
The t-stat for µ1-µ2 is -1.299 and the resulting p-value for a one-tailed test is 0.1004 and
the p-value for a two-tailed test is 0.2008, meaning that at the 5% level of significance the
two population means are not different.
In case you're curious, the population mean for the second sample was 0.03 greater than
the population mean for the first sample.
A. At a=0.02 (2% level of significance) the critical value is given by the z value (from
the standard normal table) for which the area to the right of the number is 0.02.
This means that the value in the table would be equal to 0.5000 - 0.0200 = 0.4800 which
1/2
occurs at a value of about 2.055. So, we will calculate a test statistic (using (x-m)/(s/n )
and if that test statistic is larger than 2.055, we will reject the null hypothesis.
URBDP 520 Lecture 7 Page 19 of 20
1/2
(x-bar - M0)/(s/ n )
1/2
(16.5-15)/(7/ 40 ) = 1.36
C. The p-value is the area to the right of 1.36 which is (0.500-0.4131)=0.0869 which is
greater than the level of significance. When the p-value is greater than the level of
significance, you do not reject the null hypothesis.
A. With a level of significance of a=0.05 (5%) the rejection rule is that you reject the
null hypothesis if the test statistic is less than -1.645.
B.
1/2
z = (x-bar - M0)/(s/ n )
1/2
z = (7-8)/(3.2/40 )
z = -1.98, reject the null hypothesis.
The weight loss claim is "incorrect."
So, if you have a population that can be divided into multiple groups, you can use a Chi-
square test to determine if they fall into those groups in the proportions or numbers you
expect or if they are divided significantly differently.
URBDP 520 Lecture 7 Page 20 of 20
The important thing is that this is just another sort of hypothesis test. Doing the test will
generate a p-value that has the same interpretation as the t-test p-value. The null
hypothesis is that things are equal. If you get a small p-value, you reject the null in favor
of the alternative hypothesis that things are not equal.
EX: You are looking at educational achieve for men and women in their 40’s. You
know that among men, 5% have not graduated from high school, 40% have graduated
from high school but have no college degree, 45% have a bachelors degree but no higher
college degree and 10% have a higher college degree. You would like to see if
educational outcomes for women are significantly different from those for men, so you
would use a Chi-square test to see if the women’s percentages are significantly different
from the men’s percentages.