Unit 10
Unit 10
Structure
10.1 Introduction
Objectives
10.2 Procedure of Testing of Hypothesis for Large Samples
10.3 Testing of Hypothesis for Population Mean Using Z-Test
10.4 Testing of Hypothesis for Difference of Two Population Means Using
Z-Test
10.5 Testing of Hypothesis for Population Proportion Using Z-Test
10.6 Testing of Hypothesis for Difference of Two Population Proportions
Using Z-Test
10.7 Testing of Hypothesis for Population Variance Using Z-Test
10.8 Testing of Hypothesis for Two Population Variances Using Z-Test
10.9 Summary
10.10 Solutions /Answers
10.1 INTRODUCTION
In previous unit, we have defined basic terms used in testing of hypothesis.
After providing you necessary material required for any test, we can move
towards discussing particular tests one by one. But before doing that let us tell
you the strategy we are adopting here.
First we categories the tests under two heads:
Large sample tests
Small sample tests
After that, their unit wise distribution is done. In this unit, we will discuss large
sample tests whereas in Units 11 and 12 we will discuss small sample tests.
The tests which are described in these units are known as “parametric tests”.
Sometimes in our studies in the fields of economics, psychology, medical, etc.
we take a sample of objects / units / participants / patients, etc. such as 70, 500,
1000, 10,000, etc. This situation comes under the category of large samples.
As a thumb rule, a sample of size n is treated as a large sample only if it
contains more than 30 units (or observations, n > 30). And we know that, for
large sample (n > 30), one statistical fact is that almost all sampling
distributions of the statistic(s) are closely approximated by the normal
distribution. Therefore, the test statistic, which is a function of sample
observations based on n > 30, could be assumed follow the normal distribution
approximately (or exactly).
But story does not end here. There are some other issues which need to be
taken care off. Some of these issues have been highlighted by making different
cases in each test as you will see when go through Sections 10.3 to 10.8 of this
unit.
This unit is divided into ten sections. Section 10.1 is introductory in nature.
General procedure of testing of hypothesis for large samples is described in
25
Testing of Hypothesis Section 10.2. In Section 10.3, testing of hypothesis for population mean is
discussed whereas in Section 10.4, testing of hypothesis for difference of two
population means with examples is described. Similarly, in Sections 10.5 and
10.6, testing of hypothesis for population proportion and difference of two
population proportions are explained respectively. Testing of hypothesis for
population variance and two population variances are described in Sections
10.7 and 10.8 respectively. Unit ends by providing summary of what we have
discussed in this unit in Section 10.9 and solution of exercises in Section 10.10.
Objectives
After studying this unit, you should be able to:
judge for a given situation whether we should go for large sample test or
not;
Applying the Z-test for testing the hypothesis about the population mean
and difference of two population means;
Applying the Z-test for testing the hypothesis about the population
proportion and difference of two population proportions; and
Applying the Z-test for testing the hypothesis about the population variance
and two population variances.
Step II: After setting the null and alternative hypotheses, we have to choose
level of significance. Generally, it is taken as 5% or 1% (α = 0.05 or
0.01). And accordingly rejection and non-rejection regions will be
decided.
Step III: Third step is to determine an appropriate test statistic, say, Z in case
of large samples. Suppose Tn is the sample statistic such as sample
mean, sample proportion, sample variance, etc. for the parameter
then for testing the null hypothesis, test statistic is given by
we know that SE of a statistic is
Tn E(Tn ) Tn E(Tn ) theSD of the sampling distribution
Z
SE(Tn ) Var(Tn ) of that statistic
SE(Tn ) SD(Tn ) Var(Tn )
In this case, the rejection (critical) region falls under the right tail of
the probability curve of the sampling distribution of test statistic Z.
Fig. 10.1
Suppose z is the critical value at level of significance so entire
region greater than or equal to z is the rejection region and less than
z is the non-rejection region as shown in Fig. 10.1.
Testing of Hypothesis If z (calculated value ) ≥ z (tabulated value), that means the
calculated value of test statistic Z lies in the rejection region, then we
reject the null hypothesis H0 at level of significance. Therefore, we
conclude that sample data provides us sufficient evidence against the
null hypothesis and there is a significant difference between
hypothesized or specified value and observed value of the parameter.
If z < z , that means the calculated value of test statistic Z lies in non-
rejection region, then we do not reject the null hypothesis H0 at
level of significance. Therefore, we conclude that the sample data
fails to provide us sufficient evidence against the null hypothesis and
the difference between hypothesized value and observed value of the
parameter due to fluctuation of sample.
so the population parameter θ may be 0.
Case II: When H 0 : 0 and H 1 : 0 (left-tailed test)
In this case, the rejection (critical) region falls under the left tail of the
probability curve of the sampling distribution of test statistic Z.
Suppose -z is the critical value at level of significance then entire
region less than or equal to -z is the rejection region and greater
than -z is the non-rejection region as shown in Fig. 10.2.
Fig. 10.2 If z ≤-z, that means the calculated value of test statistic Z lies in the
rejection region, then we reject the null hypothesis H0 at level of
significance.
If z >-z, that means the calculated value of test statistic Z lies in the
non-rejection region, then we do not reject the null hypothesis H0 at
level of significance.
In case of two-tailed test: When H 0 : 0 and H 1 : 0
In this case, the rejection region falls under both tails of the
probability curve of sampling distribution of the test statistic Z. Half
the area (α) i.e. α/2 will lies under left tail and other half under the
right tail. Suppose zα / 2 and zα / 2 are the two critical values at the
left-tailed and right-tailed respectively. Therefore, entire region less
than or equal to z / 2 and greater than or equal to zα / 2 are the
rejection regions and between zα / 2 and zα / 2 is the non-rejection
Fig. 10.3
region as shown in Fig. 10.3.
These p-values for Z-test can be obtained with the help of Table-I (Z-table)
given in the Appendix at the end of Block 1 of this course (which gives the
probability [0 Z z] for different value of z) as discussed in Unit 14 of
MST-003.
For example, if test is right-tailed and calculated value of test statistic Z is 1.23
then
p-value = P Z z P Z 1.23 0.5 P 0 Z 1.23
σ
SE X Var X … (2)
n
Now, follow the same procedure as we have discussed in previous section, that
is, first of all we have to setup null and alternative hypotheses. Since here we
want to test the hypothesis about the population mean so we can take the null
and alternative hypotheses as
30
Here, and 0 0 Large Sample Tests
H0 : 0 and H1 : 0 for two-tailed test if we compareit with
general procedure.
H0 : 0 and H1 : 0
or for one-tailed test
H0 : 0 and H1 : 0
When we assume that
For testing the null hypothesis, the test statistic Z is given by the null hypothesis is
true then we are actually
X E X assuming that the
Z
SE X population parameter is
equal to the value in the
X µ0 null hypothesis. For
Z Using equations (1) and (2) and example, we assume that
σ/ n under H 0 we assume that µ µ 0 . µ = 60 whether the null
hypothesis is µ = 60 or
The sampling distribution of the test statistic depends upon σ2 that it is known µ ≤ 60 or µ ≥ 60.
or unknown. Therefore, two cases arise:
Case I: When σ2 is known
In this case, the test statistic follows the normal distribution with
mean 0 and variance unity when the sample size is the large as the
population under study is normal or non-normal. If the sample size is
small then test statistic Z follows the normal distribution only when
population under study is normal. Thus,
X µ0
Z ~ N 0, 1
σ/ n
Case II: When σ2 is unknown
In this case, we estimate σ2 by the value of sample variance (S2)
where,
1 n 2
S2
n 1 i1
Xi X
X 0
Then become test statistic follows the t-distribution with
S/ n
(n−1) df as the sample size is large or small provided the population
under study follows normal as we have discussed in Unit 2 of this
course. But when population under study is not normal and sample
size is large then this test statistic approximately follows normal
distribution with mean 0 and variance unity, that is,
X 0
Z ~ N 0,1
S/ n
After that, we calculate the value of test statistic as may be the case (σ2 is
known or unknown) and compare it with the critical value given in Table 10.1
at prefixed level of significance α. Take the decision about the null hypothesis
as described in the previous section.
From above discussion of testing of hypothesis about population mean, we note
following point:
(i) When σ2 is known then we apply the Z-test as the population under study
is normal or non-normal for the large sample. But when sample size is
31
Testing of Hypothesis small then we apply the Z-test only when population under study is
normal.
(ii) When σ2 is unknown then we apply the t-test only when the population
under study is normal as sample size is large or small. But when the
assumption of normality is not fulfilled and sample size is large then we
can apply the Z-test.
(iii) When sample is small and σ2 is known or unknown and the form of the
population is not known then we apply the non-parametric test as we will
be discussed in Block 4 of this course.
Following examples will help you to understand the procedure more clearly.
Example 1: A light bulb company claims that the 100-watt light bulb it sells
has an average life of 1200 hours with a standard deviation of 100 hours. For
testing the claim 50 new bulbs were selected randomly and allowed to burn
out. The average lifetime of these bulbs was found to be 1180 hours. Is the
company’s claim is true at 5% level of significance?
Solution: Here, we are given that
Specified value of population mean = 0 = 1200 hours,
Population standard deviation = σ = 100 hours,
Sample size = n = 50
Sample mean = X = 1180 hours.
In this example, the population parameter being tested is population mean i.e.
average life of a bulb (µ) and we want to test the company’s claim that average
life of a bulb is 1200 hours. So our claim is = 1200 and its complement is
≠ 1200. Since claim contains the equality sign so we can take the claim as the
null hypothesis and complement as the alternative hypothesis. So
H 0 : 0 1200 average life of a bulb is 1200 hours
H 1 : 1200 average life of a bulb is not1200 hours
Also the alternative hypothesis is two-tailed so the test is two-tailed test.
Here, we want to test the hypothesis regarding mean when population SD
(variance) is known and sample size n = 50(> 30) is large. So we will go for
Z-test.
Thus, for testing the null hypothesis the test statistic is given by
X 0
Z
/ n
1180 1200 20
1.41
100 / 50 14.14
The critical (tabulated) values for two-tailed test at 5% level of significance are
± zα/2 = ± z0.025 = ± 1.96.
Fig. 10.4 Since calculated value of test statistic Z ( = –1.41) is greater than critical value
(= − 1.96) and less than the critical value (= 1.96), that means it lies in non-
rejection region as shown in Fig. 10.4, so we do not reject the null hypothesis.
Since the null hypothesis is the claim so we support the claim at 5% level of
significance.
Decision according to p-value:
The test is two-tailed, therefore,
p-value = 2P Z z 2P Z 1.41 Large Sample Tests
Since p-value (= 0.1586) is greater than α (= 0.05) so we do not reject the null
hypothesis at 5% level of significance.
Decision according to confidence interval:
Here, test is two-tailed, therefore, we contract two-sided confidence interval for
population mean.
Since population standard deviation is known, therefore, we can use
(1−α) 100 % confidence interval for population mean when population
variance is known which is given by
X z / 2 n , X z / 2 n
100 100
1180 1.96 50 ,1180 1.96 50
or 1180 27.71,1180 27.71
or 1152.29, 1207.71
Since 95% confidence interval for average life of a bulb contains the value of
the parameter specified by the null hypothesis, that is, 0 1200 so we do
not reject the null hypothesis.
Thus, we conclude that sample does not provide us sufficient evidence against
the claim so we may assume that the company’s claim that the average life of a
bulb is 1200 hours is true.
Note 2: Here, we note that the decisions about null hypothesis based on three
approaches (critical value or classical, p-value and confidence interval) are
same. The learners are advised to make the decision about the claim or
statement by using only one of the three approaches in the examination. Here,
we used all these approaches only to give you an idea how they can be used in
a given problem. Those learners who will opt biostatistics specialisation will
see and realize the importance of confidence interval approach in Unit 16 of
MSTE-004.
Example 2: A manufacturer of ball point pens claims that a certain pen
manufactured by him has a mean writing-life at least 460 A-4 size pages. A
purchasing agent selects a sample of 100 pens and put them on the test. The
mean writing-life of the sample found 453 A-4 size pages with standard
deviation 25 A-4 size pages. Should the purchasing agent reject the
manufacturer’s claim at 1% level of significance?
Solution: Here, we are given that
Specified value of population mean = 0 = 460,
33
Testing of Hypothesis Sample size = n = 100,
Sample mean = X = 453,
Sample standard deviation = S = 25
Here, we want to test the manufacturer’s claim that the mean writing-life (µ) of
pen is at least 460 A-4 size pages. So our claim is ≥ 460 and its complement
is < 460. Since claim contains the equality sign so we can take the claim as
the null hypothesis and the complement as the alternative hypothesis. So
H 0 : 0 460 and H 1 : 460
Since p-value (= 0.0026) is less than α (= 0.01) so we reject the null hypothesis
at 1% level of significance.
Therefore, we conclude that the sample provide us sufficient evidence against
the claim so the purchasing agent rejects the manufacturer’s claim at 1% level
of significance.
Now, you can try the following exercises.
E4) A sample of 900 bolts has a mean length 3.4 cm. Is the sample regarded
to be taken from a large population of bolts with mean length 3.25 cm
and standard deviation 2.61 cm at 5% level of significance?
E5) A big company uses thousands of CFL lights every year. The brand that
the company has been using in the past has average life of 1200 hours. A
new brand is offered to the company at a price lower than they are paying
for the old brand. Consequently, a sample of 100 CFL light of new brand Large Sample Tests
is tested which yields an average life of 1220 hours with standard
deviation 90 hours. Should the company accept the new brand at 5% level
of significance?
and
12 22
Var X Y Var X Var Y
n1 n 2
But we know that standard error = Variance
12 22
SE X Y Var X Y … (4)
n1 n 2
Now, follow the same procedure as we have discussed in Section 10.2, that is,
first of all we have to setup null and alternative hypotheses. Here, we want to
test the hypothesis about the difference of two population means so we can take
the null hypothesis as
Here, 1 1 and 2 2
H0 : 1 2 (no difference in means) if we compareit with
general procedure.
35
Testing of Hypothesis or H 0 : 1 2 0 (difference in two means is 0)
and the alternative hypothesis as
H1 : 1 2 for two-tailed test
H0 : 1 2 and H1 : 1 2
or for one-tailed test
H0 : 1 2 and H1 : 1 2
For testing the null hypothesis, the test statistic Z is given by
Z
X Y E X Y
SE X Y
X Y µ1 µ 2
or Z using equations (3) and (4)
σ12 σ 22
n1 n 2
Since under null hypothesis we assume that µ1 = µ2, therefore, we have
XY
Z
σ12 σ 22
n1 n 2
Now, the sampling distribution of the test statistic depends upon 12 and 22
that both are known or unknown. Therefore, four cases arise:
Case I: When 12 & 22 are known and 12 22 2
In this case, the test statistic follows normal distribution with mean
0 and variance unity when the sample sizes are large as both the
populations under study are normal or non-normal. But when
sample sizes are small then test statistic Z follows normal
distribution only when populations under study are normal, that is,
XY
Z ~ N 0,1
1 1
σ
n1 n2
Case II: When 12 & 22 are known and 12 22
In this case, the test statistic also follows the normal distribution as
described in case I, that is,
XY
Z ~ N(0, 1)
σ12 σ 22
n1 n 2
Case III: When 12 & 22 are unknown and 12 22 2
In this case, 12 & 22 are estimated by the values of the sample
variances S12 &S22 respectively and the exact distribution of test
statistic is difficult to derive. But when sample sizes n1 and n2 are
large (> 30) then central limit theorem, the test statistic
approximately normally distributed with mean 0 and variance unity,
that is,
XY
Z ~ N(0, 1)
S12 S22
n1 n 2
After that, we calculate the value of test statistic and compare it with the
critical value given in Table 10.1 at prefixed level of significance α. Take the
decision about the null hypothesis as described in Section10.2.
From above discussion of testing of hypothesis about population mean, we note
following point:
(i) When 12 & 22 are known then we apply the Z-test as both the population
under study are normal or non-normal for the large sample. But when
sample sizes are small then we apply the Z-test only when populations
under study are normal.
(ii) When 12 & 22 are unknown then we apply the t-test only when the
populations under study are normal as sample sizes are large or small.
But when the assumption of normality is not fulfilled and sample sizes
are large then we can apply the Z-test.
(iii) When samples are small and 12 & 22 are known or unknown and the
form of the population is not known then we apply the non-parametric
test as we will be discussed in Block 4 of this course.
Let us do some examples based on above test.
Example 3: In two samples of women from Punjab and Tamilnadu, the mean
height of 1000 and 2000 women are 67.6 and 68.0 inches respectively. If
population standard deviation of Punjab and Tamilnadu are same and equal to
5.5 inches then, can the mean heights of Punjab and Tamilnadu women be
regarded as same at 1% level of significance?
37
Testing of Hypothesis Solution: We are given
n1 = 1000, n2 = 2000, X 67.6, Y 68.0 and σ1 σ2 σ 5.5
Here, we wish to test that the mean height of Punjab and Tamilnadu women is
same. If 1 and 2 denote the mean heights of Punjab and Tamilnadu women
respectively then our claim is 1 = 2 and its complement is 1 ≠ 2. Since the
claim contains the equality sign so we can take the claim as the null hypothesis
and complement as the alternative hypothesis. Thus,
H 0 : 1 2 and H1 : 1 2
Since the alternative hypothesis is two-tailed so the test is two-tailed test.
Here, we want to test the hypothesis regarding two population means. The
standard deviations of both populations are known and sample sizes are large,
so we should go for Z-test.
So, for testing the null hypothesis, the test statistic Z is given by
XY
Z
σ12 σ 22
n1 n 2
67.6 68.0 0.4
2 2
5.5 5.5 1 1
1000 2000 5.5
1000 2000
0.4
1.88
5.5 0.0387
The critical (tabulated) values for two-tailed test at 1% level of significance are
± zα/2 = ± z0.005 = ± 2.58.
Since calculated value of Z ( = −1.88) is greater than the critical value
(= − 2.58) and less than the critical value (= 2.58), that means it lies in non-
rejection region as shown in Fig. 10.6, so we do not reject the null hypothesis
i.e. we fail to reject the claim.
Decision according to p-value:
Fig. 10.6
The test is two-tailed, therefore,
p-value = 2P Z z 2P Z 1.88
Since p-value (= 0.0602) is greater than ( 0.01) so we do not reject the null
hypothesis at1% level of significance.
Thus, we conclude that the samples do not provide us sufficient evidence
against the claim so we may assume that the average height of women of
Punjab and Tamilnadu is same.
Example 4: A university conducts both face to face and distance mode classes
for a particular course indented both to be identical. A sample of 50 students of
face to face mode yields examination results mean and SD respectively as:
X 80.4, S1 12.8
and other sample of 100 distance-mode students yields mean and SD of their Large Sample Tests
examination results in the same course respectively as:
Y 74.3, S2 20.5
Are both educational methods statistically equal at 5% level?
Solution: Here, we are given that
n1 50, X 80.4, S1 12.8;
p-value = 2P Z z 2P Z 2.23
Since p-value (= 0.0258) is less than ( 0.05) so we reject the null hypothesis
at 5% level of significance.
Thus, we conclude that samples provide us sufficient evidence against the
claim so both methods of education, i.e. face-to-face and distance-mode, are
not statistically equal.
Testing of Hypothesis Now, you can try the following exercises.
E6) Two brands of electric bulbs are quoted at the same price. A buyer was
tested a random sample of 200 bulbs of each brand and found the
following information:
Mean Life (hrs.) SD(hrs.)
Brand A 1300 41
Brand B 1280 46
Is there significant difference in the mean duration of their lives of two
brands of electric bulbs at 1% level of significance?
E7) Two research laboratories have identically produced drugs that provide
relief to BP patients. The first drug was tested on a group of 50 BP
patients and produced an average 8.3 hours of relief with a standard
deviation of 1.2 hours. The second drug was tested on 100 patients,
producing an average of 8.0 hours of relief with a standard deviation of
1.5 hours. Does the first drug provide a significant longer period of relief
at a significant level of 5%?
Case I: When sample size is not sufficiently large i.e. either of the conditions
np > 5 or nq > 5 does not meet, then we use exact binomial test. But exact
binomial test is beyond the scope of this course.
Case II: When sample size is sufficiently large, such that np > 5 and nq > 5
then by central limit theorem, the sampling distribution of sample proportion p
is approximately normally distributed with mean and variance as
PQ
E(p) = P and Var(p) = … (5)
n
But we know that standard error = Variance
PQ
SE (p) … (6)
n
Now, follow the same procedure as we have discussed in Section 10.2, first of
all we setup null and alternative hypotheses. Since here we want to test the
hypothesis about specified value P0 of the population proportion so we can take
the null and alternative hypotheses as
Here, P and P0
H 0 : P P0 and H1 : P P0 for two-tailed test if we compare it with
0
general procedure.
H 0 : P P0 and H1 : P P0
or for one-tailed test
H 0 : P P0 and H1 : P P0
For testing the null hypothesis, the test statistic Z is given by
p E p
Z
SE p
p P0
Z ~ N 0, 1 under H0 using equations(5) and (6)
P0 Q 0
n
After that, we calculate the value of test statistic and compare it with the
critical value(s) given in Table 10.1 at prefixed level of significance α. Take
the decision about the null hypothesis as described in Section 10.2.
Let us do some examples of testing of hypothesis about population proportion.
Example 5: A machine produces a large number of items out of which 25%
are found to be defective. To check this, company manager takes a random
sample of 100 items and found 35 items defective. Is there an evidence of more
deterioration of quality at 5% level of significance?
Solution: The company manager wants to check that his machine produces
25% defective items. Here, attribute under study is defectiveness. And we
define our success and failure as getting a defective or non defective item.
Let P = Population proportion of defectives items = 0.25(= P0 )
p = Observed proportion of defectives items in the sample = 35/100 = 0.35
Here, we want to test that machine produces more defective items, that is, the
proportion of defective items (P) greater than 0.25. So our claim is P > 0.25
41
Testing of Hypothesis and its complement is P ≤ 0.25. Since complement contains the equality sign so
we can take the complement as the null hypothesis and the claim as the
alternative hypothesis. So
H 0 : P P0 0.25 and H 1 : P 0.25
p-value = P Z z P Z 2.31
0.0104
Since p-value (= 0.0104) is less than ( 0.05) so we reject the null
hypothesis at 5% level of significance.
Thus, we conclude that the sample fails to provide us sufficient evidence
against the claim so we may assume that deterioration in quality exists at 5%
level of significance.
Example 6: A die is thrown 9000 times and draw of 2 or 5 is observed 3100
times. Can we regard that die is unbiased at 5% level of significance.
Solution: Let getting a 2 or 5 be our success, and getting a number other than 2
or 5 be a failure then in usual notions, we have
n = 9000, X = number of successes = 3100, p = 3100/9000 = 0.3444
Here, we want to test that the die is unbiased and we know that if die is Large Sample Tests
unbiased then proportion or probability of getting 2 or 5 is
P = Probability of getting a 2 or 5
= Probability of getting 2 + Probability of getting 5
1 1 1
0.3333
6 6 3
So our claim is P = 0.3333 and its complement is P ≠ 0.3333. Since the claim
contains the equality sign so we can take the claim as the null hypothesis and
complement as the alternative hypothesis. Thus,
H0 : P P0 0.3333 and H1 :P 0.3333
44
PQ PQ Large Sample Tests
p1 ~ N P1 , 1 1 and p 2 ~ N P2 , 2 2
n1 n2
That is,
PQ P Q
p1 p 2 ~ N P1 P2 , 1 1 2 2
n1 n2
Now, follow the same procedure as we have discussed in Section 10.2, first of
all we have to setup null and alternative hypotheses. Here, we want to test the
hypothesis about the difference of two population proportions so we can take
the null hypothesis as
Here, 1 P1 and
H 0 : P1 P2 (no difference in proportions) 2 P2 if we compare
it with general
procedure.
H 0 : P1 P2 and H1 : P1 P2
or for one-tailed test
H 0 : P1 P2 and H1 : P1 P2
p1 p2 P1 P2
or Z using equations (7) and (8)
P1Q1 P2Q2
n1 n2
Since under null hypothesis we assume that P1 = P2 = P, therefore, we have
45
Testing of Hypothesis p1 p 2
Z
1 1
PQ
n1 n 2
where, Q = 1-P.
Generally, P is unknown then it is estimated by the value of pooled proportion
P̂, where
n p n 2 p 2 X1 X 2 ˆ 1 Pˆ
Pˆ 1 1 and Q
n1 n 2 n1 n 2
After that, we calculate the value of test statistic and compare it with the
critical value(s) given in Table 10.1 at prefixed level of significance α. Take
the decision about the null hypothesis as described in Section 10.2.
Now, it is time for doing some examples for testing of hypothesis about the
difference of two population proportions.
Example 7: In a random sample of 100 persons from town A, 60 are found to
be high consumers of wheat. In another sample of 80 persons from town B, 40
are found to be high consumers of wheat. Do these data reveal a significant
difference between the proportions of high wheat consumers in town A and
town B ( at α = 0.05 )?
Solution: Here, attribute under study is high consuming of wheat. And we
define our success and failure as getting a person of high consumer of wheat
and not high consumer of wheat respectively.
We are given that
n1 = total number of persons in the sample of town A = 100
n2 = total number of persons in the sample of town B = 80
X1 = number of persons of high consumer of wheat in town A = 60
X2 = number of persons of high consumer of wheat in town B = 40
The sample proportion of high wheat consumers in town A is
X1 60
p1 0.60
n1 100
and the sample proportion of wheat consumers in town B is
X 2 40
p2 0.50
n 2 80
Here, we want to test that the proportion of high consumers of wheat in two
towns, say, P1 and P2, is not same. So our claim is P1 ≠ P2 and its complement
is P1 = P2. Since the complement contains the equality sign, so we can take the
complement as the null hypothesis and the claim as the alternative hypothesis.
Thus,
H 0 : P1 P2 P and H 1 : P1 P2
46
n1p1 100 0.60 60 5, n1q1 100 0.40 40 5 Large Sample Tests
n 2 p2 80 0.50 40 5, n 2q 2 80 0.50 40 5
We see that condition of normality meets, so we can go for Z-test.
The estimate of the combined proportion (P) of high wheat consumers in two
towns is given by
n1p1 n 2p2 X1 X 2 60 40 5
P̂
n1 n 2 n1 n 2 100 80 9
ˆ 1 Pˆ 1 5 4
Q
9 9
For testing the null hypothesis, the test statistic Z is given by
p1 p 2
Z
ˆ 1 1
P̂Q
n1 n 2
0.60 0.50 0.10
1.34
5 4 1 1 0.0745
9 9 100 80
The critical values for two-tailed test at 5% level of significance are ± zα/2
= ± z0.025 = ±1.96.
Since calculated value of Z (=1.34) is less than the critical value (= 1.96) and
greater than critical value (= −1.96), that means calculated value of Z lies in
non-rejection region, so we do not reject the null hypothesis and reject the
alternative hypothesis i.e. we reject the claim.
Decision according to p-value:
Since the test is two-tailed, therefore
p-value = 2 P Z z 2P Z 1.34
Since p-value (= 0.1802) is greater than ( 0.05) so we do not reject the null
hypothesis at 5% level of significance.
Thus, we conclude that the samples provide us the sufficient evidence against
the claim so we may assume that the proportion of high consumers of wheat in
two towns A and B is same.
Example 8: A machine produced 60 defective articles in a batch of 400. After
overhauling it produced 30 defective in a batch of 300. Has the machine
improved due to overhauling? (Take = 0.01).
Solution: Here, the machine produced articles and attribute under study is
defectiveness. And we define our success and failure as getting a defective or
non defective article. Therefore, we are given that
X1 = number of defective articles produced by the machine before overhauling
= 60
X2 = number of defective articles produced by the machine after overhauling
= 30
47
Testing of Hypothesis and n1 400, n 2 300,
Let p1 = Observed proportion of defective articles in the sample before the
overhauling
X1 60
0.15
n1 400
and p2 = Observed proportion of defective articles in the sample after the
overhauling
X2 30
0.10
n 2 300
Here, we want to test that machine improved due to overhauling that means the
proportion of defective articles is less after overhauling. If P1 and P2 denote the
proportion defectives before and after the overhauling the machine so our claim
is P1 > P2 and its complement P1 ≤ P2. Since the complement contains the
equality sign so we can take the complement as the null hypothesis and claim
as the alternative hypothesis. Thus,
H 0 : P1 P2 and H1 : P1 P2
Since the alternative hypothesis is right-tailed so the test is right-tailed test.
Since P is unknown, so the pooled estimate of proportion is given by
X1 X 2 60 30 90 9 ˆ 1 Pˆ 1 9 61 .
P̂ and Q
n1 n 2 400 300 700 70 70 70
Before proceeding further, first we have to check whether the condition of
normality meets or not.
n1p1 400 0.15 60 5, n1q1 400 0.85 340 5
n 2 p2 300 0.10 30 5, n 2q 2 300 0.90 270 5
We see that condition of normality meets, so we can go for Z-test.
For testing the null hypothesis, the statistic is given by
p1 p 2
Z
ˆ 1 1
P̂Q
n1 n 2
0.15 0.10 0.05
1.95
9 61 1 1 0.0256
70 70 400 300
The critical value for right-tailed test at 1% level of significance is
zα = z0.01 = 2.33.
Since calculated value of Z (= 1.95) is less than the critical value (= 2.33) that
means calculated value of Z lies in non-rejection region, so we do not reject the
null hypothesis and reject the alternative hypothesis i.e. we reject the claim at
1% level of significance.
Decision according to p-value:
Since the test is right-tailed, therefore,
p-value = P Z z P Z 1.95
48
0.5 P 0 Z 1.95 0.5 0.4744 0.0256 Large Sample Tests
Since p-value (= 0.0256) is greater than ( 0.01) so we do not reject the null
hypothesis at1% level of significance.
Thus, we conclude that the samples provide us sufficient evidence against the
claim so the machine has not been improved after overhauling.
Now, you can try the following exercises.
E10) The proportions of literates between groups of people of two districts A
and B are tested. Out of the 100 persons selected at random from each
district, 50 from district A and 40 from district B are found literates. Test
whether the proportion of literate persons in two districts A and B is
same at 1% level of significance?
E11) In a large population 30% of a random sample of 1200 persons had blue-
eyes and 20% of a random sample of 900 persons had the same blue-
eyes in another population. Test the proportion of blue-eyes persons is
same in two populations at 5% level of significance.
2 2
SE S2 Var S2 σ … (10)
n
The general procedure of this test is explained in the next page.
49
Testing of Hypothesis As we are doing so far in all tests, first Step in hypothesis testing problems is to
setup null and alternative hypotheses. Here, we want to test the hypothesis
specified value 20 of the population variance 2 so we can take our null and
alternative hypotheses as
and
214 224
Var S12 S22 Var S12 Var S22
n1 n2
51
Testing of Hypothesis But we know that standard error = Variance
2 14 2 42
SE S12 S22 Var S12 S22 … (12)
n1 n2
Now, follow the same procedure as we have discussed in Section 10.2, that is,
first of all we have to setup null and alternative hypothesis. Here, we want to
test the hypothesis about the two population variances, so we can take our null
and alternative hypotheses as
H 0 : 12 22 2 and H1 : 12 22 for two-tailed test
H 0 : 12 22 and H1 : 12 22
or for one-tailed test
H 0 : 12 22 and H1 : 12 22
or
S12 S22 σ12 σ 22
Z using equations (11) and (12)
2σ14 2σ 42
n1 n2
Since under null hypothesis we assume that σ12 σ22 σ2 , therefore, we have
S12 S22
Z ~ N 0,1
1 1
σ2 2
n1 n 2
Generally, population variances σ12 and σ 22 are unknown, so we estimate them
by their corresponding sample variances S12 and S 22 as
ˆ 12 S12 and ˆ 22 S 22
Thus, the test statistic Z is given by
S12 S22
Z ~ N 0,1
2S14 2S42
n n
1 2
After that, we calculate the value of test statistic as may be the case and
compare it with the critical value given in Table 10.1 at prefixed level of
significance α. Take the decision about the null hypothesis as described in
Section 10.2.
Note 3: When populations under study are normal then for testing the
hypothesis about equality of population variances we use F- test which will be
discussed in Unit 12 of this course. Whereas when the form of the populations
under study is not known and sample sizes are large then we apply Z-test as
discussed above.
Now, it is time to do an example based on above test.
52
Example 10: A comparative study of variation in weights (in pound) of Army- Large Sample Tests
soldiers and Navy- sailors was made. The sample variance of the weight of 120
soldiers was 60 pound2 and the sample variance of the weight of 160 sailors
was 70 pound2. Test whether the soldiers and sailors have equal variation in
their weights. Use 5% level of significance.
We want to test that the Army-soldiers and Navy-sailors have equal variation
in their weights. If 12 and 22 denote the variances in the weight of Army-
soldiers and Navy-sailors so our claim is 12 22 and its complement is
12 22 . Since the claim contains the equality sign so we can take the claim as
the null hypothesis and complement as the alternative hypothesis. Thus,
Since population variances are unknown so for testing the null hypothesis, the
test statistic Z is given by
S12 S22
Z
2S14 2S24
n n
1 2
60 70
2 2
2 60 2 70
120 160
10 10
0.91
60.0 61.25 11.01
53
Testing of Hypothesis E 13) Two sources of raw materials of bulbs are under consideration by a bulb
manufacturing company. Both sources seem to have similar
characteristics but the company is not sure about their respective
uniformity. A sample of 52 lots from source A yields variance 25 and a
sample of 40 lots from source B yields variance of 12. Test whether the
variance of source A significantly differs to the variances of source B at
= 0.05?
We now end this unit by giving a summary of what we have covered in it.
10.9 SUMMARY
In this unit we have covered the following points:
1. How to judge a given situation whether we should go for large sample test
or not.
2. Applying the Z-test for testing the hypothesis about the population mean
and difference of two population means.
3. Applying the Z-test for testing the hypothesis about the population
proportion and difference of two population proportions.
4. Applying the Z-test for testing the hypothesis about the population variance
and two population variances.
p-value = 2P Z z 2P Z 2.42
n 2 200, Y 1280, S2 46
Here, we want to test that there is significant difference in the mean
duration of their lives of two brands of electric bulbs. If 1 and 2
denote the mean lives of two brands of electric bulbs respectively then
our claim is 1 ≠ 2 and its complement is 1 = 2. Since the
complement contains the equality sign so we can take the complement
as the null hypothesis and the claim as the alternative hypothesis. Thus,
H 0 : 1 2 and H1 : 1 2
Since the alternative hypothesis is two-tailed so the test is two-tailed
test.
We want to test the null hypothesis regarding equality of two
population means. The standard deviations of both populations are
unknown so we should go for t-test if population of difference is known
to be normal. But it is not the case. Since sample sizes are large (n1, and
n2 > 30) so we go for Z-test.
So for testing the null hypothesis, the test statistic Z is given by
XY
Z
S12 S22
n1 n 2
1300 1280 20 20
4.59
41
2
46
2
8.41 10.58 4.36
200 200
The critical (tabulated) values for two-tailed test at 1% level of
significance are ± zα/2 = ± z0.005 = ± 2.58.
Since calculated value of test statistic Z (= 4.59) is greater than the
critical values (= ± 2.58), that means it lies in rejection region, so we
reject the null hypothesis an support the alternative hypothesis i.e.
support the claim at 1% level of significance.
Thus, we conclude that samples do not provide us sufficient evidence
against the claim so there is significant difference in the mean duration
of their lives of two brands of electric bulbs.
E7) Given that
n1 50, X 8.3, S1 1.2;
58
Since the alternative hypothesis is right-tailed so the test is right-tailed Large Sample Tests
test.
Before proceeding further, first we have to check whether the condition
of normality meets or not.
np 200 0.9 180 5
7 6
13
13
2.80
2 2 36 0.129 4.64
6
120
The critical values for two-tailed test at 5% level of significance are
± zα/2 = ± z0.025 = ±1.96.
Since calculated value of Z (= 2.8) is greater than critical values
(= ±1.96), that means it lies in rejection region, so we reject the null
hypothesis i.e. we reject our claim at 5% level of significance.
61
Testing of Hypothesis Thus, we conclude that sample provides us sufficient evidence against
the claim so standard deviation of the life of bulbs of the lot is not 6.0
hours.
E13) Here, we are given that
n1 52, S12 25
n2 40, S22 12
Here, we want to test that variance of source A significantly differs to
the variances of source B. If 12 and 22 denote the variances in the raw
materials of sources A and B respectively so our claim is 12 22 and
its complement is 12 22 . Since complement contains the equality
sign so we can take the complement as the null hypothesis and the
claim as the alternative hypothesis. Thus,
H0 : 12 22 and H1 : 12 22
Since the alternative hypothesis is two-tailed so the test is two-tailed
test.
Here, the distributions of populations under study are not known and
sample sizes are large (n1 52 30, n 2 40 30) so we can go for Z-
test.
Since population variances are unknown so for testing the null
hypothesis, the test statistic Z is given by
S12 S22
Z
2S14 2S24
n n
1 2
25 12 13
2.36
2
225 212
2 5.5
52 40
The critical values for two-tailed test at 5% level of significance are
± zα/2 = ± z0.025 = ±1.96.
Since calculated value of Z (= 2.36) is greater than critical values (= ±1.96),
that means it lies in rejection region, so we reject the null hypothesis and
support the alternative hypothesis i.e. we support our claim at 5% level of
significance.
Thus, we conclude that samples fail to provide us sufficient evidence against
the claim so variance of source A significantly differs to the variance of source
B.
62