0% found this document useful (0 votes)
59 views15 pages

Testing of Hypothesis

This document discusses hypothesis testing and sampling distributions. It defines two types of hypotheses: the null hypothesis and alternative hypothesis. It also discusses type I and type II errors. Several statistical tests are covered, including tests of single proportions, differences of proportions, single means, differences of means, and differences of standard deviations. Large sample tests are presented for these cases using the z-distribution.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views15 pages

Testing of Hypothesis

This document discusses hypothesis testing and sampling distributions. It defines two types of hypotheses: the null hypothesis and alternative hypothesis. It also discusses type I and type II errors. Several statistical tests are covered, including tests of single proportions, differences of proportions, single means, differences of means, and differences of standard deviations. Large sample tests are presented for these cases using the z-distribution.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Testing of Hypothesis

To reach decisions about populations on the basis of sample information, we make


certain assumptions about the populations involved. Such assumptions, may or
may not be true, are called statistical hypothesis.

There are two types of hypothesis. They are


Null Hypothesis : Ho : Population statistic = sample statistic
Alternate Hypothesis : H1 : Population statistic ≠ sample statistic

Note : By default, here hypothesis means null hypothesis.


In accepting or rejecting hypothesis, generally we commit two types of errors.
Type I error : Reject a hypothesis when it is to be accepted.
Type II error : Accept a hypothesis when it is to be rejected.

Note : In order for any test of hypothesis or rule of decision to be good, it must be
designed so as to minimize the errors of decision.

Level of significance :
In testing a hypothesis, the maximum probability with which we would be willing
to risk a type I error is called level of significance of the test.

For example, if 5% level of significance is chosen in designing a test of hypothesis,


then there are 5 chances out of 100 that we would reject the hypothesis when it is
to be accepted, that means we are about 95% confident that we made right
decision.
Chi square test for goodness of fit :
We want to test whether the deviations of observed data from the expected data are
significant or not.
It is also used to test how a set of observations will fit a given distribution, which
means it provides a test of goodness of fit and may be used to examine the validity
of some hypothesis about an observed data.
As a test of goodness of fit, it can be used to study the correspondence between
theoretical data and observed data.

Procedure to test goodness of fit :

2
n ( Oi − Ei )2
(i) Set up a ‘null hypothesis’ and calculate χ = 
i =1 Ei
where Oi’s are observed values and Ei’s are expected or theoretical
values and  Oi = Ei .
(ii) Find the number of degrees of freedom (d.f.) and take the corresponding
value of χ2 at a prescribed level of significance from the χ2 test for
goodness of fit table.
(iii) If calculated value is less than the table value, then accept the hypothesis
otherwise reject the hypothesis.

To decide the number of d.f.


If any theoretical (or expected) frequency is less than 5, then we use the technique
of ‘pooling’ which consists in adding the frequencies which are less than 5 with the
preceding or succeeding frequency so that the resulting sum of theoretical (or
expected) frequency is greater than or equal to 5.
Let ‘n’ be the number of theoretical (or expected) frequencies after pooling.
Now, the number of d.f. = n – 1 – number of relations used to get theoretical (or
expected) frequencies
Ex : 15,000 random numbers were taken from a table and the following
frequencies of each digit were obtained.
Digit 0 1 2 3 4 5 6 7 8 9
Frequency 1493 1441 1461 1552 1494 1454 1613 1491 1482 1519
Test the hypothesis each digit had an equal chance of appearance.
Solution :
Hypothesis : Each digit had an equal chance of appearance.
Here the given frequencies are observed frequencies (Oi’s).
 Oi = 15000

By hypothesis, we say that each digit should appear 1500 times. These are
expected frequencies.
Digit 0 1 2 3 4 5 6 7 8 9
Frequency 1493 1441 1461 1552 1494 1454 1613 1491 1482 1519
(Oi)
Expected 1500 1500 1500 1500 1500 1500 1500 1500 1500 1500
frequency
(Ei)

2 ( Oi − Ei )2
χ = = 15.6282
Ei

Here n = 10, no. of d.f. = n-1 = 9


The table value of χ2 for 9 d.f. at 5% level of significance = 16.919
Since, the calculated value of χ2 is less than the table value of χ2, the hypothesis is
accepted at 5% level of significance .

Ex : The theory predicts that a set of data values were distributed in four groups
A,B,C,D in the ratio 9:3:3:1. In an experiment, the data values observed in the
groups A,B,C,D are 882, 313, 287, 118. Does the experimental result support the
theory.
(The table value of χ2 for 3 d.f. at 5% level of significance = 7.815)
Ex : Fit the Poisson distribution to the following data.
x 0 1 2 3 4 5 6 7 8
Frequency 56 156 132 92 37 22 4 0 1
Is the fitting data correct?
Solution : Here the given frequencies are observed frequencies (Oi’s).
 Oi = 500

By using fitting of Poisson distribution, we get the following theoretical


frequencies.
x 0 1 2 3 4 5 6 7 8
Theoretical 70 137 135 89 44 17 6 2 0
frequency

After finding the theoretical frequencies, the data is


x 0 1 2 3 4 5 6 7 8
Observed 56 156 132 92 37 22 4 0 1
Frequency
Theoretical 70 137 135 89 44 17 6 2 0
frequency

After pooling, the data can be written as


Observed 56 156 132 92 37 22 5
frequency (Oi)
Theoretical 70 137 135 89 44 17 8
frequency (Ei)

After pooling, n = the number of data values = 7


No. of d.f = n – 1 – number of relations used to get theoretical frequencies
=7–1–1=5
(Rest is left as an exercise)
Sampling distribution :
Consider a sample size of ‘n’ which can be drawn from a given population at
random. Now we want to test the hypothesis with respect to a statistic.
If n ≥ 30, then the sample is considered to be a large sample, otherwise it is
considered to be a small sample.

Large samples :

Test for single proportion :


Let ‘n’ be the size of the sample.
Let ‘p’ be the required proportion w.r.t. the considered null hypothesis.
Let ‘x’ be the number of observations w.r.t. the value of ‘p’.
x − np
Calculate z= , where q = 1 – p.
npq
Here z is a standard normal variable.
If |Z| < 1.96, then the hypothesis is accepted at 5% level of significance.
If 1.96 < |Z| < 2.58, then the hypothesis is accepted at 1% level of significance.
If 2.58 < |Z| < 3, then the hypothesis is accepted at 1% level of significance.
If |Z| > 3, then the hypothesis is rejected.
Test for significance for significance of difference of proportions :
Here two samples are drawn from the same population or from different
populations.
Here null hypothesis is that both the samples have the same attitude.
Let ‘n1’ and ‘n2’ be the sizes of the sample1 and sample 2 respectively.
Let ‘p1’ and ‘p2’ be the required proportions w.r.t. the considered null hypothesis.
Then calculate

n p +n p
Pˆ = 1 1 2 2
n1 + n2
Qˆ = 1 − Pˆ
p1 − p2
z=
ˆ ˆ  1 + 1 
PQ
 n1 n2 
Here z is a standard normal variable.
If |Z| < 1.96, then the hypothesis is accepted at 5% level of significance.
If 1.96 < |Z| < 2.58, then the hypothesis is accepted at 1% level of significance.
If 2.58 < |Z| < 3, then the hypothesis is accepted at 1% level of significance.
If |Z| > 3, then the hypothesis is rejected.
Test for single mean :
Let ‘μ’ be the mean of the population.
Let Let ‘n’ be the size of the sample and x be the mean of the sample.
Let σ be the s.d. of the sample or population.
Here null hypothesis is
Sample has been drawn from the population
( or )
There is no significant difference between the sample and population means.

x −µ
Calculate z=
σ/ n
Here z is a standard normal variable.
If |Z| < 1.96, then the hypothesis is accepted at 5% level of significance.
If 1.96 < |Z| < 2.58, then the hypothesis is accepted at 1% level of significance.
If 2.58 < |Z| < 3, then the hypothesis is accepted at 1% level of significance.
If |Z| > 3, then the hypothesis is rejected.

Note :
σ
95% confidence limits for μ are x ± 1.96
n
σ
99% confidence limits for μ are x ± 2.58
n
Note :
If population size is N, then
N −n σ
95% confidence limits for μ are x ± 1.96
N −1 n

N −n σ
99% confidence limits for μ are x ± 2.58
N −1 n
Test for difference of means :
Here we consider two samples.
Sample 1 : Size = n1, Mean = x1 , s.d. = σ1
Sample 2 : Size = n2, Mean = x2 , s.d. = σ2
Here null hypothesis is
Both the samples are drawn from the same population
( or )
Samples are drawn from different populations which have the same mean
( or )
The samples are drawn from different populations which have insignificant
difference as far as the means are concerned.

x1 − x2
Calculate z=
σ 12 σ 22
+
n1 n2

Here z is a standard normal variable.


If |Z| < 1.96, then the hypothesis is accepted at 5% level of significance.
If 1.96 < |Z| < 2.58, then the hypothesis is accepted at 1% level of significance.
If 2.58 < |Z| < 3, then the hypothesis is accepted at 1% level of significance.
If |Z| > 3, then the hypothesis is rejected.
Test of significance for difference of s.d.’s :
Let s1 and s2 be the s.d.’s of two independent random samples of sizes n1 and n2
from two populations with s.d.’s σ1 and σ2 respectively.

Here null hypothesis is


Samples are drawn from different populations which have the same variance
(if σ1 = σ2)
( or )
The samples are drawn from different populations which have insignificant
difference as far as the s.d.’s are concerned.

s1 − s2
Calculate z=
σ 12 σ 22
+
2n1 2n2

s1 − s2
(or) z = if s.d.’s of populations are not given.
s12 s2 2
+
2n1 2n2

Here z is a standard normal variable.


If |Z| < 1.96, then the hypothesis is accepted at 5% level of significance.
If 1.96 < |Z| < 2.58, then the hypothesis is accepted at 1% level of significance.
If 2.58 < |Z| < 3, then the hypothesis is accepted at 1% level of significance.
If |Z| > 3, then the hypothesis is rejected.
Small samples :

Test for single mean :


Population : Mean = μ
Sample : Size = n, data values are x1, x2,…, xn
Here null hypothesis is
The sample has been drawn from the population
( or )
There is no significant difference between population mean and sample mean

x −µ  xi 1 2
Calculate t= where x = , S=  ( xi − x )
S/ n n n −1
Here S is the s.d. of the sample.
Take the corresponding value of ‘t’ at a prescribed level of significance for
(n-1) d.f. from the t- test table.
If calculated value of | t | < table value of t, then hypothesis is accepted otherwise it
is rejected.
Confidence limits
S
95% confidence limits for population mean are x ± (table value of t0.05 ) .
n
S
99% confidence limits for population mean are x ± (table value of t0.01 ) .
n
Test for difference of means :

Sample 1 : Size = n1, data values are x1, x2,…, x n1

Sample 2 : Size = n2, data values are y1, y2,…, yn2

Here null hypothesis is


Both the samples are drawn from the same population
( or )
No significant difference between sample means
( or )
Both the samples are drawn from different populations which have the same mean
( or )
Both the samples are drawn from different populations for which difference of
means is insignificant.

x−y
Calculate t =
1 1
S +
n1 n2
n1 n2
 xi  yj
1  n1 2 n2 2
  ( xi − x ) +  ( y j − y ) 
j =1
where x = i =1 , y= , S=
n1 n2 n1 + n2 − 1 i =1 j =1 
Take the corresponding value of ‘t’ at a prescribed level of significance for
(n1+n2 - 1) d.f. from the t- test table.
If calculated value of | t | < table value of t, then hypothesis is accepted otherwise it
is rejected.
χ2 test for the population variance :

Population : s.d. = σ or variance = σ2


Sample : Size = n, data values are x1, x2,…, xn

Here null hypothesis is


The sample has been drawn from the population
( or )
There is no significant difference between population variance and sample
variance.

2
2  ( xi − x )
Calculate χ =
σ2
Take the corresponding value of χ2 at a prescribed level of significance for
(n-1) d.f. from the χ2 test table.
If calculated value is less than the table value of t, then hypothesis is accepted
otherwise it is rejected.
F-test for equality of population variances :

Sample 1 : Size = n1, data values are x1, x2,…, x n1

Sample 2 : Size = n2, data values are y1, y2,…, yn2

Here null hypothesis is


Both the samples are drawn from the same population
( or )
No significant difference between sample variances
( or )
Both the samples are drawn from different populations which have the same
variance
( or )
Both the samples are drawn from different populations for which difference of
variances is insignificant.

1 n1 2 1 n2 2
Calculate s12 =  ( xi − x ) , s2 2 =  ( yj − y)
n1 − 1 i =1 n2 − 1 j =1

 s12
 2, if s12 < s2 2
 s2
F = 2
 s2 , if s12 > s2 2
s 2
 1
Take the corresponding value of F at a prescribed level of significance for
(n1-1, n2-1) d.f. if s12 < s22 or (n2-1, n1-1) d.f. if s12 > s22 from the F test table.
If calculated value is less than the table value of F, then hypothesis is accepted
otherwise it is rejected.
Fisher’s Z-test for single correlation coefficient :

Population : correlation coefficient = r1


Sample : size = n, correlation coefficient = r2

Here null hypothesis is


The sample has been drawn from the population
( or )
There is no significant difference between population correlation coefficient and
sample correlation coefficient.

1  1 + r1  1  1 + r2 
Calculate z1 = log e  , z2 = log e  
2  1 − r1  2  1 − r2 
z1 − z2
z=
1 n−3

Here z is a standard normal variable.


If |Z| < 1.96, then the hypothesis is accepted at 5% level of significance.
If 1.96 < |Z| < 2.58, then the hypothesis is accepted at 1% level of significance.
If 2.58 < |Z| < 3, then the hypothesis is accepted at 1% level of significance.
If |Z| > 3, then the hypothesis is rejected.
Fisher’s Z-test for difference of correlation coefficients :

Sample 1 : Size = n1, correlation coefficient = r1


Sample 2 : Size = n2, correlation coefficient = r2

Here null hypothesis is


Both the samples are drawn from the same population
( or )
No significant difference between sample correlation coefficients
( or )
Both the samples are drawn from different populations which have the same
correlation coefficient
( or )
Both the samples are drawn from different populations for which difference of
correlation coefficients is insignificant.

1  1 + r1  1  1 + r2 
Calculate z1 = log e  , z2 = log e  
2  1 − r1  2  1 − r2 
z1 − z2
z=
1 1
+
n1 − 3 n2 − 3

Here z is a standard normal variable.


If |Z| < 1.96, then the hypothesis is accepted at 5% level of significance.
If 1.96 < |Z| < 2.58, then the hypothesis is accepted at 1% level of significance.
If 2.58 < |Z| < 3, then the hypothesis is accepted at 1% level of significance.
If |Z| > 3, then the hypothesis is rejected.

You might also like