5_Script_Hypothesis-Large Sample test & Chisquare test200320111103035454

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Script

Hypothesis – Large Sample Test & Chi square Test

1. Large sample Tests:


For testing a given hypothesis a random sample is drawn from a population. If the sample
size is 30 or more it is generally regarded as a large sample. Here we assume underlying
distribution is normal. Even if the underlying distribution is not normal, but sample size is
large, by using central limit theorem we can apply large sample tests for such population
also.
For large sample tests, we follow the following steps for testing the given hypothesis:
Step 1: Identify the parameter and statistics

Step 2: write null and alternate hypotheses.

Step 3: Under the null hypothesis, calculate the test

statistic using the formula

Step 4: Obtain critical value for the given value of level

of significant and compare it with the calculated

value. If calculated value ≤ critical value then do

not reject the null hypothesis otherwise reject the

null hypothesis.

Step 5: write appropriate conclusion.

2. Test for variables:


(i) Test of significance of a mean:
This test is used for testing the mean of the population based on a large sample which is
taken from a normal population.
Let us understand with the illustration.
Illustration 1:

1
It is claimed by the railway authority that a particular train has an average speed of 120 k.m.
per hour. During last 100 trips it was found that the average speed was 116 k.m. per hour
with standard deviation of 15 k.m. per hour. Is the claim justified?
Solution:
Step 1: Here and .

Step 2: Ho: vs H1:

Step 3:

Under Ho the test statistic is

Step 4: Since the given test is one tailed test hence the critical value for 5% level of
significance is obtained from the tables of normal distribution and is

Since , the null hypothesis is rejected.


Step 5: At 5% level of significance we reject the null hypothesis hence we conclude that
average speed of that train is less than 120 kms per hour.

(ii) Test of significance difference of two means:


This test is used for testing the hypothesis that two large samples are drawn from the normal
population with the same mean (or there is no significant difference between two means.)
Let us consider it with the help of illustration
Illustration 2: A factory produces electric motor in large scale. During the first shift a sample
of 150 electric motors shows the average life of 1400 hours with a standard deviation of 120
hours. Another sample during a second shift of 200 motors shows the average life 1200
hours with the standard deviation of 80 hours. Is these difference in two average lives is
significant?
Solution:

Step 1: Here n1 = 150, n 2 = 200, x1 = 1400, x 2 = 1200, S1 =120, S2 = 80 and .

Step 2: H 0 : 1  2 vs H1 : 1  2

Step 3: Difference  x1  x2  200

2
S12 S22
SE   = 11.31
n1 n2

Under Ho the test statistic is

Step 4: Since the given test is two tailed test hence the critical value for 5% level of
significance is obtained from the tables of normal distribution and is

Since hence the null hypothesis is rejected.


Step 5: At 5% level of significance we reject the null hypothesis hence we conclude that the
average life of electric motors in two shifts differ significantly.
(iii) Test of significance difference of two variances or standard deviations:
This test is used for testing the hypothesis that two large samples are drawn from normal
populations with same variance (standard deviations) or there is no significant difference
between two variances.
Let us consider the illustration
Illustration 3: Information regarding the marks obtained by boys and girls in an examination
are as under:
Standard deviation
Gender Sample size Mean marks
of marks

Boy 121 83 10

Girl 81 81 12

Is this difference in the standard deviations of the two groups significant?


Solution:
Step 1: Here n1 = 121, n2 = 81, x1 = 83, x2 = 81, S1 =10, S2 = 12 and .

Step 2: H 0 : 1   2 vs H1 :  1   2

Step 3: Difference  S1  S2  2

S12 S22
SE   = 1.14
2n1 2n2

Under Ho the test statistic is

3
Step 4: Since the given test is two tailed test hence the critical value for 5% level of
significance is obtained from the tables of normal distribution and is

Since hence the null hypothesis is not rejected.


Step 5: At 5% level of significance we do not reject the null hypothesis hence we conclude
that the variation among the marks of boys and girls is insignificant.

3. Tests for proportion:


(i) Test of significance of proportion:
This test is used for testing the hypothesis that the population proportion has specified value.
Let us consider the illustration
Illustration 4: In a large consignment of commodities, 64 out of 400 are found defective.
Test the hypothesis at 1% level of significance that the proportion of the defective items is
20%.
Solution:
x
Step 1: Here n = 400, x = 64, p = = 0.16, P = 0.20, Q = 1 - P =0.80 and .
n
Step 2: H 0 : P  0.20 vs H1 : P  0.20

Step 3: Difference  p  P  0.04

PQ
SE   0.02
n
Under Ho the test statistic is

Step 4: Since the given test is two tailed test hence the critical value for 1% level of
significance is obtained from the tables of normal distribution and is

Since hence the null hypothesis is not rejected.


Step 5: At 1% level of significance we do not reject null hypothesis that the percentage of
defective items in the consignment is 20%.
(ii) Test of significance of two proportions:
This test is used for testing the hypothesis that there is no significant difference between the
two population proportions.

4
Illustration 5: In a factory production is to be carried on a machine and it is known that in a
batch of 500 articles 16 articles are found defective. After maintenance of that machine, 3
defective articles are found in a batch of 100 articles. Can it be concluded that the
performance of machine is improved after maintenance?
Solution:
x1 x
Step 1: Here n1 = 500, n 2 = 100, x1 = 16, x 2 = 3, p1 = = 0.032, p2 = 2 = 0.03
n1 n2

and .

Step 2: H 0 : P1  P2 vs H1 : P1  P2

Step 3: Difference  p1  p2  0.002

1 1 
SE  PQ   
 n1 n 2 
n p n p 19 581
Where P  1 1 2 2  ,Q
n1  n2 600 600
 SE  0.0192
Under Ho the test statistic is

Step 4: Since the given test is one tailed test hence the critical value for 5% level of
significance is obtained from the tables of normal distribution and is

Since hence the null hypothesis is not rejected.


Step 5: At 5% level of significance we do not reject null hypothesis hence we conclude that
the machine is not improved.

Confidence Interval:
As we have discussed earlier, when an estimator is used to predict a single value of
parameter then it is called a point estimate. In practice an interval is obtained which may
include the value of parameter with a certain degree of confidence. The interval developed by
using standard error of the statistic is called confidence interval or fiducial interval.
Confidence interval for the population mean is given as
x  (critical value for given  )( SE )
Confidence interval for the population proportion is given as
5
p  (critical value for given  )(SE )
For illustration 1 the 95% confidence interval for population mean can be obtained as
116  (1.96)(1.5)  (113.06, 118.94)
For illustration 5 the 99% confidence interval for population proportion can be obtained as
under
0.16  (2.575)(0.02)  (0.1085,0.2115)

Chi square Distribution  :


2

A probability distribution of square of standard normal variate is called chi square


distribution with one degree of freedom (df). The number of independent terms of a statistic is
called degree of freedom. As the number of restrictions increases the degree of freedom
decreases. In general, probability distribution of sum of squares of n independent standard

normal variate is called chi square distribution with n degree of freedom. i.e. if x1 , x2 ,......., xn is

a random sample of size n drawn from a normal population with mean  and variance 
2

 x 
n 2

then the distribution of statistic       is called chi square distribution with n df. It
2

i 1  
should be noted that the chi square distribution is a function of its degree of freedom and it is
also considered as non parametric test.
Important Properties:
Followings are some important properties of chi square distribution:
i. It is a continuous distribution
ii. Its mean is equal to its df and variance is 2(df)
iii. Its skewness is always positive
iv. For large value of sample size it follows normal distribution.
Important Application:
Following are some important applications of chi square distribution.
i. To test the goodness of fit
ii. To test the independency of attributes
iii. To test the significance of variance.
(i) To test goodness of fit:

6
This test is used to test the hypothesis that there is no significant difference between
observed and expected frequencies or to test the hypothesis that the observed frequencies
are distributed according to specified probability law.
Let us consider the illustrations to understand the above application
Illustration 6: The information regarding the daily demand of milk bag of a particular dairy at
a retail distribution center is given below. Can it be said that the demand of milk bag does not
depend on the day of week?
Day Mon Tue Wed Thu Fri Sat Sun

Demand of milk bags 14 16 8 12 11 9 14

Solution: Ho: demand of milk bag does not depend on the day of week, i.e. the probability of
1
demand of milk bag at any day is same and is 7

Expected
Demand
(Oi  Ei ) 2
Day P( x) frequency
Ei
(Oi )
Ei  N  P( x)

1 4
Mon 14 12
7 12

1 16
Tue 16 12
7 12

1 16
Wed 8 12
7 12

1
Thu 12 12 0
7

1 1
Fri 11 12
7 12

1 9
Sat 9 12
7 12

1 4
Sun 14 12
7 12

7
50
Total 84 1 84
12

The test statistic is

At 5% level of significance level of significance and with n – 1 = 7 – 1 = 6 df the critical value


from chi square table is 12.59.
Since, 4.17 < 12.59 so the null hypothesis is accepted and hence we conclude that the
demand of milk bag does not depend on the day of week.

Illustration 7: In classical random experiment of tossing five coins 320 times the distribution
of number of heads is as under:
Number of heads 0 1 2 3 4 5 Total

Observed frequency 8 42 116 90 52 12 320

Can we say that the coins are unbiased?


1
Solution: Ho: The coins are unbiased. i.e. p  2 The probability distribution of number of

heads is
p( x)  nCx p x q n  x : x  0,1, 2...5.
x 5 x
1 1
 Cx    
5

2 2
Expected
Observed frequency 5 x
1 1
x
(Oi  Ei ) 2
Number of heads p( x)  5Cx     frequency
(Oi )  2  2 Ei
Ei  N  P( x)

0 5 0
1 1 1
0 8 5
C0      10 0.40
2 2 32

1 51
1 1 5
1 42 5
C1      50 1.28
2  2 32

8
2 5 2
1 1 10
2 116 5
C2      100 2.56
2 2 32

3 5 3
1 1 10
3 90 5
C3      100 1.00
2 2 32

4 5 4
1 1 5
4 52 5
C4      50 0.08
2 2 32

5 5 5
1 1 1
5 12 5
C5      10 0.40
 2  2 32

Total 320 1 320 5.72

(Oi  Ei ) 2
The test statistic is     5.72
2

i Ei
At 5% level of significance and with n – 1 = 6 – 1 = 5 df the critical value from chi square
table is 11.07.
Since, 5.72 < 11.07 so at 5% level of significance and 5 df, the null hypothesis is accepted
and hence we conclude that the coins are unbiased.
(ii) Test of independence of attributes:
Illustration 8: The result of last examination of a sample of 100 students is as under:
Gender First class Second class Pass class Total

Boys 10 28 12 50

Girls 20 22 8 50

Total 30 50 20 100

Can we say that the performance in examination depends upon gender of student?
Solution: Ho: the performance in examination does not depend upon gender of students

Summary:

In this talk we have discussed large sample tests for variable and for proportion. Tests for
variable are (i) Test for mean, (ii) test for two means, (iii) test for two standard deviations.
Tests for proportion are (i) test for single proportion, (ii) test for two proportions. We have
seen a method of determining confidence interval estimation based on the above tests. We
9
have discussed also chi square distribution with its properties .The applications are for
testing of single variance, independence of attributes and goodness of fit.

10

You might also like