Testing of Hypothesis
Testing of Hypothesis
Descriptive
quantitative descriptions of
characteristics
Inferential Statistics
Drawing conclusions about parameters ~
What is a Hypothesis?
Null hypothesis
Alternative hypothesis
Test statistic
Critical or Rejection region
Level of significance
Null Hypothesis, H0
Assume the
population
mean age is 50.
(Null Hypothesis) Population
Is 50? The
Sample
Mean Is 20
No, not likely!
REJECT
Null Hypothesis Sample
What is Hypothesis Testing
It is a well-defined procedure which
helps us to decide whether to accept
or reject the hypothesis based on the
information available from the
sample.
Example
Suppose two different concerns
manufacture drugs or inducing
sleep, drug A manufactured by
first concern and drug B
manufactured by second concern.
Each company claims that its
drug is superior to that of the
other and it is desired to test
which is a superior drug A or B?
Let X = r.v.which denotes the additional hrs of
sleep gained by an individual when drug A is given
Let Y = r.v.which denotes the additional hrs of
sleep gained by an individual when drug B is given
Let X and Y follow the prob. Distribution with
mean X and Y
Ho: X = Y
H1 : X Y
= X > Y
X Y
Test statistic : rule or a procedure based on
a statistic to decide either to accept or reject
the null hypothesis
Critical or Rejection region : Values of
the test statistic for which we reject the null in
favor of the alternative hypothesis
Level of significance: Determines how much
difference between the sample statistic and the
population parameter may be considered as significant
so as to reject the null hypothesis of no difference.
Level of Significance, a and
the Rejection Region
H0: 3 a Critical
H1: < 3 Value(s)
Rejection 0
Regions a
H0: 3
H1: > 3
0
a/2
H0: 3
H1: 3
0
Errors in Making Decisions
Type I Error
Reject True Null Hypothesis
Example : in a clinical trial of a new drug, the
null hypothesis might be that the new drug is no
better, on an average than the current drug;
Ho : no significant difference b/w two drugs on
an average.
Type I error : two drugs produced different
effects when there was no difference b/w them.
Known as Producer’s risk (rejecting a good lot)
Say, a manufacturer of pen rejects a lot of high
quality pens due to standards that fall outside of
their allowable range.
Type II Error
Do Not Reject (or accept) False Null Hypothesis
In the above example, type II error occurs if it was
concluded that two drugs produced the same effect,
i.e. there is no difference b/w the two drugs on
average, when in fact they were different.
Known as Consumer’s risk (accepting a bad lot)
Say, a house is purchased that is believed to be of high
quality but within a month the plumbing has failed.
This is the risk of a consumer.
Hypothesis testing
Decision
a
One – Tailed Test and Two – Tailed
Test
Type of Test
The samples are paired when the data for the two samples relate to
the same group of respondents.
A Classification of Hypothesis Testing
Procedures for Examining Differences
Hypothesis Tests
Parametric Non-parametric
Tests (Metric Tests
Tests) (Nonmetric Tests)
Independen Paired
t Samples Samples Independen Paired
t Samples Samples
* Two-Group t * Paired
test * Chi-Square * Sign
t test * Mann-Whitney * Wilcoxon
* Z test
* Median Chi-Square
One – sample tests
One sample Z test for mean
One sample t test for mean
One sample Z test for proportion
One sample Z test for mean
Z-test is a statistical test used to determine if the
significance between a sample mean and the
population mean is sufficiently different.
Population mean and population standard deviation
must be known.
Mean and size of the sample must also be known.
Test statistic is given by:
Z=X-
/ n
and Z N (0,1)
X
t
S
n
Where, x = sample mean
= population mean under null hypothesis
n 2
s = sample S.D. = 1
n 1 i 1
( xi x)
n = sample size
t t (n-1)
1 2
Example
A large retailing company wants to know whether there
is a difference in the average size of customer accounts
in its Kolkata and Mumbai stores. Past experiences has
shown that the standard deviations for the two are Rs.
180 and Rs. 192, respectively. samples of 80 accounts
taken from Kolkata gave a mean value of Rs. 885 and 90
accounts from Mumbai gave a mean value of Rs. 936.
Does this provide evidence at the 5% level of
significance that the mean account sizes at the two
stores to be different?
Example
A sample of 80 steel wires produced by factory A yields a
mean breaking strength of 1,240 pounds with a standard
deviation of 120 pounds. Another sample of 100 steel
wires produced by factory B, on the other hand, yields a
mean breaking strength of 1,180 pounds with a standard
deviation of 105 pounds. Can it be concluded that the
mean breaking strength of wires produced by factory A
is greater than that of factory B? test at 0.01 level of
significance.
Two sample t test for difference of two
means(preferred over z test when sample
size is <30 and pop. Variances are unknown)
Assumption : X2 Y2 2
H0 : X Y
Test statistic:
t ( x y ) ( X Y )
1 1
S ( )
n1 n2
Where, n1 1 n2
1
x xi
n1 i 1
y
n2
y
j 1
j
1
S ( xi x) ( y j y)
2 2 2
n1 n2 2 i j
the statistic follows t-distribution with
(m+n-2) degrees of freedom.
DECISION RULE :
reject null hypothesis if the calculated
statistic is larger than the tabulated
statistic, at m+n-2 degrees of
freedom and the required level of
significance.
EXAMPLE
A random sample of 12 families in one
city showed an average monthly food
expenditure of Rs. 1380 with a s.d. of
Rs. 100 and a random sample of 15
families in another city showed an
average monthly food expenditure of
Rs. 1320 with a s.d. of Rs. 120. Test
whether the difference between the
two means is significant at 0.01 level
of significance.
EXAMPLE
The mean life of a sample of 10
electric light bulbs was found to be
1456 hours with s.d. of 423 hours. A
second sample of 17 bulbs chosen
from a different batch showed a mean
life of 1280 hours with s.d. of 398
hours. Is there a significant difference
between the means of the two
batches.
Example
Below are given the gain in weights (in
kgs) of pigs fed on two diets A and B.
Diet A:
25,32,30,34,24,14,32,24,30,31,35,25
Diet B :
44,34,22,10,48,31,40,30,32,35,18,21,35,2
9,22
Test, if the two diets differ significantly as
regards their effect on increase in weight.
Paired t test( for correlated or
dependent samples)
Used to test the difference of two
population means when the two samples
are correlated i.e. there exist one-to-one
correspondence between the values of the
sample.
Example : suppose we want to test the
efficiency of a drug. Let xi and yi
(i=1,2…..,n) be the readings in hrs of sleep
before and after the drug is given.
di = xi - yi
Null hypothesis is there is no significant
difference in the means of two related samples
Ho :
1 2 0
H1 : 1 2
Test statistic :
d
t tn-1
s/ n
n
d di / n
i 1
n
s (d d ) /( n 1)
i
i 1
follows Student’s t-distribution with
(n-1) d.f.
Example
The Peak Expiratory Flow Rate (PEFR)of 9
asthma patients was taken before and after
a walk on an extremely cold winter day for
comparing the rates. The following data
was obtained:
Before: 312,242,340,388,296,254,391,402,290
After :300,201,232,312,220,256,328,330,231
Test whether there is any significant difference
between the PEFR of asthma patients before and
after a walk on a cold winter day.
Example
A company is concerned about the decline in
its sales revenues. After an analysis, the
management concluded that the employee
attitudes had become negative due to
increased competition and excessive workload.
The management organized a 7 day special
motivational programme. In order to analyse
the effectiveness of the motivational
programme, the company researchers have
administered a well-designed questionnaire to
12 employees selected randomly. Take 90% as
the confidence level and examine whether the
motivational programme has changed the
attitude of the employees.
Scores before the Score after the programme
programme
25 29
26 30
25 31
27 30
28 31
25 32
29 33
27 31
30 32
28 30
29 31
25 32
Two sample Z test for difference of
two proportions
Suppose we want to compare two distinct
populations with respect to certain attribute
say A, among their members.
Let X1 and X2 be the number of persons
possessing the given attribute A in random
samples of sizes n1 and n2 from the two
populations respectively. Then sample
populations are given by :
p1 = X1 / n1 and p2 = X2 / n2
Let P1 and P2 = population proportions.
Null hypothesis, Ho : P1 = P2 (against alternative
hypothesis)
Test statistic :
p 1 p2
z N(0,1)
1 1
pˆ (1 pˆ )( )
n1 n2
p̂= pooled estimate of the population
proportion of success. = (x1+x2) / (n1+n2)
EXAMPLE
There has been a fundamental shift in Indian economy after 1991. All
business sectors including the banking sector have been affected by
the liberalization and privatization measures of the government. Due to
heavy competition, Indian public sector banks have also adopted
consumer-friendly policies such as extending service time for their
customers. On one hand, changes introduced by the banks enhance the
quality of services, however, on the other hand, they are also
responsible for generating stress among employees. A researcher
wants to assess the stress levels of bank employees. The researcher
has selected two banks, A and B for this purpose.
The working hours of bank A are from 10a.m. to 3.30 p.m. and the
working hours of bank B are from 8.00 a.m. to 8.00 p.m. The
researcher has randomly selected 40 employees from bank A and 10 of
them have indicated high stress levels. The researcher has also
randomly selected 50 employees from bank B and 22 of them have
indicated high stress levels. Does this indicate that the stress levels of
employees of bank B are significantly higher. Test the hypothesis at
1% level of significance.
example
A footwear company has launched a 100% leather shoe for
both male and female customers. The company conducted a
survey to understand the perception of customers about a
100% leather shoe. The company has taken a random sample
of 130 males and 150 female customers. Out of 130 males, 50
responded that a 100% leather shoe matches their lifestyle.
Out of 150 females, 90 females responded that a 100% shoe
matches their lifestyle. Does this indicates that there is a
significant difference in the proportion of male and female
customers in the population stating that a 100% leather shoe
matches with their lifestyle? Test the hypothesis at 95% C.I.
Hypothesis testing for difference in
two population variances- F
Distribution
Machine 18 19 19 18 7 19 18 19 18 19
1
Machine 2 16 17 17 17 16 18 16 16 17 17 16 16 17
How can the researcher determine whether the variance is from the
same population ( population variances are equal) or it comes from
different populations (population variances are not equal)? Take α =
0.05 as the confidence level.
EXAMPLE
Two sources of raw materials are under consideration by
a company. Both sources seem to have similar
characteristics but the company is not sure about their
respective uniformity. Obtain estimates of the variances
of the population and test whether two populations have
the same variance.