L7-Hypothesis Testing
L7-Hypothesis Testing
HYPOTHESIS TESTING
1
Hypothesis Testing
• The formal process of hypothesis testing
provides us with a means of answering
research questions.
• Hypothesis is a testable statement that
describes the nature of the proposed
relationship between two or more
variables of interest.
• The purpose of the study is to collect data
which will allow the researcher to test the
hypothesis.
2
Idea of hypothesis testing
3
Null and Alternative hypotheses
2
1
Choose α. The value should be small, usually less
Identify the null hypothesis H0 and
than 10%. It is important to consider the
the alternate hypothesis HA.
consequences of both types of errors.
3
4
Select the test statistic and
determine its value from the sample Compare the observed value of the statistic to
data. This value is called the the critical value obtained for the chosen a.
observed value of the test statistic.
Remember that t statistic is usually
appropriate for a small number of
samples; for larger number of 5
samples, a z statistic can work well if
data are normally distributed. Make a decision. 5
Types of testes
1 H 0 : 0 ( 0 )
H A : 1 0 ( 0 )
x 0
zcal
n
ztabulated z for two tailed test
2
2 H 0 : 0 ( 0 )
H A : 1 0 ( 0 )
x 0
z cal , ztabulated z for one tailed test
n
if z cal ztab reject H o
Decision :
if z cal ztab do not reject H o
3 H 0 : 0 ( 0 )
H A : 1 0 ( 0 )
if z cal ztab reject H o
Decision :
if z cal ztab do not reject H o
8
Types of errors
• There are 2 types of errors
9
Test Statistics
In hypothesis testing we always start with
statements for HO and HA.
Because of random variation, even an unbiased
sample may not accurately represent the
population as a whole.
As a result, it is possible that any observed
differences or associations may have occurred by
chance.
Statistical testing of a research hypothesis allows
the researcher to quantify the risk or error
involved in making inferences about a population
based on the information obtained from a
sample.
10
Test Statistics…...
Observed _ Hypothesized
• Test statistics = value value .
Standard error
11
12
13
14
15
The P- Value
• In most applications, the outcome of performing
a hypothesis test is to produce a p-value.
• P-value is the probability that the observed
difference is due to chance.
• A large p-value implies that the probability of
the value observed, occurring just by chance is
low, when the null hypothesis is true.
18
The P- Value…..
• But for what values of p-value should we reject the null
hypothesis?
– By convention, a p-value of 0.05 or smaller is
considered sufficient evidence for rejecting the null
hypothesis.
– By using p-value of 0.05, we are allowing a 5%
chance of wrongly rejecting the null hypothesis
when it is in fact true.
19
Hypothesis Testing of a Single Mean
(Normally Distributed)
Known Variance
Example
• A simple random sample of 10
people from a certain population
has a mean age of 27. Can we
conclude that the mean age of the
population is not 30? The variance
is known to be 20. Let α = .05.
Unknown Variance
• In most practical applications the standard
deviation of the underlying population is not
known
• In this case, can be estimated by the sample
standard deviation s.
• If the underlying population is normally
distributed, then the test statistic is:
Example:
• A simple random sample of 14
people from a certain population
gives a sample mean body mass
index (BMI) of 30.5 and sd of
10.64. Can we conclude that the
BMI is not 35 at α 5%?
Exercise
• A researcher claims that the
mean of the IQ for 16 students is
110 and the expected value for
all population is 100 with
standard deviation of 10. Test
the hypothesis that the IQ of the
population is different from 100 .
25
Hypothesis testing for proportions
Example
• In the study of childhood abuse in psychiatry patients, brown found that 166
in a sample of 947 patients reported histories of physical or sexual abuse.
a. constructs 95% confidence interval
b. test the hypothesis that the true population proportion is 30%?
• Solution (a)
• The 95% CI for P is given by
p (1 p )
p z
2 n
0.175 0.825
0.175 1.96
947
0.175 1.96 0.0124
[0.151 ; 0.2]
27
Example….
• To the hypothesis we need to follow the steps
Step 1: State the hypothesis
Ho: P=Po=0.3
Ha: P≠Po ≠0.3
Step 2: Fix the level of significant (α=0.05)
Step 3: Compute the calculated and tabulated value of the test statistic
p Po 0.175 0.3 0.125
zcal 8.39
p (1 p ) 0.3(0.7) 0.0149
n 947
ztab 1.96
28
Example….
• Step 4: Comparison of the calculated and
tabulated values of the test statistic
• Since the tabulated value is smaller than the
calculated value of the test the we reject the null
hypothesis.
• Step 6: Conclusion
• Hence we concluded that the proportion of
childhood abuse in psychiatry patients is different
from 0.3
29
The chi-squared test
• The chi-squared (X2) test statistics is widely used in the analysis
of contingency tables.
• It compares the actual observed frequency in each group with
the expected frequency (the later is based on theory,
experience or comparison groups).
30
2x2 Contingency table
31
Test Statistic: X2-test with d.f. = (r-1)x(c-1)
32
Assumptions of the X - test 2
S. mansoni Occupation
Fishermen Farmers Traders Craftsmen
Examined 35 43 58 29
Positive 22 21 17 15
% positive 62.9 48.8 29.3 51.7
35
Example ….
Hypothesis:
• Risk of infection and occupation are not associated (the risk of
infection is the same for all occupation groups). (take the level
of significance, =.05)
However, before doing this, make sure that the table is arranged
in such a way that the expected frequencies can be easily
calculated.
36
The observed frequencies are reorganized
as follows:
Occupation
S. mansoni
Fishermen Farmers Traders Craftsmen Total
Positive 22 21 17 15 75
Negative 13 22 41 14 90
Total 35 43 58 29 165
37
The corresponding expected frequencies are:
Occupation
S. mansoni
Fishermen Farmers Traders Craftsmen Total
Total 35 43 58 29 165
38
Example…
• The chi-squared test statistics measures the distance
between these expected frequencies and what we
actually observed
Observed Expected 2
2 = Expected
39
Example….
► A small p-value (p<0.05) indicates that the set of
observed frequencies is further from those
predicted by the null hypothesis (the expected
frequencies) than would be expected by chance
alone.
40
Example….
► In order to test the significance of the 2, the
calculated value of 2 is compared with the
tabulated value for the given df at a certain level of
significance.
41
Example….
The degrees of freedom (df) in a contingency table with R rows and C
columns is:
df = ( R – 1) ( C – 1)
42
Thank You All
43
22 28
2
.
TABLE 6.10
Shipbuilding Cases Controls Total
Yes 11 (a) 35 (b) 46 (r1)
No 50 (c) 203 (d ) 253 (r2)
Total 61 (c1) 238 (c2 ) 299 (n)
TABLE 6.9
Smoking Shipbuilding Cases Controls
No Yes 11 35
No 50 203
Moderate Yes 70 42
No 217 220
Heavy Yes 14 3
No 96 50
MANTEL–HAENSZEL METHOD 209
ad
n
bc
n
70 220
549
28 05
42 217 44