0% found this document useful (0 votes)
65 views44 pages

L7-Hypothesis Testing

This document provides information about hypothesis testing. It defines key terms like the null hypothesis, alternative hypothesis, and type I and type II errors. It outlines the steps in hypothesis testing, including choosing a significance level, selecting a test statistic, and making a decision to reject or fail to reject the null hypothesis based on comparing the test statistic to critical values. It also discusses p-values and how they are used to determine if a result is statistically significant. Different types of hypothesis tests are described, including ones for single means, both with known and unknown variances, and ones for testing proportions. An example walks through testing a hypothesis about a population proportion.

Uploaded by

Ahmed Ebiso
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views44 pages

L7-Hypothesis Testing

This document provides information about hypothesis testing. It defines key terms like the null hypothesis, alternative hypothesis, and type I and type II errors. It outlines the steps in hypothesis testing, including choosing a significance level, selecting a test statistic, and making a decision to reject or fail to reject the null hypothesis based on comparing the test statistic to critical values. It also discusses p-values and how they are used to determine if a result is statistically significant. Different types of hypothesis tests are described, including ones for single means, both with known and unknown variances, and ones for testing proportions. An example walks through testing a hypothesis about a population proportion.

Uploaded by

Ahmed Ebiso
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 44

NEGELE ARSI GENERAL HOSPITAL

AND MEDICAL COLLEGE

HYPOTHESIS TESTING

1
Hypothesis Testing
• The formal process of hypothesis testing
provides us with a means of answering
research questions.
• Hypothesis is a testable statement that
describes the nature of the proposed
relationship between two or more
variables of interest.
• The purpose of the study is to collect data
which will allow the researcher to test the
hypothesis.
2
Idea of hypothesis testing

3
Null and Alternative hypotheses

• Null hypothesis (represented by HO) is the statement about the


value of the population parameter. That is the null hypothesis
postulates that ‘there is no difference between factor and
outcome’ or ‘there is no an intervention effect’.
• Alternative hypothesis (represented by HA) states the ‘opposing’
view that ‘there is a difference between factor and outcome’ or
‘there is an intervention effect’.
• Hypotheses are often stated in a null form, so as to allow them to
be refuted.
E.g. ‘not all swans are white’ (HO ) as opposed to ‘all swans are
white’ (HA).
It is difficult to prove the later as one would supposedly have to
see all the swans in the world.
But taking a sample of swans, we can reject or accept the former
(the null) statement.
4
Steps in hypothesis testing

2
1
Choose α. The value should be small, usually less
Identify the null hypothesis H0 and
than 10%. It is important to consider the
the alternate hypothesis HA.
consequences of both types of errors.

3
4
Select the test statistic and
determine its value from the sample Compare the observed value of the statistic to
data. This value is called the the critical value obtained for the chosen a.
observed value of the test statistic.
Remember that t statistic is usually
appropriate for a small number of
samples; for larger number of 5
samples, a z statistic can work well if
data are normally distributed. Make a decision. 5
Types of testes

1 H 0 :    0 (   0 )
H A : 1   0 (   0 )
x  0
zcal 

n
ztabulated  z  for two tailed test
2

if | zcal | ztab reject H o


Decision : 
if | zcal | ztab do not reject H o
6
Steps in hypothesis testing…..

If the test statistic does not fall in the


If the test statistic falls in the critical
critical region:
region:
Conclude that there is not enough
Reject H0 in favour of HA.
evidence to reject H0.
7
Types of tests

2 H 0 :    0 (   0 )
H A : 1   0 (   0 )
x  0
z cal  , ztabulated  z for one tailed test

n
if z cal   ztab reject H o
Decision : 
if z cal  ztab do not reject H o
3 H 0 :    0 (   0 )
H A : 1   0 (   0 )
if z cal  ztab reject H o
Decision : 
if z cal  ztab do not reject H o
8
Types of errors
• There are 2 types of errors

Type of decision H0 true H0 false

Correct decision (1-


Reject H0 Type I error (α)
β)

Correct decision (1-


Accept H0 Type II error (β)
α)

• Type I error is more serious error and it is the level of significant


• power is the probability of rejecting false null hypothesis and it is
given by 1-β

9
Test Statistics
In hypothesis testing we always start with
statements for HO and HA.
Because of random variation, even an unbiased
sample may not accurately represent the
population as a whole.
As a result, it is possible that any observed
differences or associations may have occurred by
chance.
Statistical testing of a research hypothesis allows
the researcher to quantify the risk or error
involved in making inferences about a population
based on the information obtained from a
sample.
10
Test Statistics…...

• A test statistics is a value we can compare with known


distribution of what we expect when the null hypothesis
is true.

• The general formula of the test statistics is:

Observed _ Hypothesized
• Test statistics = value value .

Standard error

11
12
13
14
15
The P- Value
• In most applications, the outcome of performing
a hypothesis test is to produce a p-value.
• P-value is the probability that the observed
difference is due to chance.
• A large p-value implies that the probability of
the value observed, occurring just by chance is
low, when the null hypothesis is true.

• That is, a small p-value suggests that there


might be sufficient evidence for rejecting the
null hypothesis.
16
P-value …
• A p-value is the probability of getting the
observed difference, or one more extreme, in
the sample purely by chance from a population
where the true difference is zero.
• If the p-value is greater than 0.05 then, by
convention, we conclude that the observed
difference could have occurred by chance and
there is no statistically significant evidence (at
the 5% level) for a difference between the
groups in the population.
17
P-value and confidence interval
• Confidence intervals and p-values are based
upon the same theory and mathematics and will
lead to the same conclusion about whether a
population difference exists.
• Confidence intervals are referable because they
give information about the size of any
difference in the population, and they also (very
usefully) indicate the amount of uncertainty
remaining about the size of the difference.

18
The P- Value…..
• But for what values of p-value should we reject the null
hypothesis?
– By convention, a p-value of 0.05 or smaller is
considered sufficient evidence for rejecting the null
hypothesis.
– By using p-value of 0.05, we are allowing a 5%
chance of wrongly rejecting the null hypothesis
when it is in fact true.

• When the p-value is less than to 0.05, we often say that


the result is statistically significant.

19
Hypothesis Testing of a Single Mean
(Normally Distributed)
Known Variance
Example
• A simple random sample of 10
people from a certain population
has a mean age of 27.  Can we
conclude that the mean age of the
population is not 30?  The variance
is known to be 20. Let α = .05.
 
Unknown Variance
• In most practical applications the standard
deviation of the underlying population is not
known
• In this case,  can be estimated by the sample
standard deviation s.
• If the underlying population is normally
distributed, then the test statistic is:
Example:
• A simple random sample of 14
people from a certain population
gives a sample mean body mass
index (BMI) of 30.5 and sd of
10.64. Can we conclude that the
BMI is not 35 at α 5%?
Exercise
• A researcher claims that the
mean of the IQ for 16 students is
110 and the expected value for
all population is 100 with
standard deviation of 10. Test
the hypothesis that the IQ of the
population is different from 100 .
25
Hypothesis testing for proportions
Example
• In the study of childhood abuse in psychiatry patients, brown found that 166
in a sample of 947 patients reported histories of physical or sexual abuse.
a. constructs 95% confidence interval
b. test the hypothesis that the true population proportion is 30%?
• Solution (a)
• The 95% CI for P is given by

 p (1  p )
p  z
2 n
0.175  0.825
 0.175  1.96 
947
 0.175  1.96  0.0124
 [0.151 ; 0.2]

27
Example….
• To the hypothesis we need to follow the steps
Step 1: State the hypothesis
Ho: P=Po=0.3
Ha: P≠Po ≠0.3
Step 2: Fix the level of significant (α=0.05)
Step 3: Compute the calculated and tabulated value of the test statistic


p  Po 0.175  0.3  0.125
zcal     8.39
p (1  p ) 0.3(0.7) 0.0149
n 947
ztab  1.96
28
Example….
• Step 4: Comparison of the calculated and
tabulated values of the test statistic
• Since the tabulated value is smaller than the
calculated value of the test the we reject the null
hypothesis.
• Step 6: Conclusion
• Hence we concluded that the proportion of
childhood abuse in psychiatry patients is different
from 0.3
29
The chi-squared test
• The chi-squared (X2) test statistics is widely used in the analysis
of contingency tables.
• It compares the actual observed frequency in each group with
the expected frequency (the later is based on theory,
experience or comparison groups).

• The chi-squared test allows us to test for association between


categorical (nominal) variables.

• The null hypothesis for this test is there is no association


between the variables. Consequently a significant p-value
implies association.

30
2x2 Contingency table

31
Test Statistic: X2-test with d.f. = (r-1)x(c-1)

32
Assumptions of the X - test 2

• Data must be categorical


• The data be a frequency data (counts for frequency,
proportions /difference of proportions for prevalence
& incidence).
• The chi-squared test assumes adequate sample size -
that the numbers in each cell are not too small.

 No expected frequency should be less than 1, and no


more than 20% of the expected frequencies should be
less than 5.
33
Assumptions of the X – test… 2

 If some numbers are too small, row or column


variables categories can sometimes be combined to
make the expected frequencies larger or use Yates
correction, the Fisher’s exact test should be used
instead.
• It assumes that measures are independent of each
other i.e. the categories created are mutually
exclusive.
• The X2 - test assumes that there is/must exist
theoretical basis for the categorization of the
variables.
34
Example
The following table shows the results of a survey in a rural area in Central Africa to
compare the prevalences of infection with Schistosoma mansoni among different
occupations (Kirkwood, 1988).

S. mansoni  Occupation
Fishermen Farmers Traders Craftsmen

Examined 35 43 58 29
Positive 22 21 17 15
% positive  62.9 48.8 29.3 51.7

35
Example ….
Hypothesis:
• Risk of infection and occupation are not associated (the risk of
infection is the same for all occupation groups). (take the level
of significance,  =.05)

Calculating expected frequencies:


Calculate expected frequencies as if there was no difference
(i.e. the null hypothesis is true) and compare with the observed.

However, before doing this, make sure that the table is arranged
in such a way that the expected frequencies can be easily
calculated.
36
The observed frequencies are reorganized
as follows:

Occupation
S. mansoni 
Fishermen Farmers Traders Craftsmen Total

Positive 22 21 17 15 75
Negative 13 22 41 14 90

Total 35 43 58 29 165
 

37
The corresponding expected frequencies are:

Occupation
S. mansoni 
Fishermen Farmers Traders Craftsmen Total

Positive 15.9 19.5 26.4 13.2 75


Negative 19.1 23.5 31.6 15.8 90

Total 35 43 58 29 165
 

38
Example…
• The chi-squared test statistics measures the distance
between these expected frequencies and what we
actually observed
 Observed  Expected 2
2 =  Expected

• Under the null hypothesis, 2 has a chi-


squared distribution
• A large 2 value contradicts the null
hypothesis of no association

39
Example….
► A small p-value (p<0.05) indicates that the set of
observed frequencies is further from those
predicted by the null hypothesis (the expected
frequencies) than would be expected by chance
alone.

► If the value of 2 is zero, then there is a perfect


agreement between the observed and the
expected frequencies. The greater the discrepancy
between the observed and expected frequencies,
the larger will be the value of 2.

40
Example….
► In order to test the significance of the 2, the
calculated value of 2 is compared with the
tabulated value for the given df at a certain level of
significance.

 ² calc = (22-15.9)²/15.9 +(21-19.5)²/19.5


+(17-26.4)²/26.4 + (15-13.2)²/13.2 + …
+ (14-15.8)²/15.8
= 11.09

41
Example….
 The degrees of freedom (df) in a contingency table with R rows and C
columns is:

df = ( R – 1) ( C – 1)

Hence, 2 tabulated with df = 3, at .05 level of significance = 7.82

 Because 11.09 > 7.82, the null hypothesis of no association is rejected.


The corresponding P-value derived from the calculated 2 is less than 2%
(that is, P<.02).

 This suggests that there is an association between risk of infection and


occupation, the prevalence of S. mansoni being highest among fishermen
and lowest among traders.

 The commonly used threshold for accepting or rejecting the null


hypothesis in 2 by 2 tables is 2 = 3.84, which is where p=0.05

42
Thank You All

43
22 28
2
.
TABLE 6.10
Shipbuilding Cases Controls Total
Yes 11 (a) 35 (b) 46 (r1)
No 50 (c) 203 (d ) 253 (r2)
Total 61 (c1) 238 (c2 ) 299 (n)
TABLE 6.9
Smoking Shipbuilding Cases Controls
No Yes 11 35
No 50 203
Moderate Yes 70 42
No 217 220
Heavy Yes 14 3
No 96 50
MANTEL–HAENSZEL METHOD 209
ad
n
bc
n
70 220
549
28 05
42 217 44

You might also like