C-9 Hypothesis Testing
C-9 Hypothesis Testing
01/23/2023 1
Learning objectives
01/23/2023 2
Hypothesis Testing
• We often gather sample data in order to assess how much
evidence there is against a specific hypothesis about the
population
• We use a process known as hypothesis testing to quantify
our belief against a particular hypothesis
• The purpose of hypothesis testing is to aid the researcher
in reaching a decision concerning a population by
examining a sample from that population
01/23/2023 3
Definition:
A statement about one or more population
A statistical hypothesis is an assumption or a statement which may
or may not be true concerning one or more populations.
It is a process whereby the statistical importance of an
investigation finding is determined by calculating the value of the
chosen test statistic, which is taken from a population sample, and
seeing if the value is equal to or greater than the preset
significance level. If it is, the result is termed statistically
significant.
01/23/2023 4
Con…
• Hypothesis is a testable statement that describes the nature
of the proposed relationship between two or more variables
of interest
• The purpose of the study is to collect data which will allow
the researcher to test the hypothesis
• Hypothesis testing is a statistical method usually used when
a comparison has to be made, such as between two drugs or
two procedures
01/23/2023 5
Examples of Research Hypotheses
Population Mean
• The average length of stay of patients admitted to the
hospital is five days
• The mean birth weight of babies delivered by mothers
with low Socio Economic Status (SES) is lower than
those from higher SES.
Population Proportion
• The proportion of adult smokers in Debre Markos town is
p = 0.40
• The prevalence of HIV among non-married adults is
higher than that in married adults, etc
01/23/2023 6
There are five ingredients to any statistical test
Null Hypothesis
Alternate Hypothesis
Test Statistic
Rejection/Critical Region
Conclusion
01/23/2023 7
Types of Hypothesis
1. Null Hypothesis
2. Alternate Hypothesis
01/23/2023 8
1. The null hypotheses
01/23/2023 11
e.g. The mean height of College of Medicine and Health Sciences
students of Wollo University is 1.63m
H0 : μ = 1.63 m
• At present only 60% of patients with leukemia survive more than
6 years. A doctor develops a new drug. Of 40 patients, chosen at
random, on whom the new drug is tested, 26 are alive after 6
years. Is the new drug better than the former treatment?
• The null hypothesis of the above statement is written as :
H0 : p = 0.60
– Here, we are questioning whether the proportion of patients
who recover under the new treatment is still 0.60 ( and hope
that it will be improved; this will be shown in our choice of HA
in the next section).
01/23/2023 12
2. Alternative Hypothesis (HA)
01/23/2023 13
Con…
Is a statement of what we will believe is true if our sample
data causes us to reject H0.
Is generally the hypothesis that is believed (or needs to be
supported) by the researcher
Is a statement that disagrees (opposes) with Ho
(The effect of interest is not zero)
Never contains “=” , “ ≤” or “≥ ” sign
May or may not be accepted
01/23/2023 14
Con…
• In clinical research, the alternate hypothesis would state
that there is a difference in treatment outcomes between
the new drug and a placebo.
• Consider the previous example (patients with leukemia)
HO: P = 0.60
01/23/2023 15
Steps involved in testing about a hypothesis
H0: μ = μ0 H0: μ = μ0
Two‐tailed One‐tailed
01/23/2023 17
Research Hypotheses: The thing we are primarily interested in
“proving”
Null Hypothesis: Things are what they say they are, status quo
01/23/2023 18
Con…
2. Select a sample and collect data
• Categorical, continuous
3. Decide on the appropriate test statistic for the
hypothesis (Z, t, χ2, F, etc.).
E.g., One population
or
01/23/2023 19
Con…
01/23/2023 20
01/23/2023 21
Con…
01/23/2023 22
• The most frequently used values of α and the corresponding
critical values of Z are:
α (level of significance)
Two-tailed,
0.10: ± 1.64
0.05: ± 1.96
0.01: ± 2.58
One-tailed (less than)
0.1: - 1.28
0.05: - 1.645
0.01: - 2.33
One-tailed (Greater than)
0.1: 1.28
0.05: 1.645
0.01: 2.33
01/23/2023 23
6. Perform the calculation
7. Draw and state the conclusion.
– If the numerical value of the test statistic falls in the
rejection region, we reject the null hypothesis and conclude
that the alternative hypothesis‐testing process will lead to
this conclusion incorrectly only 100α% of the time when
H0 is true.
– If the test statistic does not fall in the rejection region, we
do not reject H0.
01/23/2023 24
Errors of hypothesis testing:
The null hypothesis is either true or false.
Correspondingly, Ho is either not rejected or rejected
01/23/2023 28
Hypothesis Test for One Sample
• Test for single mean
• Test for single proportion
01/23/2023 29
Hypothesis testing for single mean known
variance
• Example :
– Researchers are interested in the mean level of some
enzyme in a certain population. They are asking: can
we conclude that the mean enzyme level in this
population is different from 25?
– Sample of size 10 from a normally distributed
population with a known variance, σ2 = 45. The
calculated sample mean is 22.
01/23/2023 30
Step 1: H0: μ = 25
H1: μ ≠ 25
01/23/2023 31
01/23/2023 32
01/23/2023 33
Step 7: Since ‐1.41 falls in the acceptance region we
accept the null hypothesis.
Reject the alternative hypothesis
01/23/2023 34
01/23/2023 35
Example
• Serum amylase determination were made on a sample
of 15 apparently health subjects. The sample yielded a
mean of 96 units/100ml and a standard deviation of 35
units/100ml. The population was normally distributed
but the variance was unknown. We want to know
whether we can conclude that the mean of the
population is different from 120.
01/23/2023 36
01/23/2023 37
01/23/2023 38
Hypothesis Tests for single Proportion
01/23/2023 39
Con…
01/23/2023 40
Hypothesis Testing about a Single Population Proportion
(Normal Approximation to Binomial Distribution)
01/23/2023 41
• In the general population of 0 to 4-year-olds, the annual
incidence of asthma is 1.4%.
• If 10 cases of asthma are observed over a single year in a
sample of 500 children whose mothers smoke, can we
conclude that this is different from the underlying
probability of P=0.014? CI = 95%
H0 : P=0.014
HA: P≠0.014
01/23/2023 42
Con…
• The test statistic is given by:
01/23/2023 43
Con…
01/23/2023 45
Step 1: HO : P = 0.90
HA : P < 0.90
⇒(.875-.90)/.015 = -1.67
01/23/2023 46
4. Select the level of significance for the statistical test
(α=0.05)
5 and 6: Determine the critical value and Perform the
calculation
• A value the test statistic must attain H0: True to be
declared significant
• Z tab (α = .05) = -1.64 and
• reject HO if Z calc < -1.64
01/23/2023 47
Con…
01/23/2023 48
Two sample mean and proportion
– Let x1, x2, …, xn1 are samples from the first population and
2 n1 n2
The steps to test the hypothesis for difference of means
is the same with the single mean
01/23/2023 50
Example:
– In a large hospital for the treatment of mentally retarded, a
sample of 12 individuals with mongolism yielded a mean
serum uric acid value of 4.5mg/100ml
– In a general hospital a sample of 15 normal individuals of the
same age and sex were found to have a mean value of 3.4
– if it is reasonable to assume that the two populations of values
are normally distributed with variance equal to 1, do these
data provide sufficient evidence to indicate a difference in
mean serum uric acid levels between normal individuals and
individuals with mongolism?
01/23/2023 51
•
01/23/2023 52
• -1.96 0 1.96
Rejection rejection region region
2.82
⇒ Z== 2.82
• Since 2.82>1.96 reject the null hypothesis, i.e., there is
an indication that the means are not equal.
01/23/2023 53
01/23/2023 54
Hypothesis testing about the difference between two
population proportions
• Ho: π1= π2
• HA: π1≠ π2
• We use a pooled sample estimate (p) for the common hypothesized
proportion since the null hypothesis states that π1= π2, which is a
weighted average of the sample proportions, with the sample size
as weights
P = n1p1 + n2p2
n 1 + n2
P can be easily computed by summing all of the successes and
dividing by the total sample size
01/23/2023 55
01/23/2023 56
Example
• Two hundred patients suffering from a certain disease
were randomly divided into two equal groups. Of the
first group, 78 recovered within three days. Out of the
other 100, who were treated by a new method, 90
recovered within three days. The physician wished to
know whether the data provide sufficient evidence to
indicate that the new treatment is more effective than
the standard.
01/23/2023 57
Con…
• H0: π1 = π2
• H1: π1 < π2
• P1 = 78/100 = 0.78, P2 = 90/100 = 0.90
• P =n1p1+ n2 p2
n1+ n2
• P = 100(0.78) +100(0.90) = 0.84
100+100
01/23/2023 58
01/23/2023 59
Con…
01/23/2023 60
Tests of Significance
01/23/2023 61
Confidence interval or p – value
P‐Value
• Another vital concept related to significance is the P
value, commonly reported in scientific journals.
• The P value is related to a hypothesis test; it is the
probability of obtaining a result as extreme as (or more
extreme than) the one observed, if the null hypothesis is
true.
01/23/2023 62
• In other words, the P‐value is the probability of the
observed outcome, assuming that chance alone was
involved in creating the outcome, i.e., assuming the null
hypothesis is correct, what is the probability that we
would have seen the observed outcome.
• We can use our data to calculate the probability that our
finding is just due to chance, under the null hypothesis
01/23/2023 63
Con…
• A P-value is the probability of getting the observed
difference, or more extreme, in the sample purely by
chance from a population where the true difference is
zero.
• If the p‐value is small, meaning the observed
outcome would have been unlikely, we will reject that
chance played the only role in the observed difference
between groups and
• Conclude, say in the example above that new
treatment does in fact have an effect on outcome
compared with the standard
01/23/2023
• How small is P-value? 64
Con…
• The P value is calculated after the statistical test has been
performed; if the P value is less than α (0.05), the null
hypothesis is rejected (statistically significant)
01/23/2023 66
95% confidence interval
01/23/2023 67
Con…
• Confidence interval (CI) is far more informative measure than P
value to evaluate the role of chance
1. Provide information that p-value gives
– If null value is included in a 95% CI, by definition the
corresponding P-value is > 0.05
2. Indicate the amount of variability (effect of sample size) by the
width of the CI
– This information can not be obtained from p-value
– Wide CI indicates greater variability and suggest inadequacy of
01/23/2023 69
Con…
• Chi-Square test allows us to test for association between two
categorical variables.
or
01/23/2023 71
• Definition: A statistic which measures the
discrepancy between K observed frequencies O1, O2
……Ok and the corresponding expected frequencies
e1, e2 …… ek.
Chi square = χ2 = Σ (Oi - ei)2 / ei
01/23/2023 72
01/23/2023 73
Calculation of expected values
01/23/2023 74
Characteristics
1. Every χ2 distribution extends indefinitely to the right from 0.
2. Every χ2 distribution has only one (right ) tail.
01/23/2023 75
01/23/2023 76
01/23/2023 77
Con…
01/23/2023 78
Degree of freedom for χ2
01/23/2023 79
01/23/2023 80
• For 2x2 table, when the total no of observations is less
than 20 or when it is greater than 20 and the smallest of
the four expected frequencies is less than 5, use
Fisher’s Exact test.
01/23/2023 81
01/23/2023 82
01/23/2023 83
01/23/2023 84
01/23/2023 85
Thank
01/23/2023
You! 86