Hypothesis Testing
Hypothesis Testing
1
Central Limit Theorem
As Sample
Size Gets Sampling
Large Distribution
Enough Becomes
Almost Normal
regardless of
shape of
population
X
X
Central Limit Theorem
For almost all populations, the sample mean is
normally or approximately normally distributed, and
the mean of this distribution is equal to the mean of
the population and the standard deviation of this
distribution can be obtained by dividing the
population standard deviation by the square root of
the sample size
X ~ N ,
n
because , CLT states that
X and X n
Central Limit Theorem
⬥ If the original population is normal, a
sample of only 1 case is normally distributed
⬥ The farther the original sample is from
normal, the larger the sample required to
approach normality
⬥ Even for samples that are far from normal a
modest number of cases will be
approximately normal
When the Population is Normal
Population Distribution
= 10
Central Tendency
x
= 50 X
Variation
Sampling Distributions
x n=4
n =16
n =5 X = 2.5
X-X = 50 X
When The Population is Not Normal
Population Distribution
Central Tendency
σ= 10
x
µ = 50
Variation X
x Sampling Distributions
n
𝜇𝑥 = 1.8
n=4 n=30
𝜇𝑥 = 5
X 50
The Normal Distribution
⬥ Along the X axis you see Z scores,
i.e. standardized deviations from
the mean
x
Z
• Just think of Z scores as std.
dev. denominated units.
17
Characteristics of Hypothesis
27
One-sided (right tailed) test
28
One-sided (left tailed) test
29
Level of Significance
⬥ Represented by α
⬥ This is the probability of committing the
Type I error
⬥ It measure the amount of risks associated
in taking decisions
⬥ This factor has to be chosen before sample
information is collected
⬥ It is either 0.01 or 0.05
How to compute the level of
significance?
⬥ To measure the level of statistical
significance of the result, the investigator
first needs to calculate the p-value
⬥ It defines the probability of identifying an
effect which provides that the null
hypothesis is true
34
Common Tests
35
T-test
⬥ The t-test is a basic test that is limited to
two groups.
⬥ For multiple groups, you would have to
compare each pair of groups, for example
with three groups there would be three
tests (AB,AC, BC)
⬥ The basic principle is to test the null
hypothesis that the means of the two
groups are equal.
36
T-test
⬥ The t-test assumes:
A normal distribution (parametric data)
Underlying variances are equal (if not, use Welch's test)
⬥ It is used when there is random assignment and only two sets of
measurement to compare.
⬥ There are two main types of t-test:
⬦ Independent-measures t-test: when samples are not
matched.
⬦ Matched-pair t-test: When samples appear in pairs (eg.
before-and-after).
⬥ A single-sample t-test compares a sample against a known figure,
for example where measures of a manufactured item are
compared against the required standard.
37
T-test Applications
⬥ To compare the mean of a sample with population
mean.
⬥ To compare the mean of one sample with the mean
of another independent sample.
⬥ To compare between the values (readings) of one
sample but in 2 occasions.
38
One Sample Test
(Sample mean and population mean )
⬥ Ho: Sample mean=Population mean.
⬥ Degrees of freedom = n – 1
39
Sample mean and population mean
Example: The following data represents hemoglobin
values in gm/dl for 10 patients:
10.5 9 6.5 8 11 7 8.5 9.5 12 7.5
Is the mean value for patients significantly differ from the
mean value of general population (12 gm/dl)?
Df=9
Find tabulated value for 9df and % 0.05 level of significance= 2.262
Calculated value> tabulated value
Reject Ho.
There is a statistically significant difference between the mean of sample
and population mean, and this difference is unlikely due to chance. 40
T Table
Calculating p-
value from t-
value
41
Example in R
t.test(data$V1,mu=12)
43
Example
⬥ We measure the grams of protein for a
sample of energy bars. The label claims that
the bars have 20 grams of protein. We want
to know if the labels are correct or not.
Energy Bar - Grams of Protein
20.70 27.46 22.15 19.85 21.29 24.75
20.75 22.91 25.34 20.33 21.54 21.08
22.14 19.56 21.10 18.04 24.12 19.95
19.72 18.28 16.26 17.46 20.53 22.12
25.06 22.44 19.08 19.88 21.39 22.33 25.79 44
⬥ n=31
⬥ Ho:μ=20
Ha:μ≠20
⬥ t=Difference/Standard Error=1.40/0.456=3.07
⬥ Ho:
Mean of sample 1 = Mean of sample 2
⬥ Degrees of freedom = n1+n2 – 2
46
Two Sample Tests
(Mean of Two samples)
The following data represents weight in Kg for 10 males
and 12 females.
Males: 80 75 95 55 60 70 75 72 80 65
Females: 60 70 50 85 45 60 80 65 70 62 77 82
47
Two Sample Tests
(Mean of Two samples)
Mean1=72.7 Mean2=67.17
Variance1=128.46 Variance2=157.787
Df = n1+n2-2=20
t = 1.074
=9.38
Tabulated t (df7), with level of significance 0.05, two tails, = 2.36
We reject Ho and conclude that there is significant difference
between BP readings before and after treatment.
P<0.05.
53
Paired Two Sample Test
One sample in two occasions
t.test(data$V1,data$V2,paired = TRUE)
Paired t-test
data: data$V1 and data$V2
t = 9.3876, df = 7, p-value = 3.24e-05
alternative hypothesis: true difference in means is
not equal to 0
95 percent confidence interval:
43.48397 72.76603
sample estimates:
mean of the differences
54
58.125
⬥ t=3.15
55
Z test
Suppose we randomly sampled subjects from an honors program.
We want to determine whether their mean IQ score differs from
the general population. The general population’s IQ scores are
defined as having a mean of 100 and a standard deviation of 15.
IQ score sample mean (x̅): 107
Null (H0): µ = 100 Sample size (n): 25
Alternative (HA): µ ≠ 100 Hypothesized population mean (µ0): 100
Population standard deviation (σ): 15
57
Suppose a teacher claims that his section's students will score
higher than his colleague's section. The mean score is 22.1 for
60 students belonging to his section with a standard deviation
of 4.8. For his colleague's section, the mean score is 18.8 for 40
students and the standard deviation is 8.1. Test his claim at α =
0.05.
58
Power of test
59
The Neyman-Pearson Lemma
⬥ The Neyman-Pearson Lemma holds immense importance
when it comes to solving problems that demand decision
making or conclusions to a higher accuracy.
⬥ It offers a powerful framework for making informed
decisions based on statistical evidence.
⬥ At the heart of the Neyman-Pearson Lemma lies
the concept of statistical power. Statistical power
represents the ability of a hypothesis test to detect a true
effect or difference when it exists in the population.
⬥ The lemma emphasizes the importance of optimizing this
power while controlling the risk of both Type I and Type II
errors. 60
The Neyman-Pearson Lemma
⬥ The Neyman-Pearson Lemma allows us to
strike a balance between these errors by
maximizing power while setting a
predetermined significance level (the
probability of Type I error).
⬥ It states that the likelihood ratio test is the
most powerful test for a given significance
level in binary hypothesis testing.
61
The Neyman-Pearson Lemma
⬥ The likelihood ratio test compares the likelihoods of
the observed data under the null and alternative
hypotheses and accepts the alternative hypothesis if
the likelihood ratio exceeds a certain threshold.
Mathematically, the likelihood ratio test is given by:
⬥ Reject H0 if L(x) = f1(x) / f0(x) > k
⬥ where k is a threshold determined based on the
desired significance level α. The threshold k is
chosen such that the probability of Type I error (false
positive) is equal to α.
62
NP Lemma Example
⬥ Null hypothesis (H0): The patient does not have the
disease.
⬥ Alternative hypothesis (H1): The patient has the disease.
⬥ We want to design a test that can accurately determine
whether a patient has a specific disease or not. We need to
balance the risks of two types of errors:
⬥ Type I error (false positive): Rejecting the null hypothesis
(saying the patient has the disease) when the patient is
actually healthy.
Type II error (false negative): Failing to reject the null
hypothesis (saying the patient is healthy) when the patient
actually has the disease.
63
⬥ H0: The biomarker levels follow a normal
distribution with parameters μ0 (mean under
H0) and σ0 (variance under H0).
64