Lecture 4
Lecture 4
Hypothesis Testing
17/01/2024 1
Today’s outline
Hypothesis testing
• The basic idea: is a coin biased or not?
• Significance level
• Hypothesis testing for
– Proportions
– Means
– Differences between two means (Video)
17/01/2024 2
Statistical claims
• 75% of parents think that the IQ of their children is higher
than the average IQ.
17/01/2024 3
Hypothesis testing
Confidence intervals are used to estimate population parameters
such as population means and proportions.
Examples are:
– Is the advertising campaign a success?
– Has a new procedure improved the quality of the product or the
efficiency of a process?
– Are two groups (e.g., males vs females) different with regard to a
certain characteristic?
17/01/2024 4
An illustration: The biased coin test
Alex thinks a coin may be biased. She throws it 10 times and
gets tail 9 times, i.e., 90%. On the basis of this evidence she
claims that the coin is biased. After all, she says, it is unlikely
that one would observe 9 tails in 10 trials if the coin was
unbiased.
17/01/2024 5
An illustration: The biased coin test
If the coin was unbiased, i.e., pr(tail)= 0.5, then this
would be the distribution of the number of tails:
Number of tails
Alex observes 9 out 10
It is 0.0108+0.0108 =
0.0216
Here you also reject the Here you reject the hypothesis of
Number of tails an unbiased coin. If the coin is
hypothesis of an unbiased
coin. If the coin is indeed indeed unbiased these outcomes
unbiased these outcomes (1 (9 or more) occur with probability
or less ) occur with 0.0108
probability
17/01/2024 0.0108 8
How sure do you want to be?
Suppose you rejected the hypothesis that a coin was
unbiased whenever you observed at least 8 tails or at
most 2 tail in 10 trials. What is the probability you would
mistakenly claim that an unbiased coin is biased?
Picture of the number of tails when an
unbiased coin is tossed 10 times
It is 0.0547+0.0547 =
0.1094
Here you also reject the Here you reject the hypothesis of
Number of tails an unbiased coin. If the coin is
hypothesis of an unbiased
coin. If the coin is indeed indeed unbiased these outcomes
unbiased these outcomes (2 (8 or more) occur with probability
or less ) occur with 0.0547
probability
17/01/2024 0.0547 9
Trade-offs between different errors
You do not say
You say it is biased it is biased
Coin is
biased Correct Error
17/01/2024 10
Suppose I only say it is biased when I have extreme
data: the number of tails is 10 or zero!
Picture of the number of tails when an
unbiased coin is tossed 10 times
Here you also reject the Number of tails Here you reject the hypothesis of
hypothesis of an unbiased an unbiased coin. If the coin is
coin. If the coin is indeed indeed unbiased this outcome
unbiased this outcome (zero) (10) occurs with probability 0.001
occurs with probability 0.001
17/01/2024 1111
Suppose I only say it is biased when I have extreme
data: the number of tails is 10 or zero!
You say it You do not say
is biased it is biased
This error
Coin is is highly
unlikely Correct
unbiased
(Prob =
0.002)
17/01/2024 12
Please provide evidence
Drug approval
Approve the drug Do not approve the
as safe drug as safe
Drug is safe
Drug may be is
harmful
17/01/2024 13
Snake or stick?
Here you also reject the Here you reject the hypothesis of
Number of tails an unbiased coin. If the coin is
hypothesis of an unbiased
coin. If the coin is indeed indeed unbiased these outcomes
unbiased these outcomes (3 (7 or more) occur with probability
or less ) occur with 0.1719
probability 0.1719
17/01/2024 15
Suppose I say the coin is biased whenever I have some
evidence: the number of tails is 7 or more or 3 and less!
Picture of an object that
may be a snake
You do not say
You say it is biased it is biased
BUT: this
error is
Coin is
Correct unlikely: I
biased
often say the
coin is biased
17/01/2024 16
Important Slide: Logic of Hypothesis testing
• I have some data with a result (fex: proportion tails observed = 9
out of 10)
• 17/01/2024
I reject the null-hypothesis if the p-value is low enough. 17
The training manager’s problem
Elizabeth Field is the training manager of a light engineering firm
which employs a considerable number of skilled machine
operators. The firm is constantly making efforts to improve the
quality of its product, and so recently Elizabeth has introduced a
new training course for workers who have been on the same
machines for a long time. The first group has now completed the
course and returned to normal work, and Elizabeth would like to
assess the effect, if any, that the retraining has had upon the
standard of its work so that she can decide whether to make such
courses a regular event.
17/01/2024 19
4%
Hypothesis testing: Make hypotheses
• Let p and q be Adam’s defect rate in the past and now,
respectively. Make a null hypothesis NH and an alternative
hypothesis AH, which cover all possibilities, but are mutually
exclusive.
17/01/2024 20
What is the probability we would observe
a change of this magnitude (or more)?
• Null-hypothesis: the error rate is unchanged,
still = 4%
17/01/2024 21
Picture of a normal distribution
What is the probability we would observe
a change of this magnitude (or more)?
• Null-hypothesis: the error rate is unchanged,
still = 4%
-2 -0.51 0
Null hypothesis:
Picture of a normal distribution
new error rate – old error rate = 0
17/01/2024 23
Hypothesis testing: The Significance level
-2 2
Picture of a normal distribution
Reject the null hypothesis
Reject the null hypothesis
17/01/2024 24
Hypothesis testing: The Significance level
So: we observed t = -0.51.
-2 2
Picture of a normal distribution
We observed -0.51.
17/01/2024 26
Critical values for two-tailed tests
5% significance level 1% significance level
It is 0.0108+0.0108 =
0.0216
Here you also reject the Here you reject the hypothesis of
Number of tails an unbiased coin. If the coin is
hypothesis of an unbiased
coin. If the coin is indeed indeed unbiased these outcomes
unbiased these outcomes (1 (9 or more) occur with probability
or less ) occur with 0.0108
probability
17/01/2024 0.0108 28
In practice
• The statistical software will produce a p-value and one
compares the p-value with a chosen significance level
α, which is often 0.01 or 0.05.
• If p-value is smaller than α/2 (two-tailed test) one
"rejects the null hypothesis” and the result is said to
be ’statistically significant’.
• P-value is the probability of obtaining a test statistic at
least as extreme as the one that was actually observed,
assuming that the null hypothesis is true, that is,
p-value = p (at least as extreme as the data | NH is true)
17/01/2024 29
Hypothesis testing: Calculate test statistic
and p-value
The t-statistic was calculated based on the following formula:
t = (q-p)/STEP
= (0.035-0.04)/sqrt(0.04*(1-0.04)/400)
= -0.005/0.0098
= -0.51.
17/01/2024 30
P(Z < -0.51) = P(Z > 0.51)
=0.305
17/01/2024 32
How does the sample size affect conclusions?
17/01/2024 33
Hypothesis testing: An alternative
approach
• Construct a 95% confidence interval around sample prop.
[q – 2*STEP, q + 2*STEP] = [1.54%, 5.46%].
• Compare the 95% confidence interval with the proportion p
= 4%
• We reach the same conclusion: At the 5% significance level,
we do not reject NH since p=4% is included in the 95%
confidence interval.
3.5%
1.54% 5.46%
Picture of a normal distribution
17/01/2024 34
Important Slide: Framework for
conducting hypothesis testing: Means
17/01/2024 35
Example Calculations
• A firm has adopted a new the web design during 81 days and
experienced an sales increase of 10 (thousands) pound per
day with a standard deviation of 27.
• Can you reject the null hypothesis that there is no sales
increase, using a significance value of 5%?
17/01/2024 36
Answer
17/01/2024 38
Next lecture
• About ‘analysis of variance’, which is closely
related to today’s test statistics.
17/01/2024 39