02 Significance Level and Type I and II Errors
02 Significance Level and Type I and II Errors
errors
Whenever we’re using hypothesis testing, we always run the risk that the
sample we chose isn’t representative of the population. Even if the sample
was random, it might not be representative.
For instance, if we’ve been told that 15 % of American females have blue
eyes, and we’ve set up null and alternative hypotheses to test this claim,
then when we take a sample to investigate our null hypothesis, we still run
the risk of committing two types of errors.
If, based on the large difference between the sample proportion and the
hypothesized proportion (40 % versus 15 % ), we reject the null hypothesis,
we’ve just made a Type I error. In other words, we make a Type I error
when we mistakenly reject a null hypothesis that’s actually true. The
probability of making a Type I error is alpha, α, also called the level of
significance.
307
Now let’s consider the opposite situation and assume that the null
hypothesis is false, such that the percentage of American females with
blue eyes is not 15 % . But imagine that we take a sample of 100 women and
find a sample proportion of p̂ = 15 % .
If, based on the equality of the sample proportion and the hypothesized
portion, we accept the null hypothesis, we’ve just made a Type II error. In
other words, we make a Type II error when we mistakenly accept the null
hypothesis when it’s actually false. The probability of making a Type II
error is beta, β.
H0 is true H0 is false
Type I error
Reject H0 CORRECT
P(Type I error)=alpha
Type II error
Accept H0 CORRECT
P(Type II error)=beta
There are lots of other ways to describe Type I and Type II errors,
including
Thinking about Type I and Type II errors can get people a little twisted
around sometimes, so if we find that there’s one description of them that
makes more sense to us than the others, we can stick with that one.
308
Because α is literally the probability of making a Type I error, and β is
literally the probability of making a Type II error, we can say that the alpha
level is
Example
Lynnie is testing the hypothesis that people in her town spend more
money on coffee on Monday than they do on Tuesday. She doesn’t know
309
it, but her hypothesis is false: people don’t spend more money on coffee
on Monday. She picks a random sample of people in her town and asks
them how much money they spent on coffee each day. Say whether
Lynnie will make a Type I or Type II error.
Monday Tuesday
CM ≤ CT
CM > CT
310
H0 is true H0 is false
Type I error
Reject H0 CORRECT
P(Type I error)=alpha
Type II error
Accept H0 CORRECT
P(Type II error)=beta
From the table we looked at earlier, the intersection of “reject the null”
and “the null is true” is a Type I error. Lynnie is in danger of committing a
Type I error.
Power
Sometimes we say that the power of a hypothesis test is the probability
that we’ll reject the null hypothesis when it’s false, which is a correct
decision. Rejecting the null hypothesis when it’s false is exactly what we
want to do.
H0 is true H0 is false
So, the higher the power of our test, the better off we are. Power is also
equal to 1 − β.
311
Confidence levels and the α value
This α value, or level of significance, is the same α value we talked about
when we looked at confidence levels and confidence intervals.
So, in the same way that we said we normally pick a confidence level of
90 % , 95 % , or 99 % , we could equivalently say that we normally pick an α
value of 10 % , 5 % , or 1 % .
312
Similarly, since α is the probability of making a Type I error, and β is the
probability of making a Type II error, we’d obviously like to minimize α and
β as much as possible, because of course we always want to minimize the
possibility that we’ll make an error.
In other words, reducing the α value increases the β value, and vice versa.
The only way to reduce them both simultaneously is to increase the
sample size. If we could increase the sample size until it’s as big as the
population, the values of α and β would be 0.
313
they have to reject fewer parts as defective, which saves them money. But
this lower α value might mean that more defective parts make it into cars,
which could lead to cars that are less safe for consumers.
On the other hand, if we’re a consumer who purchases a car made with
these parts, we might prefer that the factory uses a higher α, rejects more
defective car parts, thereby making sure our car is as safe as possible.
However, if the factory uses a higher α value to keep the car safer, we may
have to pay more for the car to account for the increased number of
wasted defective parts.
So increasing the α level will decrease the Type II error risk for the
consumer, but increase the Type I error risk for the producer. In other
words, there are competing interests that are affected by changing the α
value, and we have to decide exactly what α value gives us the balance we
want.
314