5 Hypothesis Tests
5 Hypothesis Tests
For any hypothesis which claims that a population parameter has changed, there is an op-
posing hypothesis which claims that no change has occurred. These are called the alternative
hypothesis and null hypothesis respectively.
• Null hypothesis: (denoted as H0 ) Hypothesis which states that a population parameter
has not changed, i.e. is equal to a value which would be expected.
To test the null hypothesis against the alternative hypothesis, a hypothesis test is used. It is
a procedure which uses sample data as evidence to decide whether to reject or not reject the
null hypothesis.
A hypothesis test can be classified in two ways based on the statement of the alternative
hypothesis.
• One-tail test: Alternative hypothesis states that a population parameter is less than
(lower-tail) or greater than (upper-tail) the value in the null hypothesis.
• Two-tail test: Alternative hypothesis states that a population parameter is different from
the value in the null hypothesis.
Like confidence intervals, a hypothesis test also has a significance level associated to it in the
form of a percentage. It is usually a small value like 5% or 1%.
To perform a hypothesis test, first state the null and alternative hypothesis. A random sample
of the population is taken. A value obtained/calculated from the sample data (associated to
the parameter) is called the test statistic.
Assuming the null hypothesis, the probability (for samples of the same size) of obtaining a
value which is equal to or more extreme than (relative to the alternative hypothesis) the test
statistic is calculated.
• If the probability is less than the significance level (for one-tail) or half the significance
level (for two-tail), the null hypothesis is rejected.
Remark: The null hypothesis is rejected because the sample gave data which supports the
alternative hypothesis even though it was very unlikely to happen, i.e.the probability of events
as extreme as it is less than the significance level which was the threshold level for rejection.
After rejecting or not rejecting the null hypothesis, a conclusion in the context of the situation
should be made.
1
Instead of calculating probabilities, another way of deciding whether to reject or not reject the
null hypothesis is by finding the critical region.
The critical region (or rejection region) of a hypothesis test are the set of values such that the
null hypothesis is rejected if the test statistic lies in it.
Rejecting or not rejecting the null hypothesis does not mean that the statement is actually
false or true. It is only a decision made from data of one random sample. A different random
sample may lead to a different conclusion.
When one mistakenly rejects or not reject the null hypothesis, it is called an error. There are
two types of errors which can be made.
• Type II error: When the null hypothesis is false but it is not rejected.
A claim is made that the value of the population mean has changed. This forms the two
hypotheses:
H0 : µ = µ1
H1 : µ < µ1 H1 : µ > µ1 H1 : µ 6= µ1
If X is normally distributed or if the sample size is large (by the Central Limit theorem), the
2
sample mean X is also normally distributed, X ∼ N(µ, σn ).
Remark: It is also assumed that the population standard deviation is the same when the test
is carried out.
2
Assuming the null hypothesis, i.e. X ∼ N(µ1 , σn ), the probability of getting a mean equal to
or more extreme than x in random samples of size n is calculated. i.e.
2
If the probability is less than α% for a one-tail test, or less than 21 α% for a two-tail test, the
null hypothesis is rejected. Otherwise it is not rejected.
To calculate the probability, the mean of the sample x must be standardized before using the
normal distribution table.
x − µ1
z= q
σ2
n
The test statistic usually refers to z, the standardized mean of the sample.
The critical region usually refers to the set of values for z which leads to rejection of the null
hypothesis.
Let c be such that P(Z < c) = α% for lower-tail tests, P(Z > c) = α% for upper-tail tests,
and P(Z > c) (or P(Z < −c)) = 12 α% for two-tail tests. Then the critical region is
1 1
α% α% 2 α% 2 α%
c c −c c
The critical region can also refer to the set of values for x which leads to rejection of the null
hypothesis. In this case, it can be found by reverse standardizing the critical region above.
A Type I error occurs if the test statistic lies in the critical region when the null hypothesis is
true. Since the calculation of the critical region assumed the null hypothesis,
• The probability of a Type I error is just equal to the significance level of the test.
A Type II error occurs if the test statistic does not lie in the critical region when the null
hypothesis is false. Let µ2 be the true value of the population mean.
Remark: To find the probability of a Type II error, the critical region in terms of X must be
found. µ2 will be stated.
3
Exercise 5.1: The heights of a certain type of plant have a normal distribution. When the
plants are grown without fertilizer, the population mean and standard deviation are 24.0 cm
and 4.8 cm respectively. A gardener wishes to test, at the 2% significance level, whether Hier-
gro fertilizer will increase the mean height. He treats 150 randomly chosen plants with Hiergro
and finds that their mean height is 25.0 cm. Assuming that the standard deviation of the
heights of plants treated with Hiergro is still 4.8 cm, carry out the test.
Exercise 5.2: Past experience has shown that the heights of a certain variety of plant have
mean 64.0 cm and standard deviation 3.8 cm. During a particularly hot summer, it was ex-
pected that the heights of plants of this variety would be less than usual. In order to test
whether this was the case, a botanist recorded the heights of a random sample of 100 plants
and found that the value of the sample mean was 63.3 cm. Stating a necessary assumption,
carry out the test at the 2.5% significance level.
Exercise 5.3: The mean mass of packets of sugar is supposed to be 505 g. The mean mass
of packets produced by a certain machine was found to be less than 505 g, so the machine was
adjusted. Following the adjustment, the masses of a random sample of 150 packets from the
machine were measured and the total mass was found to be 75660 g.
i) Given that the population standard deviation is 3.6 g, test at the 2% significance level
whether the machine is still producing packets with mean mass less than 505 g.
ii) Explain why the use of the normal distribution is justified in carrying out the test in part
i).
Exercise 5.4: The times taken by students to complete a task are normally distributed with
standard deviation 2.4 minutes. A lecturer claims that the mean time is 17.0 minutes. The
times taken by a random sample of 5 students were 17.8, 22.4, 16.3, 23.1 and 11.4 minutes.
Carry out a hypothesis test at the 5% significance level to determine whether the lecturer’s
claim should be accepted.
Exercise 5.5: A researcher wishes to investigate whether the mean height of a certain type
of plant in one region is different from the mean height of this type of plant everywhere else.
He takes a large random sample of plants from the region and finds the sample mean. He
calculates the value of the test statistic, z, and finds that z = 1.91.
i) Explain briefly why the researcher should use a two-tail test.
ii) Carry out the test at the 4% significance level.
Exercise 5.6: The heights of a certain variety of plant have been found to be normally
distributed with mean 75.2 cm and standard deviation 5.7 cm. A biologist suspects that
pollution in a certain region is causing the plants to be shorter than usual. He takes a random
sample of n plants of this variety from this region and finds that their mean height is 73.1 cm.
He then carries out an appropriate hypothesis test.
i) He finds that the value of the test statistic z is −1.563, correct to 3 decimal places.
Calculate the value of n. State an assumption necessary for your calculation.
ii) Use this value of the test statistic to carry out the hypothesis test at the 6% significance
level.
4
Exercise 5.7: The amount of money, in dollars, spent by a customer on one visit to a certain
shop is modelled by the distribution N(µ, 1.94). In the past, the value of µ has been found to
be 20.00, but following a rearrangement in the shop, the manager suspects that the value of µ
has changed. He takes a random sample of 6 customers and notes how much they each spend,
in dollars. The results are as follows.
The manager carries out a hypothesis test using a significance level of α%. The test does not
support his suspicion. Find the largest possible value of α.
Exercise 5.8: Last year Samir found that the time for his journey to work had mean 45.7
minutes and standard deviation 3.2 minutes. Samir wishes to test whether his journey times
have increased this year. He notes the times, in minutes, for a random sample of 8 journeys
this year with the following results.
It may be assumed that the population of this year’s journey times is normally distributed
with standard deviation 3.2 minutes.
i) State, with a reason, whether Samir should use a one-tail or a two-tail test.
ii) Show that there is no evidence at the 5% significance level that Samir’s mean journey time
has increased.
iii) State, with a reason, which one of the errors, Type I or Type II, might have been made
in carrying out the test in part ii).
Exercise 5.9: Records show that the distance driven by a bus driver in a week is normally
distributed with mean 1150 km and standard deviation 105 km. New driving regulations are
introduced and in the next 20 weeks he drives a total of 21800 km.
i) Stating any assumption(s), test, at the 1% significance level, whether his mean weekly
driving distance has decreased.
ii) A similar test at the 1% significance level was carried out using the data from another 20
weeks. State the probability of a Type I error and describe what is meant by a Type I
error in this context.
Exercise 5.10: The time taken for a particular train journey is normally distributed. In
the past, the time had mean 2.4 hours and standard deviation 0.3 hours. A new timetable is
introduced and on 30 randomly chosen occasions the time for this journey is measured. The
mean time for these 30 occasions is found to be 2.3 hours.
i) Stating any assumption(s), test, at the 5% significance level, whether the mean time for
this journey has changed.
ii) A similar test at the 5% significance level was carried out using the times from another
randomly chosen 30 occasions.
5
Exercise 5.11: A fair six-sided die has faces numbered 1, 2, 3, 4, 5, 6. The score on one throw
is denoted by X.
35
i) Write down the value of E(X) and show that Var(X) = 12
.
Fayez has a six-sided die with faces numbered 1, 2, 3, 4, 5, 6. He suspects that it is biased so
that when it is thrown it is more likely to show a low number than a high number. In order to
test his suspicion, he plans to throw the die 50 times. If the mean score is less than 3 he will
conclude that the die is biased.
ii) Find the probability of a Type I error.
iii) With reference to this context, describe circumstances in which Fayez would make a Type
II error.
Exercise 5.12: The management of a factory thinks that the mean time required to complete
a particular task is 22 minutes. The times, in minutes, taken by employees to complete this
task have a normal distribution with mean µ and standard deviation 3.5. An employee claims
that 22 minutes is not long enough for the task. In order to investigate this claim, the times
for a random sample of 12 employees are used to test the null hypothesis µ = 22 against the
alternative hypothesis µ > 22 at the 5% significance level.
i) Show that the null hypothesis is rejected in favour of the alternative hypothesis if x > 23.7
(correct to 3 significant figures), where x is the sample mean.
ii) Find the probability of a Type II error given that the actual mean time is 25.8 minutes.
Exercise 5.13: In the past the time, in minutes, taken for a particular rail journey has been
found to have mean 20.5 and standard deviation 1.2. Some new railway signals are installed.
In order to test whether the mean time has decreased, a random sample of 100 times for this
journey are noted. The sample mean is found to be 20.3 minutes. You should assume that the
standard deviation is unchanged.
i) Carry out a significance test, at the 4% level, of whether the population mean time has
decreased.
Later another significance test of the same hypotheses, using another random sample of size
100, is carried out at the 4% level.
ii) Given that the population mean is now 20.1, find the probability of a Type II error.
iii) State what is meant by a Type II error in this context.
Remark: If the variance of the population is not known, it can be estimated using an unbiased
estimate from a sample.
Exercise 5.14: The masses, m kg, of packets of flour are normally distributed. The mean
mass is supposed to be 1.01 kg. A quality control officer measures the masses of a random
sample of 100 packets. The results are summarised below.
P P 2
n = 100 m = 98.2 m = 104.52
i) Test at the 5% significance level whether the population mean mass is less than 1.01 kg.
ii) Explain whether it was necessary to use the Central Limit theorem in part i).
6
Exercise 5.15: Household incomes, in thousands of dollars, in a certain country are repre-
sented by the random variable X with mean µ and standard deviation σ. The incomes of a
random sample of 400 households are found and the results are summarised below.
P P 2
n = 400 x = 923 x = 3170
Exercise 5.16: Following a change in flight schedules, an airline pilot wished to test whether
the mean distance that he flies in a week has changed. He noted the distances, x km, that he
flew in 50 randomly chosen weeks and summarised the results as follows.
P P 2
n = 50 x = 143300 x = 410900000
Exercise 5.17: The numbers of basketball courts in a random sample of 70 schools in South
Mowland are summarised in the table.
i) Calculate unbiased estimates for the population mean and variance of the number of
basketball courts per school in South Mowland.
The mean number of basketball courts per school in North Mowland is 1.9.
ii) Test at the 5% significance level whether the mean number of basketball courts per school
in South Mowland is less than the mean for North Mowland.
iii) State, with a reason, which of the errors, Type I or Type II, might have been made in
the test in part ii).
Exercise 5.18: In order to test the effect of a drug, a researcher monitors the concentration,
X, of a certain protein in the blood stream of patients. For patients who are not taking the
drug the mean value of X is 0.185. A random sample of 150 patients taking the drug was
selected and the values of X were found. The results are summarised below.
P P 2
n = 150 x = 27.0 x = 5.01
The researcher wishes to test at the 1% significance level whether the mean concentration of
the protein in the blood stream of patients taking the drug is less than 0.185.
i) Carry out the test.
ii) Given that, in fact, the mean concentration for patients taking the drug is 0.175, find the
probability of a Type II error occurring in the test.
7
5.2 Testing the mean of a Poisson distribution
Suppose the number of occurrences of an event in an interval follows a Poisson distribution.
A claim is made that the value of the mean number of occurrences has changed. To test the
claim at an α% significance level, the number of occurrences in some interval is observed.
Let X be the random variable for the number of occurrences in the interval being observed,
and let λ denote the mean of X. Then X ∼ Po(λ).
H 0 : λ = λ1
H1 : λ < λ1 H1 : λ > λ1 H1 : λ 6= λ1
Let x be the number of occurrences which appears in the observation. Then x is the test
statistic.
Assuming the null hypothesis, i.e X ∼ Po(λ1 ), the probability that the number of occurrences
is equal to or more extreme than x is calculated. i.e.
Remark: For a two-tail test, if is sufficient to calculate the smaller probability, usually
P(X ≤ x) if x < λ and P(X ≥ x) if x > λ.
If the probability is less than α% for a one-tail test, or less than 12 α% for a two-tail test, the
null hypothesis is rejected. Otherwise it is not rejected.
Remark: Unlike in the previous section, the test statistic takes discrete values, i.e. non-
negative integers. Therefore, the critical region is also in terms of integers.
The critical region is the set of integers for x which leads to the rejection of the null hypothesis.
X≤c X≥c X ≤ c1 , X ≥ c2
c is the largest integer such c is the smallest integer such c1 is the largest integer such
that P(X ≤ c) < α% that P(X ≥ c) < α% that P(X ≤ c1 ) < 12 α%
8
A Type I error occurs when the null hypothesis is true but is rejected.
For example, for a lower-tail test the probability of a Type I error is P(X ≤ c) where X ≤ c is
the critical region.
A Type II error occurs when the null hypothesis is false but is not rejected. Let λ2 be the true
value of the mean.
• The probability of a Type II error is P(X not in critical region) where X ∼ Po(λ2 ).
For example, for an upper-tail test the probability of a Type II error is P(X < c) where X ≥ c
is the critical region.
Exercise 5.19: A store sells two types of computer, laptops and tablets. The number of
laptops sold per hour is modelled by a random variable with distribution Po(0.9). The number
of tablets sold per hour is modelled by an independent random variable with distribution
Po(1.5).
i) Find the probability that, during a randomly chosen hour, the total number of laptops
and tablets sold in the store is less than 4.
ii) The manager claims that on sunny Saturdays fewer laptops than usual are sold. In order
to test this claim, an employee notes the number of laptops sold during a 4-hour period
on a randomly chosen sunny Saturday. In fact only 1 laptop is sold during this period.
Test the manager’s claim at the 10% significance level.
Exercise 5.20: Cloth made at a certain factory has been found to have an average of 0.1 faults
per square metre. Suki claims that the cloth made by her machine contains, on average, more
than 0.1 faults per square metre. In a random sample of 5 m2 of cloth from Suki’s machine, it
was found that there were 2 faults. Assuming that the number of faults per square metre has
a Poisson distribution,
ii) test at the 10% significance level whether Suki’s claim is justified.
Exercise 5.21: The number of sports injuries per month at a certain college has a Poisson
distribution. In the past the mean has been 1.1 injuries per month. The principal recently
introduced new safety guidelines and she decides to test, at the 2% significance level, whether
the mean number of sports injuries has been reduced. She notes the number of sports injuries
during a 6-month period.
i) Find the critical region for the test and state the probability of a Type I error.
iii) During the 6-month period there are a total of 2 sports injuries. Carry out the test.
iv) Assuming that the mean remains 1.1, calculate the probability that there will be fewer
than 30 sports injuries during a 36-month period.
9
Exercise 5.22: The number of absences by boys from the same class on any day is modelled
by an independent random variable with distribution Po(0.3). The teacher claims that, during
the football season, there are more absences by boys than usual. In order to test this claim at
the 5% significance level, he notes the number of absences by boys during a randomly chosen
5-day period during the football season.
iii) In fact there were 4 absences by boys during this period. Test the teacher’s claim at the
5% significance level.
Exercise 5.23: At a doctors’ surgery, the number of missed appointments per day has a
Poisson distribution. In the past the mean number of missed appointments per day has been
0.9. Following some publicity, the manager carries out a hypothesis test to determine whether
this mean has decreased. If there are fewer than 3 missed appointments in a randomly chosen
5-day period, she will conclude that the mean has decreased.
iii) Find the probability of a Type II error if the mean number of missed appointments per
day is 0.2.
Exercise 5.24: The number of accidents per month, X, at a factory has a Poisson distribution.
In the past the mean has been 1.1 accidents per month. Some new machinery is introduced
and the management wish to test whether the mean has increased. They note the number of
accidents in a randomly chosen month and carry out a hypothesis test at the 1% sig. level.
i) Show that the critical region for the test is X ≥ 5. Given that the number of accidents is
6, carry out the test.
Later they carry out a similar test, also at the 1% significance level.
ii) Explain the meaning of a Type I error in this context and state its probability.
iii) Given that the mean is now 7.0, find the probability of a Type II error.
Exercise 5.25: In the past the number of accidents per month on a certain road was modelled
by a random variable with distribution Po(0.47). After the introduction of speed restrictions,
the government wished to test, at the 5% significance level, whether the mean number of
accidents had decreased. They noted the number of accidents during the next 12 months. It
is assumed that accidents occur randomly and that a Poisson model is still appropriate.
i) Given that the total number of accidents during the 12 months was 2, carry out the test.
It is given that the mean number of accidents per month is now in fact 0.05.
iii) Using another random sample of 12 months the same test is carried out again, with the
same significance level. Find the probability of a Type II error.
10
Recall that if X ∼ Po(λ) and λ is large, then the distribution of X is approximately N(λ, λ).
Therefore if the probability required for the test is too tedious to calculate using the Poisson
distribution, it can be approximated using the normal distribution (and continuity correction).
Exercise 5.26: At factory A the mean number of accidents per year is 32. At factory B the
records of numbers of accidents before 2018 have been lost, but the number of accidents during
2018 was 21. It is known that the number of accidents per year can be well modelled by a
Poisson distribution. Use an approximating distribution to test at the 2% significance level
whether the mean number of accidents at factory B is less than at factory A.
Exercise 5.27: The number of calls per day to an enquiry desk has a Poisson distribution.
In the past the mean has been 5. In order to test whether the mean has changed, the number
of calls on a random sample of 10 days was recorded. The total number of calls was found to
be 61. Use an approximate distribution to test at the 10% significance level whether the mean
has changed.
A claim is made that the value of p has changed. To test the claim at an α% significance level,
a sample of size n of the population is taken.
Let X be the random variable for the number of members in samples of size n having the
attribute. Then X is binomially distributed, i.e. X ∼ B(n, p).
Remark: For the distribution to be binomial, it is assumed that the n trials are independent.
H0 : p = p1
H1 : p < p1 H1 : p > p1 H1 : p 6= p1
Let x be the number of members in the sample taken with the attribute. Then x is the test
statistic.
Assuming the null hypothesis, i.e X ∼ B(n, p1 ), the probability that the number of members
in the sample which is equal to or more extreme than x is calculated. i.e.
• P(X ≤ x), for lower-tail • P(X ≤ x) or P(X ≥ x), for two-tail
11
Remark: For a two-tail test, it is sufficient to calculate the smaller probability, usually
P(X ≤ x) if x is closer to 0, and P(X ≥ x) if x is closer to n.
As the distribution of X is discrete taking only positive integer values of at most n, the concept
of the critical region and errors is the same as in the previous section.
Exercise 5.28: A hockey player found that she scored a goal on 82% of her penalty shots.
After attending a coaching course, she scored a goal on 19 out of 20 penalty shots. Making an
assumption that should be stated, test at the 10% significance level whether she has improved.
Exercise 5.29: At the 2009 election, 31 of the voters in Chington voted for the Citizens Party.
One year later, a researcher questioned 20 randomly selected voters in Chington. Exactly 3 of
these 20 voters said that if there were an election next week they would vote for the Citizens
Party. Test at the 2.5% significance level whether there is evidence of a decrease in support
for the Citizens Party in Chington, since the 2009 election.
Exercise 5.30: A die has six faces numbered 1, 2, 3, 4, 5, 6. Manjit suspects that the die is
biased so that it shows a six on fewer throws than it would if it were fair. In order to test her
suspicion, she throws the die a certain number of times and counts the number of sixes.
i) State suitable null and alternative hypotheses for Manjit’s test.
ii) There are no sixes in the first 15 throws. Show that this result is not significant at the 5%
level.
iii) Find the smallest value of n such that, if there are no sixes in the first n throws, this result
is significant at the 5% level.
Exercise 5.31: A cereal manufacturer claims that 25% of cereal packets contain a free gift.
Lola suspects that the true proportion is less than 25%. In order to test the manufacturer’s
claim at the 5% significance level, she checks a random sample of 20 packets.
i) Find the critical region for the test.
ii) Hence find the probability of a Type I error.
Lola finds that 2 packets in her sample contain a free gift.
iii) State, with a reason, the conclusion she should draw.
Exercise 5.32: Marie claims that she can predict the winning horse at the local races. There
are 8 horses in each race. Nadine thinks that Marie is just guessing, so she proposes a test.
She asks Marie to predict the winners of the next 10 races and, if she is correct in 3 or more
races, Nadine will accept Marie’s claim.
i) State suitable null and alternative hypotheses.
ii) Calculate the probability of a Type I error.
iii) State the significance level of the test.
12
Exercise 5.33: At an election in 2010, 15% of voters in Bratfield voted for the Renewal Party.
One year later, a researcher asked 30 randomly selected voters in Bratfield whether they would
vote for the Renewal Party if there were an election next week. 2 of these 30 voters said that
they would.
i) Use a binomial distribution to test, at the 4% significance level, the null hypothesis that
there has been no change in the support for the Renewal Party in Bratfield against the
alternative hypothesis that there has been a decrease in support since the 2010 election.
ii) a) Explain why the conclusion in part i) cannot involve a Type I error.
b) State the circumstances in which the conclusion in part i) would involve a Type II
error.
Exercise 5.34: Jeevan thinks that a six-sided die is biased in favour of six. In order to test
this, Jeevan throws the die 10 times. If the die shows a six on at least 4 throws out of 10, she
will conclude that she is correct.
iv) If the die is actually biased so that the probability of throwing a six is 12 , calculate the
probability of a Type II error.
Exercise 5.35: At the last election, 70% of people in Apoli supported the president. Luigi be-
lieves that the same proportion support the president now. Maria believes that the proportion
who support the president now is 35%. In order to test who is right, they agree on a hypothesis
test, taking Luigi’s belief as the null hypothesis. They will ask 6 people from Apoli, chosen at
random, and if more than 3 support the president they will accept Luigi’s belief.
iii) In fact 2 of the 6 people say that they support the president. State which error, Type I
or Type II, might be made. Explain your answer.
Exercise 5.36: The four sides of a spinner are A, B, C, D. The spinner is supposed to be fair,
but Sonam suspects that the spinner is biased so that the probability, p, that it will land on
side A is greater than 14 . He spins the spinner 10 times and finds that it lands on side A 6
times.
Later Sonam carries out a similar test at the 1% significance level, using another 10 spins of
the spinner.
iii) Assuming that the value of p is actually 53 , calculate the probability of a Type II error.
13
Recall that if X ∼ B(n, p) and n is large relative to p and q = 1 − p, then the distribution of
X is approximately N(np, npq).
Therefore it the probability required for the test is too tedious to calculate using the binomial
distribution, it can be approximated using the normal distribution (and continuity correction).
Exercise 5.37: Sami claims that he can read minds. He asks each of 50 people to choose one
of the 5 letters A, B, C, D or E. He then tells each person which letter he believes they have
chosen. He gets 13 correct. Sami says “This shows that I can read minds, because 13 is more
than I would have got right if I were just guessing.”
ii) Test at the 10% significance level whether Sami’s claim is justified.
Exercise 5.38: A six-sided die shows a six on 25 throws out of 200 throws. Test at the 10%
significance level the null hypothesis: P(throwing a six) = 61 , against the alternative hypothe-
sis: P(throwing a six) < 61 .
Exercise 5.39: Deng throws a coin 100 times in order to test, at the 5% significance level,
whether it is biased towards Heads.
Let X be the number of times an event occurs (with probability p) in n independent trials, i.e.
X ∼ B(n, p).
To test if the mean of X has changed, a random sample of size m of X is taken, and the total
number of times the event occurs, x, is noted.
Recall that if n is large and p is small relative to n, then the distribution of X is approximately
Po(np).
Using this approximate distribution, the probability of getting a value at least as extreme x is
calculated and compared with the significance level.
Exercise 5.40: All the seats on a certain daily flight are always sold. The number of passengers
who have bought seats but fail to arrive for this flight on a particular day is modelled by the
distribution B(320, 0.005). After some changes, the airline wishes to test whether the mean
number of passengers per day who fail to arrive for this flight has decreased. During 5 randomly
chosen days, a total of 2 passengers failed to arrive. Carry out the test at the 2.5% significance
level.
14