0% found this document useful (0 votes)
17 views

Topic4 Revision Session Worksheet (1)

The document is a revision session worksheet with a total of 591 marks, covering various mathematical topics such as temperature conversion, statistics, probability, and hypothesis testing. It includes multiple questions that require calculations, interpretations, and conclusions based on given data. Each section has a maximum mark allocation, indicating the complexity and depth of the questions.

Uploaded by

blisspilatesbaku
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Topic4 Revision Session Worksheet (1)

The document is a revision session worksheet with a total of 591 marks, covering various mathematical topics such as temperature conversion, statistics, probability, and hypothesis testing. It includes multiple questions that require calculations, interpretations, and conclusions based on given data. Each section has a maximum mark allocation, indicating the complexity and depth of the questions.

Uploaded by

blisspilatesbaku
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Topic4 (Revision Session Worksheet) [591 marks]

1. [Maximum mark: 6]
The formula F = 1. 8C + 32 is used to convert a temperature in degrees Celsius, C , to degrees Fahrenheit, F .

(a.i) Find a formula for converting a temperature in degrees Fahrenheit to degrees Celsius. [2]

(a.ii) Find the temperature in degrees Celsius that is recorded as 77 degrees Fahrenheit. [1]

Over one year, the mean daily temperature in Mexico City was calculated to be 17 degrees Celsius with a standard deviation of 9
degrees Celsius.

(b) For the same year, find in degrees Fahrenheit

(b.i) the mean daily temperature in Mexico City. [1]

(b.ii) the standard deviation of the daily temperature in Mexico City. [2]

2. [Maximum mark: 7]
The following data show the heights, in metres, of six players in a basketball team.

(a) For these six players, find

(a.i) the mean height. [2]

(a.ii) the median height. [1]

(a.iii) the modal height. [1]

(a.iv) the range of the heights. [2]

A new player, Gheorghe, joins the team. Their height is measured as 1. 98 metres to the nearest centimetre.

(b) Write down the shortest possible height of Gheorghe. [1]

3. [Maximum mark: 6]
A teacher surveys their students to find out if they have eaten at the local Thai and Indian cafés. The results of the survey are shown
in the following Venn diagram.
(a) Write down the number of students surveyed. [1]

(b) Write down the number of students who have not eaten at the Indian café. [1]

A student is chosen at random from those surveyed.

(c) Find the probability this student has eaten at both the Thai café and the Indian café. [1]

Let T be the event: a student has eaten at the Thai café.


Let I be the event: a student has eaten at the Indian café.

(d) Find P (T ∪ I). [1]

(e) State whether the events T and I are mutually exclusive. Justify your answer. [2]

4. [Maximum mark: 8]
Zac raises funds for a library by running a game where players spin a needle. The final position of the needle results in an outcome
where a player wins or loses money. The outcomes, with associated probabilities, are shown in the following diagram.

Let X represent the amount that a player of this game wins.

(a.i) Find the expected value of X. [2]

(a.ii) Interpret your answer to part (a)(i). [1]


To encourage a person to keep playing this game, Zac increases the winning prize for the second game they play from $5 to $6.
For each successive game they play, the winning prize continues to increase by $1.

Emily plays k games. The k th game is fair.

(b.i) Find the value of k. [4]

(b.ii) Explain why Zac expects to raise money from the games Emily plays. [1]

5. [Maximum mark: 6]
Jerry makes handcrafted chocolates. On average, 1 in 25 of the chocolates that Jerry makes is flawed. Whether or not a chocolate is
flawed is independent of all other chocolates.

(a) In a batch of 20 chocolates, chosen at random, find the probability that

(a.i) two are flawed. [2]

(a.ii) more than two are flawed. [2]

Jerry sells the perfect chocolates for 50 pesos each and the flawed ones for 15 pesos each.

(b) Calculate the expected number of pesos Jerry makes from selling a batch of 20 randomly selected chocolates. [2]

6. [Maximum mark: 7]
The prices, in dollars, of 10 different garden chairs are:

79 139 255 99 50 209 229 193 69 49

(a) Find the range of the prices of the 10 chairs. [2]

(b) Use your graphic display calculator to find

(b.i) the mean price of the chairs. [2]

(b.ii) the standard deviation of the price of the chairs. [1]

In a sale, the price of each of the 10 garden chairs is reduced by $ 20.

(c) Write down

(c.i) the new mean. [1]

(c.ii) the new standard deviation. [1]

7. [Maximum mark: 5]
Sunita sorts 300 peppers into sizes of small, medium or large. Some peppers are red, some are green, and some are yellow.

The following table shows her results.


Sunita wants to test, at the 5 % significance level, whether the size of the peppers is independent of the colour.
(a) State the null and alternative hypotheses for this test. [1]

The critical value for this test is 9. 49.

(b.i) Calculate χ 2
calc
. [2]

(b.ii) State a conclusion to the test. Give a reason for your answer. [2]

8. [Maximum mark: 7]
Gustav plays a game in which he first tosses an unbiased coin and then rolls an unbiased six-sided die.

If the coin shows tails, the score on the die is Gustav’s final number of points.

If the coin shows heads, one is added to the score on the die for Gustav’s final number of points.

(a) Find the probability that Gustav’s final number of points is 7. [2]

(b) Complete the following table.

[3]

(c) Calculate the expected value of Gustav’s final number of points. [2]

9. [Maximum mark: 7]
Billy is a keen walker who keeps a record of his performance. The following table shows the time, in minutes, it takes him to walk
one kilometre up hills with different gradients. The gradient of each hill is constant.

(a.i) Find the equation of the regression line of T on G .

[2]

(a.ii) Describe the correlation between T and G with reference to the value of r, the Pearson’s product-moment
correlation coefficient. [2]

On Sunday, Billy intends to walk up a hill with a gradient of 13 %.

(b) Estimate the time it will take Billy to walk one kilometre up the hill. [2]

This morning, Billy walked one kilometre up a hill, and it took 22 minutes.
(c) Explain why it would be inappropriate to use the equation found in part (a) to estimate the gradient of this hill. [1]

10. [Maximum mark: 8]


Mrs Whitehouse is a chemistry teacher. After grading her final exams, she creates the following box and whisker diagram to
compare the grades of her two classes.

(a) Identify which two of the following statements must be true according to the box and whisker diagram. Indicate
your choices by placing tick marks in the second column of the following table.

[2]

At the end of the year, Mrs Whitehouse surveyed a random sample of students from each of her two large classes to determine how
satisfied they were with her teaching.

Each student independently selected a value from 1 to 10, with 1 meaning that they were not satisfied at all and 10 meaning that
they were very satisfied.

Her collected data from the student surveys is shown.

Mrs Whitehouse believes that there was no difference in the general satisfaction between the two classes. She assumes that the
data is drawn from a population that can be modelled by a normal distribution and proposes to conduct a t-test at the 5 %
significance level.

(b) Write down the null and alternative hypotheses for her test. [2]

(c) Find the p-value for her test. [2]

(d) Write down the conclusion to the test. Give a reason for your answer. [2]
11. [Maximum mark: 7]
Rita is playing a game. In the game, she must roll a fair six-sided die. If she gets a five or six then she wins a prize. If not, then she has
another chance but this time she must flip a fair coin which will result in the coin landing on heads or tails. If the coin lands on
heads, then Rita wins a prize.

(a) Complete the tree diagram by writing in the three missing probabilities.

[2]

(b) Find the probability that Rita does not win a prize. [2]

(c) Given that Rita won a prize, find the probability that she got a five or six when she rolled the die. [3]

12. [Maximum mark: 6]


Nicole works at a local school 5 days each week. She drives an old car to work that has a 72 % probability of starting on any given
morning. The probability of the car starting on a given morning is independent of it starting on any other morning.

(a) Find the probability that Nicole’s car starts on exactly three mornings in a particular 5 day workweek. [2]

Nicole walks to work on mornings when her car does not start and it is not raining. Nicole takes the bus to work on mornings when
her car does not start and it is raining.

Where Nicole lives, there is a 42 % probability of rain on any given morning, independent of any other morning. The probability of
Nicole’s car starting is independent of the weather.

(b) Find the probability that Nicole will not have to take the bus in a particular workweek. [4]

13. [Maximum mark: 6]


Thurston believes that more popular musical artists sell more albums.

He begins to investigate this belief by randomly selecting eight musical artists and collecting data on the number of followers each
of the artists has on a particular social media platform. He then collects data on the number of albums each artist sold in the first
week after releasing an album. His data is shown in Table 1.
Thurston decides to calculate the Spearman’s rank correlation coefficient.
(a) Complete the table of ranks shown in Table 2.

Table 2

[1]

(b) Calculate the value of r , Spearman’s rank correlation coefficient.


s [2]

Thurston believes that artists with a higher number of social media followers sell more albums in the first week. He carries out a
hypothesis test using a 10 % significance level with the following null hypothesis:

H0 : In the population, there is no monotonic relationship between the number of social media followers and the number of
albums sold in the first week.

(c) Write down Thurston’s alternative hypothesis. [1]

The critical value of r for this test is 0. 643.


s

(d) State the conclusion of the hypothesis test, giving a reason. [2]

14. [Maximum mark: 7]


Joel is a keen cyclist who keeps a record of his performance. The following table shows the time, in minutes, it takes him to ride one
kilometre on hills with different gradients. The gradient of each hill is constant.

(a.i) Find the equation of the regression line of T on G. [2]

(a.ii) Describe the correlation between T and G with reference to the value of r, the Pearson’s product-moment
correlation coefficient. [2]

On Saturday, Joel intends to ride a hill with a gradient of 17 %.


(b) Estimate the time it will take Joel to ride one kilometre up the hill. [2]

This morning, Joel rode one kilometre up a hill, and it took 22 minutes.

(c) Explain why it would be inappropriate to use the equation found in part (a) to estimate the gradient of this hill. [1]

15. [Maximum mark: 6]


The decathlon is a competition where athletes compete in ten events. Two of those events are long jump and high jump. In both
events, a greater distance means a better ranking.

The table shows results for these two events at the World Championships.

Event Rank
Athlete’s Long Jump High Jump Long Jump High Jump
Country (m) (m) Rank Rank
Germany 7. 64 2. 11 1

France 7. 52 2. 08 2

Estonia 7. 49 1. 84 3

Canada 7. 44 2. 02 4

Netherlands 7. 33 2. 05 5

Ukraine 7. 28 2. 02 6

Algeria 7. 22 1. 90 7

Austria 7. 11 1. 87 8

Grenada 6. 98 1. 99 9

Japan 6. 64 1. 96 10

The Spearman’s rank correlation coefficient is used to determine if there is a linear correlation between an athlete’s ranking in long
jump and their ranking in high jump.

(a) Complete the table to show the athletes’ rankings in high jump. [2]

(b) Find the value of the Spearman’s rank correlation coefficient r . s [2]

The following guide is used by the coach to determine the strength of the correlation between the ranks for long jump and high
jump.

|r s | Strength
0. 000 to 0. 199 Very weak
0. 200 to 0. 399 Weak
0. 400 to 0. 599 Moderate
0. 600 to 0. 799 Strong
0. 800 to 1. 000 Very strong

(c) State the strength of the correlation between the rankings as indicated by the table and interpret this in the
context of the question. [2]
16. [Maximum mark: 6]
Carys believes that, on a memory retention test, the mean score of bilingual people (μ ) will be higher than the mean score of
b

monolingual people (μ ). Carys gave a memory retention test to a random sample of students in her class. The results are shown in
m

the two tables.

Carys performs a one-tailed t-test at a 5% level of significance. It is assumed that the scores are normally distributed and the
samples have equal variances.

(a) State the null and alternative hypotheses. [2]

(b) Calculate the p-value for this test. [2]

(c) State the conclusion of the test in the context of the question. Justify your answer. [2]

17. [Maximum mark: 7]


On a specific day, the speed of cars as they pass a speed camera can be modelled by a normal distribution with a mean of
67. 3 km h
−1
.

A speed of 75. 7 km h −1
is two standard deviations from the mean.

(a) Find the standard deviation for the speed of the cars. [2]

Speeding tickets are issued to all drivers travelling at a speed greater than 72 km h −1
.

(b) Find the probability that a randomly selected driver who passes the speed camera receives a speeding ticket. [2]

It is found that 82% of cars on this road travel at speeds between p km h −1


and q km h
−1
, where p < q . This interval includes
cars travelling at a speed of 74 km h .−1

(c) Show that the region of the normal distribution between p and q is not symmetrical about the mean. [3]

18. [Maximum mark: 7]


In a school, 200 students solved a problem in a mathematics competition. Their times to solve the problem were recorded and the
following cumulative frequency graph was produced.
(a) Use the graph to find

(a.i) the median time; [1]

(a.ii) the lower quartile; [1]

(a.iii) the upper quartile; [1]

(a.iv) the interquartile range. [1]

Cedric took 14 seconds to solve the problem.

(b) Determine whether Cedric’s time is an outlier. [3]

19. [Maximum mark: 6]


At a running club, Sung-Jin conducts a test to determine if there is any association between an athlete’s age and their best time
taken to run 100 m. Eight athletes are chosen at random, and their details are shown below.

Athlete A B C D E F G H

Age (years) 13 17 22 18 19 25 11 36

Time (seconds) 13. 4 14. 6 13. 4 12. 9 12. 0 11. 8 17. 0 13. 1

Sung-Jin decides to calculate the Spearman’s rank correlation coefficient for his set of data.

(a) Complete the table of ranks.

Athlete A B C D E F G H

Age rank 3
Time rank 1 [2]

(b) Calculate the Spearman’s rank correlation coefficient, r . s [2]

(c) Interpret this value of r in the context of the question.


s [1]

(d) Suggest a mathematical reason why Sung-Jin may have decided not to use Pearson’s product-moment
correlation coefficient with his data from the original table. [1]

20. [Maximum mark: 4]


The following frequency distribution table shows the test grades for a group of students.

Grade 1 2 3 4 5 6 7

Frequency 1 4 7 9 p 9 4

For this distribution, the mean grade is 4. 5.

(a) Write down the total number of students in terms of p. [1]

(b) Calculate the value of p. [3]

21. [Maximum mark: 6]


A company that owns many restaurants wants to determine if there are differences in the quality of the food cooked for three
different meals: breakfast, lunch and dinner.

Their quality assurance team randomly selects 500 items of food to inspect. The quality of this food is classified as perfect,
satisfactory, or poor. The data is summarized in the following table.

An item of food is chosen at random from these 500.

(a) Find the probability that its quality is not perfect, given that it is from breakfast. [2]

A χ test at the 5% significance level is carried out to determine if there is significant evidence of a difference in the quality of the
2

food cooked for the three meals.

The critical value for this test is 9. 488.

The hypotheses for this test are:


H0 : The quality of the food and the type of meal are independent.
H1 : The quality of the food and the type of meal are not independent.
(b) Find the χ statistic.
2
[2]

(c) State, with justification, the conclusion for this test. [2]

22. [Maximum mark: 6]


The lengths of the seeds from a particular mango tree are approximated by a normal distribution with a mean of 4 cm and a
standard deviation of 0. 25 cm.

A seed from this mango tree is chosen at random.

(a) Calculate the probability that the length of the seed is less than 3. 7 cm. [2]

It is known that 30% of the seeds have a length greater than k cm.

(b) Find the value of k. [2]

For a seed of length d cm, chosen at random, P(4 − m < d < 4 + m) = 0. 6 .

(c) Find the value of m. [2]

23. [Maximum mark: 5]


In a game, balls are thrown to hit a target. The random variable X is the number of times the target is hit in five attempts. The
probability distribution for X is shown in the following table.

x 0 1 2 3 4 5

P(X = x) 0. 15 0. 2 k 0. 16 2k 0. 25

(a) Find the value of k. [2]

The player has a chance to win money based on how many times they hit the target.

The gain for the player, in $, is shown in the following table, where a negative gain means that the player loses money.

x 0 1 2 3 4 5

Player’s gain ($) −4 −3 −1 0 1 4

(b) Determine whether this game is fair. Justify your answer. [3]
24. [Maximum mark: 7]
The following Venn diagram shows two independent events, R and S . The values in the diagram represent probabilities.

(a) Find the value of x. [3]

(b) Find the value of y. [2]

(c) Find P(R′|S′). [2]

25. [Maximum mark: 5]


Roy is a member of a motorsport club and regularly drives around the Port Campbell racetrack.

The times he takes to complete a lap are normally distributed with mean 59 seconds and standard deviation 3 seconds.

(a) Find the probability that Roy completes a lap in less than 55 seconds. [2]

Roy will complete a 20 lap race. It is expected that 8. 6 of the laps will take more than t seconds.

(b) Find the value of t. [3]

26. [Maximum mark: 7]


Taizo plays a game where he throws one ball at two bottles that are sitting on a table. The probability of knocking over bottles, in
any given game, is shown in the following table.

(a) Taizo plays two games that are independent of each other. Find the probability that Taizo knocks over a total of
two bottles. [4]

In any given game, Taizo will win k points if he knocks over two bottles, win 4 points if he knocks over one bottle and lose 8 points
if no bottles are knocked over.

(b) Find the value of k such that the game is fair. [3]
27. [Maximum mark: 5]
Sergio is interested in whether an adult’s favourite breakfast berry depends on their income level. He obtains the following data for
341 adults and decides to carry out a χ test for independence, at the 10% significance level.
2

(a) Write down the null hypothesis. [1]

(b) Find the value of the χ statistic.


2
[2]

The critical value of this χ test is 7. 78.


2

(c) Write down Sergio’s conclusion to the test in context. Justify your answer. [2]

28. [Maximum mark: 5]


Manny and Annabelle, mathematics teachers at Burnham High School, give their students the same examination. A random
sample of the examination scores were collected from each of their classes.

Annabelle uses these scores to conduct a two-tailed t-test to compare the means of the two classes, at the 5% level of significance.
It is assumed the examination scores for both classes have the same variance and are normally distributed.

The null hypothesis is μ 1 = μ2 , where μ is the mean examination score from Manny’s class and μ is the mean examination score
1 2

from Annabelle’s class.

(a) Write down the alternative hypothesis. [1]

(b) Find the p-value for this test. Give your answer correct to five decimal places. [2]

Annabelle concludes there is insufficient evidence to reject the null hypothesis.

(c) State whether Annabelle’s conclusion is correct. Give a reason for your answer. [2]

29. [Maximum mark: 7]


In the first month of a reforestation program, the town of Neerim plants 85 trees. Each subsequent month the number of trees
planted will increase by an additional 30 trees.

The number of trees to be planted in each of the first three months are shown in the following table.
(a) Find the number of trees to be planted in the 15th month. [3]

(b) Find the total number of trees to be planted in the first 15 months. [2]

(c) Find the mean number of trees planted per month during the first 15 months. [2]

30. [Maximum mark: 6]


A factory produces bags of sugar with a labelled weight of 500 g. The weights of the bags are normally distributed with a mean of
500 g and a standard deviation of 3 g.

(a) Write down the percentage of bags that weigh more than 500 g. [1]

A bag that weighs less than 495 g is rejected by the factory for being underweight.

(b) Find the probability that a randomly chosen bag is rejected for being underweight. [2]

(c) A bag that weighs more than k grams is rejected by the factory for being overweight. The factory rejects 2% of
bags for being overweight.

Find the value of k. [3]

31. [Maximum mark: 7]


Leo is investigating whether a six-sided die is fair. He rolls the die 60 times and records the observed frequencies in the following
table:

Leo carries out a χ goodness of fit test at a 5% significance level.


2

(a) Write down the null and alternative hypotheses. [1]

(b) Write down the degrees of freedom. [1]

(c) Write down the expected frequency of rolling a 1. [1]

(d) Find the p-value for the test. [2]

(e) State the conclusion of the test. Give a reason for your answer. [2]

32. [Maximum mark: 6]


Karl has three brown socks and four black socks in his drawer. He takes two socks at random from the drawer.
(a) Complete the tree diagram.

[1]

(b) Find the probability that Karl takes two socks of the same colour. [2]

(c) Given that Karl has two socks of the same colour find the probability that he has two brown socks. [3]

33. [Maximum mark: 6]


A study was conducted to investigate whether the mean reaction time of drivers who are talking on mobile phones is the same as
the mean reaction time of drivers who are talking to passengers in the vehicle. Two independent groups were randomly selected
for the study.

To gather data, each driver was put in a car simulator and asked to either talk on a mobile phone or talk to a passenger. Each driver
was instructed to apply the brakes as soon as they saw a red light appear in front of the car. The reaction times of the drivers, in
seconds, were recorded, as shown in the following table.

At the 10% level of significance, a t-test was used to compare the mean reaction times of the two groups. Each data set is assumed
to be normally distributed, and the population variances are assumed to be the same.

Let μ and μ be the population means for the two groups. The null hypothesis for this test is H
1 2 0 : μ1 − μ2 = 0 .

(a) State the alternative hypothesis. [1]

(b) Calculate the p-value for this test. [2]

(c.i) State the conclusion of the test. Justify your answer. [2]

(c.ii) State what your conclusion means in context. [1]


34. [Maximum mark: 5]
A college runs a mathematics course in the morning. Scores for a test from this class are shown below.

25 33 51 62 63 63 70 74 79 79 81 88 90 90 98

For these data, the lower quartile is 62 and the upper quartile is 88.

(a) Show that the test score of 25 would not be considered an outlier. [3]

The box and whisker diagram showing these scores is given below.

Test scores

Another mathematics class is run by the college during the evening. A box and whisker diagram showing the scores from this class
for the same test is given below.

Test scores

A researcher reviews the box and whisker diagrams and believes that the evening class performed better than the morning class.

(b) With reference to the box and whisker diagrams, state one aspect that may support the researcher’s opinion and
one aspect that may counter it. [2]

35. [Maximum mark: 6]


A group of 130 applicants applied for admission into either the Arts programme or the Sciences programme at a university. The
outcomes of their applications are shown in the following table.

(a) Find the probability that a randomly chosen applicant from this group was accepted by the university. [1]

An applicant is chosen at random from this group. It is found that they were accepted into the programme of their choice.

(b) Find the probability that the applicant applied for the Arts programme. [2]

(c) Two different applicants are chosen at random from the original group.

Find the probability that both applicants applied to the Arts programme. [3]
36. [Maximum mark: 7]
A polygraph test is used to determine whether people are telling the truth or not, but it is not completely accurate. When a person
tells the truth, they have a 20% chance of failing the test. Each test outcome is independent of any previous test outcome.

10 people take a polygraph test and all 10 tell the truth.

(a) Calculate the expected number of people who will pass this polygraph test. [2]

(b) Calculate the probability that exactly 4 people will fail this polygraph test. [2]

(c) Determine the probability that fewer than 7 people will pass this polygraph test. [3]

37. [Maximum mark: 5]


The masses of Fuji apples are normally distributed with a mean of 163 g and a standard deviation of 6. 83 g.

When Fuji apples are picked, they are classified as small, medium, large or extra large depending on their mass. Large apples have a
mass of between 172 g and 183 g.

(a) Determine the probability that a Fuji apple selected at random will be a large apple. [2]

Approximately 68% of Fuji apples have a mass within the medium-sized category, which is between k and 172 g.

(b) Find the value of k. [3]

38. [Maximum mark: 5]


In a city, 32% of people have blue eyes. If someone has blue eyes, the probability that they also have fair hair is 58%. This
information is represented in the following tree diagram.

(a) Write down the value of a. [1]

(b) Find an expression, in terms of b, for the probability of a person not having blue eyes and having fair hair. [1]

It is known that 41% of people in this city have fair hair.

Calculate the value of

(c.i) b. [2]

(c.ii) c . [1]
39. [Maximum mark: 6]
Eduardo believes that there is a linear relationship between the age of a male runner and the time it takes them to run 5000
metres.

To test this, he recorded the age, x years, and the time, t minutes, for eight males in a single 5000 m race. His results are presented
in the following table and scatter diagram.

(a) For this data, find the value of the Pearson’s product-moment correlation coefficient, r. [2]

Eduardo looked in a sports science text book. He found that the following information about r was appropriate for athletic
performance.

(b) Comment on your answer to part (a), using the information that Eduardo found. [1]

(c) Write down the equation of the regression line of t on x, in the form t = ax + b . [1]

(d) A 57-year-old male also ran in the 5000 m race.

Use the equation of the regression line to estimate the time he took to complete the 5000 m race. [2]

40. [Maximum mark: 8]


A group of 120 students sat a history exam. The cumulative frequency graph shows the scores obtained by the students.
(a) Find the median of the scores obtained. [1]

The students were awarded a grade from 1 to 5, depending on the score obtained in the exam. The number of students receiving
each grade is shown in the following table.

(b) Find an expression for a in terms of b. [2]

The mean grade for these students is 3. 65.

(c.i) Find the number of students who obtained a grade 5. [3]

(c.ii) Find the minimum score needed to obtain a grade 5. [2]

41. [Maximum mark: 5]


Arriane has geese on her farm. She claims the mean weight of eggs from her black geese is less than the mean weight of eggs from
her white geese.

She recorded the weights of eggs, in grams, from a random selection of geese. The data is shown in the table.

In order to test her claim, Arriane performs a t-test at a 10% level of significance. It is assumed that the weights of eggs are normally
distributed and the samples have equal variances.
(a) State, in words, the null hypothesis. [1]

(b) Calculate the p-value for this test. [2]

(c) State whether the result of the test supports Arriane’s claim. Justify your reasoning. [2]

42. [Maximum mark: 7]


A game is played where two unbiased dice are rolled and the score in the game is the greater of the two numbers shown. If the two
numbers are the same, then the score in the game is the number shown on one of the dice. A diagram showing the possible
outcomes is given below.

Let T be the random variable “the score in a game”.

(a) Complete the table to show the probability distribution of T .

[2]

Find the probability that

(b.i) a player scores at least 3 in a game. [1]

(b.ii) a player scores 6, given that they scored at least 3. [2]

(c) Find the expected score of a game. [2]

43. [Maximum mark: 4]


Deb used a thermometer to record the maximum daily temperature over ten consecutive days. Her results, in degrees Celsius (°C),
are shown below.

14, 15, 14, 11, 10, 9, 14, 15, 16, 13

For this data set, find the value of

(a) the mode. [1]


(b) the mean. [2]

(c) the standard deviation. [1]

44. [Maximum mark: 6]


A newspaper vendor in Singapore is trying to predict how many copies of The Straits Times they will sell. The vendor forms a model to
predict the number of copies sold each weekday. According to this model, they expect the same number of copies will be sold each
day.

To test the model, they record the number of copies sold each weekday during a particular week. This data is shown in the table.

A goodness of fit test at the 5% significance level is used on this data to determine whether the vendor’s model is suitable.

The critical value for the test is 9. 49 and the hypotheses are

H0 : The data satisfies the model.


H1 : The data does not satisfy the model.

(a) Find an estimate for how many copies the vendor expects to sell each day. [1]

(b.i) Write down the degrees of freedom for this test. [1]

(b.ii) Write down the conclusion to the test. Give a reason for your answer. [4]

45. [Maximum mark: 6]


At Springfield University, the weights, in kg, of 10 chinchilla rabbits and 10 sable rabbits were recorded. The aim was to find out
whether chinchilla rabbits are generally heavier than sable rabbits. The results obtained are summarized in the following table.

A t-test is to be performed at the 5% significance level.

(a) Write down the null and alternative hypotheses. [2]

(b) Find the p-value for this test. [2]

(c) Write down the conclusion to the test. Give a reason for your answer. [2]

46. [Maximum mark: 5]


The number of sick days taken by each employee in a company during a year was recorded. The data was organized in a box and
whisker diagram as shown below:
For this data, write down

(a.i) the minimum number of sick days taken during the year. [1]

(a.ii) the lower quartile. [1]

(a.iii) the median. [1]

(b) Paul claims that this box and whisker diagram can be used to infer that the percentage of employees who took
fewer than six sick days is smaller than the percentage of employees who took more than eleven sick days.

State whether Paul is correct. Justify your answer. [2]

47. [Maximum mark: 19]


Xavie conducted a study to see if there is a relationship between the price of an apartment, y, and its distance, x, from the city
centre of Melbourne.

They took a random sample of six typical apartments along a train line in the city. Xavie obtained the data shown in the following
table.

A plot of these data is seen in the following graph.

(a) Write down the value of the Spearman’s rank correlation coefficient, r .
s [1]
(b.i) Find the Pearson’s product-moment correlation coefficient, r. [2]

(b.ii) Use your value of r to state which two of the following would best describe the correlation between the variables.

[2]

The relationship between the variables can be modelled by the regression equation y .
= ax + b

(c.i) Write down the value of a. [1]

(c.ii) Write down the value of b. [1]

(c.iii) According to this model, state in context what the value of b represents. [1]

(d) Xavie uses the regression equation to estimate the price of a typical apartment located 19. 6 km from the city
centre.

(d.i) Find this estimated price. [3]

(d.ii) State two reasons that Xavie might use to justify the validity of this estimate. [2]

To verify whether this relationship applies in a different direction from the city centre, Xavie considers two locations, A and B, both
an equal distance from the city centre. They take a random sample of seven apartments from each location and record the prices
(in millions of dollars) in the following tables.

Xavie conducts a t-test, at the 5 % level of significance, to see if the mean apartment price in location A is different to the mean
apartment price in location B. They assume the population variances are the same.

For this test, Xavie takes the null hypothesis to be μ A = μB .

(e) Write down the alternative hypothesis. [1]

(f ) Find the p-value for this test. [2]

(g) State the conclusion of the test. Justify your answer. [2]

(h) State one additional assumption Xavie has made about the distributions to conduct this test. [1]

48. [Maximum mark: 12]


A type of generator will only function if a particular switch is working. The generator has a main switch, A, and a ‘back up’ switch, B.

The manufacturer claims the probability of switch A failing within one month of being fitted is 0. 1 and the probability of the
cheaper switch B failing within one month is 0. 3. Whether or not a switch fails is independent of the state of the other switch.

If both switches fail, the generator needs to shut down to replace the switches. Both switches are replaced after a month of use
(whether they have failed or not) or whenever the generator needs to be shut down.
The following tree diagram shows the probabilities of a switch failing within one month of them both being replaced, assuming
the manufacturer’s claim is correct.

(a) Write down the values of a, b and c. [2]

(b) Hence find the probability that the generator needs to shut down within one month of the switches being
replaced. [1]

The owner of the generator is suspicious of the switch manufacturer’s claims, so they look back through the past 200 occasions
when the switches were replaced. The records show whether no switches, one switch or two switches had failed.

The data the owner collected are shown in the following table.

(c) Show that the expected value of no switches failing in the generator, during the last 200 occasions when the
switches were replaced, is 126. [2]

(d) Perform a χ goodness of fit test at the 5 % significance level to test whether the manufacturer’s claims are correct
2

using the following hypotheses.

H0 : The manufacturer’s claims are correct.


H1 : The manufacturer’s claims are not both correct. [7]

49. [Maximum mark: 16]


In a given week, the number of students in a particular primary school that were absent due to headlice (H ), influenza (I ) and/or
chickenpox (C) were recorded as follows.

The primary school has 500 students.

35 students had headlice only


20 students had influenza only
5 students had chickenpox only
4 students had headlice and influenza but not chickenpox
2 students had headlice and chickenpox but not influenza
3 students had influenza and chickenpox but not headlice
1 student had headlice, influenza and chickenpox

(a) Draw a Venn diagram to represent this information. [3]


(b) Calculate the number of students who did not have headlice or influenza or chickenpox. [2]

A student is chosen at random from all the students in the school.

(c) Find the probability that this student has

(c.i) headlice. [2]

(c.ii) influenza given that the student has headlice. [2]

Diego is a teacher in the school. He believes that the number of students, n, who have had influenza during the first t days of the
school year, can be modelled by the function

n(t) = 250 − 240(2)


kt
, k ∈ R.

(d) Use Diego’s model to calculate the number of students who started the school year with influenza. [2]

It is known that 130 students have had influenza during the first 10 days of the school year.

(e) Find the value of k. [2]

(f ) Using this model, calculate how many days it will take for 200 students to have had influenza since the start of the
school year. [2]

By the last day of the school year, it is known that 300 students have had influenza.

(g) Comment on the appropriateness of Diego’s model. [1]

50. [Maximum mark: 19]


A recent study found that the heights of Dutch women can be modelled by a normal distribution with mean 170. 7 cm and
standard deviation 6. 3 cm.

A Dutch woman is chosen at random.

(a) Calculate the probability that her height is

(a.i) less than 160 cm. [2]

(a.ii) between 160 cm and 170 cm. [2]

27 % of Dutch women have a height of more than h metres.

(b) Calculate the value of h.

[2]

Janneke selects a random sample of 200 Dutch women from Amsterdam and measures their heights. She wants to determine
whether this sample could have been chosen from a normally distributed population with mean of 170. 7 cm and standard
deviation of 6. 3 cm.

She performs a χ goodness of fit test at the 5 % significance level. She begins by creating the following frequency table.
2
(c) Calculate, correct to four significant figures, the value of

(c.i) a . [1]

(c.ii) b. [1]

The hypotheses for Janneke’s test are

H0 : the heights are drawn from a normally distributed population with mean 170. 7 cm and standard deviation 6. 3 cm

H1 : the heights are not drawn from a normally distributed population with mean 170. 7 cm and standard deviation 6. 3 cm

(d) Write down the degrees of freedom for this test. [1]

The critical value for this test is 7. 815.

(e) Perform the χ goodness of fit test and state your conclusion, justifying your reasoning.
2
[4]

Gundega claims that, on average, Latvian women are taller than Dutch women.

Random samples of 10 Latvian women and 10 Dutch women are chosen, and their heights are measured.

Gundega performs a t-test at the 5 % significance level. It is assumed that the populations are normally distributed and have equal
variances.

(f ) Write down the null and alternative hypotheses for this test. [2]

(g) Perform the t-test and state the conclusion, justifying your reasoning. [4]

51. [Maximum mark: 18]


The running time, t (minutes), of 200 family movies are recorded in the following table.
(a.i) Write down the mid-interval value of 70 ≤ t < 80 . [1]

(a.ii) Calculate an estimate of the mean running time of the 200 movies. [2]

This table is used to create the following cumulative frequency graph.

(b) Use the cumulative frequency curve to estimate the interquartile range. [2]

“Star Feud” is a movie in the data set and its running time is 100 minutes.

(c) Use your answer to part (b) to estimate whether “Star Feud’s” running time is an outlier for this data. Justify your
answer. [3]

It is believed that the running times of family movies follow a normal distribution with mean 88 minutes and standard deviation
6. 75 minutes.

It is decided to perform a χ goodness of fit test on the data to determine whether this sample of 200 movies could have plausibly
2

been drawn from an underlying distribution N (88, 2


6. 75 ) .

(d) Write down the null and the alternative hypotheses for the test. [2]

As part of the test, the following table is created.

(e.i) Find the value of a and the value of b. [4]

(e.ii) Hence, perform the test to a 5 % significance level, clearly stating the conclusion in context. [4]
52. [Maximum mark: 18]
The heights, h, of 200 university students are recorded in the following table.

(a.i) Write down the mid-interval value of 140 ≤ h < 160 . [1]

(a.ii) Calculate an estimate of the mean height of the 200 students. [2]

This table is used to create the following cumulative frequency graph.

(b) Use the cumulative frequency curve to estimate the interquartile range. [2]

Laszlo is a student in the data set and his height is 204 cm.

(c) Use your answer to part (b) to estimate whether Laszlo’s height is an outlier for this data. Justify your answer. [3]

It is believed that the heights of university students follow a normal distribution with mean 176 cm and standard deviation
13. 5 cm.

It is decided to perform a χ goodness of fit test on the data to determine whether this sample of 200 students could have
2

plausibly been drawn from an underlying distribution N (176, 13. 5 ). 2

(d) Write down the null and the alternative hypotheses for the test. [2]

As part of the test, the following table is created.


(e.i) Find the value of a and the value of b. [4]

(e.ii) Hence, perform the test to a 5 % significance level, clearly stating the conclusion in context. [4]

53. [Maximum mark: 15]


The mean annual temperatures for Earth, recorded at fifty-year intervals, are shown in the table.

Year (x) 1708 1758 1808 1858 1908 1958 2008

Year °C (y) 8. 73 9. 22 9. 10 9. 12 9. 13 9. 45 9. 76

Tami creates a linear model for this data by finding the equation of the straight line passing through the points with coordinates
(1708, 8. 73) and (1958, 9. 45).

(a) Calculate the gradient of the straight line that passes through these two points. [2]

(b.i) Interpret the meaning of the gradient in the context of the question. [1]

(b.ii) State appropriate units for the gradient. [1]

(c) Find the equation of this line giving your answer in the form y = mx + c . [2]

(d) Use Tami’s model to estimate the mean annual temperature in the year 2000. [2]

Thandizo uses linear regression to obtain a model for the data.

(e.i) Find the equation of the regression line y on x. [2]

(e.ii) Find the value of r, the Pearson’s product-moment correlation coefficient. [1]

(f ) Use Thandizo’s model to estimate the mean annual temperature in the year 2000. [2]

Thandizo uses his regression line to predict the year when the mean annual temperature will first exceed 15 °C.

(g) State two reasons why Thandizo’s prediction may not be valid. [2]

54. [Maximum mark: 17]


It is claimed that a new remedy cures 82% of the patients with a particular medical problem.

This remedy is to be used by 115 patients, and it is assumed that the 82% claim is true.
(a) Find the probability that exactly 90 of these patients will be cured. [3]

(b) Find the probability that at least 95 of these patients will be cured. [2]

(c) Find the variance in the possible number of patients that will be cured. [2]

The probability that at least n patients will be cured is less than 30%.

(d) Find the least value of n. [3]

A clinic is interested to see if the mean recovery time of their patients who tried the new remedy is less than that of their patients
who continued with an older remedy. The clinic randomly selects some of their patients and records their recovery time in days.
The results are shown in the table below.

The data is assumed to follow a normal distribution and the population variance is the same for the two groups. A t-test is used to
compare the means of the two groups at the 10% significance level.

(e) State the appropriate null and alternative hypotheses for this t-test. [2]

(f ) Find the p-value for this test. [2]

(g) State the conclusion for this test. Give a reason for your answer. [2]

(h) Explain what the p-value represents. [1]

55. [Maximum mark: 17]


Elsie, a librarian, wants to investigate the length of time, T minutes, that people spent in her library on a particular day.

(a) State whether the variable T is discrete or continuous. [1]

Elsie’s data for 160 people who visited the library on that particular day is shown in the following table.

(b) Find the value of k. [2]

(c.i) Write down the modal class. [1]

(c.ii) Write down the mid-interval value for this class. [1]

(d) Use Elsie’s data to calculate an estimate of the mean time that people spent in the library. [2]
(e) Using the table, write down the maximum possible number of people who spent 35 minutes or less in the library
on that day. [1]

Elsie assumes her data to be representative of future visitors to the library.

(f ) Find the probability a visitor spends at least 60 minutes in the library. [2]

The following box and whisker diagram shows the times, in minutes, that the 160 visitors spent in the library.

(g) Write down the median time spent in the library. [1]

(h) Find the interquartile range. [2]

(i) Hence show that the longest time that a person spent in the library is not an outlier. [3]

Elsie believes the box and whisker diagram indicates that the times spent in the library are not normally distributed.

(j) Identify one feature of the box and whisker diagram which might support Elsie’s belief. [1]

56. [Maximum mark: 16]


At Mirabooka Primary School, a survey found that 68% of students have a dog and 36% of students have a cat. 14% of students
have both a dog and a cat.

This information can be represented in the following Venn diagram, where m, n, p and q represent the percentage of students
within each region.

Find the value of

(a.i) m . [1]

(a.ii) n . [1]

(a.iii) p . [1]

(a.iv) q . [1]
(b) Find the percentage of students who have a dog or a cat or both. [1]

Find the probability that a randomly chosen student

(c.i) has a dog but does not have a cat. [1]

(c.ii) has a dog given that they do not have a cat. [2]

Each year, one student is chosen randomly to be the school captain of Mirabooka Primary School.

Tim is using a binomial distribution to make predictions about how many of the next 10 school captains will own a dog. He
assumes that the percentages found in the survey will remain constant for future years and that the events “being a school captain”
and “having a dog” are independent.

Use Tim’s model to find the probability that in the next 10 years

(d.i) 5 school captains have a dog. [2]

(d.ii) more than 3 school captains have a dog. [2]

(d.iii) exactly 9 school captains in succession have a dog. [3]

John randomly chooses 10 students from the survey.

(e) State why John should not use the binomial distribution to find the probability that 5 of these students have a
dog. [1]

57. [Maximum mark: 16]


The scores of the eight highest scoring countries in the 2019 Eurovision song contest are shown in the following table.

For this data, find

(a.i) the upper quartile. [2]

(a.ii) the interquartile range. [2]

(b) Determine if the Netherlands’ score is an outlier for this data. Justify your answer. [3]

Chester is investigating the relationship between the highest-scoring countries’ Eurovision score and their population size to
determine whether population size can reasonably be used to predict a country’s score.
The populations of the countries, to the nearest million, are shown in the table.

Chester finds that, for this data, the Pearson’s product moment correlation coefficient is r = 0. 249.
(c) State whether it would be appropriate for Chester to use the equation of a regression line for y on x to predict a
country’s Eurovision score. Justify your answer. [2]

Chester then decides to find the Spearman’s rank correlation coefficient for this data, and creates a table of ranks.

Write down the value of:

(d.i) a . [1]

(d.ii) b. [1]

(d.iii) c . [1]

(e.i) Find the value of the Spearman’s rank correlation coefficient r .


s [2]

(e.ii) Interpret the value obtained for r .


s [1]

(f ) When calculating the ranks, Chester incorrectly read the Netherlands’ score as 478. Explain why the value of the
Spearman’s rank correlation r does not change despite this error.
s [1]

58. [Maximum mark: 15]


The aircraft for a particular flight has 72 seats. The airline’s records show that historically for this flight only 90% of the people who
purchase a ticket arrive to board the flight. They assume this trend will continue and decide to sell extra tickets and hope that no
more than 72 passengers will arrive.

The number of passengers that arrive to board this flight is assumed to follow a binomial distribution with a probability of 0. 9.
(a) The airline sells 74 tickets for this flight. Find the probability that more than 72 passengers arrive to board the
flight. [3]

(b.i) Write down the expected number of passengers who will arrive to board the flight if 72 tickets are sold. [2]

(b.ii) Find the maximum number of tickets that could be sold if the expected number of passengers who arrive to board
the flight must be less than or equal to 72. [2]

Each passenger pays $150 for a ticket. If too many passengers arrive, then the airline will give $300 in compensation to each
passenger that cannot board.

(c) Find, to the nearest integer, the expected increase or decrease in the money made by the airline if they decide to
sell 74 tickets rather than 72. [8]

59. [Maximum mark: 17]


Mackenzie conducted an experiment on the reaction times of teenagers. The results of the experiment are displayed in the
following cumulative frequency graph.

Use the graph to estimate the

(a.i) median reaction time. [1]

(a.ii) interquartile range of the reaction times. [3]

(b) Find the estimated number of teenagers who have a reaction time greater than 0. 4 seconds. [2]

(c) Determine the 90th percentile of the reaction times from the cumulative frequency graph. [2]

Mackenzie created the cumulative frequency graph using the following grouped frequency table.
(d.i) Write down the value of a. [1]

(d.ii) Write down the value of b. [1]

(e) Write down the modal class from the table. [1]

(f ) Use your graphic display calculator to find an estimate of the mean reaction time. [2]

Upon completion of the experiment, Mackenzie realized that some values were grouped incorrectly in the frequency table. Some
reaction times recorded in the interval 0 < t ≤ 0. 2 should have been recorded in the interval 0. 2 < t ≤ 0. 4.

(g) Suggest how, if at all, the estimated mean and estimated median reaction times will change if the errors are
corrected. Justify your response. [4]

60. [Maximum mark: 16]


A group of 1280 students were asked which electronic device they preferred. The results per age group are given in the following
table.

A student from the group is chosen at random. Calculate the probability that the student

(a.i) prefers a tablet. [2]

(a.ii) is 11–13 years old and prefers a mobile phone. [2]

(a.iii) prefers a laptop given that they are 17–18 years old. [2]

(a.iv) prefers a tablet or is 14–16 years old. [3]

A χ test for independence was performed on the collected data at the 1% significance level. The critical value for the test is
2

13. 277.

(b) State the null and alternative hypotheses. [1]

(c) Write down the number of degrees of freedom. [1]

(d.i) Write down the χ test statistic.


2
[2]
(d.ii) Write down the p-value. [1]

(d.iii) State the conclusion for the test in context. Give a reason for your answer. [2]

61. [Maximum mark: 14]


Arianne plays a game of darts.

The distance that her darts land from the centre, O, of the board can be modelled by a normal distribution with mean 10 cm and
standard deviation 3 cm.

Find the probability that

(a.i) a dart lands less than 13 cm from O. [2]

(a.ii) a dart lands more than 15 cm from O. [1]

Each of Arianne’s throws is independent of her previous throws.

(b) Find the probability that Arianne throws two consecutive darts that land more than 15 cm from O. [2]

In a competition a player has three darts to throw on each turn. A point is scored if a player throws all three darts to land within a
central area around O. When Arianne throws a dart the probability that it lands within this area is 0. 8143.

(c) Find the probability that Arianne does not score a point on a turn of three darts. [2]

In the competition Arianne has ten turns, each with three darts.

(d.i) Find the probability that Arianne scores at least 5 points in the competition. [3]

(d.ii) Find the probability that Arianne scores at least 5 points and less than 8 points. [2]

(d.iii) Given that Arianne scores at least 5 points, find the probability that Arianne scores less than 8 points. [2]

62. [Maximum mark: 13]


The stopping distances for bicycles travelling at 20 km h −1
are assumed to follow a normal distribution with mean 6. 76 m and
standard deviation 0. 12 m.
Under this assumption, find, correct to four decimal places, the probability that a bicycle chosen at random travelling at 20 km h −1

manages to stop

(a.i) in less than 6. 5 m. [2]

(a.ii) in more than 7 m. [1]

1000 randomly selected bicycles are tested and their stopping distances when travelling at 20 km h −1
are measured.

Find, correct to four significant figures, the expected number of bicycles tested that stop between

(b.i) 6. 5 m and 6. 75 m. [2]

(b.ii) 6. 75 m and 7 m. [1]

The measured stopping distances of the 1000 bicycles are given in the table.

It is decided to perform a χ goodness of fit test at the 5% level of significance to decide whether the stopping distances of
2

bicycles travelling at 20 km h −1
can be modelled by a normal distribution with mean 6. 76 m and standard deviation 0. 12 m.

(c) State the null and alternative hypotheses. [2]

(d) Find the p-value for the test. [3]

(e) State the conclusion of the test. Give a reason for your answer. [2]

63. [Maximum mark: 18]


As part of his mathematics exploration about classic books, Jason investigated the time taken by students in his school to read the
book The Old Man and the Sea. He collected his data by stopping and asking students in the school corridor, until he reached his target of
10 students from each of the literature classes in his school.

(a) State which of the two sampling methods, systematic or quota, Jason has used. [1]

Jason constructed the following box and whisker diagram to show the number of hours students in the sample took to read this
book.
(b) Write down the median time to read the book. [1]

(c) Calculate the interquartile range. [2]

Mackenzie, a member of the sample, took 25 hours to read the novel. Jason believes Mackenzie’s time is not an outlier.

(d) Determine whether Jason is correct. Support your reasoning. [4]

For each student interviewed, Jason recorded the time taken to read The Old Man and the Sea (x), measured in hours, and paired this with
their percentage score on the final exam (y). These data are represented on the scatter diagram.

(e) Describe the correlation. [1]

Jason correctly calculates the equation of the regression line y on x for these students to be

y = −1. 54x + 98. 8 .

He uses the equation to estimate the percentage score on the final exam for a student who read the book in 1. 5 hours.

(f ) Find the percentage score calculated by Jason. [2]

(g) State whether it is valid to use the regression line y on x for Jason’s estimate. Give a reason for your answer. [2]

Jason found a website that rated the ‘top 50’ classic books. He randomly chose eight of these classic books and recorded the
number of pages. For example, Book H is rated 44th and has 281 pages. These data are shown in the table.

Jason intends to analyse the data using Spearman’s rank correlation coefficient, r . s

(h) Copy and complete the information in the following table.

[2]

(i.i) Calculate the value of r .


s [2]

(i.ii) Interpret your result. [1]


64. [Maximum mark: 14]
It is known that the weights of male Persian cats are normally distributed with mean 6. 1 kg and variance 0. 5 2
kg
2
.

(a) Sketch a diagram showing the above information. [2]

(b) Find the proportion of male Persian cats weighing between 5. 5 kg and 6. 5 kg. [2]

A group of 80 male Persian cats are drawn from this population.

(c) Determine the expected number of cats in this group that have a weight of less than 5. 3 kg. [3]

(d) It is found that 12 of the cats weigh more than x kg. Estimate the value of x. [3]

(e) Ten of the cats are chosen at random. Find the probability that exactly one of them weighs over 6. 25 kg. [4]

65. [Maximum mark: 19]


A medical centre is testing patients for a certain disease. This disease occurs in 5% of the population.

They test every patient who comes to the centre on a particular day.

(a) State the sampling method being used. [1]

It is intended that if a patient has the disease, they test “positive”, and if a patient does not have the disease, they test “negative”.

However, the tests are not perfect, and only 99% of people who have the disease test positive. Also, 2% of people who do not
have the disease test positive.

The tree diagram shows some of this information.

Write down the value of

(b.i) a . [1]

(b.ii) b. [1]

(b.iii) c . [1]

(b.iv) d . [1]

Use the tree diagram to find the probability that a patient selected at random

(c.i) will not have the disease and will test positive. [2]

(c.ii) will test negative. [3]

(c.iii) has the disease given that they tested negative. [3]
(d) The medical centre finds the actual number of positive results in their sample is different than predicted by the
tree diagram. Explain why this might be the case. [1]

The staff at the medical centre looked at the care received by all visiting patients on a randomly chosen day. All the patients
received at least one of these services: they had medical tests (M ), were seen by a nurse (N ), or were seen by a doctor (D). It was
found that:

78 had medical tests,


45 were seen by a nurse;
30 were seen by a doctor;

9 had medical tests and were seen by a doctor and a nurse;

18 had medical tests and were seen by a doctor but were not seen by a nurse;

11 patients were seen by a nurse and had medical tests but were not seen by a doctor;

2 patients were seen by a doctor without being seen by nurse and without having medical tests.

(e) Draw a Venn diagram to illustrate this information, placing all relevant information on the diagram. [3]

(f ) Find the total number of patients who visited the centre during this day. [2]

© International Baccalaureate Organization, 2025

You might also like