Topic4 Revision Session Worksheet (1)
Topic4 Revision Session Worksheet (1)
1. [Maximum mark: 6]
The formula F = 1. 8C + 32 is used to convert a temperature in degrees Celsius, C , to degrees Fahrenheit, F .
(a.i) Find a formula for converting a temperature in degrees Fahrenheit to degrees Celsius. [2]
(a.ii) Find the temperature in degrees Celsius that is recorded as 77 degrees Fahrenheit. [1]
Over one year, the mean daily temperature in Mexico City was calculated to be 17 degrees Celsius with a standard deviation of 9
degrees Celsius.
(b.ii) the standard deviation of the daily temperature in Mexico City. [2]
2. [Maximum mark: 7]
The following data show the heights, in metres, of six players in a basketball team.
A new player, Gheorghe, joins the team. Their height is measured as 1. 98 metres to the nearest centimetre.
3. [Maximum mark: 6]
A teacher surveys their students to find out if they have eaten at the local Thai and Indian cafés. The results of the survey are shown
in the following Venn diagram.
(a) Write down the number of students surveyed. [1]
(b) Write down the number of students who have not eaten at the Indian café. [1]
(c) Find the probability this student has eaten at both the Thai café and the Indian café. [1]
(e) State whether the events T and I are mutually exclusive. Justify your answer. [2]
4. [Maximum mark: 8]
Zac raises funds for a library by running a game where players spin a needle. The final position of the needle results in an outcome
where a player wins or loses money. The outcomes, with associated probabilities, are shown in the following diagram.
(b.ii) Explain why Zac expects to raise money from the games Emily plays. [1]
5. [Maximum mark: 6]
Jerry makes handcrafted chocolates. On average, 1 in 25 of the chocolates that Jerry makes is flawed. Whether or not a chocolate is
flawed is independent of all other chocolates.
Jerry sells the perfect chocolates for 50 pesos each and the flawed ones for 15 pesos each.
(b) Calculate the expected number of pesos Jerry makes from selling a batch of 20 randomly selected chocolates. [2]
6. [Maximum mark: 7]
The prices, in dollars, of 10 different garden chairs are:
7. [Maximum mark: 5]
Sunita sorts 300 peppers into sizes of small, medium or large. Some peppers are red, some are green, and some are yellow.
(b.i) Calculate χ 2
calc
. [2]
(b.ii) State a conclusion to the test. Give a reason for your answer. [2]
8. [Maximum mark: 7]
Gustav plays a game in which he first tosses an unbiased coin and then rolls an unbiased six-sided die.
If the coin shows tails, the score on the die is Gustav’s final number of points.
If the coin shows heads, one is added to the score on the die for Gustav’s final number of points.
(a) Find the probability that Gustav’s final number of points is 7. [2]
[3]
(c) Calculate the expected value of Gustav’s final number of points. [2]
9. [Maximum mark: 7]
Billy is a keen walker who keeps a record of his performance. The following table shows the time, in minutes, it takes him to walk
one kilometre up hills with different gradients. The gradient of each hill is constant.
[2]
(a.ii) Describe the correlation between T and G with reference to the value of r, the Pearson’s product-moment
correlation coefficient. [2]
(b) Estimate the time it will take Billy to walk one kilometre up the hill. [2]
This morning, Billy walked one kilometre up a hill, and it took 22 minutes.
(c) Explain why it would be inappropriate to use the equation found in part (a) to estimate the gradient of this hill. [1]
(a) Identify which two of the following statements must be true according to the box and whisker diagram. Indicate
your choices by placing tick marks in the second column of the following table.
[2]
At the end of the year, Mrs Whitehouse surveyed a random sample of students from each of her two large classes to determine how
satisfied they were with her teaching.
Each student independently selected a value from 1 to 10, with 1 meaning that they were not satisfied at all and 10 meaning that
they were very satisfied.
Mrs Whitehouse believes that there was no difference in the general satisfaction between the two classes. She assumes that the
data is drawn from a population that can be modelled by a normal distribution and proposes to conduct a t-test at the 5 %
significance level.
(b) Write down the null and alternative hypotheses for her test. [2]
(d) Write down the conclusion to the test. Give a reason for your answer. [2]
11. [Maximum mark: 7]
Rita is playing a game. In the game, she must roll a fair six-sided die. If she gets a five or six then she wins a prize. If not, then she has
another chance but this time she must flip a fair coin which will result in the coin landing on heads or tails. If the coin lands on
heads, then Rita wins a prize.
(a) Complete the tree diagram by writing in the three missing probabilities.
[2]
(b) Find the probability that Rita does not win a prize. [2]
(c) Given that Rita won a prize, find the probability that she got a five or six when she rolled the die. [3]
(a) Find the probability that Nicole’s car starts on exactly three mornings in a particular 5 day workweek. [2]
Nicole walks to work on mornings when her car does not start and it is not raining. Nicole takes the bus to work on mornings when
her car does not start and it is raining.
Where Nicole lives, there is a 42 % probability of rain on any given morning, independent of any other morning. The probability of
Nicole’s car starting is independent of the weather.
(b) Find the probability that Nicole will not have to take the bus in a particular workweek. [4]
He begins to investigate this belief by randomly selecting eight musical artists and collecting data on the number of followers each
of the artists has on a particular social media platform. He then collects data on the number of albums each artist sold in the first
week after releasing an album. His data is shown in Table 1.
Thurston decides to calculate the Spearman’s rank correlation coefficient.
(a) Complete the table of ranks shown in Table 2.
Table 2
[1]
Thurston believes that artists with a higher number of social media followers sell more albums in the first week. He carries out a
hypothesis test using a 10 % significance level with the following null hypothesis:
H0 : In the population, there is no monotonic relationship between the number of social media followers and the number of
albums sold in the first week.
(d) State the conclusion of the hypothesis test, giving a reason. [2]
(a.ii) Describe the correlation between T and G with reference to the value of r, the Pearson’s product-moment
correlation coefficient. [2]
This morning, Joel rode one kilometre up a hill, and it took 22 minutes.
(c) Explain why it would be inappropriate to use the equation found in part (a) to estimate the gradient of this hill. [1]
The table shows results for these two events at the World Championships.
Event Rank
Athlete’s Long Jump High Jump Long Jump High Jump
Country (m) (m) Rank Rank
Germany 7. 64 2. 11 1
France 7. 52 2. 08 2
Estonia 7. 49 1. 84 3
Canada 7. 44 2. 02 4
Netherlands 7. 33 2. 05 5
Ukraine 7. 28 2. 02 6
Algeria 7. 22 1. 90 7
Austria 7. 11 1. 87 8
Grenada 6. 98 1. 99 9
Japan 6. 64 1. 96 10
The Spearman’s rank correlation coefficient is used to determine if there is a linear correlation between an athlete’s ranking in long
jump and their ranking in high jump.
(a) Complete the table to show the athletes’ rankings in high jump. [2]
(b) Find the value of the Spearman’s rank correlation coefficient r . s [2]
The following guide is used by the coach to determine the strength of the correlation between the ranks for long jump and high
jump.
|r s | Strength
0. 000 to 0. 199 Very weak
0. 200 to 0. 399 Weak
0. 400 to 0. 599 Moderate
0. 600 to 0. 799 Strong
0. 800 to 1. 000 Very strong
(c) State the strength of the correlation between the rankings as indicated by the table and interpret this in the
context of the question. [2]
16. [Maximum mark: 6]
Carys believes that, on a memory retention test, the mean score of bilingual people (μ ) will be higher than the mean score of
b
monolingual people (μ ). Carys gave a memory retention test to a random sample of students in her class. The results are shown in
m
Carys performs a one-tailed t-test at a 5% level of significance. It is assumed that the scores are normally distributed and the
samples have equal variances.
(c) State the conclusion of the test in the context of the question. Justify your answer. [2]
A speed of 75. 7 km h −1
is two standard deviations from the mean.
(a) Find the standard deviation for the speed of the cars. [2]
Speeding tickets are issued to all drivers travelling at a speed greater than 72 km h −1
.
(b) Find the probability that a randomly selected driver who passes the speed camera receives a speeding ticket. [2]
(c) Show that the region of the normal distribution between p and q is not symmetrical about the mean. [3]
Athlete A B C D E F G H
Age (years) 13 17 22 18 19 25 11 36
Time (seconds) 13. 4 14. 6 13. 4 12. 9 12. 0 11. 8 17. 0 13. 1
Sung-Jin decides to calculate the Spearman’s rank correlation coefficient for his set of data.
Athlete A B C D E F G H
Age rank 3
Time rank 1 [2]
(d) Suggest a mathematical reason why Sung-Jin may have decided not to use Pearson’s product-moment
correlation coefficient with his data from the original table. [1]
Grade 1 2 3 4 5 6 7
Frequency 1 4 7 9 p 9 4
Their quality assurance team randomly selects 500 items of food to inspect. The quality of this food is classified as perfect,
satisfactory, or poor. The data is summarized in the following table.
(a) Find the probability that its quality is not perfect, given that it is from breakfast. [2]
A χ test at the 5% significance level is carried out to determine if there is significant evidence of a difference in the quality of the
2
(c) State, with justification, the conclusion for this test. [2]
(a) Calculate the probability that the length of the seed is less than 3. 7 cm. [2]
It is known that 30% of the seeds have a length greater than k cm.
x 0 1 2 3 4 5
P(X = x) 0. 15 0. 2 k 0. 16 2k 0. 25
The player has a chance to win money based on how many times they hit the target.
The gain for the player, in $, is shown in the following table, where a negative gain means that the player loses money.
x 0 1 2 3 4 5
(b) Determine whether this game is fair. Justify your answer. [3]
24. [Maximum mark: 7]
The following Venn diagram shows two independent events, R and S . The values in the diagram represent probabilities.
The times he takes to complete a lap are normally distributed with mean 59 seconds and standard deviation 3 seconds.
(a) Find the probability that Roy completes a lap in less than 55 seconds. [2]
Roy will complete a 20 lap race. It is expected that 8. 6 of the laps will take more than t seconds.
(a) Taizo plays two games that are independent of each other. Find the probability that Taizo knocks over a total of
two bottles. [4]
In any given game, Taizo will win k points if he knocks over two bottles, win 4 points if he knocks over one bottle and lose 8 points
if no bottles are knocked over.
(b) Find the value of k such that the game is fair. [3]
27. [Maximum mark: 5]
Sergio is interested in whether an adult’s favourite breakfast berry depends on their income level. He obtains the following data for
341 adults and decides to carry out a χ test for independence, at the 10% significance level.
2
(c) Write down Sergio’s conclusion to the test in context. Justify your answer. [2]
Annabelle uses these scores to conduct a two-tailed t-test to compare the means of the two classes, at the 5% level of significance.
It is assumed the examination scores for both classes have the same variance and are normally distributed.
The null hypothesis is μ 1 = μ2 , where μ is the mean examination score from Manny’s class and μ is the mean examination score
1 2
(b) Find the p-value for this test. Give your answer correct to five decimal places. [2]
(c) State whether Annabelle’s conclusion is correct. Give a reason for your answer. [2]
The number of trees to be planted in each of the first three months are shown in the following table.
(a) Find the number of trees to be planted in the 15th month. [3]
(b) Find the total number of trees to be planted in the first 15 months. [2]
(c) Find the mean number of trees planted per month during the first 15 months. [2]
(a) Write down the percentage of bags that weigh more than 500 g. [1]
A bag that weighs less than 495 g is rejected by the factory for being underweight.
(b) Find the probability that a randomly chosen bag is rejected for being underweight. [2]
(c) A bag that weighs more than k grams is rejected by the factory for being overweight. The factory rejects 2% of
bags for being overweight.
(e) State the conclusion of the test. Give a reason for your answer. [2]
[1]
(b) Find the probability that Karl takes two socks of the same colour. [2]
(c) Given that Karl has two socks of the same colour find the probability that he has two brown socks. [3]
To gather data, each driver was put in a car simulator and asked to either talk on a mobile phone or talk to a passenger. Each driver
was instructed to apply the brakes as soon as they saw a red light appear in front of the car. The reaction times of the drivers, in
seconds, were recorded, as shown in the following table.
At the 10% level of significance, a t-test was used to compare the mean reaction times of the two groups. Each data set is assumed
to be normally distributed, and the population variances are assumed to be the same.
Let μ and μ be the population means for the two groups. The null hypothesis for this test is H
1 2 0 : μ1 − μ2 = 0 .
(c.i) State the conclusion of the test. Justify your answer. [2]
25 33 51 62 63 63 70 74 79 79 81 88 90 90 98
For these data, the lower quartile is 62 and the upper quartile is 88.
(a) Show that the test score of 25 would not be considered an outlier. [3]
The box and whisker diagram showing these scores is given below.
Test scores
Another mathematics class is run by the college during the evening. A box and whisker diagram showing the scores from this class
for the same test is given below.
Test scores
A researcher reviews the box and whisker diagrams and believes that the evening class performed better than the morning class.
(b) With reference to the box and whisker diagrams, state one aspect that may support the researcher’s opinion and
one aspect that may counter it. [2]
(a) Find the probability that a randomly chosen applicant from this group was accepted by the university. [1]
An applicant is chosen at random from this group. It is found that they were accepted into the programme of their choice.
(b) Find the probability that the applicant applied for the Arts programme. [2]
(c) Two different applicants are chosen at random from the original group.
Find the probability that both applicants applied to the Arts programme. [3]
36. [Maximum mark: 7]
A polygraph test is used to determine whether people are telling the truth or not, but it is not completely accurate. When a person
tells the truth, they have a 20% chance of failing the test. Each test outcome is independent of any previous test outcome.
(a) Calculate the expected number of people who will pass this polygraph test. [2]
(b) Calculate the probability that exactly 4 people will fail this polygraph test. [2]
(c) Determine the probability that fewer than 7 people will pass this polygraph test. [3]
When Fuji apples are picked, they are classified as small, medium, large or extra large depending on their mass. Large apples have a
mass of between 172 g and 183 g.
(a) Determine the probability that a Fuji apple selected at random will be a large apple. [2]
Approximately 68% of Fuji apples have a mass within the medium-sized category, which is between k and 172 g.
(b) Find an expression, in terms of b, for the probability of a person not having blue eyes and having fair hair. [1]
(c.i) b. [2]
(c.ii) c . [1]
39. [Maximum mark: 6]
Eduardo believes that there is a linear relationship between the age of a male runner and the time it takes them to run 5000
metres.
To test this, he recorded the age, x years, and the time, t minutes, for eight males in a single 5000 m race. His results are presented
in the following table and scatter diagram.
(a) For this data, find the value of the Pearson’s product-moment correlation coefficient, r. [2]
Eduardo looked in a sports science text book. He found that the following information about r was appropriate for athletic
performance.
(b) Comment on your answer to part (a), using the information that Eduardo found. [1]
(c) Write down the equation of the regression line of t on x, in the form t = ax + b . [1]
Use the equation of the regression line to estimate the time he took to complete the 5000 m race. [2]
The students were awarded a grade from 1 to 5, depending on the score obtained in the exam. The number of students receiving
each grade is shown in the following table.
She recorded the weights of eggs, in grams, from a random selection of geese. The data is shown in the table.
In order to test her claim, Arriane performs a t-test at a 10% level of significance. It is assumed that the weights of eggs are normally
distributed and the samples have equal variances.
(a) State, in words, the null hypothesis. [1]
(c) State whether the result of the test supports Arriane’s claim. Justify your reasoning. [2]
[2]
To test the model, they record the number of copies sold each weekday during a particular week. This data is shown in the table.
A goodness of fit test at the 5% significance level is used on this data to determine whether the vendor’s model is suitable.
The critical value for the test is 9. 49 and the hypotheses are
(a) Find an estimate for how many copies the vendor expects to sell each day. [1]
(b.i) Write down the degrees of freedom for this test. [1]
(b.ii) Write down the conclusion to the test. Give a reason for your answer. [4]
(c) Write down the conclusion to the test. Give a reason for your answer. [2]
(a.i) the minimum number of sick days taken during the year. [1]
(b) Paul claims that this box and whisker diagram can be used to infer that the percentage of employees who took
fewer than six sick days is smaller than the percentage of employees who took more than eleven sick days.
They took a random sample of six typical apartments along a train line in the city. Xavie obtained the data shown in the following
table.
(a) Write down the value of the Spearman’s rank correlation coefficient, r .
s [1]
(b.i) Find the Pearson’s product-moment correlation coefficient, r. [2]
(b.ii) Use your value of r to state which two of the following would best describe the correlation between the variables.
[2]
The relationship between the variables can be modelled by the regression equation y .
= ax + b
(c.iii) According to this model, state in context what the value of b represents. [1]
(d) Xavie uses the regression equation to estimate the price of a typical apartment located 19. 6 km from the city
centre.
(d.ii) State two reasons that Xavie might use to justify the validity of this estimate. [2]
To verify whether this relationship applies in a different direction from the city centre, Xavie considers two locations, A and B, both
an equal distance from the city centre. They take a random sample of seven apartments from each location and record the prices
(in millions of dollars) in the following tables.
Xavie conducts a t-test, at the 5 % level of significance, to see if the mean apartment price in location A is different to the mean
apartment price in location B. They assume the population variances are the same.
(g) State the conclusion of the test. Justify your answer. [2]
(h) State one additional assumption Xavie has made about the distributions to conduct this test. [1]
The manufacturer claims the probability of switch A failing within one month of being fitted is 0. 1 and the probability of the
cheaper switch B failing within one month is 0. 3. Whether or not a switch fails is independent of the state of the other switch.
If both switches fail, the generator needs to shut down to replace the switches. Both switches are replaced after a month of use
(whether they have failed or not) or whenever the generator needs to be shut down.
The following tree diagram shows the probabilities of a switch failing within one month of them both being replaced, assuming
the manufacturer’s claim is correct.
(b) Hence find the probability that the generator needs to shut down within one month of the switches being
replaced. [1]
The owner of the generator is suspicious of the switch manufacturer’s claims, so they look back through the past 200 occasions
when the switches were replaced. The records show whether no switches, one switch or two switches had failed.
The data the owner collected are shown in the following table.
(c) Show that the expected value of no switches failing in the generator, during the last 200 occasions when the
switches were replaced, is 126. [2]
(d) Perform a χ goodness of fit test at the 5 % significance level to test whether the manufacturer’s claims are correct
2
Diego is a teacher in the school. He believes that the number of students, n, who have had influenza during the first t days of the
school year, can be modelled by the function
(d) Use Diego’s model to calculate the number of students who started the school year with influenza. [2]
It is known that 130 students have had influenza during the first 10 days of the school year.
(f ) Using this model, calculate how many days it will take for 200 students to have had influenza since the start of the
school year. [2]
By the last day of the school year, it is known that 300 students have had influenza.
[2]
Janneke selects a random sample of 200 Dutch women from Amsterdam and measures their heights. She wants to determine
whether this sample could have been chosen from a normally distributed population with mean of 170. 7 cm and standard
deviation of 6. 3 cm.
She performs a χ goodness of fit test at the 5 % significance level. She begins by creating the following frequency table.
2
(c) Calculate, correct to four significant figures, the value of
(c.i) a . [1]
(c.ii) b. [1]
H0 : the heights are drawn from a normally distributed population with mean 170. 7 cm and standard deviation 6. 3 cm
H1 : the heights are not drawn from a normally distributed population with mean 170. 7 cm and standard deviation 6. 3 cm
(d) Write down the degrees of freedom for this test. [1]
(e) Perform the χ goodness of fit test and state your conclusion, justifying your reasoning.
2
[4]
Gundega claims that, on average, Latvian women are taller than Dutch women.
Random samples of 10 Latvian women and 10 Dutch women are chosen, and their heights are measured.
Gundega performs a t-test at the 5 % significance level. It is assumed that the populations are normally distributed and have equal
variances.
(f ) Write down the null and alternative hypotheses for this test. [2]
(g) Perform the t-test and state the conclusion, justifying your reasoning. [4]
(a.ii) Calculate an estimate of the mean running time of the 200 movies. [2]
(b) Use the cumulative frequency curve to estimate the interquartile range. [2]
“Star Feud” is a movie in the data set and its running time is 100 minutes.
(c) Use your answer to part (b) to estimate whether “Star Feud’s” running time is an outlier for this data. Justify your
answer. [3]
It is believed that the running times of family movies follow a normal distribution with mean 88 minutes and standard deviation
6. 75 minutes.
It is decided to perform a χ goodness of fit test on the data to determine whether this sample of 200 movies could have plausibly
2
(d) Write down the null and the alternative hypotheses for the test. [2]
(e.ii) Hence, perform the test to a 5 % significance level, clearly stating the conclusion in context. [4]
52. [Maximum mark: 18]
The heights, h, of 200 university students are recorded in the following table.
(a.i) Write down the mid-interval value of 140 ≤ h < 160 . [1]
(a.ii) Calculate an estimate of the mean height of the 200 students. [2]
(b) Use the cumulative frequency curve to estimate the interquartile range. [2]
Laszlo is a student in the data set and his height is 204 cm.
(c) Use your answer to part (b) to estimate whether Laszlo’s height is an outlier for this data. Justify your answer. [3]
It is believed that the heights of university students follow a normal distribution with mean 176 cm and standard deviation
13. 5 cm.
It is decided to perform a χ goodness of fit test on the data to determine whether this sample of 200 students could have
2
(d) Write down the null and the alternative hypotheses for the test. [2]
(e.ii) Hence, perform the test to a 5 % significance level, clearly stating the conclusion in context. [4]
Year °C (y) 8. 73 9. 22 9. 10 9. 12 9. 13 9. 45 9. 76
Tami creates a linear model for this data by finding the equation of the straight line passing through the points with coordinates
(1708, 8. 73) and (1958, 9. 45).
(a) Calculate the gradient of the straight line that passes through these two points. [2]
(b.i) Interpret the meaning of the gradient in the context of the question. [1]
(c) Find the equation of this line giving your answer in the form y = mx + c . [2]
(d) Use Tami’s model to estimate the mean annual temperature in the year 2000. [2]
(e.ii) Find the value of r, the Pearson’s product-moment correlation coefficient. [1]
(f ) Use Thandizo’s model to estimate the mean annual temperature in the year 2000. [2]
Thandizo uses his regression line to predict the year when the mean annual temperature will first exceed 15 °C.
(g) State two reasons why Thandizo’s prediction may not be valid. [2]
This remedy is to be used by 115 patients, and it is assumed that the 82% claim is true.
(a) Find the probability that exactly 90 of these patients will be cured. [3]
(b) Find the probability that at least 95 of these patients will be cured. [2]
(c) Find the variance in the possible number of patients that will be cured. [2]
The probability that at least n patients will be cured is less than 30%.
A clinic is interested to see if the mean recovery time of their patients who tried the new remedy is less than that of their patients
who continued with an older remedy. The clinic randomly selects some of their patients and records their recovery time in days.
The results are shown in the table below.
The data is assumed to follow a normal distribution and the population variance is the same for the two groups. A t-test is used to
compare the means of the two groups at the 10% significance level.
(e) State the appropriate null and alternative hypotheses for this t-test. [2]
(g) State the conclusion for this test. Give a reason for your answer. [2]
Elsie’s data for 160 people who visited the library on that particular day is shown in the following table.
(c.ii) Write down the mid-interval value for this class. [1]
(d) Use Elsie’s data to calculate an estimate of the mean time that people spent in the library. [2]
(e) Using the table, write down the maximum possible number of people who spent 35 minutes or less in the library
on that day. [1]
(f ) Find the probability a visitor spends at least 60 minutes in the library. [2]
The following box and whisker diagram shows the times, in minutes, that the 160 visitors spent in the library.
(g) Write down the median time spent in the library. [1]
(i) Hence show that the longest time that a person spent in the library is not an outlier. [3]
Elsie believes the box and whisker diagram indicates that the times spent in the library are not normally distributed.
(j) Identify one feature of the box and whisker diagram which might support Elsie’s belief. [1]
This information can be represented in the following Venn diagram, where m, n, p and q represent the percentage of students
within each region.
(a.i) m . [1]
(a.ii) n . [1]
(a.iii) p . [1]
(a.iv) q . [1]
(b) Find the percentage of students who have a dog or a cat or both. [1]
(c.ii) has a dog given that they do not have a cat. [2]
Each year, one student is chosen randomly to be the school captain of Mirabooka Primary School.
Tim is using a binomial distribution to make predictions about how many of the next 10 school captains will own a dog. He
assumes that the percentages found in the survey will remain constant for future years and that the events “being a school captain”
and “having a dog” are independent.
Use Tim’s model to find the probability that in the next 10 years
(e) State why John should not use the binomial distribution to find the probability that 5 of these students have a
dog. [1]
(b) Determine if the Netherlands’ score is an outlier for this data. Justify your answer. [3]
Chester is investigating the relationship between the highest-scoring countries’ Eurovision score and their population size to
determine whether population size can reasonably be used to predict a country’s score.
The populations of the countries, to the nearest million, are shown in the table.
Chester finds that, for this data, the Pearson’s product moment correlation coefficient is r = 0. 249.
(c) State whether it would be appropriate for Chester to use the equation of a regression line for y on x to predict a
country’s Eurovision score. Justify your answer. [2]
Chester then decides to find the Spearman’s rank correlation coefficient for this data, and creates a table of ranks.
(d.i) a . [1]
(d.ii) b. [1]
(d.iii) c . [1]
(f ) When calculating the ranks, Chester incorrectly read the Netherlands’ score as 478. Explain why the value of the
Spearman’s rank correlation r does not change despite this error.
s [1]
The number of passengers that arrive to board this flight is assumed to follow a binomial distribution with a probability of 0. 9.
(a) The airline sells 74 tickets for this flight. Find the probability that more than 72 passengers arrive to board the
flight. [3]
(b.i) Write down the expected number of passengers who will arrive to board the flight if 72 tickets are sold. [2]
(b.ii) Find the maximum number of tickets that could be sold if the expected number of passengers who arrive to board
the flight must be less than or equal to 72. [2]
Each passenger pays $150 for a ticket. If too many passengers arrive, then the airline will give $300 in compensation to each
passenger that cannot board.
(c) Find, to the nearest integer, the expected increase or decrease in the money made by the airline if they decide to
sell 74 tickets rather than 72. [8]
(b) Find the estimated number of teenagers who have a reaction time greater than 0. 4 seconds. [2]
(c) Determine the 90th percentile of the reaction times from the cumulative frequency graph. [2]
Mackenzie created the cumulative frequency graph using the following grouped frequency table.
(d.i) Write down the value of a. [1]
(e) Write down the modal class from the table. [1]
(f ) Use your graphic display calculator to find an estimate of the mean reaction time. [2]
Upon completion of the experiment, Mackenzie realized that some values were grouped incorrectly in the frequency table. Some
reaction times recorded in the interval 0 < t ≤ 0. 2 should have been recorded in the interval 0. 2 < t ≤ 0. 4.
(g) Suggest how, if at all, the estimated mean and estimated median reaction times will change if the errors are
corrected. Justify your response. [4]
A student from the group is chosen at random. Calculate the probability that the student
(a.iii) prefers a laptop given that they are 17–18 years old. [2]
A χ test for independence was performed on the collected data at the 1% significance level. The critical value for the test is
2
13. 277.
(d.iii) State the conclusion for the test in context. Give a reason for your answer. [2]
The distance that her darts land from the centre, O, of the board can be modelled by a normal distribution with mean 10 cm and
standard deviation 3 cm.
(b) Find the probability that Arianne throws two consecutive darts that land more than 15 cm from O. [2]
In a competition a player has three darts to throw on each turn. A point is scored if a player throws all three darts to land within a
central area around O. When Arianne throws a dart the probability that it lands within this area is 0. 8143.
(c) Find the probability that Arianne does not score a point on a turn of three darts. [2]
In the competition Arianne has ten turns, each with three darts.
(d.i) Find the probability that Arianne scores at least 5 points in the competition. [3]
(d.ii) Find the probability that Arianne scores at least 5 points and less than 8 points. [2]
(d.iii) Given that Arianne scores at least 5 points, find the probability that Arianne scores less than 8 points. [2]
manages to stop
1000 randomly selected bicycles are tested and their stopping distances when travelling at 20 km h −1
are measured.
Find, correct to four significant figures, the expected number of bicycles tested that stop between
The measured stopping distances of the 1000 bicycles are given in the table.
It is decided to perform a χ goodness of fit test at the 5% level of significance to decide whether the stopping distances of
2
bicycles travelling at 20 km h −1
can be modelled by a normal distribution with mean 6. 76 m and standard deviation 0. 12 m.
(e) State the conclusion of the test. Give a reason for your answer. [2]
(a) State which of the two sampling methods, systematic or quota, Jason has used. [1]
Jason constructed the following box and whisker diagram to show the number of hours students in the sample took to read this
book.
(b) Write down the median time to read the book. [1]
Mackenzie, a member of the sample, took 25 hours to read the novel. Jason believes Mackenzie’s time is not an outlier.
For each student interviewed, Jason recorded the time taken to read The Old Man and the Sea (x), measured in hours, and paired this with
their percentage score on the final exam (y). These data are represented on the scatter diagram.
Jason correctly calculates the equation of the regression line y on x for these students to be
He uses the equation to estimate the percentage score on the final exam for a student who read the book in 1. 5 hours.
(g) State whether it is valid to use the regression line y on x for Jason’s estimate. Give a reason for your answer. [2]
Jason found a website that rated the ‘top 50’ classic books. He randomly chose eight of these classic books and recorded the
number of pages. For example, Book H is rated 44th and has 281 pages. These data are shown in the table.
Jason intends to analyse the data using Spearman’s rank correlation coefficient, r . s
[2]
(b) Find the proportion of male Persian cats weighing between 5. 5 kg and 6. 5 kg. [2]
(c) Determine the expected number of cats in this group that have a weight of less than 5. 3 kg. [3]
(d) It is found that 12 of the cats weigh more than x kg. Estimate the value of x. [3]
(e) Ten of the cats are chosen at random. Find the probability that exactly one of them weighs over 6. 25 kg. [4]
They test every patient who comes to the centre on a particular day.
It is intended that if a patient has the disease, they test “positive”, and if a patient does not have the disease, they test “negative”.
However, the tests are not perfect, and only 99% of people who have the disease test positive. Also, 2% of people who do not
have the disease test positive.
(b.i) a . [1]
(b.ii) b. [1]
(b.iii) c . [1]
(b.iv) d . [1]
Use the tree diagram to find the probability that a patient selected at random
(c.i) will not have the disease and will test positive. [2]
(c.iii) has the disease given that they tested negative. [3]
(d) The medical centre finds the actual number of positive results in their sample is different than predicted by the
tree diagram. Explain why this might be the case. [1]
The staff at the medical centre looked at the care received by all visiting patients on a randomly chosen day. All the patients
received at least one of these services: they had medical tests (M ), were seen by a nurse (N ), or were seen by a doctor (D). It was
found that:
18 had medical tests and were seen by a doctor but were not seen by a nurse;
11 patients were seen by a nurse and had medical tests but were not seen by a doctor;
2 patients were seen by a doctor without being seen by nurse and without having medical tests.
(e) Draw a Venn diagram to illustrate this information, placing all relevant information on the diagram. [3]
(f ) Find the total number of patients who visited the centre during this day. [2]