0% found this document useful (0 votes)
339 views15 pages

Unit 3 Revision

This document appears to be a test or exam with multiple questions about statistical concepts and analyses. It includes questions about: - Calculating probabilities and expected values for games of chance with biased dice - Conducting chi-squared and t-tests to analyze independence and differences of means - Stating null and alternative hypotheses - Interpreting p-values and conclusions from statistical tests - Calculating correlations and performing regressions - Conducting sampling and determining sample sizes - Analyzing distributions, percentiles, and box plots - Modeling data using Poisson, normal, and binomial distributions The questions cover a wide range of statistical topics and require calculating, stating, and interpreting various statistical measures to analyze datasets and

Uploaded by

P Rushita
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
339 views15 pages

Unit 3 Revision

This document appears to be a test or exam with multiple questions about statistical concepts and analyses. It includes questions about: - Calculating probabilities and expected values for games of chance with biased dice - Conducting chi-squared and t-tests to analyze independence and differences of means - Stating null and alternative hypotheses - Interpreting p-values and conclusions from statistical tests - Calculating correlations and performing regressions - Conducting sampling and determining sample sizes - Analyzing distributions, percentiles, and box plots - Modeling data using Poisson, normal, and binomial distributions The questions cover a wide range of statistical topics and require calculating, stating, and interpreting various statistical measures to analyze datasets and

Uploaded by

P Rushita
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Revision [234 marks]

Jae Hee plays a game involving a biased six-sided die.


The faces of the die are labelled −3, −1, 0, 1, 2 and 5.
The score for the game, X, is the number which lands face up after the die is
rolled.
The following table shows the probability distribution for X.

1a. Find the exact value of p. [1 mark]

Jae Hee plays the game once.

1b. Calculate the expected score. [2 marks]

1c. Jae Hee plays the game twice and adds the two scores together. [3 marks]
Find the probability Jae Hee has a total score of −3.

As part of a study into healthy lifestyles, Jing visited Surrey Hills University. Jing
recorded a person’s position in the university and how frequently they ate a salad.
Results are shown in the table.

Jing conducted a χ 2 test for independence at a 5 % level of significance.

2a. State the null hypothesis. [1 mark]

2b. Calculate the p-value for this test. [2 marks]


2c. State, giving a reason, whether the null hypothesis should be accepted. [2 marks]

Ms Calhoun measures the heights of students in her mathematics class. She is


interested to see if the mean height of male students, μ1 , is the same as the mean
height of female students, μ2 . The information is recorded in the table.

At the 10 % level of significance, a t-test was used to compare the means of the
two groups. The data is assumed to be normally distributed and the standard
deviations are equal between the two groups.

3a. State the null hypothesis. [1 mark]

3b. State the alternative hypothesis. [1 mark]

3c. Calculate the p-value for this test. [2 marks]

3d. State, giving a reason, whether Ms Calhoun should accept the null [2 marks]
hypothesis.

The number of fish that can be caught in one hour from a particular lake can be
modelled by a Poisson distribution.
The owner of the lake, Emily, states in her advertising that the average number of
fish caught in an hour is three.
Tom, a keen fisherman, is not convinced and thinks it is less than three. He
decides to set up the following test. Tom will fish for one hour and if he catches
fewer than two fish he will reject Emily’s claim.

4a. State a suitable null and alternative hypotheses for Tom’s test. [1 mark]

4b. Find the probability of a Type I error. [2 marks]

4c. The average number of fish caught in an hour is actually 2.5. [3 marks]
Find the probability of a Type II error.
The Malvern Aquatic Center hosted a 3 metre spring board diving event. The
judges, Stan and Minsun awarded 8 competitors a score out of 10. The raw data is
collated in the following table.

5a. Write down the value of the Pearson’s product–moment correlation [2 marks]
coefficient, r.

5b. Using the value of r, interpret the relationship between Stan’s score [2 marks]
and Minsun’s score.

5c. Write down the equation of the regression line y on x. [2 marks]

5d. Use your regression equation from part (b) to estimate Minsun’s score [2 marks]
when Stan awards a perfect 10.

5e. State whether this estimate is reliable. Justify your answer. [2 marks]

The Commissioner for the event would like to find the Spearman’s rank correlation
coefficient.

5f. Copy and complete the information in the following table. [2 marks]

5g. Find the value of the Spearman’s rank correlation coefficient, rs . [2 marks]

5h. Comment on the result obtained for rs . [2 marks]

5i. The Commissioner believes Minsun’s score for competitor G is too high [1 mark]
and so decreases the score from 9.5 to 9.1.
Explain why the value of the Spearman’s rank correlation coefficient rs does not
change.

740 5
A school consists of 740 students divided into 5 grade levels. The numbers of
students in each grade are shown in the table below.

The Principal of the school wishes to select a sample of 25 students. She wishes to
ensure that, as closely as possible, the proportion of the students from each grade
in the sample is the same as the proportions in the school.

6a. Calculate the number of grade 12 students who should be in the sample. [3 marks]

6b. The Principal selects the students for the sample by asking those who [2 marks]
took part in a previous survey if they would like to take part in another.
She takes the first of those who reply positively, up to the maximum needed for
the sample.
State which two of the sampling methods listed below best describe the method
used.
Stratified Quota Convenience Systematic Simple random

The weights of apples on a tree can be modelled by a normal distribution with a


mean of 85 grams and a standard deviation of 7. 5 grams.

7a. Find the probability that an apple from the tree has a weight greater [2 marks]
than 90 grams.

A sample of apples are taken from 2 trees, A and B, in different parts of the
orchard.
The data is shown in the table below.

The owner of the orchard wants to know whether the mean weight of the apples
from tree A(μA ) is greater than the mean weight of the apples from tree B(μB )
so sets up the following test:
H0 : μA = μB and H1 : μA > μB

7b. Find the p-value for the owner’s test. [2 marks]

7c. The test is performed at the 5% significance level. [2 marks]


State the conclusion of the test, giving a reason for your answer.

760
A food scientist measures the weights of 760 potatoes taken from a single field
and the distribution of the weights is shown by the cumulative frequency curve
below.

8a. Find the number of potatoes in the sample with a weight of more than [2 marks]
200 grams.

8b. Find the median weight. [1 mark]

8c. Find the lower quartile. [1 mark]

8d. Find the upper quartile. [1 mark]

8e. The weight of the smallest potato in the sample is 20 grams and the [2 marks]
weight of the largest is 400 grams.
Use the scale shown below to draw a box and whisker diagram showing the
distribution of the weights of the potatoes. You may assume there are no outliers.
The cars for a fairground ride hold four people. They arrive at the platform for
loading and unloading every 30 seconds.
During the hour from 9 am the arrival of people at the ride in any interval of t
minutes can be modelled by a Poisson distribution with a mean of 9t(0 < t < 60).
When the 9 am car leaves there is no one in the queue to get on the ride.
Shunsuke arrives at 9. 01 am.

9a. Find the probability that more than 7 people arrive at the ride before [2 marks]
Shunsuke.

9b. Find the probability there will be space for him on the 9. 01 car. [6 marks]

Charles wants to measure the strength of the relationship between the price of a
house and its distance from the city centre where he lives. He chooses houses of
a similar size and plots a graph of price, P (in thousands of dollars) against
distance from the city centre, d (km).

10a. Explain why it is not appropriate to use Pearson’s product moment [1 mark]
correlation coefficient to measure the strength of the relationship
between P and d.

10b. Explain why it is appropriate to use Spearman’s rank correlation [1 mark]


coefficient to measure the strength of the relationship between P and d.

The data from the graph is shown in the table.

10c. Calculate Spearman’s rank correlation coefficient for this data. [6 marks]
10d. State what conclusion Charles can make from the answer in part (c). [1 mark]

Eggs at a farm are sold in boxes of six. Each egg is either brown or white. The
owner believes that the number of brown eggs in a box can be modelled by a
binomial distribution. He examines 100 boxes and obtains the following data.

11a. Calculate the mean number of brown eggs in a box. [1 mark]

11b. Hence estimate p, the probability that a randomly chosen egg is brown. [1 mark]

11c. By calculating an appropriate χ 2 statistic, test, at the 5% significance [8 marks]


level, whether or not the binomial distribution gives a good fit to these data.

The number of telephone calls received by a helpline over 80 one-minute periods


are summarized in the table below.

12a. Find the exact value of the mean of this distribution. [2 marks]

12b. Test, at the 5% level of significance, whether or not the data can be [12 marks]
modelled by a Poisson distribution.
The heights, x metres, of the 241 new entrants to a men’s college were measured
and the following statistics calculated.

∑ x = 412.11, ∑ x2 = 705.5721

13a. Calculate unbiased estimates of the population mean and the [3 marks]
population variance.

The Head of Mathematics decided to use a χ 2 test to determine whether or not


these heights could be modelled by a normal distribution. He therefore divided the
data into classes as follows.

13b. State suitable hypotheses. [1 mark]

13c. Calculate the value of the χ 2 statistic and state your conclusion using [11 marks]
a 10% level of significance.

Jim writes a computer program to generate 500 values of a variable Z. He obtains


the following table from his results.

14a. Use a chi-squared goodness of fit test to investigate whether or not, at [12 marks]
the 5 % level of significance, the N(0, 1) distribution can be used to
model these results.

In this situation, state briefly what is meant by

14b. a Type I error. [2 marks]

14c. a Type II error. [2 marks]

= ( ) 0⩽ ⩽ 10
The curve y = f (x) is shown in the graph, for 0 ⩽ x ⩽ 10.

The curve y = f (x) passes through the following points.

It is required to find the area bounded by the curve, the x-axis, the y-axis and the
line x = 10.

15a. Use the trapezoidal rule to find an estimate for the area. [3 marks]

One possible model for the curve y = f (x) is a cubic function.

15b. Use all the coordinates in the table to find the equation of the least [3 marks]
squares cubic regression curve.

15c. Write down the coefficient of determination. [1 mark]

15d. Write down an expression for the area enclosed by the cubic regression [1 mark]
curve, the x-axis, the y-axis and the line x = 10.

15e. Find the value of this area. [2 marks]


This question explores methods to determine the area bounded by an unknown
curve.
The curve y = f (x) is shown in the graph, for 0 ⩽ x ⩽ 4.4.

The curve y = f (x) passes through the following points.

It is required to find the area bounded by the curve, the x-axis, the y-axis and the
line x = 4.4 .

16a. Use the trapezoidal rule to find an estimate for the area. [3 marks]

16b. With reference to the shape of the graph, explain whether your answer [2 marks]
to part (a)(i) will be an over-estimate or an underestimate of the area.

One possible model for the curve y = f (x) is a cubic function.

16c. Use all the coordinates in the table to find the equation of the least [3 marks]
squares cubic regression curve.

16d. Write down the coefficient of determination. [1 mark]

16e. Write down an expression for the area enclosed by the cubic function, [2 marks]
the x-axis, the y-axis and the line x = 4.4 .

16f. Find the value of this area. [2 marks]

= ( )
A second possible model for the curve y = f (x) is an exponential function,
y = peqx , where p, q ∈ R.

16g. Show that ln y = qx + ln p. [2 marks]

16h. Hence explain how a straight line graph could be drawn using the [1 mark]
coordinates in the table.

16i. By finding the equation of a suitable regression line, show that p = 1.83 [5 marks]
and q = 0.986.

16j. Hence find the area enclosed by the exponential function, the x-axis, the [2 marks]
y-axis and the line x = 4.4.

Consider two events A and B such that P(A) = k, P(B) = 3k, P(A ∩ B) = k2
and P(A ∪ B) = 0.5 .

17a. Calculate k; [3 marks]

17b. Find P(A′ ∩ B). [3 marks]

When carpet is manufactured, small faults occur at random. The number of faults
in Premium carpets can be modelled by a Poisson distribution with mean 0.5
faults per 20m2. Mr Jones chooses Premium carpets to replace the carpets in his
office building. The office building has 10 rooms, each with the area of 80m2.

18a. Find the probability that the carpet laid in the first room has fewer than [3 marks]
three faults.

18b. Find the probability that exactly seven rooms will have fewer than three [3 marks]
faults in the carpet.

Events A and B are independent with P(A ∩ B) = 0.2 and P(A′ ∩ B) = 0.6.

19a. Find P(B). [2 marks]

19b. Find P(A ∪ B). [4 marks]


On a work day, the probability that Mr Van Winkel wakes up early is 45 .

If he wakes up early, the probability that he is on time for work is p.

If he wakes up late, the probability that he is on time for work is 14 .

20a. Complete the tree diagram below. [2 marks]

The probability that Mr Van Winkel arrives on time for work is 35 .

20b. Find the value of p. [4 marks]

The table below shows the distribution of test grades for 50 IB students at
Greendale School.

21a. Calculate the mean test grade of the students; [2 marks]

21b. Calculate the standard deviation. [1 mark]

21c. Find the median test grade of the students. [1 mark]

21d. Find the interquartile range. [2 marks]


A student is chosen at random from these 50 students.

21e. Find the probability that this student scored a grade 5 or higher. [2 marks]

A second student is chosen at random from these 50 students.

21f. Given that the first student chosen at random scored a grade 5 or [3 marks]
higher, find the probability that both students scored a grade 6.

The number of minutes that the 50 students spent preparing for the test was
normally distributed with a mean of 105 minutes and a standard deviation of 20
minutes.

21g. Calculate the probability that a student chosen at random spent at [2 marks]
least 90 minutes preparing for the test.

21h. Calculate the expected number of students that spent at least 90 [2 marks]
minutes preparing for the test.

A and B are such that P(A ∪ B) = 0.95, P(A ∩ B) = 0.6 and


Events
P(A|B) = 0.75.

22a. Find P(B). [2 marks]

22b. Find P(A). [2 marks]

22c. Hence show that events A′ and B are independent. [2 marks]


In a group of 20 girls, 13 take history and 8 take economics. Three girls take both
history and economics, as shown in the following Venn diagram. The values p and
q represent numbers of girls.

23a. Find the value of p; [2 marks]

23b. Find the value of q. [2 marks]

23c. A girl is selected at random. Find the probability that she takes [2 marks]
economics but not history.

Jim heated a liquid until it boiled. He measured the temperature of the liquid as it
cooled. The following table shows its temperature, d degrees Celsius, t minutes
after it boiled.

24a. Write down the independent variable. [1 mark]

24b. Write down the boiling temperature of the liquid. [1 mark]

Jim believes that the relationship between d and t can be modelled by a linear
regression equation.

24c. Jim describes the correlation as very strong. Circle the value below [2 marks]
which best represents the correlation coefficient.
0.992 0.251 0 − 0.251 − 0.992

24d. Jim’s model is d = −2.24t + 105, for 0 ⩽ t ⩽ 20. Use his model to [2 marks]
predict the decrease in temperature for any 2 minute interval.
A quadratic function f can be written in the form f(x) = a(x − p)(x − 3). The
graph of f has axis of symmetry x = 2.5 and y-intercept at (0, − 6)

25a. Find the value of p. [3 marks]

25b. Find the value of a . [3 marks]

25c. The line y = kx − 5 is a tangent to the curve of f . Find the values of k. [8 marks]

© International Baccalaureate Organization 2023


International Baccalaureate® - Baccalauréat International® - Bachillerato Internacional®

Printed for WOODSTOCK SCH 09-


12

You might also like