0% found this document useful (0 votes)
259 views65 pages

Exam Questions Chapter2&3

The document contains 7 exam questions related to analyzing and interpreting data distributions and outliers. The questions cover topics like calculating proportions, estimating means and standard deviations, identifying outliers, comparing distributions, and interpreting correlation. Various types of data displays are used, including box plots, histograms, stem-and-leaf plots, and tables. Students are asked to analyze the data distributions, find measures of center and spread, and draw conclusions.

Uploaded by

Aisha Qasim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
259 views65 pages

Exam Questions Chapter2&3

The document contains 7 exam questions related to analyzing and interpreting data distributions and outliers. The questions cover topics like calculating proportions, estimating means and standard deviations, identifying outliers, comparing distributions, and interpreting correlation. Various types of data displays are used, including box plots, histograms, stem-and-leaf plots, and tables. Students are asked to analyze the data distributions, find measures of center and spread, and draw conclusions.

Uploaded by

Aisha Qasim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 65

Page |1

Exam Questions Chapter 2 and 3


Q1. (Q02 6683/01, June 2007)

The box plot in Figure 1 shows a summary of the weights of the luggage, in kg, for each musician in an
orchestra on an overseas tour.

The airline's recommended weight limit for each musician's luggage was 45 kg. Given that none of the
musicians' luggage weighed exactly 45 kg,

(a) state the proportion of the musicians whose luggage was below the recommended weight limit.
(1)
A quarter of the musicians had to pay a charge for taking heavy luggage.

(b) State the smallest weight for which the charge was made.
(1)
(c) Explain what you understand by the + on the box plot in Figure 1, and suggest an instrument that the
owner of this luggage might play.
(2)
(d) Describe the skewness of this distribution. Give a reason for your answer.
(2)
One musician of the orchestra suggests that the weights of luggage, in kg, can be modelled by a normal
distribution with quartiles as given in Figure 1.

(e) Find the standard deviation of this normal distribution.


(4)

(Total 10 marks)
Page |2

Q2. (Q05 6683/01, June 2007)

Figure 2 shows a histogram for the variable t which represents the time taken, in minutes, by a group of
people to swim 500m.

(a) Complete the frequency table for t.

(2)
(b) Estimate the number of people who took longer than 20 minutes to swim 500m.
(2)
(c) Find an estimate of the mean time taken.
(4)
(d) Find an estimate for the standard deviation of t.
(3)
(e) Find the median and quartiles for t.
(4)

One measure of skewness is found using .

(f) Evaluate this measure and describe the skewness of these data.
(2)

(Total 17 marks)
Page |3

Q3. (Q03 6683/01, Jan 2008)

The histogram in Figure 1 shows the time taken, to the nearest minute, for 140 runners to complete a fun
run.

Use the histogram to calculate the number of runners who took between 78.5 and 90.5 minutes to
complete the fun run.
(5)

(Total 5 marks)
Page |4

Q4. (Q02 6683/01, Jan 2008)

Cotinine is a chemical that is made by the body from nicotine which is found in cigarette smoke. A doctor
tested the blood of 12 patients, who claimed to smoke a packet of cigarettes a day, for cotinine. The
results, in appropriate units, are shown below.

(a) Find the mean and standard deviation of the level of cotinine in a patient's blood.
(4)
(b) Find the median, upper and lower quartiles of these data.
(3)
A doctor suspects that some of his patients have been smoking more than a packet of cigarettes per day.
He decides to use Q3+1.5(Q3–Q1) to determine if any of the cotinine results are far enough away from the
upper quartile to be outliers.

(c) Identify which patient(s) may have been smoking more than a packet of cigarettes a day. Show your
working clearly.
(4)
Research suggests that cotinine levels in the blood form a skewed distribution.

One measure of skewness is found using .

(d) Evaluate this measure and describe the skewness of these data.
(3)

(Total 14 marks)
Page |5

Q5. (Q02 6683/01, June 2008)

The age in years of the residents of two hotels are shown in the back to back stem and leaf diagram
below.

For the Balmoral Hotel,

(a) write down the mode of the age of the residents,


(1)
(b) find the values of the lower quartile, the median and the upper quartile.
(3)
(c) (i) Find the mean, of the age of the residents.
2
(ii) Given that ∑ x = 81 213, find the standard deviation of the age of the residents.
(4)
One measure of skewness is found using

(d) Evaluate this measure for the Balmoral Hotel.


(2)
For the Abbey Hotel, the mode is 39, the mean is 33.2, the standard deviation is 12.7 and the measure of
skewness is –0.454.

(e) Compare the two age distributions of the residents of each hotel.
(3)

(Total 13 marks)
Page |6

Q6. (Q04 6683/01, Jan 2009)

In a study of how students use their mobile telephones, the phone usage of a random sample of 11
students was examined for a particular week.

The total length of calls, y minutes, for the 11 students were


17, 23, 35, 36, 51, 53, 54, 55, 60, 77, 110

(a) Find the median and quartiles for these data.


(3)
A value that is greater than Q3 + 1.5 × (Q3 − Q1) or smaller than Q1 − 1.5 × (Q3 − Q1) is defined as an
outlier.

(b) Show that 110 is the only outlier.


(2)
(c) Using the graph paper on page 15 draw a box plot for these data indicating clearly the position of the
outlier.
(3)
The value of 110 is omitted.

(d) Show that Syy for the remaining 10 students is 2966.9


(3)
These 10 students were each asked how many text messages, x, they sent in the same week.

The values of Sxx and Sxy for these 10 students are Sxx = 3463.6 and Sxy = −18.3.

(e) Calculate the product moment correlation coefficient between the number of text messages sent and
the total length of calls for these 10 students.
(2)
A parent believes that a student who sends a large number of text messages will spend fewer minutes on
calls.

(f) Comment on this belief in the light of your calculation in part (e).
(1)

(Total 14 marks)
Page |7

Q7. (Q05 6683/01, Jan 2009)

In a shopping survey a random sample of 104 teenagers were asked how many hours, to the nearest
hour, they spent shopping in the last month. The results are summarised in the table below.

A histogram was drawn and the group (8 − 10) hours was represented by a rectangle that was 1.5 cm
wide and 3 cm high.

(a) Calculate the width and height of the rectangle representing the group (16 − 25) hours.
(3)
(b) Use linear interpolation to estimate the median and interquartile range.
(5)
(c) Estimate the mean and standard deviation of the number of hours spent shopping.
(4)
(d) State, giving a reason, the skewness of these data.
(2)
(e) State, giving a reason, which average and measure of dispersion you would recommend to use to
summarise these data.
(2)
(Total 16 marks)
Page |8

Q8. (Q04 6683/01, June 2009)

A researcher measured the foot lengths of a random sample of 120 ten-year-old children. The lengths are
summarised in the table below.

(a) Use interpolation to estimate the median of this distribution.


(2)
(b) Calculate estimates for the mean and the standard deviation of these data.
(6)
One measure of skewness is given by

Coefficient of skewness =
(c) Evaluate this coefficient and comment on the skewness of these data.
(3)
Greg suggests that a normal distribution is a suitable model for the foot lengths of ten-year-old children.
(d) Using the value found in part (c), comment on Greg's suggestion, giving a reason for your answer.
(2)
(Total 13 marks)
Page |9

Q9. (Q03 6683/01, June 2009)

The variable x was measured to the nearest whole number. Forty observations are given in the table
below.

A histogram was drawn and the bar representing the 10 - 15 class has a width of 2 cm and a height of 5
cm. For the 16 - 18 class find
(a) the width,
(1)
(b) the height
(2)
of the bar representing this class.
(Total 3 marks)
P a g e | 10

Q10. (Q02 6683/01, Jan 2010)

The 19 employees of a company take an aptitude test. The scores out of 40 are illustrated in the stem
and leaf diagram below.

Find

(a) the median score,


(1)
(b) the interquartile range.
(3)

The company director decides that any employees whose scores are so low that they are outliers will
undergo retraining.

An outlier is an observation whose value is less than the lower quartile minus 1.0 times the interquartile
range.

(c) Explain why there is only one employee who will undergo retraining.
(2)
(d) On the graph paper on page 5, draw a box plot to illustrate the employees' scores.
(3)

(Total 9 marks)
P a g e | 11

Q11. (Q03 6683/01, Jan 2010)

The birth weights, in kg, of 1500 babies are summarised in the table below.

[You may use fx = 4841 and fx2 = 15 889.5]

(a) Write down the missing midpoints in the table above.


(2)
(b) Calculate an estimate of the mean birth weight.
(2)
(c) Calculate an estimate of the standard deviation of the birth weight.
(3)
(d) Use interpolation to estimate the median birth weight.
(2)
(e) Describe the skewness of the distribution. Give a reason for your answer.
(2)

(Total 11 marks)
P a g e | 12

Q12. (Q05 6683/01, June 2010)

A teacher selects a random sample of 56 students and records, to the nearest hour, the time spent
watching television in a particular week.

(a) Find the mid-points of the 21-25 hour and 31-40 hour groups.
(2)
A histogram was drawn to represent these data. The 11-20 group was represented by a bar of width 4 cm
and height 6 cm.

(b) Find the width and height of the 26-30 group.


(3)
(c) Estimate the mean and standard deviation of the time spent watching television by these students.
(5)
(d) Use linear interpolation to estimate the median length of time spent watching television by these
students.
(2)
The teacher estimated the lower quartile and the upper quartile of the time spent watching television to be
15.8 and 29.3 respectively.

(e) State, giving a reason, the skewness of these data.


(2)
(Total 14 marks)
P a g e | 13

Q13. (Q07 6683/01, June 2010)

The distances travelled to work, D km, by the employees at a large company are normally distributed with
D N(30, 82 ).

(a) Find the probability that a randomly selected employee has a journey to work of more than 20 km.
(3)
(b) Find the upper quartile, Q3, of D.
(3)
(c) Write down the lower quartile, Q1 , of D.
(1)
An outlier is defined as any value of D such that D < h or D > k where

h = Q1 − 1.5 × (Q3 − Q1) and k = Q3 + 1.5 × (Q3 − Q1)

(d) Find the value of h and the value of k.


(2)
An employee is selected at random.

(e) Find the probability that the distance travelled to work by this employee is an outlier.
(3)
(Total 12 marks)
P a g e | 14

Q14. (Q02 6683/01, Jan 2011)

Keith records the amount of rainfall, in mm, at his school, each day for a week. The results are given
below.

Jenny then records the amount of rainfall, x mm, at the school each day for the following 21 days. The
results for the 21 days are summarised below.

(a) Calculate the mean amount of rainfall during the whole 28 days.
(2)
Keith realises that he has transposed two of his figures. The number 9.4 should have been 4.9 and the
number 0.5 should have been 5.0
Keith corrects these figures.

(b) State, giving your reason, the effect this will have on the mean.
(2)
(Total 4 marks)
P a g e | 15

Q15. (Q05 6683/01, Jan 2011)

On a randomly chosen day, each of the 32 students in a class recorded the time, t minutes to the nearest
minute, they spent on their homework. The data for the class is summarised in the following table.

(a) Use interpolation to estimate the value of the median.


(2)
Given that

(b) find the mean and the standard deviation of the times spent by the students on their homework.
(3)
(c) Comment on the skewness of the distribution of the times spent by the students on their homework.
Give a reason for your answer.
(2)
(Total 7 marks)
P a g e | 16

Q16. (Q03 6683/01, Jan 2011)

Over a long period of time a small company recorded the amount it received in sales per month. The
results are summarised below.

An outlier is an observation that falls


either 1.5 × interquartile range above the upper quartile
or 1.5 × interquartile range below the lower quartile.

(a) On the graph paper below, draw a box plot to represent these data, indicating clearly any outliers.
(5)

(b) State the skewness of the distribution of the amount of sales received. Justify your answer.
(2)
(c) The company claims that for 75% of the months, the amount received per month is greater than £10
000. Comment on this claim, giving a reason for your answer.
(2)
(Total 9 marks)
P a g e | 17

Q17. (Q05 6683/01, June 2011)

A class of students had a sudoku competition. The time taken for each student to complete the sudoku
was recorded to the nearest minute and the results are summarised in the table below.

(You may use fx2 = 8603.75)

(a) Write down the mid-point for the 9 - 12 interval.


(1)
(b) Use linear interpolation to estimate the median time taken by the students.
(2)
(c) Estimate the mean and standard deviation of the times taken by the students.
(5)
The teacher suggested that a normal distribution could be used to model the times taken by the students
to complete the sudoku.

(d) Give a reason to support the use of a normal distribution in this case.
(1)
On another occasion the teacher calculated the quartiles for the times taken by the students to complete
a different sudoku and found

Q1 = 8.5 Q2 =13.0 Q3 = 21.0

(e) Describe, giving a reason, the skewness of the times on this occasion.
(2)
(Total 11 marks)
P a g e | 18

Q18. (Q01 6683/01, Jan 2012)

The histogram in Figure 1 shows the time, to the nearest minute, that a random sample of 100 motorists
were delayed by roadworks on a stretch of motorway.

Figure 1

(a) Complete the table.

(2)
(b) Estimate the number of motorists who were delayed between 8.5 and 13.5 minutes by the roadworks.
(2)

(Total 4 marks)
P a g e | 19

Q19. (Q04 6683/01, Jan 2012)

The marks, x, of 45 students randomly selected from those students who sat a mathematics examination
are shown in the stem and leaf diagram below.

(a) Write down the modal mark of these students.


(1)
(b) Find the values of the lower quartile, the median and the upper quartile.
(3)
2
For these students x = 2497 and x = 143 369

(c) Find the mean and the standard deviation of the marks of these students.
(3)
(d) Describe the skewness of the marks of these students, giving a reason for your answer.
(2)
The mean and standard deviation of the marks of all the students who sat the examination were 55 and
10 respectively. The examiners decided that the total mark of each student should be scaled by
subtracting 5 marks and then reducing the mark by a further 10%.

(e) Find the mean and standard deviation of the scaled marks of all the students.
(4)

(Total 13 marks)
P a g e | 20

Q20. (Q05 6683/01, June 2012)

A policeman records the speed of the traffic on a busy road with a 30 mph speed limit. He records the
speeds of a sample of 450 cars. The histogram in Figure 2 represents the results.

(a) Calculate the number of cars that were exceeding the speed limit by at least 5 mph in the sample.
(4)
(b) Estimate the value of the mean speed of the cars in the sample.
(3)
(c) Estimate, to 1 decimal place, the value of the median speed of the cars in the sample.
(2)
(d) Comment on the shape of the distribution. Give a reason for your answer.
(2)
(e) State, with a reason, whether the estimate of the mean or the median is a better representation of the
average speed of the traffic on the road.
(2)
(Total 13 marks)
P a g e | 21

Q21. (Q05 6683/01, Jan 2013)

A survey of 100 households gave the following results for weekly income £y.

(You may use ∑ fy2 = 12 452 800)

A histogram was drawn and the class 200 ≤ y < 240 was represented by a rectangle of width 2 cm and
height 7 cm.

(a) Calculate the width and the height of the rectangle representing the class 320 ≤ y < 400
(3)
(b) Use linear interpolation to estimate the median weekly income to the nearest pound.
(2)
(c) Estimate the mean and the standard deviation of the weekly income for these data.
(4)

One measure of skewness is .

(d) Use this measure to calculate the skewness for these data and describe its value.
(2)
Katie suggests using the random variable X which has a normal distribution with mean 320 and standard
deviation 150 to model the weekly income for these data.

(e) Find P(240 < X < 400).


(2)
(f) With reference to your calculations in parts (d) and (e) and the data in the table, comment on Katie's
suggestion.
(2)
(Total 15 marks)
P a g e | 22

Q22. (Q04 6683/01, June 2013)

The following table summarises the times, t minutes to the nearest minute, recorded for a group of
students to complete an exam.

[You may use ft2 = 134281.25]

(a) Estimate the mean and standard deviation of these data.


(5)
(b) Use linear interpolation to estimate the value of the median.
(2)
(c) Show that the estimated value of the lower quartile is 18.6 to 3 significant figures.
(1)
(d) Estimate the interquartile range of this distribution.
(2)
(e) Give a reason why the mean and standard deviation are not the most appropriate summary statistics
to use with these data.
(1)
The person timing the exam made an error and each student actually took 5 minutes less than the times
recorded above. The table below summarises the actual times.

(f) Without further calculations, explain the effect this would have on each of the estimates found in parts
(a), (b), (c) and (d).
(3)
(Total 14 marks)
P a g e | 23

Q23. (Q02 6683/01, June 2013)

The marks of a group of female students in a statistics test are summarised in Figure 1

Figure 1

(a) Write down the mark which is exceeded by 75% of the female students.
(1)
The marks of a group of male students in the same statistics test are summarised by the stem and leaf
diagram below.

(b) Find the median and interquartile range of the marks of the male students.
(3)
An outlier is a mark that is

either more than 1.5 × interquartile range above the upper quartile

or more than 1.5 × interquartile range below the lower quartile.

(c) In the space provided on Figure 1 draw a box plot to represent the marks of the male students,
indicating clearly any outliers.
(5)
(d) Compare and contrast the marks of the male and the female students.
(2)
(Total 11 marks)
P a g e | 24

Q24. (Q03 6683/01R, June 2013)

An agriculturalist is studying the yields, y kg, from tomato plants. The data from a random sample of 70
tomato plants are summarised below.

A histogram has been drawn to represent these data.

The bar representing the yield 5 ≤ y < 10 has a width of 1.5 cm and a height of 8 cm.

(a) Calculate the width and the height of the bar representing the yield 15 ≤ y < 25
(3)
(b) Use linear interpolation to estimate the median yield of the tomato plants.
(2)
(c) Estimate the mean and the standard deviation of the yields of the tomato plants.
(4)
(d) Describe, giving a reason, the skewness of the data.
(2)
(e) Estimate the number of tomato plants in the sample that have a yield of more than 1 standard
deviation above the mean.
(2)
(Total 13 marks)
P a g e | 25

Q25. (Q02 6683/01, June 2014)

The mark, x, scored by each student who sat a statistics examination is coded using

y = 1.4x − 20

The coded marks have mean 60.8 and standard deviation 6.60

Find the mean and the standard deviation of x.


(4)

(Total 4 marks)
P a g e | 26

Q26. (Q01 6683/01, June 2014)

A random sample of 35 homeowners was taken from each of the villages Greenslax and Penville and
their ages were recorded. The results are summarised in the back-to-back stem and leaf diagram below.

Key: 7 | 3 | 1 means 37 years for Greenslax and 31 years for Penville

Some of the quartiles for these two distributions are given in the table below.

(a) Find the value of a and the value of b.


(2)
An outlier is a value that falls either

more than 1.5 × (Q3 − Q1) above Q3

or more than 1.5 × (Q3 − Q1) below Q1

(b) On the graph paper opposite draw a box plot to represent the data from Penville. Show clearly any
outliers.
(4)
(c) State the skewness of each distribution. Justify your answers.
(3)
P a g e | 27

(Total 9 marks)
P a g e | 28

Q27. (Q05 6683/01R, June 2014)

The table shows the time, to the nearest minute, spent waiting for a taxi by each of 80 people one Sunday
afternoon.

(a) Write down the upper class boundary for the 2–4 minute interval.
(1)
A histogram is drawn to represent these data. The height of the tallest bar is 6 cm.

(b) Calculate the height of the second tallest bar.


(3)
(c) Estimate the number of people with a waiting time between 3.5 minutes and 7 minutes.
(2)
(d) Use linear interpolation to estimate the median, the lower quartile and the upper quartile of the waiting
times.
(4)
(e) Describe the skewness of these data, giving a reason for your answer.
(2)

(Total 12 marks)
P a g e | 29

Q28. (Q06 6683/01, June 2014)

The times, in seconds, spent in a queue at a supermarket by 85 randomly selected customers, are
summarised in the table below.

A histogram was drawn to represent these data. The 30 – 60 group was represented by a bar of width 1.5
cm and height 1 cm.

(a) Find the width and the height of the 70 – 80 group.


(3)
(b) Use linear interpolation to estimate the median of this distribution.
(2)
Given that x denotes the midpoint of each group in the table and

∑ fx = 6460 ∑ fx2 = 529 400

(c) calculate an estimate for


(i) the mean,
(ii) the standard deviation,
for the above data.
(3)
One measure of skewness is given by

coefficient of skewness =

(d) Evaluate this coefficient and comment on the skewness of these data.
(3)

(Total 11 marks)
P a g e | 30

Q29. (Q01 6683/01, June 2015)

Each of 60 students was asked to draw a 20° angle without using a protractor. The size of
each angle drawn was measured. The results are summarised in the box plot below.

(a) Find the range for these data.


(1)
(b) Find the interquartile range for these data.
(1)
The students were then asked to draw a 70° angle.
The results are summarised in the table below.

(c) Use linear interpolation to estimate the size of the median angle drawn. Give your
answer to 1 decimal place.
(2)
(d) Show that the lower quartile is 63°
(2)
For these data, the upper quartile is 75°, the minimum is 55° and the maximum is 84°

An outlier is an observation that falls either


more than 1.5 × (interquartile range) above the upper quartile or
more than 1.5 × (interquartile range) below the lower quartile.

(e) (i) Show that there are no outliers for these data.
(ii) Draw a box plot for these data on the grid on page 3.
(5)
(f) State which angle the students were more accurate at drawing. Give reasons for
your answer.
(3)
.............................................................................................................................................
P a g e | 31

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

.............................................................................................................................................

(Total for question = 14 marks)


P a g e | 32

Q30. (Q05 6683/01, June 2016)

A midwife records the weights, in kg, of a sample of 50 babies born at a hospital. Her results are given in
the table below.

[You may use ∑ fx2 = 611.375]

A histogram has been drawn to represent these data.

The bar representing the weight 2 ≤ w < 3 has a width of 1 cm and a height of 4 cm.

(a) Calculate the width and height of the bar representing a weight of 3 ≤ w < 3.5
(3)
(b) Use linear interpolation to estimate the median weight of these babies.
(2)
(c) (i) Show that an estimate of the mean weight of these babies is 3.43 kg.
(ii) Find an estimate of the standard deviation of the weights of these babies.
(3)
Shyam decides to model the weights of babies born at the hospital, by the random variable W, where W ~
N(3.43, 0.652)

(d) Find P(W < 3)


(3)
(e) With reference to your answers to (b), (c)(i) and (d) comment on Shyam's decision.
(3)
A newborn baby weighing 3.43 kg is born at the hospital.

(f) Without carrying out any further calculations, state, giving a reason, what effect the addition of this
newborn baby to the sample would have on your estimate of the
(i) mean,
(ii) standard deviation.
(3)

(Total for question = 17 marks)


P a g e | 33

Q31. (Q02 6683/01, June 2017)

An estate agent is studying the cost of office space in London. He takes a random sample of 90 offices
and calculates the cost, £x per square foot. His results are given in the table below.

A histogram is drawn for these data and the bar representing 50 ≤ x < 60 is 2cm wide and 8cm high.

(a) Calculate the width and height of the bar representing 20 ≤ x < 40
(3)
(b) Use linear interpolation to estimate the median cost.
(2)
(c) Estimate the mean cost of office space for these data.
(2)
(d) Estimate the standard deviation for these data.
(2)
(e) Describe, giving a reason, the skewness.
(1)
Rika suggests that the cost of office space in London can be modelled by a normal distribution with mean
£50 and standard deviation £10

(f) With reference to your answer to part (e), comment on Rika's suggestion.
(1)
(g) Use Rika's model to estimate the 80th percentile of the cost of office space in London.
(3)

(Total for question = 14 marks)


P a g e | 34

Q32. (Q02 6683/01, June 2018)

The following grouped frequency distribution summarises the number of minutes, to the nearest minute,
that a random sample of 100 motorists were delayed by roadworks on a stretch of motorway one Monday.

A histogram has been drawn to represent these data.

The bar representing a delay of (3–6) minutes has a width of 2 cm and a height of 9.5 cm.

(a) Calculate the width and the height of the bar representing a delay of (11–15) minutes.
(3)
(b) Use linear interpolation to estimate the median delay.
(2)
(c) Calculate an estimate of the mean delay.
(2)
(d) Calculate an estimate of the standard deviation of the delays.
(2)

One coefficient of skewness is given by

(e) Evaluate this coefficient for the above data, giving your answer to 2 significant figures.
(1)
On the following Friday, the coefficient of skewness for the delays on this stretch of motorway was – 0.22

(f) State, giving a reason, how the delays on this stretch of motorway on Friday are different from the
delays on Monday.
(2)

(Total for question = 12 marks)


P a g e | 35

Q33. (Q01 6683/01, June 2019)

The box plot summarises the weights of luggage for a group of tourists at an airport.

(a) Write down the range.


(1)
A quarter of the tourists had to pay an extra charge because their luggage was too heavy.

(b) Write down an estimate of the maximum weight that is allowed before having to pay the extra charge.
(1)
Four tourists with luggage weighing 10 kg, 17 kg, 21 kg and 25 kg join the group.

(c) State what, if any, changes this will make to the box plot.
Give a reason for your answer.
(1)

(Total for question = 3 marks)


P a g e | 36

Q34. (Q02 6683/01, June 2019)

Peta recorded the times, t minutes, a group of children took to swim 300 metres. She summarised these
times in the following histogram.

(a) Calculate the number of children who took part in the swim.
(2)
Adam used the histogram to estimate the mean and standard deviation of the times taken by the children
to complete the swim.

(b) Find Adam's estimate of


(i) the mean,
(2)
(ii) the standard deviation.
(3)
Adam used linear interpolation to estimate the median time taken by the children to complete the swim.

(c) Find Adam's estimate of the median.


(2)
Peta also calculated the mean, standard deviation and median of the times taken by the children but she
used each child's actual time taken to complete the swim. She obtained a mean time of 20.8 minutes, a
standard deviation of 5.51 minutes and a median time of 20.5 minutes.

(d) Explain an assumption Adam made about these data that has led him to get different answers to
Peta.
(1)
Adam and Peta each calculate a coefficient of skewness by using their statistics in the formula

(e) (i) Calculate the coefficients of skewness found by Adam and Peta.
(ii) Suggest how Peta could improve her histogram to describe the data more accurately.
(2)

(Total for question = 12 marks)


P a g e | 37

Q35. (Q02 WST01/01, Jan 2014)

A rugby club coach uses club records to take a random sample of 15 players from 1990 and an
independent random sample of 15 players from 2010. The body weight of each player was recorded to
the nearest kg and the results from 2010 are summarised in the table below.

(a) Find the estimated values in kg of the summary statistics a, b and c in the table below.

Give your answers to 3 significant figures.


(6)
The rugby coach claims that players' body weight increased between 1990 and 2010.

(b) Using the table in part (a), comment on the rugby coach's claim.
(2)

(Total for question = 8 marks)


P a g e | 38

Q36. (Q08 WST01/01, Jan 2014)

A manager records the number of hours of overtime claimed by 40 staff in a month.

The histogram in Figure 1 represents the results.

Figure 1

(a) Calculate the number of staff who have claimed less than 10 hours of overtime in the month.
(4)
(b) Estimate the median number of hours of overtime claimed by these 40 staff in the month.
(2)
(c) Estimate the mean number of hours of overtime claimed by these 40 staff in the month.
(2)
The manager wants to compare these data with overtime data he collected earlier to find out if the
overtime claimed by staff has decreased.

(d) State, giving a reason, whether the manager should use the median or the mean to compare the
overtime claimed by staff.
(2)

(Total for question = 10 marks)


P a g e | 39

Q37. (Q02 WST01/01, Jan 2015)

A sports teacher recorded the number of press-ups done by his students in two minutes.
He recorded this information for a Year 7 class and for a Year 11 class.

The back-to-back stem and leaf diagram shows this information.

Key: means 42 press–ups for a Year 7 student and 40 press–ups for a Year 11 student

(a) Find the median number of press-ups for each class.


(2)
For the Year 11 class, the lower quartile is 38 and the upper quartile is 59

(b) Find the lower quartile and the upper quartile for the Year 7 class.
(2)
(c) Use the medians and quartiles to describe the skewness of each of the two distributions.
(3)
(d) Give two reasons why the normal distribution should not be used to model the number
of press-ups done by the Year 11 class.
(2)

(Total for question = 9 marks)


P a g e | 40

Q38. (Q06 WST01/01, Jan 2016)

Yujie is investigating the weights of 10 young rabbits. She records the weight, x grams, of each rabbit and
the results are summarised below.

x = 8360 and (x − )2 = 63840

(a) Calculate the mean and the standard deviation of the weights of these rabbits.
(3)

Given that the median weight of these rabbits is 815 grams,

(b) describe, giving a reason, the skewness of these data.


(2)

Two more rabbits weighing 776 grams and 896 grams are added to make a group of 12 rabbits.

(c) State, giving a reason, how the inclusion of these two rabbits would affect the mean.
(2)

(d) By considering the change in (x − )2, state what effect the inclusion of these two rabbits would have
on the standard deviation.
(2)

(Total for question = 9 marks)


P a g e | 41

Q39. (Q01 WST01/01, Jan 2017)

Ralph records the weights, in grams, of 100 tomatoes. This information is displayed in the histogram
below.

Given that 5 of the tomatoes have a weight between 2 and 3 grams,

(a) find the number of tomatoes with a weight between 0 and 2 grams.
(2)
One of the tomatoes is selected at random.

(b) Find the probability that it weighs more than 3 grams.


(2)
(c) Estimate the proportion of the tomatoes with a weight greater than 6.25 grams.
(2)
(d) Using your answer to part (c), explain whether or not the median is greater than 6.25 grams.
(1)
Given that the mean weight of these tomatoes is 6.25 grams and using your answer to part (d),

(e) describe the skewness of the distribution of the weights of these tomatoes. Give a reason for your
answer.
(1)
Two of these 100 tomatoes are selected at random.

(f) Estimate the probability that both tomatoes weigh within 0.75 grams of the mean.
(4)

(Total for question = 12 marks)


P a g e | 42

Q40. (Q01 WST01/01, June 2017)

Nina weighed a random sample of 50 carrots from her shop and recorded the weight, in grams to the
nearest gram, for each carrot. The results are summarised below.

(a) Use linear interpolation to estimate the median weight of these carrots.
(2)
(b) Find an estimate for the mean weight of these carrots.
(2)
(c) Find an estimate for the standard deviation of the weights of these carrots.
(2)
A carrot is selected at random from Nina's shop.

(d) Estimate the probability that the weight of this carrot is more than 70 grams.
(2)

(Total for question = 8 marks)


P a g e | 43

Q41. (Q01 WST01/01, Jan 2018)

Two classes of students, class A and class B, sat a test.

Class A has 10 students. Class B has 15 students.

Each student achieved a score, x, on the test and their scores are summarised in the table below.

The mean score for Class A is 77 and the mean score for Class B is 61

(a) Find the value of t


(1)
(b) Calculate the variance of the test scores for each class.
(3)
The highest score on the test was 95 and the lowest score was 45

These were each scored by students from the same class.

(c) State, with a reason, which class you believe they were from.
(1)
The two classes are combined into one group of 25 students.

(d) (i) Find the mean test score for all 25 students.
(ii) Find the variance of the test scores for all 25 students.
(4)
The teacher of class A later realises that he added up the test scores for his class incorrectly. Each
student's test score in class A should be increased by 3

(e) Without further calculations, state, with a reason, the effect this will have on
(i) the variance of the test scores for class A
(ii) the mean test score for all 25 students
(iii) the variance of the test scores for all 25 students.
(3)

(Total for question = 12 marks)


P a g e | 44

Q42. (Q02 WST01/01, June 2018)

Two youth clubs, Eastyou and Westyou, decided to raise money for charity by running a 5 km race. All
the members of the youth clubs took part and the time, in minutes, taken for each member to run the 5 km
was recorded.
The times for the Westyou members are summarised in Figure 1.

(a) Write down the time that is exceeded by 75% of Westyou members.
(1)
The times for the Eastyou members are summarised by the stem and leaf diagram below.

(b) Find the value of the median and interquartile range for the Eastyou members.
(3)
An outlier is a value that falls either

more than 1.5 × (Q3 - Q1) above Q3

or more than 1.5 × (Q3 - Q1) above Q1

(c) On the grid on page 7, draw a box plot to represent the times of the Eastyou members.
(4)
(d) State the skewness of each distribution. Give reasons for your answers.
(3)

(Total for question = 11 marks)


P a g e | 45

Q43. (Q05 WST01/01, June 2018)

The weights, in grams, of a random sample of 48 broad beans are summarised in the table.

(You may assume Σfy2 = 101.56)

A histogram was drawn to represent these data. The 2.1 < x ≤ 2.7 class was represented by a bar of
width 1.5 cm and height 1 cm.

(a) Find the width and height of the 0.9 < x ≤ 1.1 class.
(3)
(b) Give a reason to justify the use of a histogram to represent these data.
(1)
(c) Estimate the mean and the standard deviation of the weights of these broad beans.
(4)
(d) Use linear interpolation to estimate the median of the weights of these broad beans.
(2)
One of these broad beans is selected at random.

(e) Estimate the probability that its weight lies between 1.1 grams and 1.6 grams.
(1)
One of these broad beans having a recorded weight of 0.95 grams was incorrectly weighed. The correct
weight is 1.4 grams.

(f) State, giving a reason, the effect this would have on your answers to part (c). Do not carry out any
further calculations.
(2)

(Total for question = 13 marks)


P a g e | 46

Q44. (Q04 WST01/01, Jan 2019)

A group of 100 adults recorded the amount of time, t minutes, they spent exercising each day. Their
results are summarised in the table below.

[You may use fx2 = 455 512.5]

A histogram is drawn to represent these data.

The bar representing the time 0 ≤ t < 15 has width 0.5 cm and height 6 cm.

(a) Calculate the width and height of the bar representing a time of 60 ≤ t < 120
(3)
(b) Use linear interpolation to estimate the median time spent exercising by these adults each day.
(2)
(c) Find an estimate of the mean time spent exercising by these adults each day.
(2)
(d) Calculate an estimate for the standard deviation of these times.
(2)
(e) Describe, giving a reason, the skewness of these data.
(1)
Further analysis of the above data revealed that 18 of the 25 adults in the 0 ≤ t < 15 group took no
exercise each day.

(f) State, giving a reason, what effect, if any, this new information would have on your answers to
(i) the estimate of the median in part (b),
(ii) the estimate of the mean in part (c),
(iii) the estimate of the standard deviation in part (d).
(3)

(Total for question = 13 marks)


P a g e | 47

Q45. (Q01 WST01/01, June 2019)

The heights, x metres, of 40 children were recorded by a teacher. The results are summarised as follows

(a) Find the mean and the variance of the heights of these 40 children.
(3)
The teacher decided that these statistics would be more useful in centimetres.

(b) Find
(i) the mean of these heights in centimetres,
(ii) the standard deviation of these heights in centimetres.
(2)
Two more children join the group. Their heights are 130 cm and 160 cm.

(c) (i) State, giving a reason, the mean height of the 42 children.
(ii) Without recalculating the standard deviation, state, giving a reason, whether the standard deviation
of the heights of the 42 children will be greater than, less than or the same as the standard deviation of
the heights of the group of 40 children.
(4)

(Total for question = 9 marks)


P a g e | 48

Q46. (Q02 WST01/01, June 2019)

Chi wanted to summarise the scores of the 39 competitors in a village quiz. He started to produce the
following stem and leaf diagram

He did not complete the stem and leaf diagram but instead produced the following box plot.

Chi defined an outlier as a value that is


greater than Q3 + 1.5 × (Q3 − Q1)
or
less than Q1 − 1.5 × (Q3 − Q1)
(a) Find
(i) the interquartile range
(ii) the range.
(2)
(b) Describe, giving a reason, the skewness of the distribution of scores.
(2)
Albert and Beth asked for their scores to be checked.

Albert's score was changed from 25 to 37


Beth's score was changed from 54 to 60

(c) On the grid, draw an updated box plot.


Show clearly any calculations that you used.

(7)
Some of the competitors complained that the questions were biased towards the younger generation. The
product moment correlation coefficient between the age of the competitors and their score in the quiz is –
0.187

(d) State, giving a reason, whether or not the complaint is supported by this statistic.
(2)
(Total for question = 13 marks)
P a g e | 49

Q47. (Q04 WST01/01, Oct 2020)

A group of students took some tests. A teacher is analysing the average mark for each student. Each
student obtained a different average mark.

For these average marks, the lower quartile is 24, the median is 30 and the interquartile range (IQR) is 10

The three lowest average marks are 8, 10 and 15.5 and the three highest average marks are 45, 52.5 and
56

The teacher defines an outlier to be a value that is either

more than 1.5 × IQR below the lower quartile or


more than 1.5 × IQR above the upper quartile

(a) Determine any outliers in these data.


(4)
(b) On the grid below draw a box plot for these data, indicating clearly any outliers.
(3)

(c) Use the quartiles to describe the skewness of these data.

Give a reason for your answer. (2)


Two more students also took the tests. Their average marks, which were both less than 45, are added to
the data and the box plot redrawn.

The median and the upper quartile are the same but the lower quartile is now 26

(d) Redraw the box plot on the grid below.


(3)

(e) Give ranges of values within which each of these students' average marks must lie.
(2)

Only use these grids if you need to redraw your answer for part (b) or part (d).

Copy of grid for part (b)


P a g e | 50

Copy of grid for part (d)

(Total for question = 14 marks)


P a g e | 51

Q48. (Q02 WST11/01, Specimen papers) (Q002 WST01/01, June 2016)

The time taken to complete a puzzle, in minutes, is recorded for each person in a club. The times are
summarised in a grouped frequency distribution and represented by a histogram.

One of the class intervals has a frequency of 20 and is shown by a bar of width 1.5 cm and height 12 cm
on the histogram. The total area under the histogram is 94.5 cm2

Find the number of people in the club.

(Total for question = 3 marks)


P a g e | 52

Q49. (Q04 WST11/01, Specimen papers) (Q004 WST01/01, June 2016)

A researcher recorded the time, t minutes, spent using a mobile phone during a particular afternoon, for
each child in a club.

The researcher coded the data using and the results are summarised in the table below.

(a) Write down the value of a and the value of b.


(1)
(b) Calculate an estimate of the mean of v.
(1)
(c) Calculate an estimate of the standard deviation of v.
(2)
(d) Use linear interpolation to estimate the median of v.
(2)
(e) Hence describe the skewness of the distribution. Give a reason for your answer.
(2)
(f) Calculate estimates of the mean and the standard deviation of the time spent using a mobile phone
during the afternoon by the children in this club.
(4)

(Total for question = 12 marks)


P a g e | 53

Q50. (Q02 WST01/01, Jan 2021)

The stem and leaf diagram below shows the ages (in years) of the residents in a care home.

(a) Find the median age of the residents.


(1)
(b) Find the interquartile range (IQR) of the ages of the residents.
(2)
An outlier is defined as a value that is either
more than 1.5 × (IQR) below the lower quartile or
more than 1.5 × (IQR) above the upper quartile.
(c) Determine any outliers in these data. Show clearly any calculations that you use.
(3)
(d) On the grid below, draw a box plot to summarise these data.
(3)

(Total for question = 9 marks)


P a g e | 54

Q51. (Q06 WST01/01, Jan 2021)

A disc of radius 1 cm is rolled onto a horizontal grid of rectangles so that the disc is equally likely to land
anywhere on the grid. Each rectangle is 5 cm long and 3 cm wide. There are no gaps between the
rectangles and the grid is sufficiently large so that no discs roll off the grid.

If the disc lands inside a rectangle without covering any part of the edges of the rectangle then a prize is
won.

By considering the possible positions for the centre of the disc,

(a) show that the probability of winning a prize on any particular roll is
(3)
A group of 15 students each roll the disc onto the grid twenty times and record the number of times, x,
that each student wins a prize. Their results are summarised as follows

(b) Find the standard deviation of the number of prizes won per student.
(2)
A second group of 12 students each roll the disc onto the grid twenty times and the mean number of
prizes won per student is 3.5 with a standard deviation of 2

(c) Find the mean and standard deviation of the number of prizes won per student for the whole group of
27 students.
(7)
The 27 students also recorded the number of times that the disc covered a corner of a rectangle and
estimated the probability to be 0.2216 (to 4 decimal places).

(d) Explain how this probability could be used to find an estimate for the value of π and state the value of
your estimate.
(3)

(Total for question = 15 marks)


P a g e | 55

Q52. (Q03 WST01/01, June 2021)

A random sample of 100 carrots is taken from a farm and their lengths, L cm, recorded.
The data are summarised in the following table.

A histogram is drawn to represent these data.


The bar representing the class 5 ≤ L < 8 is 1.5 cm wide and 1 cm high.

(a) Find the width and height of the bar representing the class 15 ≤ L < 20
(3)
(b) Use linear interpolation to estimate the median length of these carrots.
(2)
(c) Estimate
(i) the mean length of these carrots,
(2)
(ii) the standard deviation of the lengths of these carrots.
(3)
A supermarket will only buy carrots with length between 9 cm and 22 cm.

(d) Estimate the proportion of carrots from the farm that the supermarket will buy.
(2)
Any carrots that the supermarket does not buy are sold as animal feed.

The farm makes a profit of 2.2 pence on each carrot sold to the supermarket, a profit of 0.8 pence on
each carrot longer than 22 cm and a loss of 1.2 pence on each carrot shorter than 9 cm.

(e) Find an estimate of the mean profit per carrot made by the farm.
(2)

(Total for question = 14 marks)


P a g e | 56

Q53. (Q03 WST01/01, Oct 2021)

The stem and leaf diagram shows the ages of the 35 male passengers on a cruise.

(a) Find the median age of the male passengers.


(1)
(b) Show that the interquartile range (IQR) of these ages is 16
(2)
An outlier is defined as a value that is more than
1.5 × IQR above the upper quartile
or
1.5 × IQR below the lower quartile
(c) Show that there are 3 outliers amongst these ages.
(3)
(d) On the grid in Figure 1 below, draw a box plot for the ages of the male passengers on the cruise.
(4)
Figure 1 below also shows a box plot for the ages of the female passengers on the cruise.

(e) Comment on any difference in the distributions of ages of male and female passengers on the cruise.
State the values of any statistics you have used to support your comment.
(1)
Anja, along with her 2 daughters and a granddaughter, now join the cruise.
Anja's granddaughter is younger than both of Anja's daughters.
Anja had her 23rd birthday on the day her eldest daughter was born.
When their 4 ages are included with the other female passengers on the cruise, the box plot does not
change.

(f) State, giving reasons, what you can say about


(i) the granddaughter's age
(ii) Anja's age.
(3)
P a g e | 57

(Total for question = 14 marks)


P a g e | 58

Q54. (Q03 WST01/01, Jan 2022)

The stem and leaf diagram shows the number of deliveries made by Pat each day for 24 days

where a, b and c are positive integers with a < b < c

An outlier is defined as any value greater than 1.5 × interquartile range above the upper quartile.

Given that there is only one outlier for these data,

(a) show that c = 9


(3)
The number of deliveries made by Pat each day is represented by d

The data in the stem and leaf diagram are coded using

x = d – 125

and the following summary statistics are obtained

(b) Find the mean number of deliveries.


(3)
(c) Find the standard deviation of the number of deliveries.
(2)
One of these 24 days is selected at random. The random variable D represents the number of deliveries
made by Pat on this day.

The random variable X = D – 125

(d) Find P(D > 118 |X < 0)


(2)

(Total for question = 10 marks)


P a g e | 59

Q55. (Q01 WST01/01, June 2022)

The company Seafield requires contractors to record the number of hours they work each week. A
random sample of 38 weeks is taken and the number of hours worked per week by contractor Kiana is
summarised in the stem and leaf diagram below.

The quartiles for this distribution are summarised in the table below.

(a) Find the values of w, x and y


(3)
Kiana is looking for outliers in the data. She decides to classify as outliers any observations greater than

Q3 + 1.0 × (Q3 - Q1)

(b) Showing your working clearly, identify any outliers that Kiana finds.
(2)
(c) Draw a box plot for these data in the space provided on the grid opposite.
(3)
(d) Use the formula

to find the skewness of these data. Give your answer to 2 significant figures.
(2)
Kiana's new employer, Landacre, wishes to know the average number of hours per week she worked
during her employment at Seafield to help calculate the cost of employing her.

(e) Explain why Landacre might prefer to know Kiana's mean, rather than median, number of hours
worked per week.
(1)

(Total for question = 11 marks)


P a g e | 60

Q56. (Q03 WST01/01, June 2022)

Gill buys a bag of logs to use in her stove. The lengths, l cm, of the 88 logs in the bag are summarised in
the table below.

A histogram is drawn to represent these data.

The bar representing logs with length has a width of 1.5 cm and a height of 4 cm.

(a) Calculate the width and height of the bar representing log lengths of
(3)
(b) Use linear interpolation to estimate the median of l
(2)
The maximum length of log Gill can use in her stove is 26 cm.

Gill estimates, using linear interpolation, that x logs from the bag will fit into her stove.

(c) Show that x = 62


(1)
Gill randomly selects 4 logs from the bag.

(d) Using x = 62 , find the probability that all 4 logs will fit into her stove.
(2)
The weights, W grams, of the logs in the bag are coded using y = 0.5w - 255 and summarised by

n = 88 ∑ y = 924 ∑ y2 = 12 862

(e) Calculate
(i) the mean of W
(3)
(ii) the variance of W
(3)

(Total for question = 14 marks)


P a g e | 61

Q57. (Q01 WST01/01, Oct 2022)

The stem lengths of a sample of 120 tulips are recorded in the grouped frequency table below.

A histogram is drawn to represent these data.

The area of the bar representing the class is 16.5 cm2

(a) Calculate the exact area of the bar representing the class.
(2)
The height of the tallest bar in the histogram is 10 cm.

(b) Find the exact height of the second tallest bar.


(3)
Q1 for these data is 45 cm.

(c) Use linear interpolation to find an estimate for


(i) Q2
(ii) the interquartile range.
(4)
One measure of skewness is given by

(d) By calculating this measure, describe the skewness of these data.


(2)

(Total for question = 11 marks)


P a g e | 62

Q58. (Q03 WST01/01, Oct 2022)

Morgan is investigating the body length, b centimetres, of squirrels.

A random sample of 8 squirrels is taken and the data for each squirrel is coded using

The results for the coded data are summarised below

(a) Find the mean of b


(3)
(b) Find the standard deviation of b
(3)
A 9th squirrel is added to the sample. Given that for all 9 squirrels

(c) find
(i) the body length of the 9th squirrel,
(2)
(ii) the standard deviation of x for all 9 squirrels.
(2)

(Total for question = 10 marks)


P a g e | 63

Q59. (Q01 WST01/01, Jan 2023)

The histogram shows the times taken, t minutes, by each of 100 people to swim 500 metres.

(a) Use the histogram to complete the frequency table for the times taken by the 100 people to swim 500
metres.

(1)
(b) Estimate the number of people who took less than 16 minutes to swim 500 metres.
(2)
(c) Find an estimate for the mean time taken to swim 500 metres.
(2)
2
Given that ∑ ft = 41 033

(d) find an estimate for the standard deviation of the times taken to swim 500 metres.
(2)
Given that Q3 = 23

(e) use linear interpolation to estimate the interquartile range of the times taken to swim 500 metres.
(3)

(Total for question = 10 marks)


P a g e | 64

Q60. (Q03 WST01/01, June 2023)

Jim records the length, l mm, of 81 salmon. The data are coded using x = l – 600 and the following
summary statistics are obtained.

(a) Find the mean length of these salmon.


(3)
(b) Find the variance of the lengths of these salmon.
(2)
The weight, w grams, of each of the 81 salmon is recorded to the nearest gram.
The recorded results for the 81 salmon are summarised in the box plot below.

(c) Find the maximum number of salmon that have weights in the interval

4600 < w ≤ 7700


(1)
Raj says that the box plot is incorrect as Jim has not included outliers.

For these data an outlier is defined as a value that is more than

1.5 × IQR above the upper quartile or 1.5 × IQR below the lower quartile

(d) Show that there are no outliers.


(3)

(Total for question = 9 marks)


P a g e | 65

Q61. (Q01 WST01/01, June 2023)

The histogram shows the distances, in km, that 274 people travel to work.

Given that 60 of these people travel between 10 km and 20 km to work, estimate

(a) the number of people who travel between 22 km and 45 km to work,


(3)
(b) the median distance travelled to work by these 274 people,
(2)
(c) the mean distance travelled to work by these 274 people.
(3)

(Total for question = 8 marks)

You might also like