S1 HW Booklet Students
S1 HW Booklet Students
Name
Teacher
Grid
Contents Marks %
Representation and Summary ...................................... 2
Boxplot ........................................................................ 10
Histograms .................................................................. 15
Probability ................................................................... 22
Correlation and Regression......................................... 30
Discrete Random Variable .......................................... 36
Normal Distribution .................................................... 41
Munem Ahmed
1
Representation and Summary
2
1. The number of caravans on Seaview caravan site on each night in August last year is
summarised in the following stem and leaf diagram.
During the same month, the least number of caravans on Northcliffe caravan site was
31. The maximum number of caravans on this site on any night that month was 72. The
three quartiles for this site were 38, 45 and 52 respectively.
(b) On page 4 and using the same scale, draw box plots to represent the data for
both caravan sites. You may assume that there are no outliers.
(6)
3
4
2. Over a period of time, the number of people x leaving a hotel each morning was
recorded. These data are summarised in the stem and leaf diagram below.
mean – mode
.
standard deviation
(d) Evaluate this measure to show that these data are negatively skewed.
(2)
(e) Give two other reasons why these data are negatively skewed.
(4)
(Total 14 marks)
5
6
3. In a study of how students use their mobile telephones, the phone usage of a random
sample of 11 students was examined for a particular week.
17, 23, 35, 36, 51, 53, 54, 55, 60, 77, 110
A value that is greater than Q3 + 1.5 × (Q3 – Q1) or smaller than Q1 – 1.5 × (Q3 – Q1) is
defined as an outlier.
(c) Using the graph below draw a box plot for these data indicating clearly the
position of the outlier.
(3)
These 10 students were each asked how many text messages, x, they sent in the same week.
The values of Sxx and Sxy for these 10 students are Sxx = 3463.6 and Sxy = –18.3.
(e) Calculate the product moment correlation coefficient between the number of
text messages sent and the total length of calls for these 10 students.
(2)
A parent believes that a student who sends a large number of text messages will spend
fewer minutes on calls.
(f) Comment on this belief in the light of your calculation in part (e).
(1)
(Total 14 marks)
7
8
9
Boxplot
10
1. Aeroplanes fly from City A to City B. Over a long period of time the number of minutes
delay in take-off from City A was recorded. The minimum delay was 5 minutes and the
maximum delay was 63 minutes. A quarter of all delays were at most 12 minutes, half
were at most 17 minutes and 75% were at most 28 minutes. Only one of the delays was
longer than 45 minutes.
An outlier is an observation that falls either 1.5 x (interquartile range) above the upper
quartile or 1.5 x (interquartile range) below the lower quartile.
(c) Suggest how the distribution might be interpreted by a passenger who frequently
flies from City A to City B.
(1)
(Total 10 marks)
11
12
2. (a) Describe the main features and uses of a box plot.
(3)
Children from school A and B took part in a fun run for charity. The times to the nearest
minute, taken by the children from school A are summarised in the figure below.
School A
10 20 40 50 60
30
Time (minutes)
(b) (i) Write down the time by which 75% of the children in school A had
completed the run.
(1)
(c) Explain what you understand by the two crosses ( ) on the figure above.
(2)
13
For school B the least time taken by any of the children was 25 minutes and the longest time
was 55 minutes. The three quartiles were 30, 37 and 50 respectively.
(4)
14
Histograms
15
1. The variable x was measured to the nearest whole number. Forty observations are
given in the table below.
x 10 – 15 16 – 18 19 –
Frequency 15 9 16
A histogram was drawn and the bar representing the 10 – 15 class has a width of 2 cm
and a height of 5 cm. For the 16 – 18 class find
16
17
2. A teacher selects a random sample of 56 students and records, to the nearest hour, the
time spent watching television in a particular week.
(a) Find the mid-points of the 21–25 hour and 31–40 hour groups.
(2)
A histogram was drawn to represent these data. The 11–20 group was represented by a
bar of width 4 cm and height 6 cm.
(c) Estimate the mean and standard deviation of the time spent watching television
by these students.
(5)
(d) Use linear interpolation to estimate the median length of time spent watching
television by these students.
(2)
The teacher estimated the lower quartile and the upper quartile of the time spent
watching television to be 15.8 and 29.3 respectively.
18
19
3. In a shopping survey a random sample of 104 teenagers were asked how many hours,
to the nearest hour, they spent shopping in the last month. The results are summarised
in the table below.
A histogram was drawn and the group (8 – 10) hours was represented by a rectangle
that was 1.5 cm wide and 3 cm high.
(a) Calculate the width and height of the rectangle representing the group (16 – 25)
hours.
(3)
(b) Use linear interpolation to estimate the median and interquartile range.
(5)
(c) Estimate the mean and standard deviation of the number of hours spent
shopping.
(4)
(e) State, giving a reason, which average and measure of dispersion you would
recommend to use to summarise these data.
(2)
(Total 16 marks)
20
21
Probability
22
1. A jar contains 2 red, 1 blue and 1 green bead. Two beads are drawn at random from the
jar without replacement.
(a) Draw a tree diagram to illustrate all the possible outcomes and associated
probabilities. State your probabilities clearly.
(3)
(b) Find the probability that a blue bead and a green bead are drawn from the jar.
(2)
(Total 5 marks)
23
2. A fair die has six faces numbered 1, 2, 2, 3, 3 and 3. The die is rolled twice and the
number showing on the uppermost face is recorded each time.
Find the probability that the sum of the two numbers recorded is at least 5.
(Total 5 marks)
24
3. A company assembles drills using components from two sources. Goodbuy supplies 85%
of the components and Amart supplies the rest. It is known that 3% of the components
supplied by Goodbuy are faulty and 6% of those supplied by Amart are faulty.
25
4. (a) Given that P(A) = a and P(B) = b express P(A B) in terms of a and b when
(d) P(R).
(2)
(Total 7 marks)
26
27
5. There are 180 students at a college following a general course in computing. Students
on this course can choose to take up to three extra options.
Students who want to become technicians take systems support and networking. Given
that a randomly chosen student wants to become a technician,
(d) find the probability that this student takes all three extra options.
(2)
(Total 9 marks)
28
29
Correlation and Regression
30
1. Gary compared the total attendance, x, at home matches and the total number of goals,
y, scored at home during a season for each of 12 football teams playing in a league. He
correctly calculated:
(a) Calculate the product moment correlation coefficient for these data.
(2)
Helen was given the same data to analyse. In view of the large numbers involved she
decided to divide the attendance figures by 100. She then calculated the product
x
moment correlation coefficient between and y.
100
31
2. The volume of a sample of gas is kept constant. The gas is heated and the pressure, p, is
measured at 10 different temperatures, t. The results are summarised below.
32
3. An experiment carried out by a student yielded pairs of (x, y) observations such that
33
4. The weight, w grams, and the length, l mm, of 10 randomly selected newborn turtles
are given in the table below.
l 49.0 52.0 53.0 54.5 54.1 53.4 50.0 51.6 49.5 51.2
w 29 32 34 39 38 35 30 31 29 30
(a) Find the equation of the regression line of w on l in the form w = a + bl.
(5)
(b) Use your regression line to estimate the weight of a newborn turtle of length 60
mm.
(2)
(c) Comment on the reliability of your estimate giving a reason for your answer.
(2)
(Total 9 marks)
34
35
Discrete Random Variable
36
1. The discrete random variable X can take only the values 2, 3 or 4. For these values the
cumulative distribution function is defined by
( x k )2
F( x) for x 2,3,4
25
(a) Find k.
(2)
37
2. The random variable X has the discrete uniform distribution
P(X = x) = 1 , x = 1, 2, 3, 4, 5.
5
(a) Write down the value of E(X) and show that Var(X) = 2.
(3)
Find
38
3. A discrete random variable X has a probability function as shown in the table below, where a
and b are constants.
x 0 1 2 3
P(X = x) 0.2 0.3 b a
Find
39
40
Normal Distribution
41
1. The lifetimes of batteries used for a computer game have a mean of 12 hours and a
standard deviation of 3 hours. Battery lifetimes may be assumed to be normally
distributed.
Find the lifetime, t hours, of a battery such that 1 battery in 5 will have a lifetime longer
than t.
(Total 6 marks)
42
2. The random variable X has a normal distribution with mean 20 and standard deviation
4.
(b) Find the value of d such that P(20 < X < d) = 0.4641
(4)
(Total 7 marks)
43
3. The random variable X is normally distributed with mean µ and variance σ 2.
44
4. The weight of coffee in glass jars labelled 100 g is normally distributed with mean
101.80 g and standard deviation 0.72 g. The weight of an empty glass jar is normally
distributed with mean 260.00 g and standard deviation 5.45 g. The weight of a glass jar
is independent of the weight of the coffee it contains.
Find the probability that a randomly selected jar weighs less than 266 g and contains
less than 100 g of coffee. Give your answer to 2 significant figures.
(Total 8 marks)
45
46
5. The weights of bags of popcorn are normally distributed with mean of 200 g and 60% of
all bags weighing between 190 g and 210 g.
(b) Find the standard deviation of the weights of the bags of popcorn.
(5)
A shopkeeper finds that customers will complain if their bag of popcorn weighs less
than 180 g.
47
48