0% found this document useful (0 votes)
116 views24 pages

Y X,) On Your Scatter Diagram and Label This Point M.: IB Questionbank Mathematical Studies 3rd Edition 1

The document presents data from two experiments: 1. The relationship between moisture content and heat output of wood samples. A scatter plot is drawn and line of best fit calculated. The regression equation is used to estimate heat output at 25% moisture. 2. Distance traveled by cyclists over time. A scatter plot is drawn and line of best fit calculated. The regression equation is used to estimate distance at a given time and evaluate the reliability of such estimates.

Uploaded by

Arva Malpani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
116 views24 pages

Y X,) On Your Scatter Diagram and Label This Point M.: IB Questionbank Mathematical Studies 3rd Edition 1

The document presents data from two experiments: 1. The relationship between moisture content and heat output of wood samples. A scatter plot is drawn and line of best fit calculated. The regression equation is used to estimate heat output at 25% moisture. 2. Distance traveled by cyclists over time. A scatter plot is drawn and line of best fit calculated. The regression equation is used to estimate distance at a given time and evaluate the reliability of such estimates.

Uploaded by

Arva Malpani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

1.

The heat output in thermal units from burning 1 kg of wood changes according to the wood’s percentage
moisture content. The moisture content and heat output of 10 blocks of the same type of wood each
weighing 1 kg were measured. These are shown in the table.

Moisture content % (x) 8 15 22 30 34 45 50 60 74 82


Heat output ( y) 80 77 74 69 68 61 61 55 50 45

(a) Draw a scatter diagram to show the above data. Use a scale of 2 cm to represent 10 % on the x-axis
and a scale of 2 cm to represent 10 thermal units on the y-axis.
(4)

(b) Write down

(i) the mean percentage moisture content, x ;

(ii) the mean heat output, y .


(2)

(c) Plot the point ( x, y ) on your scatter diagram and label this point M.
(2)

(d) Write down the product-moment correlation coefficient, r.


(2)

The equation of the regression line y on x is y = –0.470x + 83.7.

(e) Draw the regression line y on x on your scatter diagram.


(2)

(f) Estimate the heat output in thermal units of a 1 kg block of wood that has 25 % moisture content.
(2)

(g) State, with a reason, whether it is appropriate to use the regression line y on x to estimate the heat
output in part (f).
(2)

4. Alex and Kris are riding their bicycles together along a bicycle trail and note the following distance
markers at the given times.

Time (t hours) 1 2 3 4 5 6 7
Distance (d km) 57 65 72 81 89 97 107

IB Questionbank Mathematical Studies 3rd edition 1


(a) Draw a scatter diagram of the data. Use 1 cm to represent 1 hour and 1 cm to represent 10 km.
(3)

(b) Write down for this set of data

(i) the mean time, t ;

(ii) the mean distance, d .


(2)

(c) Mark and label the point M (t , d ) on your scatter diagram.


(2)

(d) Draw the line of best fit on your scatter diagram.


(2)

(e) Using your graph, estimate the time when Alex and Kris pass the 85 km distance marker. Give
your answer correct to one decimal place.
(2)

(f) Write down the equation of the regression line for the data given.
(2)

(g) (i) Using your equation calculate the distance marker passed by the cyclists at 10.3 hours.

(ii) Is this estimate of the distance reliable? Give a reason for your answer.
(4)
(Total 17 marks)

6. In a mountain region there appears to be a relationship between the number of trees growing in the region
and the depth of snow in winter. A set of 10 areas was chosen, and in each area the number of trees was
counted and the depth of snow measured.
The results are given in the table below.

Number of trees (x) Depth of snow in cm (y)


45 30
75 50
66 40
27 25
44 30
28 5
60 35
35 20
73 45
47 25

IB Questionbank Mathematical Studies 3rd edition 2


(a) Use your graphic display calculator to find

(i) the mean number of trees;

(ii) the standard deviation of the number of trees;

(iii) the mean depth of snow;

(iv) the standard deviation of the depth of snow.


(4)

The covariance, Sxy = 188.5.

(b) Write down the product-moment correlation coefficient, r.


(2)

(c) Write down the equation of the regression line of y on x.


(2)

(d) If the number of trees in an area is 55, estimate the depth of snow.
(2)

(e) (i) Use the equation of the regression line to estimate the depth of snow in an area with 100
trees.

(ii) Decide whether the answer in (e)(i) is a valid estimate of the depth of snow in the area. Give
a reason for your answer.
(3)
(Total 13 marks)

8. The following table shows the cost in AUD of seven paperback books chosen at random, together with
the number of pages in each book.

Book 1 2 3 4 5 6 7
Number of pages (x) 50 120 200 330 400 450 630
Cost (y AUD) 6.00 5.40 7.20 4.60 7.60 5.80 5.20

(a) Plot these pairs of values on a scatter diagram. Use a scale of 1 cm to represent 50 pages on the
horizontal axis and 1 cm to represent 1 AUD on the vertical axis.
(3)

(b) Write down the linear correlation coefficient, r, for the data.
(2)

(c) Stephen wishes to buy a paperback book that has 350 pages in it. He plans to draw a line of best fit
to determine the price. State whether or not this is an appropriate method in this case and justify
your answer.
IB Questionbank Mathematical Studies 3rd edition 3
(2)
(Total 7 marks)

10. The figure below shows the lengths in centimetres of fish found in the net of a small trawler.

11
10
9
8
7
Number of 6
fish 5
4
3
2
1
0 10 20 30 40 50 60 70 80 90 110 120 130
–1
Length (cm)

(a) Find the total number of fish in the net.


(2)

(b) Find (i) the modal length interval;

(ii) the interval containing the median length;

(iii) an estimate of the mean length.


(5)

(c) (i) Write down an estimate for the standard deviation of the lengths.

(ii) How many fish (if any) have length greater than three standard deviations above the mean?
(3)

The fishing company must pay a fine if more than 10 of the catch have lengths less than 40cm.

(d) Do a calculation to decide whether the company is fined.


(2)

A sample of 15 of the fish was weighed. The weight, W was plotted against length, L as shown

1.2

W 0.8
(kg)
0.6

0.4

0.2

0 20 40 60 80 100
below. L (cm)
IB Questionbank Mathematical Studies 3rd edition 4
(e) Exactly two of the following statements about the plot could be correct. Identify the two correct
statements.
(2)

Note: You do not need to enter data in a GDC or to calculate r exactly.

(i) The value of r, the correlation coefficient, is approximately 0.871.

(ii) There is an exact linear relation between W and L.

(iii) The line of regression of W on L has equation W = 0.012L + 0.008.

(iv) There is negative correlation between the length and weight.

(v) The value of r, the correlation coefficient, is approximately 0.998.

(vi) The line of regression of W on L has equation W = 63.5L + 16.5.


(Total 14 marks)

17. A number of employees at a factory were given x additional training sessions each. They were then timed
on how long (y seconds) it took them to complete a task. The results are shown in the scatter diagram
below. A list of descriptive statistics is also given.

14
12
time taken (seconds)

10
8
6
4
2

0 2 4 6 8 10
number of additional training sessions

n = 9,

sum of x values:  x = 54,

sum of y values:  y = 81,

mean of x values: x = 6,

mean of y values: y = 9,

standard deviation of x: sx = 1.94,

standard deviation of y: sy = 2.35,

covariance: sxy = –3.77.

(a) Determine the product-moment correlation coefficient (r) for this data.
(2)

IB Questionbank Mathematical Studies 3rd edition 5


(b) What is the nature of the relationship between the amount of additional training and the time taken
to complete the task?
(2)

n
(c) Calculate  (x
i =1
i – x )( y i – y ) given that the covariance sxy = –3.77.
(1)

(d) (i) Determine the equation of the linear regression line for y on x.

(ii) Find the expected time to complete the task for an employee who only attended three
additional training sessions.
(4)
(Total 9 marks)

19. A study was carried out to investigate possible links between the weights of baby rabbits and their
mothers. A sample of 20 pairs of mother rabbits (x) and baby rabbits (y) was chosen at random and their
weights noted. This information was plotted on a scatter diagram and various statistical calculations were
made. These appear below.

5.0
4.0
baby rabbit’s
weight (kg) 3.0
2.0
1.0
1.0 2.0 3.0 4.0 5.0 6.0
mother rabbit’s weight (kg)

mean of x mean of y sx sy sxy sum of x sum of y


3.78 3.46 0.850 0.689 0.442 75.6 69.2

(a) Show that the product-moment correlation coefficient r for this data is 0.755.
(2)

(b) (i) Write the equation of the regression line for y on x in the form y = ax + b.
(3)

(ii) Use your equation for the regression line to estimate the weight of a rabbit given that its
mother weighs 3.71 kg.
(2)
(Total 7 marks)

IB Questionbank Mathematical Studies 3rd edition 6


20. The sketches below represent scatter diagrams for the way in which variables x, y and z change over time,
t, in a given chemical experiment. They are labelled 1 , 2 and 3 .

x y z

× × × ××
× × × ×
× × ×
× × × ××
× ×
×× ×× ×
× × × ××
×× × × ××
× × ×× × ×× × ×
××
× ×
1 time t 2 time t 3 time t

(a) State which of the diagrams indicate that the pair of variables

(i) is not correlated;


(1)

(ii) shows strong linear correlation.


(1)

(b) A student is given a piece of paper with five numbers written on it. She is told that three of these
numbers are the product moment correlation coefficients for the three pairs of variables shown
above. The five numbers are

0.9, –0.85, –0.20, 0.04, 1.60

(i) For each sketch above state which of these five numbers is the most appropriate value for
the correlation coefficient.
(3)

(ii) For the two remaining numbers, state why you reject them for this experiment.
(2)

(c) Another variable, w, over time, t, gave the following information

∑t = 124 ∑w = 250 st = 6.08 sw = 10.50 stw = 55.00

for 20 data points.

Calculate

(i) the product moment correlation coefficient for this data;


(2)

(ii) the equation of the regression line of w on t in the form w = at + b.


(5)

21. A group of 15 students was given a test on mathematics. The students then played a computer game. The
diagram below shows the scores on the test and the game.

IB Questionbank Mathematical Studies 3rd edition 7


100
90
80
70
Game 60
score 50

40 M
30
20
10
0
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95
Mathematics score

The mean score on the mathematics test was 56.9 and the mean score for the computer game was 45.9.
The point M has coordinates (56.9, 45.9).

(a) Describe the relationship between the two sets of scores.

A straight line of best fit passes through the point (0, 69).

(b) On the diagram draw this straight line of best fit.

Jane took the tests late and scored 45 at mathematics.

(c) Using your graph or otherwise, estimate the score Jane expects on the computer game, giving your
answer to the nearest whole number.

(Total 8 marks)

IB Questionbank Mathematical Studies 3rd edition 8


22. The following are the results of a survey of the scores of 10 people on both a mathematics (x) and a
science (y) aptitude test:

Student Mathematics (x) Science (y)


1 90 85
2 38 60
3 58 78 x = 73
4 85 70 y = 78
5 73 65
6 82 71
7 56 80 Sx = 16.7
8 73 90 Sy =10.8
9 95 96 Sxy = 100.1
10 80 85

(a) Copy the graph below on graph paper and fill in the missing points for students 7–10 on the graph.

90
80
70
60
Science
Scores 50
40
30
20
10

0 10 20 30 40 50 60 70 80 90
Mathematics Scores
(4)

(b) Plot the point M ( x , y ) on the graph.


(1)

IB Questionbank Mathematical Studies 3rd edition 9


(c) Find the equation of the regression line of y on x in the form
y = ax + b.
(2)

(d) Graph this line on the above graph.


(2)

(e) Given that a student receives an 88 on the mathematics test, what would you expect this student's
science score to be? Show how you arrived at your result.
(2)
(Total 11 marks)

23. Several candy bars were purchased and the following table shows the weight and the cost of each bar.

Yummy Chox Marz Twin Chunx Lite BigC Bite


Weight (g) 60 85 80 65 95 50 100 45
Cost (Euros) 1.10 1.50 1.40 1.20 1.80 1.00 1.70 0.90

(a) Given that sx = 19.2, sy = 0.307 and sxy = 5.81, find the correlation coefficient, r, giving your
answer correct to 3 decimal places.
(2)

(b) Describe the correlation between the weight of a candy bar and its cost.
(1)

(c) Calculate the equation of the regression line for y on x.


(3)

(d) Use your equation to estimate the cost of a candy bar weighing 109 g.
(2)
(Total 8 marks)

24. Eight students in Mr. O’Neil’s Physical Education class did pushups and situps. Their results are shown
in the following table.

IB Questionbank Mathematical Studies 3rd edition 10


Student 1 2 3 4 5 6 7 8
number of pushups (x) 24 18 32 51 35 42 45 25
number of situps (y) 32 28 38 40 30 52 48 52

The graph below shows the results for the first seven students.
y

60

50
number
of 40
situps
(y) 30

20

10

O 10 20 30 40 50 60 x
number of pushups (x)

(a) Plot the results for the eighth student on the graph.

(b) If x = 34 and y = 40 , draw a line of best fit on the graph.

(c) A student can do 60 pushups. How many situps can the student be expected to do?

(Total 8 marks)

IB Questionbank Mathematical Studies 3rd edition 11


25. The scatter diagram below shows the relationship between the number of vehicles per thousand of
population and the number of people killed in road accidents over an eight year period in Calmville.

Relationship between number of vehicles and people


killed in road accidents in Calmville
900
Number of people killed

800
in road accidents

700
600
500
400
300
200
100
0
0 50 100 150 200 250 300 350
Number of vehicles per 1000 of population

Let x be the number of vehicles per thousand and y be the number of people killed. The following
information is known.

_ _
x = 270, y = 650 sx = 22.3 sy = 96.2, sxy = 2077.75

(a) (i) Calculate the product–moment correlation coefficient (r).

(ii) Explain clearly the statistical relationship between the variables x and y
(4)

(b) Write the equation of the regression line of y on x, expressing it in the form y = mx + c (where m
and c are given correct to 3 significant figures).
(4)

(c) Use your equation in part (b) to answer the following questions.

(i) There were 250 vehicles per 1000 of population. Find the number of people killed.

(ii) Explain why it is not a good idea to use the regression line to estimate the number of people
killed when the number of vehicles is 150 per thousand.
(3)
(Total 11 marks)

IB Questionbank Mathematical Studies 3rd edition 12


26. A shopkeeper wanted to investigate whether or not there was a correlation between the prices of food 10
years ago in 1992, with their prices today. He chose 8 everyday items and the prices are given in the table
below.

sugar milk eggs rolls tea bags coffee potatoes flour


1992 price $1.44 $0.80 $2.16 $1.80 $0.92 $3.16 $1.32 $1.12
2002 price $2.20 $1.04 $2.64 $3.00 $1.32 $2.28 $1.92 $1.44

(a) Calculate the mean and the standard deviation of the prices

(i) in 1992;

(ii) in 2002.
(4)

(b) (i) Given that sxy = 0.3104, calculate the correlation coefficient.

(ii) Comment on the relationship between the prices.


(4)

(c) Find the equation of the line of the best fit in the form y = mx + c.
(3)

(d) What would you expect to pay now for an item costing $2.60 in 1992?
(1)

(e) Which item would you omit to increase the correlation coefficient?
(2)
(Total 14 marks)

IB Questionbank Mathematical Studies 3rd edition 13


27. Ten students were asked for their average grade at the end of their last year of high school and their
average grade at the end of their last year at university. The results were put into a table as follows:

Student High School grade, x University grade, y


1 90 3.2
2 75 2.6
3 80 3.0
4 70 1.6
5 95 3.8
6 85 3.1
7 90 3.8
8 70 2.8
9 95 3.0
10 85 3.5

Total 835 30.4

(a) Given that sx = 8.96, sy = 0.610 and sxy = 4.16, find the correlation coefficient r, giving your answer
to two decimal places.
(2)

(b) Describe the correlation between the high school grades and the university grades.
(2)

(c) Find the equation of the regression line for y on x in the form y = ax + b.
(2)
(Total 6 marks)

28. The heights and weights of 10 students selected at random are shown in the table below.

Student 1 2 3 4 5 6 7 8 9 10
Height
x cm 155 161 173 150 182 165 170 185 175 145

Weight
y kg 50 75 80 46 81 79 64 92 74 108

(a) Plot this information on a scatter graph. Use a scale of 1 cm to represent 20 cm on the
x-axis and 1 cm to represent 10 kg on the y-axis.
(4)

IB Questionbank Mathematical Studies 3rd edition 14


(b) Calculate the mean height.
(1)

IB Questionbank Mathematical Studies 3rd edition 15


(c) Calculate the mean weight.
(1)

IB Questionbank Mathematical Studies 3rd edition 16


(d) It is given that Sxy = 44.31.

(i) By first calculating the standard deviation of the heights, correct to two decimal places,
show that the gradient of the line of regression of y on x is 0.276.

(ii) Calculate the equation of the line of best fit.

(iii) Draw the line of best fit on your graph.


(6)

IB Questionbank Mathematical Studies 3rd edition 17


(e) Use your line to estimate

(i) the weight of a student of height 190 cm;

(ii) the height of a student of weight 72 kg.


(2)

IB Questionbank Mathematical Studies 3rd edition 18


(f) It is decided to remove the data for student number 10 from all calculations. Explain briefly what
effect this will have on the line of best fit.
(1)

IB Questionbank Mathematical Studies 3rd edition 19


29. Statements I, II, III, IV and V represent descriptions of the correlation between two variables.

I High positive linear correlation


II Low positive linear correlation
III No correlation
IV Low negative linear correlation
V High negative linear correlation

Which statement best represents the relationship between the two variables shown in each of the scatter
diagrams below.

(a) y (b) y
10 10

8 8

6 6

4 4

2 2

0 2 4 6 8 10 x 0 2 4 6 8 10 x

(c) y (d) y
10 10

8 8

6 6

4 4

2 2

0 2 4 6 8 10 x 0 2 4 6 8 10 x

Answers:

(a) …………………………………………

(b) …………………………………………

(c) …………………………………………

(d) …………………………………………
(Total 4 marks)

30. Ten students were given two tests, one on Mathematics and one on English.
IB Questionbank Mathematical Studies 3rd edition 20
The table shows the results of the tests for each of the ten students.

Student A B C D E F G H I J
Mathematics (x) 8.6 13.4 12.8 9.3 1.3 9.4 13.1 4.9 13.5 9.6
English (y) 33 51 30 48 12 23 46 18 36 50

(a) Given sxy (the covariance) is 35.85, calculate, correct to two decimal places, the product moment
correlation coefficient (r).
(6)

(b) Use your result from part (a) to comment on the statement:

‘Those who do well in Mathematics also do well in English.’


(2)
(Total 8 marks)

31. The following table gives the heights and weights of five sixteen-year-old boys.

Name Height Weight


Blake 182 cm 73 kg
Jorge 173 cm 68 kg
Chin 162 cm 60 kg
Ravi 178 cm 66 kg
Derek 190 cm 75 kg

(a) Find

(i) the mean height;

(ii) the mean weight.

IB Questionbank Mathematical Studies 3rd edition 21


(b) Plot the above data on the grid below and draw the line of best fit.

190

185

180

175
height
(cm)
170

165

160

0
60 65 70 75
weight (kg)

(Total 4 marks)

32. The Type Fast secretarial training agency has a new computer software spreadsheet package. The agency
investigates the number of hours it takes people of varying ages to reach a level of proficiency using this
package. Fifteen individuals are tested and the results are summarised in the table below.

Age
(x) 32 40 21 45 24 19 17 21 27 54 33 37 23 45 18

Time
(in hours) 10 12 8 15 7 8 6 9 11 16 t 13 9 17 5
(y)

(a) (i) Given that Sy = 3.5 and Sxy = 36.7, calculate the product-moment correlation coefficient r for
this data.
(4)

(ii) What does the value of the correlation coefficient suggest about the relationship between the
two variables?
(1)

(b) Given that the mean time taken was 10.6 hours, write the equation of the regression line for y on x
in the form y = ax + b.
(3)

IB Questionbank Mathematical Studies 3rd edition 22


(c) Use your equation for the regression line to predict

(i) the time that it would take a 33 year old person to reach proficiency, giving your answer
correct to the nearest hour;
(2)

(ii) the age of a person who would take 8 hours to reach proficiency, giving your answer correct
to the nearest year.
(2)
(Total 12 marks)

33. The diagram below shows the marks scored by pupils in a French test and a German test. The mean score
on the French test is 29 marks and on the German test is 31 marks.

40

30

GERMAN
20

10

0 10 20 30 40
FRENCH

(a) Describe the relationship between the marks scored in the two tests.

(b) On the graph mark the point M which represents the mean of the distribution.

IB Questionbank Mathematical Studies 3rd edition 23


(c) Draw a suitable line of best fit.

(d) Idris scored 32 marks on the French test. Use your graph to estimate the mark Idris scored on the
German test.

(Total 4 marks)

34. The length and width of 10 leaves are shown on the scatter diagram below.

Relationship between leaf length and width

70
60
50
Width
(mm) 40
30
20
10

0 20 40 60 80 100 120 140 160


Length (mm)

(a) Plot the point M(97, 43) which represents the mean length and the mean width.

(b) Draw a suitable line of best fit.

(c) Write a sentence describing the relationship between leaf length and leaf width for this sample.

(Total 4 marks)

IB Questionbank Mathematical Studies 3rd edition 24

You might also like