Linear_regression
Linear_regression
Linear Regression
1. The length and width of 10 leaves are shown on the scatter diagram below.
70
60
50
Width
(mm) 40
30
20
10
(a) Plot the point M(97, 43) which represents the mean length and the mean width.
(c) Write a sentence describing the relationship between leaf length and leaf width for this
sample.
(Total 4 marks)
2. The Type Fast secretarial training agency has a new computer software spreadsheet package.
The agency investigates the number of hours it takes people of varying ages to reach a level of
proficiency using this package. Fifteen individuals are tested and the results are summarised in
the table below.
Age 32 40 21 45 24 19 17 21 27 54 33 37 23 45 18
(x)
Time
(in hours) 10 12 8 15 7 8 6 9 11 16 t 13 9 17 5
(y)
(a) (i) Given that Sy = 3.5 and Sxy = 36.7, calculate the product-moment correlation
coefficient r for this data.
(4)
(ii) What does the value of the correlation coefficient suggest about the relationship
between the two variables?
(1)
(b) Given that the mean time taken was 10.6 hours, write the equation of the regression line
for y on x in the form y = ax + b.
(3)
1
Linear Regression
(i) the time that it would take a 33 year old person to reach proficiency, giving your
answer correct to the nearest hour;
(2)
(ii) the age of a person who would take 8 hours to reach proficiency, giving your
answer correct to the nearest year.
(2)
(Total 12 marks)
3. Statements I, II, III, IV and V represent descriptions of the correlation between two variables.
Which statement best represents the relationship between the two variables shown in each of the
scatter diagrams below.
(a) y (b) y
10 10
8 8
6 6
4 4
2 2
0 2 4 6 8 10 x 0 2 4 6 8 10 x
(c) y (d) y
10 10
8 8
6 6
4 4
2 2
0 2 4 6 8 10 x 0 2 4 6 8 10 x
Answers:
(a) …………………………………………
(b) …………………………………………
(c) …………………………………………
(d) …………………………………………
2
Linear Regression
4. The diagram below shows the marks scored by pupils in a French test and a German test. The
mean score on the French test is 29 marks and on the German test is 31 marks.
40
30
GERMAN
20
10
0 10 20 30 40
FRENCH
(a) Describe the relationship between the marks scored in the two tests.
(b) On the graph mark the point M which represents the mean of the distribution.
(d) Idris scored 32 marks on the French test. Use your graph to estimate the mark Idris scored
on the German test.
(Total 4 marks)
5. Ten students were given two tests, one on Mathematics and one on English.
The table shows the results of the tests for each of the ten students.
Student A B C D E F G H I J
Mathematics
(x) 8.6 13.4 12.8 9.3 1.3 9.4 13.1 4.9 13.5 9.6
English
33 51 30 48 12 23 46 18 36 50
(y)
(a) Given sxy (the covariance) is 35.85, calculate, correct to two decimal places, the product
moment correlation coefficient (r).
(6)
(b) Use your result from part (a) to comment on the statement:
3
Linear Regression
6. Eight students in Mr. O'Neil's Physical Education class did pushups and situps. Their results are
shown in the following table.
Student 1 2 3 4 5 6 7 8
number of pushups (x) 24 18 32 51 35 42 45 25
number of situps (y) 32 28 38 40 30 52 48 52
The graph below shows the results for the first seven students.
y
60
50
number
of 40
situps
(y) 30
20
10
O 10 20 30 40 50 60 x
number of pushups (x)
(a) Plot the results for the eighth student on the graph.
(c) A student can do 60 pushups. How many situps can the student be expected to do?
(Total 8 marks)
7. Ten students were asked for their average grade at the end of their last year of high school and
their average grade at the end of their last year at university. The results were put into a table as
follows:
4
Linear Regression
(a) Find the correlation coefficient r, giving your answer to two decimal places.
(2)
(b) Describe the correlation between the high school grades and the university grades.
(2)
(c) Find the equation of the regression line for y on x in the form y = ax + b.
(2)
(Total 6 marks)
8. A shopkeeper wanted to investigate whether or not there was a correlation between the prices of
food 10 years ago in 1992, with their prices today. He chose 8 everyday items and the prices are
given in the table below.
(a) Calculate the mean and the standard deviation of the prices
(i) in 1992;
(ii) in 2002.
(4)
(c) Find the equation of the line of the best fit in the form y = mx + c.
(3)
(d) What would you expect to pay now for an item costing $2.60 in 1992?
(1)
(e) Which item would you omit to increase the correlation coefficient?
(2)
(Total 14 marks)
5
Linear Regression
9. A group of 15 students was given a test on mathematics. The students then played a computer
game. The diagram below shows the scores on the test and the game.
100
90
80
70
Game 60
score 50
40 M
30
20
10
0
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95
Mathematics score
The mean score on the mathematics test was 56.9 and the mean score for the computer game
was 45.9. The point M has coordinates (56.9, 45.9).
A straight line of best fit passes through the point (0, 69).
(c) Using your graph or otherwise, estimate the score Jane expects on the computer game,
giving your answer to the nearest whole number.
(Total 8 marks)
6
Linear Regression
10. The sketches below represent scatter diagrams for the way in which variables x, y and z change
over time, t, in a given chemical experiment. They are labelled 1 , 2 and 3 .
x y z
× × × ××
× × × ×
× × ×
× × × ××
× ×
×× ×× ×
× × × ××
×× × × ××
× ×
× × × ×× × ×
××
× ×
1 time t 2 time t 3 time t
(a) State which of the diagrams indicate that the pair of variables
(b) A student is given a piece of paper with five numbers written on it. She is told that three
of these numbers are the product moment correlation coefficients for the three pairs of
variables shown above. The five numbers are
(i) For each sketch above state which of these five numbers is the most appropriate
value for the correlation coefficient.
(3)
(ii) For the two remaining numbers, state why you reject them for this experiment.
(2)
7
Linear Regression
11. The following table gives the amount of fuel in a car's fuel tank, and the number of kilometres
travelled after filling the tank.
Distance
travelled (km) 0 220 276 500 680 850
Amount of fuel
in tank (litres) 55 43 30 24 10 6
60
×
Fuel in litres
40
×
× ×
20
0
0 100 200 300 400 500 600 700 800 900 1000
Distance in km
The mean distance travelled is 421 km ( x ), and the mean amount of fuel in the tank is 28 litres
( y ). This point is plotted on the scatter diagram.
(c) Use your line of best fit to estimate the amount of fuel left in the tank.
(Total 6 marks)
8
Linear Regression
12. It is decided to take a random sample of 10 students to see if there is any linear relationship
between height and shoe size. The results are given in the table below.
Height (cm) (x) Shoe size (y)
175 8
160 9
180 8
155 7
178 10
159 8
166 9
185 11
189 10
173 9
(a) Write down the equation of the regression line of shoe size (y) on height (x), giving your
answer in the form y = mx + c.
(3)
(b) Use your equation in part (a) to predict the shoe size of a student who is 162 cm in height.
(2)
9
Linear Regression
13. The heights and weights of 10 students selected at random are shown in the table below.
Student 1 2 3 4 5 6 7 8 9 10
Height
155 161 173 150 182 165 170 185 175 145
x cm
Weight
50 75 80 46 81 79 64 92 74 108
y kg
(a) Plot this information on a scatter graph. Use a scale of 1 cm to represent 20 cm on the
x-axis and 1 cm to represent 10 kg on the y-axis.
(4)
(b) Calculate the mean height.
(1)
(i) By first calculating the standard deviation of the heights, correct to two decimal
places, show that the gradient of the line of regression of y on x is 0.276.
(f) It is decided to remove the data for student number 10 from all calculations. Explain
briefly what effect this will have on the line of best fit.
(1)
(Total 15 marks)
10