0% found this document useful (0 votes)
111 views29 pages

Topic 4 - Bivariate Analysis

This document discusses several examples of analyzing bivariate data using regression analysis and correlation. Multiple choice questions are asked about finding equations of regression lines, using lines to make predictions, interpreting correlation coefficients, and describing sampling techniques.

Uploaded by

barry1528
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
111 views29 pages

Topic 4 - Bivariate Analysis

This document discusses several examples of analyzing bivariate data using regression analysis and correlation. Multiple choice questions are asked about finding equations of regression lines, using lines to make predictions, interpreting correlation coefficients, and describing sampling techniques.

Uploaded by

barry1528
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Topic 4 - Bivariate Analysis [92 marks]

1. [Maximum mark: 5] SPM.2.SL.TZ0.5


The following table below shows the marks scored by seven students on two
different mathematics tests.

Let L1 be the regression line of x on y. The equation of the line L1 can be written in
the form x = ay + b.

(a) Find the value of a and the value of b. [2]

(b) Let L2 be the regression line of y on x. The lines L1 and L2 pass


through the same point with coordinates (p , q).

Find the value of p and the value of q. [3]


2. [Maximum mark: 7] SPM.2.AHL.TZ0.4
The following table below shows the marks scored by seven students on two
different mathematics tests.

Let L1 be the regression line of x on y. The equation of the line L1 can be written in
the form x = ay + b.

(a) Find the value of a and the value of b. [2]

Let L2 be the regression line of y on x. The lines L1 and L2 pass through the same
point with coordinates (p , q).

(b) Find the value of p and the value of q. [3]


(c) Jennifer was absent for the first test but scored 29 marks on the
second test. Use an appropriate regression equation to estimate
Jennifer’s mark on the first test. [2]
3. [Maximum mark: 8] EXN.2.SL.TZ0.4
The following table shows the systolic blood pressures, p mmHg, and the ages, t
years, of 6 male patients at a medical clinic.

(a.i) Determine the value of Pearson’s product‐moment correlation


coefficient, r, for these data. [2]

(a.ii) Interpret, in context, the value of r found in part (a) (i). [1]

The relationship between t and p can be modelled by the regression line of p on


t with equation p = at + b .

(b) Find the equation of the regression line of p on t. [2]


A 50‐year‐old male patient enters the medical clinic for his appointment.

(c) Use the regression equation from part (b) to predict this
patient’s systolic blood pressure. [2]

(d) A 16‐year‐old male patient enters the medical clinic for his
appointment.

Explain why the regression equation from part (b) should not
be used to predict this patient’s systolic blood pressure. [1]
4. [Maximum mark: 15] EXM.2.SL.TZ0.1
The principal of a high school is concerned about the effect social media use
might be having on the self-esteem of her students. She decides to survey a
random sample of 9 students to gather some data. She wants the number of
students in each grade in the sample to be, as far as possible, in the same
proportion as the number of students in each grade in the school.

(a) State the name for this type of sampling technique. [1]

The number of students in each grade in the school is shown in table.

(b.i) Show that 3 students will be selected from grade 12. [3]

(b.ii) Calculate the number of students in each grade in the sample. [2]
In order to select the 3 students from grade 12, the principal lists their names in
alphabetical order and selects the 28th, 56th and 84th student on the list.

(c) State the name for this type of sampling technique. [1]

Once the principal has obtained the names of the 9 students in the random
sample, she surveys each student to find out how long they used social media
the previous day and measures their self-esteem using the Rosenberg scale. The
Rosenberg scale is a number between 10 and 40, where a high number
represents high self-esteem.

(d.i) Calculate Pearson’s product moment correlation coefficient, r. [2]


(d.ii) Interpret the meaning of the value of r in the context of the
principal’s concerns. [1]

(d.iii) Explain why the value of r makes it appropriate to find the


equation of a regression line. [1]

(e) Another student at the school, Jasmine, has a self-esteem value


of 29.

By finding the equation of an appropriate regression line,


estimate the time Jasmine spent on social media the previous
day. [4]
5. [Maximum mark: 7] 23M.2.SL.TZ1.4
The total number of children, y, visiting a park depends on the highest
temperature, T , in degrees Celsius (°C). A park official predicts the total number
of children visiting his park on any given day using the model
+ 23T + 110, where 10 ≤ T ≤ 35.
2
y = −0. 6T

(a) Use this model to estimate the number of children in the park
on a day when the highest temperature is 25 °C. [2]

An ice cream vendor investigates the relationship between the total number of
children visiting the park and the number of ice creams sold, x. The following
table shows the data collected on five different days.

Total number
81 175 202 346 360
of children (y)
Ice creams
15 27 23 35 46
sold (x)

(b) Find an appropriate regression equation that will allow the


vendor to predict the number of ice creams sold on a day when
there are y children in the park. [3]
(c) Hence, use your regression equation to predict the number of
ice creams that the vendor sells on a day when the highest
temperature is 25°C. [2]
6. [Maximum mark: 5] 22N.2.SL.TZ0.1
The following table shows the Mathematics test scores (x) and the Science test
scores (y) for a group of eight students.

The regression line of y on x for this data can be written in the form
y = ax + b.

(a) Find the value of a and the value of b. [2]

(b) Write down the value of the Pearson’s product-moment


correlation coefficient, r. [1]

(c) Use the equation of your regression line to predict the Science
test score for a student who has a score of 78 on the
Mathematics test. Express your answer to the nearest integer. [2]
7. [Maximum mark: 7] 22M.1.SL.TZ1.3
A survey at a swimming pool is given to one adult in each family. The age of
the adult, a years old, and of their eldest child, c years old, are recorded.

The ages of the eldest child are summarized in the following box and whisker
diagram.

(a) Find the largest value of c that would not be considered an


outlier. [3]

The regression line of a on c is a =


7

4
c + 20. The regression line of c on a is
1
c = a − 9.
2

(b.i) One of the adults surveyed is 42 years old. Estimate the age of
their eldest child. [2]
(b.ii) Find the mean age of all the adults surveyed. [2]
8. [Maximum mark: 7] 22M.1.SL.TZ1.3
A survey at a swimming pool is given to one adult in each family. The age of
the adult, a years old, and of their eldest child, c years old, are recorded.

The ages of the eldest child are summarized in the following box and whisker
diagram.

(a) Find the largest value of c that would not be considered an


outlier. [3]

The regression line of a on c is a =


7

4
c + 20. The regression line of c on a is
1
c = a − 9.
2

(b.i) One of the adults surveyed is 42 years old. Estimate the age of
their eldest child. [2]
(b.ii) Find the mean age of all the adults surveyed. [2]
9. [Maximum mark: 5] 21N.2.SL.TZ0.1
In Lucy’s music academy, eight students took their piano diploma examination
and achieved scores out of 150. For her records, Lucy decided to record the
average number of hours per week each student reported practising in the
weeks prior to their examination. These results are summarized in the table
below.

(a) Find Pearson’s product-moment correlation coefficient, r, for


these data. [2]

(b) The relationship between the variables can be modelled by the


regression equation D = ah + b. Write down the value of a
and the value of b. [1]

(c) One of these eight students was disappointed with her result
and wished she had practised more. Based on the given data,
determine how her score could have been expected to alter had
she practised an extra five hours per week. [2]
10. [Maximum mark: 7] 21N.2.AHL.TZ0.1
In Lucy’s music academy, eight students took their piano diploma examination
and achieved scores out of 150. For her records, Lucy decided to record the
average number of hours per week each student reported practising in the
weeks prior to their examination. These results are summarized in the table
below.

(a) Find Pearson’s product-moment correlation coefficient, r, for


these data. [2]

(b) The relationship between the variables can be modelled by the


regression equation D = ah + b. Write down the value of a
and the value of b. [1]

(c) One of these eight students was disappointed with her result
and wished she had practised more. Based on the given data,
determine how her score could have been expected to alter had
she practised an extra five hours per week. [2]
(d) Lucy asserts that the number of hours a student practises has a
direct effect on their final diploma result. Comment on the
validity of Lucy’s assertion. [1]

(e) Lucy suspected that each student had not been practising as
much as they reported. In order to compensate for this, Lucy
deducted a fixed number of hours per week from each of
the students’ recorded hours.

State how, if at all, the value of r would be affected. [1]


11. [Maximum mark: 7] 21M.2.SL.TZ1.2
The following table shows the data collected from an experiment.

The data is also represented on the following scatter diagram.

The relationship between x and y can be modelled by the regression line of y


on x with equation y = ax + b, where a, b ∈ R.

(a) Write down the value of a and the value of b. [2]

(b) Use this model to predict the value of y when x = 18. [2]
(c)
¯
¯ Write down the value of x and the value of y . [1]

(d) Draw the line of best fit on the scatter diagram. [2]
12. [Maximum mark: 6] 21M.2.SL.TZ2.1
At a café, the waiting time between ordering and receiving a cup of coffee is
dependent upon the number of customers who have already ordered their
coffee and are waiting to receive it.

Sarah, a regular customer, visited the café on five consecutive days. The
following table shows the number of customers, x, ahead of Sarah who have
already ordered and are waiting to receive their coffee and Sarah’s waiting time,
y minutes.

The relationship between x and y can be modelled by the regression line of y


on x with equation y = ax + b.

(a.i) Find the value of a and the value of b. [2]

(a.ii) Write down the value of Pearson’s product-moment correlation


coefficient, r. [1]

(b) Interpret, in context, the value of a found in part (a)(i). [1]


(c) On another day, Sarah visits the café to order a coffee. Seven
customers have already ordered their coffee and are waiting to
receive it.

Use the result from part (a)(i) to estimate Sarah’s waiting time to
receive her coffee. [2]
13. [Maximum mark: 6] 20N.2.SL.TZ0.S_2
Lucy sells hot chocolate drinks at her snack bar and has noticed that she sells
more hot chocolates on cooler days. On six different days, she records the
maximum daily temperature, T , measured in degrees centigrade, and the
number of hot chocolates sold, H . The results are shown in the following table.

The relationship between H and T can be modelled by the regression line


with equation H = aT + b.

(a.i) Find the value of a and of b. [3]

(a.ii) Write down the correlation coefficient. [1]

(b) Using the regression equation, estimate the number of hot


chocolates that Lucy will sell on a day when the maximum
temperature is 12°C.
[2]

© International Baccalaureate Organization, 2024

You might also like