Statistics and Probability Quarter 4: Week 8-Module 16 Regression Analysis
Statistics and Probability Quarter 4: Week 8-Module 16 Regression Analysis
Copyright © 2021
La Union Schools Division
Region I
All rights reserved. No part of this module may be reproduced in any form
without written permission from the copyright owners.
Management Team:
In this lesson, we will take a deeper look at the trend line. We will go to its
more accurate analysis by getting the mathematical equation and how it is used in
prediction.
.
After going through this lesson, you are expected to:
Subtasks:
1. Find the regression line
Before going on, check how much you know about this topic. Answer
the pretest on the next page in a separate sheet of paper.
Pre-test
Directions: Choose the letter of the correct answer. Write your answer on a separate
sheet of paper.
A. Line A
B. Line B
C. Line C
D. None of the lines fit the data.
2. Schon distributed a survey to his fellow students asking them how many
hours they'd spent playing sports in the past day. He also asked them to rate
their mood on a scale from 0 to 10, with 10 being the happiest. A line was fit
to the data to model the relationship. Which of these linear equations best
describes the given model?
A. Y = 5X + 1.5
B. Y = 1.5X + 5
C. Y = -1.5X+ 5
D. Y = -5X + 1.5
3. Refer to the model in number 2, estimate the mood rating for a student that
spent 2.5 hours playing sports.
A. 10 B. 9.5 C. 8.75 D. 1.25
5. Regression analysis was applied between sales (y) and advertising (x) across
all the branches of a major national corporation. The following regression
function was obtained, y = 5000 + 7.25x. If the advertising budgets of two
branches of the corporation differ by 30,000, then what will be the predicted
difference in their sales?
A. 217,500 B. 222,500 C. 5000 D. 7.25
10. You studied the impact of the dose of a new drug treatment for high blood
pressure. You think that the drug might be more effective in people with very
high blood pressure. Because you expect a bigger change in those patients
who start the treatment with high blood pressure, you use regression to
analyze the relationship between the initial blood pressure of a patient (x) and
the change in blood pressure after treatment with the new drug (y). If you find
a very strong positive association between these variables, what would be your
conclusion?
A. There is evidence that the higher the patients initial blood pressure, the
bigger the impact of the new drug.
B. There is evidence that the higher the patients initial blood pressure, the
smaller the impact of the new drug.
C. There is evidence for an association of some kind between the patient’s
initial blood pressure and the impact of the new drug on the patient’s
blood pressure
D. None of these are correct, this is a case of regression fallacy.
11. The relationship between number of beers consumed (x) and blood alcohol
content (y) was studied in 16 male college students by using least squares
regression. The following regression equation was obtained from this study:
Y’ = -0.0127 + 0.0180x. What does the equation imply?
A. Each beer consumed increases blood alcohol by 1.27%.
B. Each beer consumed increases blood alcohol by exactly 0.018.
C. On the average it takes 1.8 beers to increase blood alcohol content by 1%.
D. Each beer consumed increases blood alcohol by an average amount of
1.8%
13. In the case of an algebraic model for a straight line, if a value for the x variable
is specified, what would happen to the value for y?
A. The computed response to the independent value will always give a
minimal residual variable.
B. The computed value of y will always be the best estimate of the mean
response.
C. The exact value of the response variable can be computed.
D. None of these alternatives is correct.
14. A regression analysis between sales (in P1000) and price (in peso) resulted in
the following equation: Y’ = 50,000 - 8X, what is the implication of the given
equation?
A. increase of P 1 in price is associated with a decrease of P 8 in sales
B. increase of P 1 in price is associated with a decrease of P 8000 in sales
C. increase of P 8 in price is associated with an increase of P 8,000 in sales
D. increase of P 1 in price is associated with a decrease of P 42,000 in sales
15. Regression analysis was applied to return rates of sparrow hawk colonies.
Regression analysis was used to study the relationship between return rate
(x: % of birds that return to the colony in a given year) and immigration rate
(y: % of new adults that join the colony per year). The following regression
equation was obtained. Y’ = 31.9 – 0.34x. Based on the above estimated
regression equation, if the return rate were to decrease by 10%, what would
be the immigration rate of the colony?
A. Increase by 34% B. Increase by 3.4%
C. Decrease by 0.34% D. Decrease by 3.4%
Jumpstart
For you to understand the lesson well, do the following
activities. Have fun and good luck!
X 0 1 2 3 4 5
Y 4 6 8 10 12 14
When the trend line is drawn, we observe that some points are on the line
while others are below or above the line. In other words, we say that the points in
the scatterplot regress with reference to the line. If the average y distances of the
points from this line is the least, then we call this line the regression line or the line
that “best fit” in the scatterplot. The regression line is the same as the trend line.
To find the regression line, like the equation of a line in Algebra, we write the
equation of the regression line using the “point-slope-form”.
where:
a = y-intercept
(ΣY)(Σ𝑋 2 )−(ΣX)(ΣXY)
Formula for the y-intercept (a) =
𝑛(Σ𝑋 2 )−(ΣX)2
b = slope of the regression line
𝑛(ΣXY)−(ΣX)(ΣY)
Formula for the slope of the regression line (b) = 𝑛 (Σ𝑋 2)−(ΣX)2
n = number of cases
STEPS SOLUTION
1. Identify the dependent and Dependent variable (Y) – height of the son
independent variables Independent variable (X) - height of the father
2. Compute the correlation coefficient
using the formula
X Y X2 Y2 XY
71 71 5041 5041 5041
69 69 4761 4761 4761
69 71 4761 5041 4899
65 68 4225 4624 4420
66 68 4356 4624 4488
63 66 3969 4356 4158
68 70 4624 4900 4760
70 72 4900 5184 5040
60 65 3600 4225 3900
58 60 3364 3600 3480
ΣX= ΣY= ΣX2= ΣY2= ΣXY=
659 680 43601 46356 44947
10(44947)−(659)(680)
r=
√[10(43601)−(659)2 ][10(46356)−(680)2 ]
r = 0.95
3. Test the significance using the n =10 and r = 0.95
formula 10−2
t = 0.63√1− (0.95)2
𝑛−2
t = r√
1− 𝑟 2 t = 8.61
4. Compare the computed t-value to Using df= n-2 =10 - 2 = 7, level of significance
the critical t-value is 0.05 two-tailed test, we find from the table
that the critical value of t is 2.306.
5. Make a decision Since the computed t = 8.61 is greater than
the critical t = 2.306, we reject the null
hypothesis. So, there is significant
relationship between the two variables.
6. Summarize There is a sufficient evidence to conclude that
there is a significant relationship the number
of height of the father and height of the son.
Thus, we will proceed to regression analysis.
7. Compute the value of a and b in a=
( 680)(43601)−(659)(44947)
10(43601)− (659)2
the regression equation using the
a = 16.55
formula
(ΣY)(Σ𝑋 2 )−(ΣX)(ΣXY)
a=
𝑛(Σ𝑋 2 )−(ΣX)2 10(44947)−(659)(680)
b=
𝑛(ΣXY)−(ΣX)(ΣY) 10(43601)− (659)2
b=
𝑛(Σ𝑋 2 )−(ΣX)2 b = 0.78
8. Form the regression equation Plug in the value of a and b in equation
Y’ = bX + a
Y’ = 0.78X + 16.55
The regression equation for predicting the
height of the son given the height of the father
is Y’ = 0.78X + 16.55
9. Predict the height of the son if the X = 78
height of the father is 78 inches. Y’ = 0.78X + 16.55
Y’ = 0.78(78) + 16.55
= 77.39 or 77 inches
So, the predicted height of the son whose
father’s height is 78 inches is 77 inches.
Explore
6
a. Compute r.
5
b. Find the slope (b) and the
y-intercept (a). 4
c. Find the regression line. 3
d. Plot the regression line. Is the 2
line closest to the points? 1
e. Predict value of Y when
0
X =2.5. Show this in the 0 1 2 3 4 5 6
graph.
Deepen
The following data show the age of a car and the average mileage/liter.
Age(in years) 0 1 2 3 4 5 6
Mileage per liter (in km) 20.6 18.1 16.3 15.5 14.1 13.9 11.2
a. Find the regression line that will predict the average mileage/liter of the
car.
b. Find the average mileage of the car at age 10 years.
Gauge
Directions: Choose the letter of the correct answer. Write your answer on a separate
sheet of paper.
3. Which of the following is the same as the point-slope form equation of a line
in algebra?
A. Correlation B. Regression Line
C. Regression analysis D. Slope-intercept form
11. A regression analysis between sales (in P1000) and price (in peso) resulted in
the following equation: Y’ = 50,000 - 8X, what is the implication of the given
equation?
A. increase of P 1 in price is associated with a decrease of P 8 in sales
B. increase of P 1 in price is associated with a decrease of P 8000 in sales
C. increase of P 8 in price is associated with an increase of P 8,000 in sales
D. increase of P 1 in price is associated with a decrease of P 42,000 in sales
14. For the regression line𝑦 = 2.6𝑥 + 0.56, what will be the value of y if x = 3.5?
A. 7.66 B. 8.66 C. 9.66 D. 10.66
15. The relationship between number of beers consumed (x) and blood alcohol
content (y) was studied in 16 male college students by using least squares
regression. The following regression equation was obtained from this study:
Y’ = -0.0127 + 0.0180x. What does the equation imply?
A. Each beer consumed increases blood alcohol by 1.27%.
B. Each beer consumed increases blood alcohol by exactly 0.018.
C. On the average it takes 1.8 beers to increase blood alcohol content
by 1%.
D. Each beer consumed increases blood alcohol by an average amount
of 1.8%
References
Books
Luis Allan B. Melosantos, Janice F. Antonio, Josephine R. Sacluti, Ryan M. Bruce.
2016. Math Connections in the Digital Age: Statistics and Probability. Manila.
Sibs Publishing Inc.
Rene R. Belecina, Elisa S. Baccay, Efren B. Mateo. 2016. Statistics and Probability.
Manila. Rex Bookstore, Inc
Websites
Linear regression multiple choice questions from
https://www.mindcoral.com/question/library/1499
Quiz: Simple Linear Regression from https://cliffnotes.com/study-
guides/statistics/bivariate-relationships/quiz-simple-linear-regression
Pre-Test Jumpstart
1. A Activity 1
2. B Slope : 2
3. C Y-intercept : 4
4. D Equation of the line : y = 2x+4
5. A
6. D Activity 2
7. A
8. C Dependent Independent
9. C 1. Altitude Acceleration
10.D 2. Demand Price of goods
11.D 3. Annual income Monthly salary
12.D 4. Academic IQ
performance
13.C
5. Volume of air balloon Temperature
14.B
15.B
Explore
Activity 3
a. r = 0.75
b. b = 0.7
a = 1.1
c. y’ = 0.7x + 1.1 Gauge
d. 1. A
2. C
3. B
4. D
5. C
6. D
7. D
8. C
9. B
10.D
11.B
e. y’ = 2.85 12.B
13.C
Deepen 14.C
a. y’ = -1.39x + 19.3 15.D
b. Average mileage = 5.93 km ~ 6 km
Answer Key