0% found this document useful (0 votes)
145 views4 pages

Lecture 2

The document discusses calculating and interpreting the slope and y-intercept of a regression line. It provides an example of regression analysis on student data, calculating the slope as 0.945 and y-intercept as 79.078. This means for every 1 unit increase in study hours, grade increases by 0.945 on average, and the predicted grade is 79.078 with zero study hours. The document then discusses using the regression equation to predict dependent variable values. An example predicts a score of 97.829 for a student studying 14 hours based on the regression line of y’=75.667 + 1.583x.

Uploaded by

Norhan Esmail
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
145 views4 pages

Lecture 2

The document discusses calculating and interpreting the slope and y-intercept of a regression line. It provides an example of regression analysis on student data, calculating the slope as 0.945 and y-intercept as 79.078. This means for every 1 unit increase in study hours, grade increases by 0.945 on average, and the predicted grade is 79.078 with zero study hours. The document then discusses using the regression equation to predict dependent variable values. An example predicts a score of 97.829 for a student studying 14 hours based on the regression line of y’=75.667 + 1.583x.

Uploaded by

Norhan Esmail
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Calculation and Interpretation of the Slope and Y-

Lesson 1
intercept of a Regression Line

The regression line is also called as the line of best fit. Its significance is in enabling us to
interpret data trends and help us in making predictions based on that data, the latter which is to be
discussed further in the next lesson.
Again, please take note that in doing regression, you first need to consider the following
assumptions:
a. There exist a relationship between the variables; and
b. The relationship is tested to be significant.

The stated conditions are necessary to be first met, otherwise doing a regression analysis
would be totally pointless.
A scatterplot is one way of illustrating a line of best fit. The figure below shows a scatterplot
of a data of two variables. Notice that several lines can be drawn on the graph near the points. With
this, you should be able to draw the line of best fit. Best fit means that the sum of the squares of the
vertical distances from each point to the line is at a minimum.

The Equation of a Regression Line


Going back in our algebra concepts, an equation of a line is given by 𝑦 = 𝑚𝑥 + 𝑏, where 𝑚
stands for the slope and 𝑏 for the y-intercept. Similarly, an equation of a regression line is given by
𝑦 ′ = 𝑎 + 𝑏𝑥, where 𝑏 is the slope and 𝑎 is the y-intercept.
Furthermore, the corresponding formulas for the y-intercept 𝑎 and the slope 𝑏 are as follows:
(∑ 𝑦)(∑ 𝑥 2 )−(∑ 𝑥)(∑ 𝑥𝑦)
𝑎= 2
𝑛(∑ 𝑥 2 )−(∑ 𝑥)
𝑛(∑ 𝑥𝑦)−(∑ 𝑥)(∑ 𝑦)
𝑏= 2
𝑛(∑ 𝑥 2 )−(∑ 𝑥)

where 𝑛 is the number of data pairs.


The rounding rule for both 𝑎 and 𝑏 is up to three decimal places.

Example :
Given the data below, find the equation of the regression line and provide an interpretation of
the results.
Student No. of Study Hours (𝑥) Final Grade in Math(𝑦)
A 2 79
B 3 83
C 5 85
D 9 88
E 11 89
F 15 93

NORHAN A. SARIP 1
Solution

Before we can successfully proceed to solving for the equation of the regression line, we need
to solve first for the necessary summations. As such, a completed table like the one shown below
would be of great help.

Student No. of Study Hours (𝑥) Final Grade in Math(𝑦) 𝑥𝑦 𝑥2

A 2 79 158 4
B 3 83 249 9
C 5 85 425 25
D 9 88 792 81
E 11 89 979 121
F 15 93 1395 225
45 517 3998 465

The values needed for solving the equation are as follows:


𝑛 = 6, since there are six pairs of data
∑ 𝑥 = 45
∑ 𝑦 = 517
∑ 𝑥𝑦 = 3998
∑ 𝑥 2 = 465

Solving for the y-intercept 𝑎, we get

(∑ 𝑦)(∑ 𝑥 2 ) − (∑ 𝑥)(∑ 𝑥𝑦) (517)(465) − (45)(3998) 240405 − 179910 60495


𝑎= 2 = = = = 79.078
𝑛(∑ 𝑥 2 )
− (∑ 𝑥) 6(465) − 452 2790 − 2025 765
Solving for the slope𝑏, we also get

𝑛(∑ 𝑥𝑦) − (∑ 𝑥)(∑ 𝑦) 6(3998) − (45)(517) 23988 − 23265 723


𝑏= 2 = = = = .945
𝑛(∑ 𝑥 2 ) − (∑ 𝑥) 6(465) − 452 2790 − 2025 765

Hence, the equation of the regression line 𝑦 ′ = 𝑎 + 𝑏𝑥 is𝑦 ′ = 79.078 + .945𝑥 where the slope is .945
and the y-intercept is 79.078.The y-intercept is the value you get when 𝑥 = 0. That is, it is the value
at some point where the line intersects the y-axis.

Interpretation
Marginal change is the magnitude of the change in one variable when the other variable
changes exactly one unit. In the problem, the value of the slope 𝑏, which is .945, is the marginal
change. This means that for every change in the value of 𝑥, which is the number of study hours, the
value of 𝑦 which is the grade also changes at .945unit on the average. Similarly, the value of the y–
intercept 𝑎 is 79.078. This means that the grade of a student would be 79.078 if he/she has zero
hours of study.

Lesson 2 Solving Problems Involving Regression Analysis

Today, you will be learning on how to use the equation of a regression line to make predictions
on the value of the dependent variable. That’s right! You heard it properly – prediction, or shall I say
estimation of a value of a dependent variable in which the value of the independent variable is not
present in your data given the circumstances that you have found.

To give you an idea on how to do such prediction (or estimation), let me start by showing you a
sample problem.

Example:

Below is a sample data about the top achieving students of a school given their number of
study hours (𝑥) and their score in the math final exam (𝑦). Find the equation of the regression line
and predict the value of the dependent variable if the value of the independent one is 14.

NORHAN A. SARIP 2
Student No. of Study Hours Score (out of 100)
A 5 83
B 7 87
C 8 89
D 11 93
E 13 96

Before we proceed with our initial computation, we must remember that in making regression
analysis, the data must be correlated and that the correlation must be significant. For the sake of this
discussion let us just have the assumption that such requirements have been met.
Now, like what we did in the previous module, we first need to solve for the necessary values in
finding the slope 𝑎 and the y-intercept 𝑏. Hence, we should come up with the following:

Student No. of Study Hours (x) Score out of 100 (y) xy x^2
A 5 83 415 25
B 7 87 609 49
C 8 89 712 64
D 11 93 1023 121
E 13 96 1248 169
44 448 4007 428

The values needed for solving the equation are as follows:


𝑛 = 5, since there are five pairs of data
∑ 𝑥 = 44
∑ 𝑦 = 448
∑ 𝑥𝑦 = 4007
∑ 𝑥 2 = 428

Solving for the y-intercept 𝑎, we get


(∑ 𝑦)(∑ 𝑥 2 ) − (∑ 𝑥)(∑ 𝑥𝑦) (448)(428) − (44)(4007) 191744 − 176308 15436
𝑎= 2 = = = = 75.667
𝑛(∑ 𝑥 2 ) − (∑ 𝑥) 5(428) − 442 2140 − 1936 204
Solving for the slope 𝑏, we also get

𝑛(∑ 𝑥𝑦) − (∑ 𝑥)(∑ 𝑦) 5(4007) − (44)(448) 20035 − 19712 323


𝑏= 2 = = = = 1.583
𝑛(∑ 𝑥 2 ) − (∑ 𝑥) 5(428) − 442 2140 − 1936 204

Hence, the equation of the regression line 𝑦 ′ = 𝑎 + 𝑏𝑥 is 𝑦 ′ = 75.667 + 1.583𝑥 where the slope
is 1.583 and the y-intercept is 75.667.

Interpretation
In the regression line equation, our slope 𝑏 is 1.583 which means that for every change in the
value of 𝑥, which is the number of study hours, the value of 𝑦 which is the score also changes at 1.583
unit on the average. Similarly, the value of the y–intercept 𝑎 is 75.667. This means that the score of a
student would be 75.667 if he/she has zero hours of study.

Now, since our main objective is to predict the value of 𝑦 when the value of 𝑥 is 14, we will now
use our newfound equation. We will replace 𝑥 with 14.

𝑦 ′ = 75.667 + 1.583𝑥
𝑦 ′ = 75.667 + 1.583(14)
𝑦 ′ = 75.667 + 22.162
𝑦 ′ = 97.829

NORHAN A. SARIP 3
Hence, if a student’s study hours is 14, his/her expected score in the math exam would be
97.829.

PLEASE TAKE NOTE:

When using a regression line, you can only apply the interpretations of the slope and y-
intercept over the range of x values. It is dangerous to make predictions or statements beyond the
scope of what you observed in the data set.
In our example, we found that when a student studies for about 14 hours he/she would have
a score of 97.829. But should we use that same equation to predict their scores when the number of
study hours are already very large, say 100? Definitely not.

NORHAN A. SARIP 4

You might also like