0% found this document useful (0 votes)
20 views9 pages

Cea Ece069 Sas-17-1

The document is a student activity sheet for ECE 069: Engineering Data Analysis, focusing on simple linear regression and correlation coefficients. It outlines lesson objectives, provides a lesson preview, and includes activities for understanding and applying regression analysis, including examples and exercises. Key concepts covered include the regression equation, correlation coefficient, and assumptions of simple linear regression.

Uploaded by

ryanjay7165
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views9 pages

Cea Ece069 Sas-17-1

The document is a student activity sheet for ECE 069: Engineering Data Analysis, focusing on simple linear regression and correlation coefficients. It outlines lesson objectives, provides a lesson preview, and includes activities for understanding and applying regression analysis, including examples and exercises. Key concepts covered include the regression equation, correlation coefficient, and assumptions of simple linear regression.

Uploaded by

ryanjay7165
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

ECE 069: Engineering Data Analysis

Module #17 Student Activity Sheet

Name: _________________________________________________ Class number: _______


Section: ____________ Schedule: __________________________ Date: _______________

Lesson Title: Materials:

SIMPLE LINEAR REGRESSION & CORRELATION Board, marker, and calculator(casio fx


COEFFICIENT 350EX model)

Lesson Objectives: At the end of this session the students References:


will be able to: http://onlinestatbook.com/
1. solve for the regression equation and the correlation https://www.scribbr.com/
coefficient of two variables, then give interpretation to https://www.thebalancesmb.com/
the extent of the relationship of the response variable https://wps.prenhall.com/
and the predictor variable.

Productivity Tip: Stay Healthy.

A. LESSON PREVIEW / REVIEW

1) Introduction (2 mins)

 Simple linear regression is a statistical method for obtaining a formula to predict the scores on one
variable from the scores on a second variable. The variable we are predicting is called the criterion
variable and is referred to as Y. The variable we are basing our predictions on is called the predictor
variable and is referred to as X. When there is only one predictor variable, the prediction method is
called simple regression.
 In simple linear regression, the predictions of Y when plotted as a function of X form a straight line.
 Linear regression consists of finding the best-fitting straight line through the points. The best-fitting line
is called a regression line.

2) Activity 1: What I Know Chart, part 1 (3 mins)


Fill in the first column of what you know to answer the questions on the second column of the table below.

What I Know QUESTION What I Learned

What is simple linear regression?

What is correlation coefficient?

This document is a property of PHINMA EDUCATION


ECE 069: Engineering Data Analysis
Module #17 Student Activity Sheet

Name: _________________________________________________ Class number: _______


Section: ____________ Schedule: __________________________ Date: _______________

B. MAIN LESSON

1. Activity 2: Content Notes (13 mins)

Simple Linear Regression Model


where : Yi = the response variable
Yi   0  1 X i   i Xi = the predictor (or explanatory )variable
β0 and β1 = the regression coefficients
εi = the residual error

The residual error is,  i = Yi  Y0 , where Yi is the predicted value and Y0 is the observed
value. The error term is used to account for the variability in y that cannot be explained by the linear
relationship between x and y. If ε were not present, that would mean that knowing x would provide
enough information to determine the value of y.

The  0 ( the intercept of the regression line) and  1 ( the coefficient of X i or the slope of the
regression line ) is estimated by minimizing the sum of the square of the residual error. This
procedure is known as the Method of Least Square.

 
2
minimize ( i = (Yi  Y.0 ) 2 )

From calculus, we arrived at the following values of  0 and  1 .

1 n 
n n n
n   xi yi   xi   yi
n

1  i 1 i 1 i 1 and  o    yi  1  xi 
n
 n 
2 n  i 1 i 1 
n xi    xi 
2

i 1  i 1 
Equation 2
Equation 1

We then substitute the value of  0 and  1 and to the equation and have the regression line
equation.
Y   0  1 X Equation 3

Assumptions of Simple Linear Regression


Simple linear regression is a parametric test, meaning that it makes certain assumptions about the data.
These assumptions are . . .
 Homogeneity of variance, that is the size of the error in our prediction doesn’t change
significantly across the values of the independent variable.
 Independence of observations.
 Normality, that is the data follows a normal distribution.
This document is a property of PHINMA EDUCATION
ECE 069: Engineering Data Analysis
Module #17 Student Activity Sheet

Name: _________________________________________________ Class number: _______


Section: ____________ Schedule: __________________________ Date: _______________

 The relationship between the independent and dependent variable is linear, that is, the line of
best fit through the data points is a straight line (rather than a curve)

Correlation Coefficient, r

 One of the most commonly used correlation coefficient is the Pearson’s correlation coefficient, r.
 The correlation coefficient, r, measures the strength of the linear relationship between the response
variable and the set of explanatory variable.

Formula of the Correlation Coefficient , r

nx y  x y
r 
n x 2
  x 
2
 n y 2
  y 
2
 Equation 4

Coefficient of Determination, r2
 The square of the correlation coefficient.
 It is the proportion of variation in the response variable explained by the regression model.
 The most common interpretation of the coefficient of determination is how well the regression model fits
the observed data. For example, a coefficient of determination of 60% shows that 60% of the data fit the
regression model. Generally, a higher coefficient indicates a better fit for the model.

Interpretation of the correlation coefficient, r.

Graph of Data Points on the Regression line at Various Value of r.

This document is a property of PHINMA EDUCATION


ECE 069: Engineering Data Analysis
Module #17 Student Activity Sheet

Name: _________________________________________________ Class number: _______


Section: ____________ Schedule: __________________________ Date: _______________

Example 1. A research was done to study the effect of ambient temperature, x, on the electric power
consumed, y, by an industrial plant. Other factors were held constant. Below are data collected from
the experiment. Find the equation of the regression line and estimate the electric power consumption
when x = 70 0F.

y, x,
Trials
(BTU) (0 F )
1 250 27
2 285 45
3 320 72
4 295 58
5 265 31
6 298 60
7 267 31
8 321 74

We extend columns of the above table to solve for the  0 and  1 .

y x x*y x^2 y^2

1 250 27 6750 729 62500


2 285 45 12825 2025 81225
3 320 72 23040 5184 102400
4 295 58 17110 3364 87025
5 265 31 8215 961 70225
6 298 60 17880 3600 88804
7 267 31 8277 961 71289
8 321 74 23754 5476 103041

sum 2301 398 117851 22300 666509

This document is a property of PHINMA EDUCATION


ECE 069: Engineering Data Analysis
Module #17 Student Activity Sheet

Name: _________________________________________________ Class number: _______


Section: ____________ Schedule: __________________________ Date: _______________

From this table, we have Σ y i = 2,301; Σ x I = 398; Σ x i * y I = 117, 851; Σ xi 2 = 22,300 and
Σ yi 2 = 22,300. We then substitute these values to Equation 1, then to Equation 2 to solve  0 and
1
n n n
n   xi yi   xi   yi
8  (117 ,851)  398  ( 2,301)
1  i 1 i 1 i 1
  1.35
8  ( 22,300 )  398 
2 2
n
 n 
n  xi    xi 
2

i 1  i 1 

1 n n

o    yi  1  xi  
1
2,301  1.35  (398)   220 .5
n  i 1 i 1  8

 Substitute the values of  0 and  1 Equation 3, hence, the regression line equation is . . .
y = 220.5 + 1.35 x.

 To predict the power consumption at x = 70 0F, we substitute this value to the regression line to
predict the power consumption, y.

y = 220.5+ 1.35 (70) = 315 BTU

Example 2. What is correlation coefficient of Example 1? Interpret your result.

From this table, we have Σ y i = 2,301; Σ x I = 398 ; Σ x i * y I = 117, 851 ; Σ xi 2 = 22,300 ;


Σ yi 2 = 22,300. Substitute the values to Equation 4 to solve for r.

nx y x y 8  (117,851)  (398)  (2,301)


r   0.99
n x   x  n y   y  8  (22,300)  (398)  8  (666,509  (2,301) 
2 2 2 2 2 2

The value of r =0.99, indicates that there is a very high positive relationship between the electric power
consumption and ambient temperature. That there is an increase in electric power consumption for an
increase in ambient temperature. Furthermore, the coefficient of determination of 0.98 (r2 = 0.992)
indicates that 98 % of the data fits into the regression line.

This document is a property of PHINMA EDUCATION


ECE 069: Engineering Data Analysis
Module #17 Student Activity Sheet

Name: _________________________________________________ Class number: _______


Section: ____________ Schedule: __________________________ Date: _______________

2) Activity 3: Skill-building Activities (with answer key) (18 mins + 2 mins checking)
Given below are data set on y and x. Let the y be the response variable and x be the predictor variable.
Find the equation of the regression line equation and the value of the correlation coefficient, r. Interpret
your result.

x y

0 2
1 3
2 5
3 4
4 6

This document is a property of PHINMA EDUCATION


ECE 069: Engineering Data Analysis
Module #17 Student Activity Sheet

Name: _________________________________________________ Class number: _______


Section: ____________ Schedule: __________________________ Date: _______________

3) Activity 4: What I Know Chart, Part 2 (2 mins)


You may now answer the third column of table in activity 1 based on what you know now.
What I Know QUESTION What I Learned

What is simple linear regression?

What is correlation coefficient?

4) Activity 5: Check for Understanding (5 mins)


Multiple Choice. Encircle the best answer.
1. Regression analysis . . .
a. estimate the mean of two variables. c. establishes a relation between two variables.
b. establish cause and effect. d. measures confidence.

2. If r 2 = 0.99, how confident are you in using the regression line to estimate the response variable given
the predictor variable?
a. not confident c. the relationship is weak to predict
b. very confident d. the relationship cannot be predicted
3. If the correlation coefficient is 0.90, the percentage of variation in the response variable explained by
the variation in the predictor variable is . . .
a. 0.90 % b. 90% c. 81% d. 0.81%

4. The correlation coefficient is used to determine . . .


a. a value of the y-variable given a specific value of the x-variable.
b. a value of the x-variable given a specific value of the y-variable.
c. the strength of the relationship between the x and y variables.
d. none of the above.

5. Larger values of r2 give us idea t hat the observations are more closely grouped about the . . ..
a. average value of the independent variables.
b. average value of the dependent variable
c. least squares line.
d. none of the above.

This document is a property of PHINMA EDUCATION


ECE 069: Engineering Data Analysis
Module #17 Student Activity Sheet

Name: _________________________________________________ Class number: _______


Section: ____________ Schedule: __________________________ Date: _______________

C. LESSON WRAP-UP
1) Activity 6: Thinking about Learning (5 mins)
You are done with the session! Let's track your progress.
Period 1 Period 2 Period 3
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

Group yourselves by three. Search for a problem (with given data points) related to your profession
that uses regression analysis. Solve for the regression line and the correlation coefficient then interpret
your result.

KEY TO CORRECTION
Activity #3
Extending the columns of the preceding table.

x y x*y x^2 y^2

0 2 0 0 4
1 3 3 1 9
2 5 10 4 25
3 4 12 9 16
4 6 24 16 36

SUM 10 20 49 30 90

From this table, we have n = 5; Σ y i = 20; Σ x I = 10; Σ x i * y I = 49; Σ xi 2 = 30; Σ yi 2 = 90.

We then substitute these values to Equation 1, then to Equation 2 to solve  0 and  1 .


n n n
n   xi yi   xi   yi
5  ( 49)  (10)  ( 20)
1  i 1 i 1 i 1
  0.9
n
 n 
2
5  (30)  (10) 2
n xi    xi 
2

i 1  i 1 

1 n n

o    yi  1  xi  
1
20  0.9  (10)   0.20
n  i 1 i 1  5

This document is a property of PHINMA EDUCATION


ECE 069: Engineering Data Analysis
Module #17 Student Activity Sheet

Name: _________________________________________________ Class number: _______


Section: ____________ Schedule: __________________________ Date: _______________

Substitute the values of  0 and  1 Equation 3.


 The regression line equation
y = 0.20 + 0.9 x
 The correlation coefficient, r

nx y x y 5  (49)  (10)  (20)


r   0.90
n x 2 2

  x   n y 2   y 
2
 5  (30)  10  5  (90)  20 
2 2

The value of r =0.90, indicates that there is a very high positive relationship between the y and the x
variables. Furthermore, the coefficient of determination of 0.81 (r2 = 0.902) indicates that 81 % of the data
fits into the regression line.

This document is a property of PHINMA EDUCATION

You might also like