0% found this document useful (0 votes)

10 views7 pages

Regression Models - Follow

Chapter 4 discusses regression models, focusing on the relationship between a dependent variable and one or more independent variables. It outlines learning objectives, including identifying variables, developing regression equations, and testing model significance using Excel. The chapter also provides a practical example of predicting lunch spending based on breakfast spending, illustrating key concepts such as the coefficient of determination and correlation.

Uploaded by

happyfelix57

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views7 pages

Regression Models - Follow

Uploaded by

happyfelix57

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Chapter 4 - Regression Models

regression is an approach for modeling the relationship between a quantitative dependent variable y and one or more
explanatory variables (or independent variables) represented by X(s). The case of one explanatory variable is called
simple linear regression.

The main purposes of regression analysis are to understand relationship between/among variables and to predict one
variable based on the other(s).

Learning Objectives for this Chapter

At the completion of the Spring 2023 semester, students will be able to:

4.1 – Identify variables, visualize them using scatter diagram, and use them in a regression model

4.2 – Develop simple linear regression equations from a collected data and interpret the slope and intercept

4.3 – Compute the coefficient of Determination and the coefficient of correlation and interpret their meanings

4.4 – List assumptions used in regression and use residual plot to identify problems

4.5 – Test the model for significance

4.6 – Use Excel to do a regression analysis

4.7 – Develop a multiple regression model using excel and use it for prediction

We are going to use the example blow to go through learning objectives 4.1 to 4.6

A cafeteria at a local college would like to come up with a regression model that would predict what a student would
spend for lunch based on what they spent for breakfast. The collected data from a randomly selected students and the
result is shown below:

X(money spent on breakfast) 5 6 7 7 9 10 12

Y(money spent on lunch) 12 11 9 8 4 3 2

4.1 Scatter diagram

Y(money spent on lunch)

14
12
10
8
6
4
2
0
4 5 6 7 8 9 10 11 12 13

Speculation: There seem to be a negative linear relationship between what a given student spends for breakfast and
lunch. For this scatter plot, $ breakfast is our input, independent, or explanatory variable, whereas $ lunch is our output,
dependent, or response variable.
4.2 – Developing a simple linear model

The simple linear regression model is Y = β0 + β1X + ε  This is a general model

Where

Y is the dependent variable

X is the independent variable

β0 is the intercept (Y value when X = 0)

β1 is the slope of the regression line

ε is some random error

Simple regression model estimated for a data sample.

Ŷ = b0 + b1x where b0 and b1 are estimated values of the intercept and slope assuming that error are at minimum.
Note that error still exist and can be tabulated by E = (actual value Y) – (predicted value ŷ)

Here,

Ŷ is the predicted value of Y

b0 is the predicted value of β0, based on a sample

b1 is the predicted value of β1, based on a sample

How to compute these values?

We need to first compute the following

The best way to do this is to develop a table (You can use excel)
(X-
X Y (X-X̄ )^2==> Explanation X̄ )^2 (X-X̄ )(Y-Ȳ)==> Explanation (X-X̄ )(Y-Ȳ)
5 12 (5-8)^2 9 (5-8)(12-7) -15
6 11 (6-8)^2 4 -8
7 9 (7-8)^2 1 -2
7 8 (7-8)^2 1 -1
9 4 (9-8)^2 1 -3
10 3 (10-8)^2 4 -8
12 2 (12-8)^2 16 -20
Sum 56 49 36 -57

X̄ = 56/7 8

Ȳ = 49/7 7
b1 = -57/36  -1.58

b0 = 7- (-1.58)(8)  19.67

so, the simple regression equation is

4.3 – Measuring the Fit of the Regression Model

To know for a fact that the model developed is good enough to be used for prediction, we must start by computing the
coefficient of Determination (R 2) and the coefficient of correlation ( r ).

To do that, we must first compute the following:

 Sum of Square total or SST: This measures the total variability of Y about the mean.
SST = ∑(Y-Ȳ)2
 Sum of Suare error or SSE: This Measures the variability of Y about the regression line.
SSE = ∑(e)2  ∑(Y-Ŷ)2
 Sum of Squares Regression or SSR: This indicates how much total variability of Y can be explained by
the regression model.
SSR = ∑(Ŷ - Ȳ)2
Important relationship: Since SST = SSR+SSE, therefore SSR  SST-SSE.
X Y (Y-Ȳ)^2 Ŷ = bo - b1 X here (ŷ = 19.67-1.58x) (Y - Ŷ)^2 (Ŷ - Ȳ)^2
5 12 25 11.77 0.0529 22.7529
6 11 16 10.19 0.6561 10.1761
7 9 4 8.61 0.1521 2.5921
7 8 1 8.61 0.3721 2.5921
9 4 9 5.45 2.1025 2.4025
10 3 16 3.87 0.7569 9.7969
12 2 25 0.71 1.6641 39.5641
Sums 56 49 96 5.7567 89.8767
SST SSE SSR
X̄ = 56/7 8
Ȳ = 49/7 7
Coefficient of determination

The coefficient of determination (represented by R2) gives proportion of the variation in the dependent variable(Y) that
is predictable from the regression with the independent variable (X)

R2 = SSR/SST This is also the same as 1 – SSE/SST

For our question, R2 = 89.8767/96  0.936 or about 94%

Interpretation: About 94% of the variation in Y (money spent on lunch) can be explained by the regression with X
(money spent on breakfast). The remaining 6% are due to other fact (are due to error)

Coefficient of correlation

The quantity r, called the linear correlation coefficient, measures the strength and
the direction of a linear relationship between two variables.

r = +/- √ R2  important r has the same sign as the slope (b1) of the line of regression

Speculating on r based on the scatter diagram

The value of is such that -1 < r < +1. The + and – signs are used for positive
linear correlations and negative linear correlations, respectively.

Positive correlation: If x and y have a strong positive linear correlation, r is close

to +1. An r value of exactly +1 indicates a perfect positive fit. Positive values indicate a relationship between x and y
variables such that as values for x increases, values for y also increase.
Negative Correlation: If x and y have a strong negative linear correlation, r is close
to -1. An r value of exactly -1 indicates a perfect negative fit. Negative values
indicate a relationship between x and y such that as values for x increase, values
for y decrease.
No Correlation: If there is no linear correlation or a weak linear correlation, r is
close to 0. A value near zero means that there is a random, nonlinear relationship between the two variables
Note that r is a dimensionless quantity; that is, it does not depend on the units
employed.
Perfect Correlation: A perfect correlation of r = ± 1 happens only when the data points all lie exactly on a straight line. If
r = +1, the slope of this line is positive. If r = -1, the slope of this line is negative.

Negative Positive
-1 Strong - .7Moderated - .5 Weak 0 Weak .5 Moderated .7 Strong +1
In this case, r = -√ 0.936 = - 967

This is negative because the slope of the line of regression is also negative.

Interpretation: there is a strong negative linear relationship between the amount of money spent on breakfast and
the amount of money spent on lunch.

4.4 – Assumption of the Regression Model

We stated earlier that the linear regression model comes with errors in it due to the fact that we are not dealing with
perfectly aligned set of points… In other terms, the SSE in not always equal to 0, or R 2 is not always 100%. Therefore, we
have to make some assumptions about the errors in the regression model so that we can test it for significance. We
must make the following assumptions about the errors:

 The errors are independent.

 The errors are normally distributed
 The errors have a mean of zero
 The errors have constant variance (Regardless of values of X)

When assumptions are met, a plot of errors against the independent variable should appear to be random

In our example, we are going to plot X against Residual (Y - Ŷ ) and check for randomness

Ŷ = bo - b1 X here (ŷ =
X Y 19.67-1.58x) Residual (Y - Ŷ )
5 12 11.77 0.23
6 11 10.19 0.81
7 9 8.61 0.39
7 8 8.61 -0.61
9 4 5.45 -1.45
10 3 3.87 -0.87
12 2 0.71 1.29

Using Excel

Residual (Y - Ŷ )
1.5
1
0.5
0
4 5 6 7 8 9 10 11 12 13
-0.5
-1
-1.5
-2

We can see that the scatter plot appears to be random – You can use figure 4.4A, 4.4B, and 4.4C on page 118 to check
for likelihood of randomness. We want the residual plot to look like figure 4.4 a.
The next step is to estimate the variance.

While errors are assumed to have constant variance ( σ 2), it can only be estimated when a sample is collected. The
Mean Squared Error (MSE or s2) is a good estimate of the population variance σ 2.

S2 = MSE = SSE/(n-k-1), where n is the number of observations (pairs of points), and k it the number of independent
variables.

For our example, s2 = 5.7567/(7-1-1) = 1.15

For the sample variance, we can estimate the standard deviation by taking the square root of s 2.

Here, s = √ 1.15 = 1.07. This is also called the standard error estimate or standard deviation of the regression.

4.5 – Testing the Model for significance

Steps for Hypothesis Testing

1. Determine the Null Hypothesis (H0) and the Alternative Hypothesis (H1).
This is always

H0 : ᵝI = 0 The correlation is 0 (The correlation is not significant)

Ha : ᵝI ≠ 0 The correlation is 0 (The correlation significant)
2. Select the level of significance (Probability to reject H o) . This is either 0.05 or 0.01

3. Compute the calculated value of F. For our course, we will read that value on the regression
summary output.

4. Reject H0 if F calculated is greater than F critical (On F table)… and interpret the finding.

How to read the F table…

 Select a level of significance either 0.05 or 0.01

 Locate df1 or Degrees of freedom of the numerator (entry column on F table). DF1 is the number of
independent variables K.

 Locate df2 or Degrees of freedom of the denominator (entry row on F table). The value of dF 2 is by
the n – k – 1 (Sample size – number of independent variables – 1).

 The Critical value of f or F-critical (df1, df2) is going to be the number located at the junction of the
identify entry row (df1) and the entry column (df2).
For hour example
Step 1

H0 : ᵝI = 0 The correlation is 0 (The correlation is not significant)

Ha : ᵝI ≠ 0 The correlation is 0 (The correlation significant)
Step 1 We are going to use α = 0.05 to test our hypothesis

Step 3 Calculate the value of F statistic. Fcalculated = MSR/MSE

MSR = SSR/k 89.8767/1 = 89.8767

MSE = 1.15

Fcalculated = 89.8767/1.15 = 78.1536

Step 4: Decision: Reject Ho if the test statistics is greater than F critical (From the F table)

df1 = k = 1  first column

df2 = n-k-1 = 7-1-1 = 5 5th row

We are going to go to the F Table in appendix D, look for α = 0.05 (the first F distribution table) and go to first column
and fifth row  F0.05, 1, 5 = 6.61.

Here, since Fcalculated of 78.1536 is greater than Fcritical of 6.61, we are going to reject Ho. Therefore, the regression model is
significant. That is, prediction generated by the linear model ŷ = 19.67 – 1.58X will be reliable.

Try this
Additional examples X 4 5 6 8 10
Given the following pairs of points Y 13 16 8 3 2

a. Draw a scatter diagram and speculate on the linear relationship between x and y
b. Find the equation of the regression line
c. Compute the coefficient of determination and tell us what that means
d. Find the coefficient of correlation and determine the strength of the relationship between x and y
e. Is the linear relationship significant? Use alpha of 0.05 to test this hypothesis for significance

Regression Analysis Using Excel
100% (1)
Regression Analysis Using Excel
85 pages
PE Civil: Transportation Ebook Practice Exam
No ratings yet
PE Civil: Transportation Ebook Practice Exam
41 pages
Linear Regression
100% (2)
Linear Regression
28 pages
Introduction of Regression
No ratings yet
Introduction of Regression
57 pages
PROBLEMS ch05
No ratings yet
PROBLEMS ch05
117 pages
Mda-Session-7 Simple Linear Regression
No ratings yet
Mda-Session-7 Simple Linear Regression
75 pages
5 Chapter Fi
No ratings yet
5 Chapter Fi
29 pages
Session 19&20
No ratings yet
Session 19&20
54 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
27 pages
QBM101 Chapter10
No ratings yet
QBM101 Chapter10
40 pages
STAT22209 - Chapter 02-Regression Analyisis - 2022
No ratings yet
STAT22209 - Chapter 02-Regression Analyisis - 2022
41 pages
Lecture 8 Correlation and Linear Regression
No ratings yet
Lecture 8 Correlation and Linear Regression
66 pages
Lecture 12
No ratings yet
Lecture 12
47 pages
Pre-Experimental and Quasi Experimental Designs
100% (2)
Pre-Experimental and Quasi Experimental Designs
29 pages
Screenshot 2023-12-04 at 11.27.14
No ratings yet
Screenshot 2023-12-04 at 11.27.14
32 pages
Chapter 5,6 Regression Analysis
50% (2)
Chapter 5,6 Regression Analysis
44 pages
Correlation and Regression
No ratings yet
Correlation and Regression
30 pages
Simple Linear Regression Sample
No ratings yet
Simple Linear Regression Sample
55 pages
Chapter 14 Multiple Regression and Correlation Analysis
No ratings yet
Chapter 14 Multiple Regression and Correlation Analysis
25 pages
RiP Final Study
No ratings yet
RiP Final Study
35 pages
Week 13
No ratings yet
Week 13
25 pages
6.3 SSK5210 Parametric Statistical Testing - Analysis of Variance LR and Correlation - 2
No ratings yet
6.3 SSK5210 Parametric Statistical Testing - Analysis of Variance LR and Correlation - 2
39 pages
13 Predictive Analysis - Tests of Association - Regression
No ratings yet
13 Predictive Analysis - Tests of Association - Regression
70 pages
CH 4 - Correlation and Regression YARA&LAMA
No ratings yet
CH 4 - Correlation and Regression YARA&LAMA
27 pages
Corr and Regress
No ratings yet
Corr and Regress
30 pages
Statistical Methods
No ratings yet
Statistical Methods
7 pages
BUSINESS STATISTICS: Simple Linear Regression and Correlation
No ratings yet
BUSINESS STATISTICS: Simple Linear Regression and Correlation
55 pages
Slide Chap11
No ratings yet
Slide Chap11
19 pages
Simple Linear Regression and Correlation 568a5ac2ce9b3
No ratings yet
Simple Linear Regression and Correlation 568a5ac2ce9b3
31 pages
Chap 010
No ratings yet
Chap 010
45 pages
Simple LR Lecture
No ratings yet
Simple LR Lecture
60 pages
Corelation and Regression
No ratings yet
Corelation and Regression
137 pages
Theme 3 Multivariante Regression Model
No ratings yet
Theme 3 Multivariante Regression Model
8 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
9 pages
Simple Regression and Correlation
No ratings yet
Simple Regression and Correlation
30 pages
14 Statistics and Probability
No ratings yet
14 Statistics and Probability
37 pages
QBM 101 Lecture 10
No ratings yet
QBM 101 Lecture 10
45 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
95 pages
Lecture8 4
No ratings yet
Lecture8 4
29 pages
Evans Analytics2e PPT 08
No ratings yet
Evans Analytics2e PPT 08
65 pages
Topic - Chapter 12 - Regression Models
No ratings yet
Topic - Chapter 12 - Regression Models
1 page
Session 5 Marked B PDF
No ratings yet
Session 5 Marked B PDF
36 pages
Chapter No 11 (Simple Linear Regression)
No ratings yet
Chapter No 11 (Simple Linear Regression)
3 pages
Regression&Corr&Annova
No ratings yet
Regression&Corr&Annova
71 pages
01 SLR Final
No ratings yet
01 SLR Final
37 pages
03 - Simple Linear Regression
No ratings yet
03 - Simple Linear Regression
13 pages
Simple Linear Regression and Correlation: Model and Examine The Relationship Between A and One or More (Predictors)
No ratings yet
Simple Linear Regression and Correlation: Model and Examine The Relationship Between A and One or More (Predictors)
31 pages
Regression Kann Ur 14
No ratings yet
Regression Kann Ur 14
43 pages
Complete Business Statistics: Simple Linear Regression and Correlation
No ratings yet
Complete Business Statistics: Simple Linear Regression and Correlation
50 pages
Part 2 Exploring Relationships Among Variables
No ratings yet
Part 2 Exploring Relationships Among Variables
8 pages
Topic Simple Linear Regression
No ratings yet
Topic Simple Linear Regression
38 pages
Regression
No ratings yet
Regression
66 pages
Data Science 03 - Regression PDF
No ratings yet
Data Science 03 - Regression PDF
32 pages
QMM Epgdm 5
No ratings yet
QMM Epgdm 5
58 pages
Regression Analysis 1 2020
No ratings yet
Regression Analysis 1 2020
40 pages
A Tutorial On How To Run A Simple Linear Regression in Excel
No ratings yet
A Tutorial On How To Run A Simple Linear Regression in Excel
19 pages
Estimation
100% (1)
Estimation
19 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
7 pages
Mixed Effects Models and Extensions in Ecology With R Instant PDF Download
100% (8)
Mixed Effects Models and Extensions in Ecology With R Instant PDF Download
16 pages
Regression
No ratings yet
Regression
3 pages
Panel Data For Learing
100% (2)
Panel Data For Learing
34 pages
LPP
100% (4)
LPP
5 pages
Decision Making Under Uncertainty
No ratings yet
Decision Making Under Uncertainty
49 pages
Kazadi Joel 9213934 DLMDSAS01
No ratings yet
Kazadi Joel 9213934 DLMDSAS01
20 pages
CH 4 Decision Theory
No ratings yet
CH 4 Decision Theory
37 pages
Further Inference in The Multiple Regression Model - CLASS VERSION
No ratings yet
Further Inference in The Multiple Regression Model - CLASS VERSION
57 pages
Midterm Aut2014 (Final) Sol
No ratings yet
Midterm Aut2014 (Final) Sol
23 pages
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
No ratings yet
10.0 Lesson Plan: Answer Questions Robust Estimators Maximum Likelihood Estimators
15 pages
Simple PD and LGD Estimation in MS Excel
100% (1)
Simple PD and LGD Estimation in MS Excel
4 pages
MOR401 403 VII 24
No ratings yet
MOR401 403 VII 24
4 pages
Mba 641 Chapter Four
No ratings yet
Mba 641 Chapter Four
34 pages
Syllabus SBE204 Spring2024
No ratings yet
Syllabus SBE204 Spring2024
4 pages
Monte Carlo Simulation (ARIMA Time Series Models)
No ratings yet
Monte Carlo Simulation (ARIMA Time Series Models)
15 pages
Problem Sets Merged PDF
No ratings yet
Problem Sets Merged PDF
50 pages
Measurement of Risk
No ratings yet
Measurement of Risk
17 pages
Datacamp Python 4
No ratings yet
Datacamp Python 4
37 pages
Realestate Quiz Part1
No ratings yet
Realestate Quiz Part1
27 pages
On The Optimal Weighting Matrix For The GMM System Estimator in Dynamic Panel Data Models
No ratings yet
On The Optimal Weighting Matrix For The GMM System Estimator in Dynamic Panel Data Models
28 pages
FM Doubts
No ratings yet
FM Doubts
13 pages
EFM4, CH 05, Slides, 07-02-18
No ratings yet
EFM4, CH 05, Slides, 07-02-18
44 pages
A Better Lemon Squeezer? Maximum-Likelihood Regression With Beta-Distributed Dependent Variables
No ratings yet
A Better Lemon Squeezer? Maximum-Likelihood Regression With Beta-Distributed Dependent Variables
18 pages
191FF03063 - Muhammad Gian Fahrozi - 2 FA 2 - Laporan HKSA-2
No ratings yet
191FF03063 - Muhammad Gian Fahrozi - 2 FA 2 - Laporan HKSA-2
6 pages
Problem Set 1-Game Theory
No ratings yet
Problem Set 1-Game Theory
2 pages
Uji Validitas Dan Reliabilitas (Lubis)
No ratings yet
Uji Validitas Dan Reliabilitas (Lubis)
5 pages
A Stochastic Frontier Model With Correction For Sample Selection
No ratings yet
A Stochastic Frontier Model With Correction For Sample Selection
10 pages
A Note On Finding The Mean Variance Efficient (MVE) Portfolio: The Case With Two Risky Assets and One Risk-Free Asset
No ratings yet
A Note On Finding The Mean Variance Efficient (MVE) Portfolio: The Case With Two Risky Assets and One Risk-Free Asset
3 pages
PPF Calculation Master Sheet
No ratings yet
PPF Calculation Master Sheet
4 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Capsule Calculus
From Everand
Capsule Calculus
Ira Ritow
No ratings yet

Regression Models - Follow

Uploaded by

Regression Models - Follow

Uploaded by

Chapter 4 - Regression Models

Learning Objectives for this Chapter

4.5 – Test the model for significance

4.6 – Use Excel to do a regression analysis

X(money spent on breakfast) 5 6 7 7 9 10 12

4.1 Scatter diagram

Y(money spent on lunch)

The simple linear regression model is Y = β0 + β1X + ε  This is a general model

Y is the dependent variable

X is the independent variable

β0 is the intercept (Y value when X = 0)

β1 is the slope of the regression line

ε is some random error

Simple regression model estimated for a data sample.

Ŷ is the predicted value of Y

b0 is the predicted value of β0, based on a sample

b1 is the predicted value of β1, based on a sample

How to compute these values?

We need to first compute the following

so, the simple regression equation is

4.3 – Measuring the Fit of the Regression Model

To do that, we must first compute the following:

R2 = SSR/SST This is also the same as 1 – SSE/SST

For our question, R2 = 89.8767/96  0.936 or about 94%

Speculating on r based on the scatter diagram

Positive correlation: If x and y have a strong positive linear correlation, r is close

4.4 – Assumption of the Regression Model

 The errors are independent.

For our example, s2 = 5.7567/(7-1-1) = 1.15

4.5 – Testing the Model for significance

Steps for Hypothesis Testing

H0 : ᵝI = 0 The correlation is 0 (The correlation is not significant)

How to read the F table…

H0 : ᵝI = 0 The correlation is 0 (The correlation is not significant)

Step 3 Calculate the value of F statistic. Fcalculated = MSR/MSE

MSR = SSR/k 89.8767/1 = 89.8767

Fcalculated = 89.8767/1.15 = 78.1536

df1 = k = 1  first column

df2 = n-k-1 = 7-1-1 = 5 5th row

You might also like