0% found this document useful (0 votes)

76 views22 pages

Lecture 3

This document discusses measuring the goodness of fit of a regression line. It introduces the coefficient of determination (R-squared) as a measure of how well the regression line fits the data. R-squared compares the regression line to the mean line, and indicates what proportion of the variation in the dependent variable is explained by the independent variable. The document explains how to calculate R-squared and provides an example using data on mean consumption and income. It finds an R-squared value of 0.98, suggesting income explains 98% of the variation in consumption. The document also introduces the F-test as a way to test if the R-squared value is statistically significant.

Uploaded by

Watani Bidami

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

76 views22 pages

Lecture 3

Uploaded by

Watani Bidami

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Introductory Econometrics for Finance

3. Measuring the goodness of fit of the regression line

Siraj M. Sep, 2023

Goodness of fit

 Having calculated the regression line, we now ask whether it provides

a good fit for the data.
 Do the observations tend to lie close to, or far away from, the line?
 Even though we have fitted a regression line, by itself this tells us
nothing about the closeness of the fit.
 If the fit is poor, perhaps the effect of X upon Y is not so strong after
all.
2
Goodness of fit

 Although b is likely to be small, it is unlikely to be exactly zero.

 Measuring the goodness of fit of the data to the line helps us to
distinguish between good and bad regressions.
 Generally, there will be some positive 𝜇𝑖 and some negative 𝜇𝑖 .
 What we hope for is that these residuals around the regression line are
as small as possible.

3
Goodness of fit

 The coefficient of determination is a summary measure that tells

how well the sample regression line fits the data.

4
Goodness of fit

 The goodness of fit is calculated by comparing two lines: the

regression line and the ‘mean line’ (i.e. a horizontal line drawn at the
mean value of Y).
 The regression line must fit the data better (if the mean line were
the best fit, that is also where the regression line would be) but
the question is how much better?

5
Goodness of fit: Calculation of R squared

This is illustrated in
Figure 3.1, which
demonstrates the principle
behind the calculation of
the coefficient of
determination, denoted by
𝑅2 and usually more
simply referred to as ‘R
squared’.

6
Goodness of fit: Calculation of R squared

 The figure shows the mean value of Y, the calculated sample

regression line and an arbitrarily chosen sample observation (𝑋𝑖 , 𝑌𝑖 ).
 The difference between 𝑌𝑖 and 𝑌 (length 𝑌𝑖 − 𝑌) can be divided up
into:
 That part ‘explained’ by the regression line, 𝑌𝑖 −𝑌(i.e. explained by the value of 𝑋𝑖 ).
 The error term 𝑌𝑖 − 𝑌𝑖 .
 In algebraic terms, 𝑌𝑖 − 𝑌 = 𝑌𝑖 − 𝑌𝑖 + 𝑌𝑖 − 𝑌 (3.1)
7
Goodness of fit: Calculation of R squared

 A good regression model should ‘explain’ a large part of the

differences between the 𝑌𝑖 values and 𝑌 , i.e. the length ( 𝑌𝑖 −𝑌) should
be large relative to 𝑌𝑖 - 𝑌𝑖 .
 A measure of fit could therefore be:

𝑌𝑖 −𝑌
𝑌𝑖 −𝑌

8
Goodness of fit: Calculation of R squared

 We need to apply this to all observations rather than just a single

one; hence we could sum this expression over all the sample
observations.
 A problem with this is that some of the terms would take a
negative value and offset the positive terms.
 To measure the goodness of fit, we do not want the positive and
negative terms to cancel each other out.
9
Goodness of fit: Calculation of R squared

 Hence, to get round this problem, we square each of the terms in

equation (3.1) to make them all positive, and then sum over the
observations. This gives:

10
Goodness of fit: Calculation of R squared

 The measure of goodness of fit, 𝑅2 , is then defined as the ratio

of the regression sum of squares to the total sum of squares, i.e.
2 𝑅𝑆𝑆
𝑅 = (3.2)
𝑇𝑆𝑆

 The better the divergences between 𝑌𝑖 and 𝑌 are explained by

the regression line, the better the goodness of fit, and the higher
the calculated value of 𝑅2 .
11
Goodness of fit: Calculation of R squared

 A value of 𝑅2 = 1 (and hence ESS = 0) indicates that all the

sample observations lie exactly on the regression line (equivalent
to perfect correlation).
 If 𝑅2 = 0, then the regression line is of no use at all -X does not
influence Y (linearly) at all, and to try to predict a value of Y one
might as well use the mean 𝑌 rather than the value Xi inserted
into the sample regression equation.
12
Goodness of fit: Calculation of R squared

 We illustrate the econometric theory developed so far by considering

the data given in Lecture 1, which relates mean (Y) and income (X).
 Basic economics theory tells us, that among many variables, income is
an important determinant of consumption.
 From the data given in Lecture 1, we obtain the estimated regression
line as follows:
𝑌𝑖 = 124.316 + 0.6086𝑋𝑖 (3.3)
13
Goodness of fit: Calculation of R squared

 Geometrically, the estimated regression line is as shown in following

Figure.

14
Goodness of fit: Calculation of R squared

 As we know, each point on the regression line gives an estimate of the

mean value of Y corresponding to the chosen X value.
 The value of 𝛽2 = 0.6086, which measures the slope of the line, shows
that, within the sample range of X values, as X increases by 1, the
estimated increase in mean consumption is about 61 cents.
 That is, each additional unit of income, on average, increases personal
consumption by about 61 cents.
15
Goodness of fit: Calculation of R squared

 The 𝑅2 value of about 0.98 suggests that income explains about 98

percent of the variation in personal consumption.
 Consumption vary around the overall mean value of 1087 and 98% of
this variation is explained by variation in national income.
 This is quite a respectable figure to obtain, leaving only 2% of the
variation in Y left to be explained by other factors (or pure random
variation).
16
Goodness of fit: Calculation of R squared

 The regression seems to make a worthwhile contribution to explaining

why consumption differ.

 However, it does not explain the mechanism by which higher income

leads to a higher consumption.

17
Testing the significance of R2 : the F test

 The Another check of the quality of the regression equation is to

test whether the 𝑅2 value, calculated earlier, is significantly
greater than zero.
 This is a test using the F distribution.
 The null hypothesis for the test is 𝐻0 : 𝑅2 = 0, implying once
again that X does not influence Y (hence equivalent to 𝛽2 = 0).

18
Testing the significance of R2 : the F test

 The test statistic is:

𝑅2 1
𝐹=
1 − 𝑅2 𝑛 − 2
Or equivalently

𝑅𝑆𝑆 1
𝐹=
𝐸𝑆𝑆 𝑛 − 2

19
Testing the significance of R2 : the F test

 The F statistic is therefore the ratio of the regression sum of squares to

the error sum of squares, each divided by their degrees of freedom (for
the RSS there is one degree of freedom because of the one explanatory
variable, for the ESS there are n - 2 degrees of freedom).
 A high value of the F statistic (i.e. RSS is large relative to ESS) rejects
𝐻0 in favor of the alternative hypothesis, 𝐻1 : 𝑅2 > 0.

20
Testing the significance of R2 : the F test

 Evaluating the consumption data worked so far gives:

21
@@@@@@@@@@@@@@@@@@@@@ end of lecture 3 @@@@@@@@@@@@@@@@@@@@

Introduction To Econometrics, 5 Edition: Chapter 2: Properties of The Regression Coefficients and Hypothesis Testing
No ratings yet
Introduction To Econometrics, 5 Edition: Chapter 2: Properties of The Regression Coefficients and Hypothesis Testing
45 pages
Hep & GIT Final MCQ 21 B
100% (4)
Hep & GIT Final MCQ 21 B
23 pages
Simple and Multiple Linear Regression
No ratings yet
Simple and Multiple Linear Regression
72 pages
Sermon Notes: "The Good Life?" (Luke 12:13-21)
No ratings yet
Sermon Notes: "The Good Life?" (Luke 12:13-21)
3 pages
Chapter 3 - Presentation
No ratings yet
Chapter 3 - Presentation
54 pages
Astm F 1145
100% (2)
Astm F 1145
12 pages
Ch10 - Curve Fitting
No ratings yet
Ch10 - Curve Fitting
157 pages
Econometrics For Finace Lecture II-Session Three
No ratings yet
Econometrics For Finace Lecture II-Session Three
32 pages
03 Statistics in Regrression Analysis
No ratings yet
03 Statistics in Regrression Analysis
24 pages
Chapter 5 - Eng
No ratings yet
Chapter 5 - Eng
20 pages
Chapter 5 Statistics
No ratings yet
Chapter 5 Statistics
47 pages
Chapter 2-Simple Regression Model
No ratings yet
Chapter 2-Simple Regression Model
25 pages
Goodness of Fit Part 2
No ratings yet
Goodness of Fit Part 2
7 pages
Confined Space Entry Procedure
100% (2)
Confined Space Entry Procedure
4 pages
Lecture 3 - LRM
No ratings yet
Lecture 3 - LRM
40 pages
01 - Simple Linear Regression
No ratings yet
01 - Simple Linear Regression
24 pages
Name-Simran Kaur Syal Subject - Financial Econometrics Assignment No. 4 Q. Explain BLUE in Detail and Conditions For The Same? Ans
No ratings yet
Name-Simran Kaur Syal Subject - Financial Econometrics Assignment No. 4 Q. Explain BLUE in Detail and Conditions For The Same? Ans
4 pages
REGRESS
No ratings yet
REGRESS
24 pages
PROBLEMS ch05
No ratings yet
PROBLEMS ch05
117 pages
Correlation & Regression Analysis
100% (1)
Correlation & Regression Analysis
39 pages
Renal Diseases Pathophysiology
100% (1)
Renal Diseases Pathophysiology
6 pages
Lecturer 10 UET
No ratings yet
Lecturer 10 UET
54 pages
Unit 2 Multiple Regression: 1. Correlation
No ratings yet
Unit 2 Multiple Regression: 1. Correlation
8 pages
Goodness of Fit: Squares (ESS) To The Total Sum of Squares (TSS)
No ratings yet
Goodness of Fit: Squares (ESS) To The Total Sum of Squares (TSS)
2 pages
Unit 4 Multiple Regression Model: 4.0 Objectives
No ratings yet
Unit 4 Multiple Regression Model: 4.0 Objectives
23 pages
Linear Regression Analysis and Least Square Methods
No ratings yet
Linear Regression Analysis and Least Square Methods
65 pages
ch03 Regression
No ratings yet
ch03 Regression
10 pages
Introducing Regression: Notes Unit 5: Regression Basics
No ratings yet
Introducing Regression: Notes Unit 5: Regression Basics
5 pages
Chapter 5 Measure The Fit of Regression
No ratings yet
Chapter 5 Measure The Fit of Regression
20 pages
Eco 6
No ratings yet
Eco 6
96 pages
Intronumericalrecipes v01 Chapter02 Regress
No ratings yet
Intronumericalrecipes v01 Chapter02 Regress
15 pages
Chapter 5 - STATISTICAL TESTS OF THE LEAST SQUARES ESTIMATES
No ratings yet
Chapter 5 - STATISTICAL TESTS OF THE LEAST SQUARES ESTIMATES
10 pages
Canela (2019) Coeficiente de Correlación
No ratings yet
Canela (2019) Coeficiente de Correlación
9 pages
Dr. Hussin Abdullah School of Economics, Finance and Banking, Uum Cob
No ratings yet
Dr. Hussin Abdullah School of Economics, Finance and Banking, Uum Cob
12 pages
Chapter 3
No ratings yet
Chapter 3
31 pages
Yarn Processing
No ratings yet
Yarn Processing
31 pages
Breaker Blocks
100% (5)
Breaker Blocks
16 pages
Deck2 BusinessIntelligence M1 ACSA
No ratings yet
Deck2 BusinessIntelligence M1 ACSA
15 pages
Lecture 12
No ratings yet
Lecture 12
47 pages
Regression Lecture Summary
No ratings yet
Regression Lecture Summary
31 pages
Statistical Analysis: Linear Regression
No ratings yet
Statistical Analysis: Linear Regression
36 pages
Lecture Week 12 - Intro To Regression
No ratings yet
Lecture Week 12 - Intro To Regression
5 pages
Chap3 - Multiple Regression
No ratings yet
Chap3 - Multiple Regression
56 pages
SRM Notes
No ratings yet
SRM Notes
38 pages
Yesim Ozan - Simple Linear Regression-Presentation - 08.08.15
No ratings yet
Yesim Ozan - Simple Linear Regression-Presentation - 08.08.15
19 pages
LM02 Evaluating Regression Model Fit and Interpreting Model Results IFT Notes
No ratings yet
LM02 Evaluating Regression Model Fit and Interpreting Model Results IFT Notes
9 pages
Coding 2
No ratings yet
Coding 2
3 pages
Week 13
No ratings yet
Week 13
25 pages
Fda Unit 5
No ratings yet
Fda Unit 5
20 pages
1.1 Regression Analysis
No ratings yet
1.1 Regression Analysis
33 pages
Fluid Mechanics
No ratings yet
Fluid Mechanics
9 pages
Introducing Transdisciplinary Design Thinking in Early Undergradu
No ratings yet
Introducing Transdisciplinary Design Thinking in Early Undergradu
272 pages
03 Coefficient of Determination and RMSE
No ratings yet
03 Coefficient of Determination and RMSE
7 pages
Reliability: Supplement Outline
No ratings yet
Reliability: Supplement Outline
19 pages
Session 2
No ratings yet
Session 2
21 pages
Regression&Corr&Annova
No ratings yet
Regression&Corr&Annova
71 pages
Lecture 4
No ratings yet
Lecture 4
22 pages
Regression Analysis
No ratings yet
Regression Analysis
49 pages
CMA Study Plan
No ratings yet
CMA Study Plan
10 pages
Amapl - SS316L - Dia 100 MM - HT - 24SL1214 - 2596.000 Kgs.
No ratings yet
Amapl - SS316L - Dia 100 MM - HT - 24SL1214 - 2596.000 Kgs.
4 pages
F Test of Goodness of Fit
No ratings yet
F Test of Goodness of Fit
44 pages
Chapter 3 Notes
No ratings yet
Chapter 3 Notes
5 pages
Ch3 Multiple Regression
No ratings yet
Ch3 Multiple Regression
56 pages
StartNow Overview
No ratings yet
StartNow Overview
22 pages
Regression Models - Follow
No ratings yet
Regression Models - Follow
7 pages
Regression Models Notes
No ratings yet
Regression Models Notes
13 pages
Testbank For Economics of Money Banking and Financial Markets The 13th Edition Mishkin Instant Download
No ratings yet
Testbank For Economics of Money Banking and Financial Markets The 13th Edition Mishkin Instant Download
18 pages
Lecture 25 - Multiple Regression
No ratings yet
Lecture 25 - Multiple Regression
34 pages
Info Age
No ratings yet
Info Age
31 pages
Maxwellian Distribution Revisited: Maxwell - 2a.m
No ratings yet
Maxwellian Distribution Revisited: Maxwell - 2a.m
7 pages
Regression Analysis Estimation and Interpretation of Regression Equation Dummy Independent Variable
No ratings yet
Regression Analysis Estimation and Interpretation of Regression Equation Dummy Independent Variable
39 pages
Text Based RACE Writing Passages
100% (3)
Text Based RACE Writing Passages
29 pages
5th Grade Colonial Village Unit Plan
100% (1)
5th Grade Colonial Village Unit Plan
25 pages
Summary of Major Events and Problems - US Army Chemical Corps 1959
No ratings yet
Summary of Major Events and Problems - US Army Chemical Corps 1959
42 pages
NIT Part C D
No ratings yet
NIT Part C D
471 pages
Digvijay Singh
No ratings yet
Digvijay Singh
2 pages
Businesses Proposal
No ratings yet
Businesses Proposal
9 pages
Aroon Kumar: "Award Winning Global Marketer and Digital Business Leader"
No ratings yet
Aroon Kumar: "Award Winning Global Marketer and Digital Business Leader"
6 pages
4ME Brochure Update V2657
No ratings yet
4ME Brochure Update V2657
12 pages
Frequency-Dependence of Relative Permeability in Steel
No ratings yet
Frequency-Dependence of Relative Permeability in Steel
8 pages
Group 1 - CJR PSYCHOLINGUITICS - DIK 19 B-1
No ratings yet
Group 1 - CJR PSYCHOLINGUITICS - DIK 19 B-1
4 pages
Haas ST-10 Series Lathes: The High-Performance Turning Centers
No ratings yet
Haas ST-10 Series Lathes: The High-Performance Turning Centers
2 pages
Internal Energy Change Equations
No ratings yet
Internal Energy Change Equations
2 pages
RLT A Question of Trust
No ratings yet
RLT A Question of Trust
3 pages
Cheat Sheet
No ratings yet
Cheat Sheet
4 pages
Strategic Analysis Rubric Expanded
No ratings yet
Strategic Analysis Rubric Expanded
4 pages
Regression
No ratings yet
Regression
3 pages
Student Solutions Manual for Mathematics for Economics, fourth edition
From Everand
Student Solutions Manual for Mathematics for Economics, fourth edition
Michael Hoy
No ratings yet
Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)

Lecture 3

Uploaded by

Lecture 3

Uploaded by

Introductory Econometrics for Finance

3. Measuring the goodness of fit of the regression line

Siraj M. Sep, 2023

 Having calculated the regression line, we now ask whether it provides

 Although b is likely to be small, it is unlikely to be exactly zero.

 The coefficient of determination is a summary measure that tells

 The goodness of fit is calculated by comparing two lines: the

 The figure shows the mean value of Y, the calculated sample

 A good regression model should ‘explain’ a large part of the

 We need to apply this to all observations rather than just a single

 Hence, to get round this problem, we square each of the terms in

 The measure of goodness of fit, 𝑅2 , is then defined as the ratio

 The better the divergences between 𝑌𝑖 and 𝑌 are explained by

 A value of 𝑅2 = 1 (and hence ESS = 0) indicates that all the

 We illustrate the econometric theory developed so far by considering

 Geometrically, the estimated regression line is as shown in following

 As we know, each point on the regression line gives an estimate of the

 The 𝑅2 value of about 0.98 suggests that income explains about 98

 The regression seems to make a worthwhile contribution to explaining

 However, it does not explain the mechanism by which higher income

 The Another check of the quality of the regression equation is to

 The test statistic is:

 The F statistic is therefore the ratio of the regression sum of squares to

 Evaluating the consumption data worked so far gives:

You might also like