0% found this document useful (0 votes)

17 views8 pages

Ec410 Lecture 4 - Simple Regression II

Problem Set 1 for Econ 410 is due on February 6, with late submissions allowed up to 24 hours after the deadline with a 20% deduction. The document outlines key properties of OLS regression, including the necessity for the average residuals to equal zero and the importance of the relationship between the independent variable and residuals. It also introduces measures of goodness of fit, such as the Total Sum of Squares, Explained Sum of Squares, and R-squared, which indicates the proportion of variance in the dependent variable explained by the independent variable.

Uploaded by

Laura Pérez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views8 pages

Ec410 Lecture 4 - Simple Regression II

Uploaded by

Laura Pérez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Announcements

• Problem Set 1 is due Thursday, February 6

• Submit (in class or under my o ce door) before 3:45pm that day

• Late problem sets (20% deduction) can be submitted (under my o ce door)

Econ 410: Introductory Econometrics up to 24 hours after the deadline

• Fine to work in groups, but remember to write up your answers independently

Simple Regression II • Write your discussion section number at the top of your problem set!

Outline
Suppose we estimate our OLS regression line, then
calculate the average of the residuals
• Three Properties of OLS
What would we get? Why?
• Goodness of Fit

• Decomposing the Total Sum of Squares

• Introducing a Goodness of Fit Measure

3
ffi
ffi
Properties of OLS Properties of OLS
• Recall – we derived the OLS estimators so that they minimized the sum of squared Dividing both sides by n
residuals. When we did that, we got these two FOC
n Xn n n
X
X X
(yi ˆ0 ˆ1 xi ) = 0 ûi = 0 (yi ˆ0 ˆ1 xi ) = 0 ûi = 0 ¯=0
û
i=1 i=1 i=1 i=1
n
X n
X n
X n
X
xi (yi ˆ0 ˆ1 xi ) = 0 xi ûi = 0 xi (yi ˆ0 ˆ1 xi ) = 0 xi ûi = 0
i=1 i=1 i=1 i=1
• The rst of these equations assures us that the sum of the residuals from the
estimated OLS regression line must be zero

• What does this tell us about the average residual?

What is this term? • Which brings us to our second property

ˆ0 ˆ1 xi = yi • These FOC had to be satis ed in order to
yi ŷi minimize the sum of squared residuals • It’s also the case that the sum of xiuî over all the observations is zero

= ûi • If these conditions aren’t satis ed, then the • How shall we interpret this?
regression line can’t possibly be OLS! 6

Properties of OLS Properties of OLS

• Consider the sample covariance between x and the residuals: • So from these two rst order conditions, we’ve decided that:
" n
# n
X X n
X ˆ0 ˆ1 xi ) = 0
\û) = n 1
(xi x̄)(ûi ¯
û) (yi ûi = 0 û = 0
cov(x, i=1
i=1
i=1
" # n
X n
X
Xn n
X xi (yi ˆ0 ˆ1 xi ) = 0 xi ûi = 0 \û) = 0
cov(x,
=n 1
(xi x̄)ûi (xi ¯
x̄)û i=1 i=1
i=1 i=1
" n n
# • Let’s get some graphical intuition for these two properties…
X X
1
=n xi ûi x̄ûi
i=1 i=1
" n n
#
X X
1
=n xi ûi x̄ ûi =?
i=1 i=1
7 8
fi
fi
fi
fi
Last lecture we showed that

8
8
this can also be written as:

OLS OLS
Properties of OLS ȳ = ˆ0 + ˆ1 x̄

7
7

6
y
6
y

• So we’ve concluded that:

n n
X
X
(yi ˆ0 ˆ1 xi ) = 0 ûi = 0 û = 0

5
5

i=1 i=1

4
4

0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1
x x
Xn Xn
xi (yi ˆ0 ˆ1 xi ) = 0 xi ûi = 0 \û) =
cov(x, 0
1.5

.5
i=1 i=1
1

0
• So the interpretation of this rst property is that the regression shouldn’t be too
high or too low – but where will this be?

-.5
.5
residuals

residuals
-1
0

• Obviously one answer is: when the average residuals is zero – but can we say more?
-1.5
-.5

Average Average • Consider our equation for the estimated regression line:
ŷi = ˆ0 + ˆ1 xi
-1

-2

0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1
x x

• So the height of the estimated regression line at x̄ is:

Is this regression line OLS?
How can we tell from the residuals? So how would OLS be di erent? = ˆ0 + ˆ1 x̄ = ȳ 10

8
8
8

7
OLS OLS

7
?

6
y
(X̄, Ȳ)

6
y
(X̄, Ȳ)
7

A regression line

5
estimated via OLS will

5
always predict the
average value of y

4
4
0 .2 .4 .6 .8 1

ȳ
0 .2 .4 .6 .8 1
6
y

x x

? (x̄, ȳ) 2

1
.5
1
5

OLS estimate of the

residuals

residuals
0
In other words, for an regression line must go
0

individual with an average through this point!

-.5
value of x
-1
4

Residuals Residuals

-1
0 .2 .4 x̄ .6 .8 1

-1.5
-2

x 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1
x x

Is this regression line OLS?

How can we tell from the residuals? So how would OLS be di erent?
ff
ff
fi
8
Properties of OLS The rst FOC of OLS sets the height of the regression
line at X̄ to be Ȳ , ensuring the average residual is zero

• So we’ve concluded that: ?

7
n n
X
X
(yi ˆ0 ˆ1 xi ) = 0 ûi = 0 û = 0
i=1 i=1

6
n

y
X Xn

(X̄, Ȳ)
xi (yi ˆ0 ˆ1 xi ) = 0 xi ûi = 0 \û)
cov(x, =0
i=1 i=1
• So the interpretation of this second property is that the regression line ?

5
shouldn’t be too steep or too shallow The second FOC ensures the slope of the
regression results in zero covariance between X
• How do we achieve this? By making sure x is uncorrelated with the residuals! and the residuals

4
0 .2 .4 .6 .8 1
x

Interpreting both FOC

8
Goodness of t

6 7
y
• Our next goal: we’d like to be able to measure how well our model ts the data

• In doing so, we’ll need to answer a few questions:

5
• How do we de ne “well”

4
0 .2 .4 .6 .8 1
x

• In other words, what criteria are we using to measure the goodness of t?

8
• Is this a useful criteria and, if so, what is it useful for?

7
• Let’s start by considering two regression lines…

6
y
5
4 0 .2 .4 .6 .8 1
x

Which regression best ts the data?

15
fi
fi
fi
fi
fi
fi
Total Sum of Squares Explained Sum of Squares

• To measure this, we need to de ne a few new statistics • The second: Explained Sum of Squares (SSE)
Xn
• The rst: Total Sum of Squares (SST)
X n Sometimes it’s useful to specify which SSE = (ŷi ȳ)2
variable we’ve measured SST for…
SST = (yi ȳ)2 = SSTy i=1
(when unspeci ed, the default is always the
i=1 • It’s a measure of how much variability in y is explained by the regressor
dependent variable of the regression)
• Does this equation look familiar?

• So it’s a measure of the total variability in y (as opposed to the average)

17 18

Residual Sum of Squares SST = SSE + SSR

• The third: Residual Sum of Squares (SSR) • Given these de nitions, probably it’s intuitive that the total variation should equal the
n explained variation plus the residual variation:
X
SSR = û2i SST = SSE + SSR
Residual
i=1 Total Explained (aka Unexplained)
• It’s a measure of how much variability in y is not explained by the regressor • And we can show that this is indeed the case:
X n n
X
SST = (yi ȳ) = 2
[(yi ŷi ) + (ŷi ȳ)]2
i=1 i=1
X n
= [ûi + (ŷi ȳ)]2
i=1
n
X n
X n
X
= û2i + 2 ûi (ŷi ȳ) + (ŷi ȳ)2
i=1 i=1 i=1
19
SSR =0? SSE
fi
fi
fi
fi
SST = SSE + SSR R2

• So we are left wanting to show that the following expression is equal to zero: • With this in hand, we can now propose a possible measure for goodness of t:

ŷi = ˆ0 + ˆ1 xi SSE Explained Sum of Squares

R2 = =
n
X n
X SST Total Sum of Squares
2 ûi (ŷi ȳ) = 2 ûi ( ˆ0 + ˆ1 xi ȳ)
• So R2 is the fraction of the total sample variation in y that’s explained by x
i=1 i=1
n n n • If you prefer a de nition that’s more “glass half-empty”, recall that:
X X X
=2 ûi ˆ0 + 2 ûi ˆ1 xi 2 ûi ȳ SST = SSE + SSR =) SSE = SST SSR
i=1 i=1 i=1 • Plugging this in above, we can write R2 as:

n
X n
X n
X
= 2 ˆ0 ûi + 2 ˆ1 ûi xi 2ȳ ûi SSE SST SSR SSR
i=1 i=1 i=1 R2 = = =1
• What are these terms equal to?
SST SST SST
Residual Sum of Squares
Do you know why we =1
21
call this statistic R- Total Sum of Squares 22
squared?

Where does R-squared get it’s name? Where does R-squared get it’s name?

Recall: • We can also con rm this in Stata...

Pn
SSE (ŷi ȳ) 2 ŷi = ˆ0 + ˆ1 xi
R2 = = Pi=1n ȳ = ˆ0 + ˆ1 x̄
SST i=1 (yi ȳ)2
Pn ˆ
( 0 + ˆ1 xi ˆ0 ˆ1 x̄)2
= i=1 Pn
i=1 (yi ȳ)2
Pn Pn
(x x̄)2
n 1
i=1 (xi x̄)2
= ˆ1 Pn ˆ
2 i=1 i
= 2
P
i=1 (yi
1 n
ȳ)2 n 1 i=1 (yi ȳ)2
" #2 h i2
\y) Var(x) \ \
Cov(x, y)
Cov(x,
= = = r2 0.92532 = 0.8561
\
Var(x) \
Var(y) \ Var(y)
Var(x) \
23 24
fi
fi
Interpreting R2 So for every single Interpreting R2
observation:
Example 1 ŷi = ȳ Example 2

2
• To get some intuition, consider an extreme case: • Consider another case:
Estimated

.8
regression line

1.5
• Imagine we are regressing y on a variable x that • Imagine there is a perfect relationship between

.6
tells us nothing at all about y y and x

y
1
.4
• What will our estimate of the slope be? • What will the residuals be?

.5
.2
• And what will our estimate of the intercept be?
n
X
SSR = û2i

0
• Recall that:

0
0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1

• So what does our regression line look like?

x x

i=1
• Recall that: n
X • So what will R2 be?
SSE = (ŷi ȳ)2
i=1 SSR
R2 = 1
SST
• So what will R2 be? SSE
R2 =
SST 25 26

Interpreting R2 Example: Deciding between two projects

• These are the most extreme possible cases cases, so: 0  R2  1 Project 1 Project 2
• To interpret R2, we multiply by 100 and treat it as a percentage:

71
20

70.8
• Example: If R2=0.37, we would say that 37 percent of the of the sample variation
in y has been explained by x 15

Life Expectancy
70.6
Celsius
10

• What I am looking for when I ask you to interpret R2:

70.4
5

70.2
• Your interpretation should be context speci c, so instead of Y and X you should
be clear about what’s on the left- and right-hand side of the regression model

70
0

30 40 50 60 70 0 .2 .4 .6 .8 1
Fahrenheit Dose

• What I am not looking for when I ask you to interpret R2 is:

• a subjective assessment of whether the R2 is high or low R2=1.000 R2= 0.0024

• a judgement of the quality/importance of the regression based on the R2 alone • So which project has the higher R2?

• Sometimes you’ll come across researchers that treat R2 in this way, so why do we • Is this project also more important?
avoid it? Easiest to illustrate with a couple examples…
27 28
fi
Example: Voting

• Consider a regression of the share of votes going to a candidate regressed on the

share of money spent by that same candidate:

Vote share going

to a candidate

Share of money
spent by that
candidate
• What is the R-squared? And how do we interpret it for this example?

85.61% of the of the variation in the vote share going to a candidate can be
explained by the share of money spent by that candidate
29

Pertemuan 3 - Simple Linear Regression
No ratings yet
Pertemuan 3 - Simple Linear Regression
19 pages
Lecture Set 2
No ratings yet
Lecture Set 2
47 pages
05 16 Simple Regression 2
No ratings yet
05 16 Simple Regression 2
84 pages
The Simple Linear Regression Model (Part 2)
No ratings yet
The Simple Linear Regression Model (Part 2)
38 pages
Ordinary Least Squares
No ratings yet
Ordinary Least Squares
35 pages
Statistics Week3
No ratings yet
Statistics Week3
19 pages
Pertemuan 4
No ratings yet
Pertemuan 4
24 pages
Econ 399 Chapter2a
No ratings yet
Econ 399 Chapter2a
40 pages
Week 3-4
No ratings yet
Week 3-4
75 pages
Theory 3. Linear Regression With One Regressor (Textbook Chapter 4)
No ratings yet
Theory 3. Linear Regression With One Regressor (Textbook Chapter 4)
41 pages
Econometrics For Finace Lecture II-Session Three
No ratings yet
Econometrics For Finace Lecture II-Session Three
32 pages
Lecture 2 SLR - 1
No ratings yet
Lecture 2 SLR - 1
28 pages
2024 1 Metrics 6 Multipleols 4
No ratings yet
2024 1 Metrics 6 Multipleols 4
18 pages
Week 2
No ratings yet
Week 2
54 pages
Temas 4 Al 7
No ratings yet
Temas 4 Al 7
191 pages
Simple Regression Model - Estimation
No ratings yet
Simple Regression Model - Estimation
9 pages
Lecture 2
No ratings yet
Lecture 2
39 pages
Econometrics 3
No ratings yet
Econometrics 3
7 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
42 pages
Biostat Lecture 10
No ratings yet
Biostat Lecture 10
47 pages
Simple Regression Model CH02
No ratings yet
Simple Regression Model CH02
60 pages
Ols Estimates
No ratings yet
Ols Estimates
16 pages
Simple Regression
No ratings yet
Simple Regression
46 pages
1 - The Simple Regression Model
No ratings yet
1 - The Simple Regression Model
41 pages
Stats135 Reviewer
No ratings yet
Stats135 Reviewer
5 pages
Chapter Two
No ratings yet
Chapter Two
44 pages
PE Civil: Transportation Ebook Practice Exam
No ratings yet
PE Civil: Transportation Ebook Practice Exam
41 pages
The Simple Regression Model
No ratings yet
The Simple Regression Model
41 pages
Bivariate Regression Analysis: The Beginning of Many Types of Regression
No ratings yet
Bivariate Regression Analysis: The Beginning of Many Types of Regression
40 pages
Linear Regression With One Regressor
No ratings yet
Linear Regression With One Regressor
41 pages
Regression Formulas Quick Notes Simplified
No ratings yet
Regression Formulas Quick Notes Simplified
1 page
Topic 2
No ratings yet
Topic 2
23 pages
Fda Unit 5
No ratings yet
Fda Unit 5
20 pages
Chapter 2-Simple Regression Model
No ratings yet
Chapter 2-Simple Regression Model
25 pages
Chapter 2: Simple Linear Regression
No ratings yet
Chapter 2: Simple Linear Regression
58 pages
CH 5 Market Risk Measurement and Management UXQCDVDK8F
No ratings yet
CH 5 Market Risk Measurement and Management UXQCDVDK8F
143 pages
Chapter 2 Econometrics
No ratings yet
Chapter 2 Econometrics
5 pages
AF ECO 4000 Cheat Sheet
No ratings yet
AF ECO 4000 Cheat Sheet
3 pages
ECO 401 Econometrics: SI 2021 Week 2, 14 September
100% (1)
ECO 401 Econometrics: SI 2021 Week 2, 14 September
47 pages
6th Lecture Note 108335647 230518 203102
No ratings yet
6th Lecture Note 108335647 230518 203102
35 pages
Topic 2
No ratings yet
Topic 2
23 pages
02 Simple Regression
No ratings yet
02 Simple Regression
29 pages
Regression Analysis and Multiple Regression: Session 7
No ratings yet
Regression Analysis and Multiple Regression: Session 7
100 pages
Regression Basics: Predicting A DV With A Single IV
No ratings yet
Regression Basics: Predicting A DV With A Single IV
20 pages
Chap3 - Multiple Regression
No ratings yet
Chap3 - Multiple Regression
56 pages
Ch3 Multiple Regression
No ratings yet
Ch3 Multiple Regression
56 pages
Chapter 1 Article
No ratings yet
Chapter 1 Article
9 pages
Chap 2
No ratings yet
Chap 2
15 pages
Regression With One Regressor
No ratings yet
Regression With One Regressor
25 pages
Goodness of Fit: Squares (ESS) To The Total Sum of Squares (TSS)
No ratings yet
Goodness of Fit: Squares (ESS) To The Total Sum of Squares (TSS)
2 pages
Capstone Notes-1
No ratings yet
Capstone Notes-1
18 pages
Simple Linear Regression and Multiple Linear Regression: MAST 6474 Introduction To Data Analysis I
No ratings yet
Simple Linear Regression and Multiple Linear Regression: MAST 6474 Introduction To Data Analysis I
15 pages
Ordinary Least Squares Linear Regression Review: Week 4
No ratings yet
Ordinary Least Squares Linear Regression Review: Week 4
10 pages
Regression Analysis: Basic Concepts: 1 The Simple Linear Model
No ratings yet
Regression Analysis: Basic Concepts: 1 The Simple Linear Model
4 pages
Stock and Watson - Slides For Chapter 4
No ratings yet
Stock and Watson - Slides For Chapter 4
43 pages
Chapter 3 Notes
No ratings yet
Chapter 3 Notes
5 pages
Welcome To The Course: Financial Econometrics I
No ratings yet
Welcome To The Course: Financial Econometrics I
14 pages
ST 3101 Slides
No ratings yet
ST 3101 Slides
251 pages
Time Series Final Review
No ratings yet
Time Series Final Review
18 pages
Econometrics Chapter 4
No ratings yet
Econometrics Chapter 4
5 pages
The Impact of School Resources On Student Enrollment
No ratings yet
The Impact of School Resources On Student Enrollment
25 pages
Multicollinearity and Remedies
No ratings yet
Multicollinearity and Remedies
23 pages
Stat210 FL17 LCN 1
No ratings yet
Stat210 FL17 LCN 1
43 pages
Chapter 9
No ratings yet
Chapter 9
22 pages
Sample Size Calculator
No ratings yet
Sample Size Calculator
3 pages
Econo Mid-Term Exam
No ratings yet
Econo Mid-Term Exam
4 pages
Data Analytics On Vechicle Insurance Data
No ratings yet
Data Analytics On Vechicle Insurance Data
22 pages
P&S UNIT-4 Sampling Theory
No ratings yet
P&S UNIT-4 Sampling Theory
9 pages
Student Database STD ID STD Name Marks Percentage Grade Remark Grade Description Marks
No ratings yet
Student Database STD ID STD Name Marks Percentage Grade Remark Grade Description Marks
18 pages
Apld07 Stat1
No ratings yet
Apld07 Stat1
25 pages
ADU5301 - Home Assignment
No ratings yet
ADU5301 - Home Assignment
3 pages
DAA - Chapter 03
No ratings yet
DAA - Chapter 03
18 pages
m4 Codes
No ratings yet
m4 Codes
6 pages
MAT2377 Final Formula Sheet
No ratings yet
MAT2377 Final Formula Sheet
4 pages
MBAF 502 Project 2
No ratings yet
MBAF 502 Project 2
19 pages
Stat 102 Module 2
No ratings yet
Stat 102 Module 2
10 pages
Statistics-Testing Hypothesis2
No ratings yet
Statistics-Testing Hypothesis2
1 page
Quantitative Techniques
No ratings yet
Quantitative Techniques
2 pages
Modified Breusch-Godfrey Test For Restricted Highe
No ratings yet
Modified Breusch-Godfrey Test For Restricted Highe
11 pages
DSAASTAT - by Andrea Onofri
No ratings yet
DSAASTAT - by Andrea Onofri
10 pages
Assignment 5
No ratings yet
Assignment 5
10 pages
Anova Table
No ratings yet
Anova Table
9 pages
Tugas Desain Riset Data Sas - Nandyni Zulfa Fitasari - E10021137 - D
No ratings yet
Tugas Desain Riset Data Sas - Nandyni Zulfa Fitasari - E10021137 - D
4 pages
ST202 2018 (1171664)
No ratings yet
ST202 2018 (1171664)
4 pages
Chapter 8
No ratings yet
Chapter 8
3 pages
Chapter 2 Frappy Student Sample Commentary
No ratings yet
Chapter 2 Frappy Student Sample Commentary
1 page
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Calculus-II (Mathematics) Question Bank
From Everand
Calculus-II (Mathematics) Question Bank
Mohmmad Khaja Shareef
No ratings yet
Hyperbolic Functions (Trigonometry) Mathematics E-Book For Public Exams
From Everand
Hyperbolic Functions (Trigonometry) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
No ratings yet