Ec410 Lecture 4 - Simple Regression II
Ec410 Lecture 4 - Simple Regression II
Simple Regression II • Write your discussion section number at the top of your problem set!
Outline
Suppose we estimate our OLS regression line, then
calculate the average of the residuals
• Three Properties of OLS
What would we get? Why?
• Goodness of Fit
3
ffi
ffi
Properties of OLS Properties of OLS
• Recall – we derived the OLS estimators so that they minimized the sum of squared Dividing both sides by n
residuals. When we did that, we got these two FOC
n Xn n n
X
X X
(yi ˆ0 ˆ1 xi ) = 0 ûi = 0 (yi ˆ0 ˆ1 xi ) = 0 ûi = 0 ¯=0
û
i=1 i=1 i=1 i=1
n
X n
X n
X n
X
xi (yi ˆ0 ˆ1 xi ) = 0 xi ûi = 0 xi (yi ˆ0 ˆ1 xi ) = 0 xi ûi = 0
i=1 i=1 i=1 i=1
• The rst of these equations assures us that the sum of the residuals from the
estimated OLS regression line must be zero
= ûi • If these conditions aren’t satis ed, then the • How shall we interpret this?
regression line can’t possibly be OLS! 6
• Consider the sample covariance between x and the residuals: • So from these two rst order conditions, we’ve decided that:
" n
# n
X X n
X ˆ0 ˆ1 xi ) = 0
\û) = n 1
(xi x̄)(ûi ¯
û) (yi ûi = 0 û = 0
cov(x, i=1
i=1
i=1
" # n
X n
X
Xn n
X xi (yi ˆ0 ˆ1 xi ) = 0 xi ûi = 0 \û) = 0
cov(x,
=n 1
(xi x̄)ûi (xi ¯
x̄)û i=1 i=1
i=1 i=1
" n n
# • Let’s get some graphical intuition for these two properties…
X X
1
=n xi ûi x̄ûi
i=1 i=1
" n n
#
X X
1
=n xi ûi x̄ ûi =?
i=1 i=1
7 8
fi
fi
fi
fi
Last lecture we showed that
8
8
this can also be written as:
OLS OLS
Properties of OLS ȳ = ˆ0 + ˆ1 x̄
7
7
6
y
6
y
5
5
i=1 i=1
4
4
0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1
x x
Xn Xn
xi (yi ˆ0 ˆ1 xi ) = 0 xi ûi = 0 \û) =
cov(x, 0
1.5
.5
i=1 i=1
1
0
• So the interpretation of this rst property is that the regression shouldn’t be too
high or too low – but where will this be?
-.5
.5
residuals
residuals
-1
0
• Obviously one answer is: when the average residuals is zero – but can we say more?
-1.5
-.5
Average Average • Consider our equation for the estimated regression line:
ŷi = ˆ0 + ˆ1 xi
-1
-2
0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1
x x
8
8
8
7
OLS OLS
7
?
6
y
(X̄, Ȳ)
6
y
(X̄, Ȳ)
7
A regression line
5
estimated via OLS will
5
always predict the
average value of y
4
4
0 .2 .4 .6 .8 1
ȳ
0 .2 .4 .6 .8 1
6
y
x x
? (x̄, ȳ) 2
1
.5
1
5
residuals
0
In other words, for an regression line must go
0
-.5
value of x
-1
4
Residuals Residuals
-1
0 .2 .4 x̄ .6 .8 1
-1.5
-2
x 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1
x x
7
n n
X
X
(yi ˆ0 ˆ1 xi ) = 0 ûi = 0 û = 0
i=1 i=1
6
n
y
X Xn
(X̄, Ȳ)
xi (yi ˆ0 ˆ1 xi ) = 0 xi ûi = 0 \û)
cov(x, =0
i=1 i=1
• So the interpretation of this second property is that the regression line ?
5
shouldn’t be too steep or too shallow The second FOC ensures the slope of the
regression results in zero covariance between X
• How do we achieve this? By making sure x is uncorrelated with the residuals! and the residuals
4
0 .2 .4 .6 .8 1
x
8
Goodness of t
6 7
y
• Our next goal: we’d like to be able to measure how well our model ts the data
5
• How do we de ne “well”
4
0 .2 .4 .6 .8 1
x
8
• Is this a useful criteria and, if so, what is it useful for?
7
• Let’s start by considering two regression lines…
6
y
5
4 0 .2 .4 .6 .8 1
x
• To measure this, we need to de ne a few new statistics • The second: Explained Sum of Squares (SSE)
Xn
• The rst: Total Sum of Squares (SST)
X n Sometimes it’s useful to specify which SSE = (ŷi ȳ)2
variable we’ve measured SST for…
SST = (yi ȳ)2 = SSTy i=1
(when unspeci ed, the default is always the
i=1 • It’s a measure of how much variability in y is explained by the regressor
dependent variable of the regression)
• Does this equation look familiar?
17 18
• The third: Residual Sum of Squares (SSR) • Given these de nitions, probably it’s intuitive that the total variation should equal the
n explained variation plus the residual variation:
X
SSR = û2i SST = SSE + SSR
Residual
i=1 Total Explained (aka Unexplained)
• It’s a measure of how much variability in y is not explained by the regressor • And we can show that this is indeed the case:
X n n
X
SST = (yi ȳ) = 2
[(yi ŷi ) + (ŷi ȳ)]2
i=1 i=1
X n
= [ûi + (ŷi ȳ)]2
i=1
n
X n
X n
X
= û2i + 2 ûi (ŷi ȳ) + (ŷi ȳ)2
i=1 i=1 i=1
19
SSR =0? SSE
fi
fi
fi
fi
SST = SSE + SSR R2
• So we are left wanting to show that the following expression is equal to zero: • With this in hand, we can now propose a possible measure for goodness of t:
n
X n
X n
X
= 2 ˆ0 ûi + 2 ˆ1 ûi xi 2ȳ ûi SSE SST SSR SSR
i=1 i=1 i=1 R2 = = =1
• What are these terms equal to?
SST SST SST
Residual Sum of Squares
Do you know why we =1
21
call this statistic R- Total Sum of Squares 22
squared?
Where does R-squared get it’s name? Where does R-squared get it’s name?
2
• To get some intuition, consider an extreme case: • Consider another case:
Estimated
.8
regression line
1.5
• Imagine we are regressing y on a variable x that • Imagine there is a perfect relationship between
.6
tells us nothing at all about y y and x
y
1
.4
• What will our estimate of the slope be? • What will the residuals be?
.5
.2
• And what will our estimate of the intercept be?
n
X
SSR = û2i
0
• Recall that:
0
0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1
i=1
• Recall that: n
X • So what will R2 be?
SSE = (ŷi ȳ)2
i=1 SSR
R2 = 1
SST
• So what will R2 be? SSE
R2 =
SST 25 26
• These are the most extreme possible cases cases, so: 0 R2 1 Project 1 Project 2
• To interpret R2, we multiply by 100 and treat it as a percentage:
71
20
70.8
• Example: If R2=0.37, we would say that 37 percent of the of the sample variation
in y has been explained by x 15
Life Expectancy
70.6
Celsius
10
70.4
5
70.2
• Your interpretation should be context speci c, so instead of Y and X you should
be clear about what’s on the left- and right-hand side of the regression model
70
0
30 40 50 60 70 0 .2 .4 .6 .8 1
Fahrenheit Dose
• Sometimes you’ll come across researchers that treat R2 in this way, so why do we • Is this project also more important?
avoid it? Easiest to illustrate with a couple examples…
27 28
fi
Example: Voting
Share of money
spent by that
candidate
• What is the R-squared? And how do we interpret it for this example?
85.61% of the of the variation in the vote share going to a candidate can be
explained by the share of money spent by that candidate
29