Lecture 4
Lecture 4
Reference
Chapter 3
Wooldridge, Jeffrey M. (2013). Introductory
Econometrics: A Modern Approach, 4th
Edition, Cengage
Econ 443 1
Multiple Regression Analysis
y = b 0 + b 1 x1 + b 2 x2 + . . . b k xk + u
1. Estimation
Econ 443 2
Parallels with Simple Regression
b0 is still the intercept
b1 to bk all called slope parameters
u is still the error term (or disturbance)
Still need to make a zero conditional mean
assumption, so now assume that
E(u|x1,x2, …,xk) = 0
Still minimizing the sum of squared
residuals, so have k+1 first order conditions
Econ 443 3
Interpreting Multiple Regression
Econ 443 4
Simple vs Multiple Reg Estimate
~ ~ ~
Compare the simple regression y = b 0 + b1 x1
with the multiple regression yˆ = bˆ0 + bˆ1 x1 + bˆ 2 x2
~
Generally, b1 ¹ bˆ1 unless :
bˆ = 0 (i.e. no partial effect of x ) OR
2 2
Econ 443 5
Matrix Approach
Econ 443 6
Matrix Approach
Econ 443 7
Matrix Approach
Econ 443 8
Example
y x1 x2
3 12 1
4 8 4
6 7 6
5 9 7
8 10 8
Econ 443 9
Example
Econ 443 10
Example
Econ 443 11
Example
Econ 443 12
Example
Econ 443 13
Example
Econ 443 14
Goodness-of-Fit
We can think of each observation as being made
up of an explained part, and an unexplained part,
yi = yˆ i + uˆi We then define the following :
å ( y - y ) is the total sum of squares (SST)
2
i
R2 = SSE/SST = 1 – SSR/SST
Econ 443 16
Goodness-of-Fit (continued)
We can also think of R 2 as being equal to
the squared correlation coefficient between
the actual yi and the values yˆ i
(å ( y - y )(yˆ - yˆ ))
2
(å ( y - y ) )(å (yˆ - yˆ ) )
i i
R 2
= 2 2
i i
Econ 443 17
More about R-squared
R2 can never decrease when another
independent variable is added to a
regression, and usually will increase
Econ 443 18
Assumptions for Unbiasedness
Population model is linear in parameters:
y = b0 + b1x1 + b2x2 +…+ bkxk + u
We can use a random sample of size n,
{(xi1, xi2,…, xik, yi): i=1, 2, …, n}, from the
population model, so that the sample model
is yi = b0 + b1xi1 + b2xi2 +…+ bkxik + ui
E(u|x1, x2,… xk) = 0, implying that all of the
explanatory variables are exogenous
None of the x’s is constant, and there are no
exact linear relationships among them
Econ 443 19
Too Many or Too Few Variables
What happens if we include variables in
our specification that don’t belong?
There is no effect on our parameter
estimate, and OLS remains unbiased
Econ 443 20
Omitted Variable Bias
Suppose the true model is given as
y = b 0 + b1 x1 + b 2 x2 + u , but we
~ ~ ~
estimate y = b + b x + u , then
0 1 1
~ å (x - x1 ) yi
b1 = i1
å (x - x1 )
2
i1
Econ 443 21
Omitted Variable Bias (cont)
Recall the true model, so that
yi = b 0 + b1 xi1 + b 2 xi 2 + ui , so the
numerator becomes
å (x - x )(b
i1 1 0 + b1 xi1 + b 2 xi 2 + ui ) =
b å (x - x ) + b 2 å ( xi1 - x1 )xi 2 + å ( xi1 - x1 )ui
2
1 i1 1
Econ 443 22
Omitted Variable Bias (cont)
~
b = b1 + b 2 å (x - x )x + å (x
i1 1 i2 i1 - x1 )ui
å ((x - x ) ) å ((x - x1 ) )
2 2
i1 1 i1
( )~
E b1 = b1 + b 2
å (x - x )x
i1 1 i2
å ((x - x ) )
2
i1 1
Econ 443 23
Omitted Variable Bias (cont)
å ((x - x ) )
2
i1 1
( )
~
so E b1 = b1 + b 2d 1
~
Econ 443 24
Summary of Direction of Bias
Econ 443 25
Omitted Variable Bias Summary
Two cases where bias is equal to zero
n b2 = 0, that is x2 doesn’t really belong in model
n x1 and x2 are uncorrelated in the sample
Econ 443 26
Variance of the OLS Estimators
Now we know that the sampling
distribution of our estimate is centered
around the true parameter
Want to think about how spread out this
distribution is
Much easier to think about this variance
under an additional assumption, so
Assume Var(u|x1, x2,…, xk) = s2
(Homoskedasticity)
Econ 443 27
Variance of OLS (cont)
Let x stand for (x1, x2,…xk)
Assuming that Var(u|x) = s2 also implies
that Var(y| x) = s2
Econ 443 28
Variance of OLS (cont)
Given the Gauss - Markov Assumptions
( ) s 2
Var bˆ j = , where
(
SST j 1 - R 2
j )
SST j = å (xij - x j ) and R is the R
2 2 2
j
Econ 443 30
Misspecified Models
( )
s 2
~ ~ ~ ~
y = b 0 + b1 x1 , so that Var b1 =
SST1
( )
~
( )
Thus, Var b < Var bˆ unless x and
1 1 1
Econ 443 31
Misspecified Models (cont)
While the variance of the estimator is
smaller for the misspecified model, unless
b2 = 0 the misspecified model is biased
Econ 443 32
Estimating the Error Variance
We don’t know what the error variance, s2,
is, because we don’t observe the errors, ui
Econ 443 33
Error Variance Estimate (cont)
sˆ = (å uˆ
2
) (n - k - 1) º SSR df
2
i
df = n – (k + 1), or df = n – k – 1
df (i.e. degrees of freedom) is the (number
of observations) – (number of estimated
parameters)
Econ 443 34
The Gauss-Markov Theorem
Given our 5 Gauss-Markov Assumptions it
can be shown that OLS is “BLUE”
Best
Linear
Unbiased
Estimator
Thus, if the assumptions hold, use OLS
Econ 443 35