Lecture 7. Multiple Regression
Lecture 7. Multiple Regression
4 Properties of OLS
1
Based on lecture notes by Nicolas de Roos.
Simon Kwok ECMT5001 L7 1 / 41 Simon Kwok ECMT5001 L7 2 / 41
slope random
y intercept
2 Estimation by OLS parameters error
y= 0 + 1 x1 + 2 x2 + ... k xk +u
3 Assumptions of OLS
dependent independent
variable variables
4 Properties of OLS
Simon Kwok ECMT5001 L7 Multiple regression 3 / 41 Simon Kwok ECMT5001 L7 Multiple regression 4 / 41
Multiple regression Multiple regression
Simon Kwok ECMT5001 L7 Multiple regression 5 / 41 Simon Kwok ECMT5001 L7 Multiple regression 6 / 41
Simon Kwok ECMT5001 L7 Multiple regression 7 / 41 Simon Kwok ECMT5001 L7 Multiple regression 8 / 41
Outline OLS estimation
Start with a random sample
Simon Kwok ECMT5001 L7 Estimation by OLS 9 / 41 Simon Kwok ECMT5001 L7 Estimation by OLS 10 / 41
Simon Kwok ECMT5001 L7 Estimation by OLS 13 / 41 Simon Kwok ECMT5001 L7 Estimation by OLS 14 / 41
Simon Kwok ECMT5001 L7 Estimation by OLS 15 / 41 Simon Kwok ECMT5001 L7 Estimation by OLS 16 / 41
Multiple regression Multiple regression
Example (Arrest records)
Example (Arrest records)
Suppose an additional explanatory variable is added
\
narr 86 = 0.712 0.150pcnv 0.034ptime86 0.104qemp86
n = 2, 725, R 2 = 0.0413 \
narr 86 = .707 .151pcnv + .0074avgsen .037ptime86 .103qemp86
narr 86: number of times arrested in 1986 n = 2, 725, R 2 = 0.0422
pcnv : proportion of prior arrests that led to a conviction avgsen: average sentence in prior convictions
ptime86: months in prison in 1986 Interpretation
qemp86: quarters employed in 1986 longer prior sentence increases the number of arrests
Interpretation limited additional explanatory power (small increase in R 2 )
another quarter of employment is associated with a decrease in arrest Comment on R 2
rate of 0.104
even if R 2 is small, regression may still provide good estimates of
a 10% point increase in the probability of prior conviction is ceteris paribus e↵ects
associated with a decrease in the arrest rate of 0.015
Simon Kwok ECMT5001 L7 Estimation by OLS 17 / 41 Simon Kwok ECMT5001 L7 Estimation by OLS 18 / 41
Simon Kwok ECMT5001 L7 Assumptions of OLS 19 / 41 Simon Kwok ECMT5001 L7 Assumptions of OLS 20 / 41
Standard assumptions for multiple regression Examples of perfect collinearity
Small samples
Assumption MLR.3 (No perfect collinearity)
in the sample (and also in the population), none of the independent avgscore = 0 + 1 expend + 2 avginc +u
variables is constant and there are no exact relationships between the avginc might coincidentally be an exact multiple of expend (this is
independent variables rare, even in small samples)
Remarks it will then be impossible to disentangle their e↵ects
the assumption rules out perfect correlation between explanatory Relationships between regressors
variables
if an explanatory variable is a perfect linear combination of other voteA = 0 + 1 shareA + 2 shareB +u
explanatory variables it may be eliminated there is an exact linear relationship between shareA and shareB:
constant variables are also ruled out (collinear with intercept) shareA + shareB = 1
either shareA or shareB will have to be dropped
Simon Kwok ECMT5001 L7 Assumptions of OLS 21 / 41 Simon Kwok ECMT5001 L7 Assumptions of OLS 22 / 41
Standard assumptions for multiple regression The zero conditional mean assumption
Assumption MLR.4 (Zero conditional mean)
If avginc were not included in the model it would end up in the error term
and expend would likely be correlated with u
Simon Kwok ECMT5001 L7 Assumptions of OLS 23 / 41 Simon Kwok ECMT5001 L7 Assumptions of OLS 24 / 41
Outline Statistical properties of OLS
E ( ˆj ) = j, j = 0, 1, 2, . . . , k
2 Estimation by OLS
Interpretation
3 Assumptions of OLS in any random sample, the estimated coefficients may be larger or
smaller
on average in repeated samples they will be equal to the values
4 Properties of OLS
determined by the population relationship between y and the
explanatory variables
Simon Kwok ECMT5001 L7 Properties of OLS 25 / 41 Simon Kwok ECMT5001 L7 Properties of OLS 26 / 41
Simon Kwok ECMT5001 L7 Properties of OLS 27 / 41 Simon Kwok ECMT5001 L7 Properties of OLS 28 / 41
Omitted variable bias Omitted variable bias
Example (Wage equation) Consider the more general model with k = 3
y= 0 + 1 x1 + 2 x2 + 3 x3 +u (true model)
wage = 0 + 1 educ + 2 abil +u
y= 0 + 1 x1 + 2 x2 +u (estimated model)
abil = 0 + 1 educ +v
wage = ( 0 + 2 0) +( 1 + 2 1 )educ +( 2v + u) No general statements are possible about the direction of bias
The return to education 1 will be overestimated if abil is omitted because Example (Wage equation)
2 1 > 0.
it will look as if people with high education earn very high wages wage = 0 + 1 educ + 2 exper + 3 abil +u
this is partly because people with more education have greater ability
on average If abil is omitted then the direction of the bias is unclear
however if we know exper is approximately uncorrelated with educ
When is there no omitted variable bias? and abil, then the direction of bias can be analysed as in the simple
if the omitted variable is irrelevant (i.e. 2 = 0) or uncorrelated (i.e. two variable case
1 = 0)
Simon Kwok ECMT5001 L7 Properties of OLS 29 / 41 Simon Kwok ECMT5001 L7 Properties of OLS 30 / 41
where
2
Var (u|educ, exper , tenure) = 2 is the variance of the error term
Pn
This assumption may also be hard to justify SSTj = i=1 (xij x j )2 is the total sample variation in xj
Rj2 is the R 2 from a regression of explanatory variable xj on all other
Short-hand notation explanatory variables (including a constant)
2
Var (u|x) = , x = (x1 , x2 , . . . , xk )
Simon Kwok ECMT5001 L7 Properties of OLS 31 / 41 Simon Kwok ECMT5001 L7 Properties of OLS 32 / 41
Components of the OLS variance Components of the OLS variance
2 2
Var ( ˆj ) = , j = 1, . . . , k Var ( ˆj ) = , j = 1, . . . , k
SSTj (1 Rj2 ) SSTj (1 Rj2 )
Simon Kwok ECMT5001 L7 Properties of OLS 33 / 41 Simon Kwok ECMT5001 L7 Properties of OLS 33 / 41
Multicollinearity Multicollinearity
Example (Test scores)
Discussion
with multicollinearity, the e↵ects of di↵erent explanatory variables is
avgscore = 0 + 1 teachexp + 2 matexp + 3 othexp + . . . hard to disentangle
avgscore: average standardised test score of a school dropping some explanatory variables may reduce multicollinearity (but
teachexp: expenditure on teachers it may lead to omitted variable bias!)
matexp: expenditure on instructional materials only the sampling variance of the explanatory variables involved will
othexp: other expenditures be inflated; estimates of other e↵ects may still be precise
note that multicollinearity is not a violation of MLR.3
The di↵erent expenditure categories may be strongly correlated: if a
school has a lot of resources, it will spend a lot on everything multicollinearity can be detected with variance inflation factors
it will be hard to estimate the di↵erential e↵ects of di↵erent spending VIFj = 1/(1 Rj2 )
categories
for precise estimates, we need information from situations in which as a rule of thumb, we should not have VIFj > 10
expenditure categories change di↵erentially
Simon Kwok ECMT5001 L7 Properties of OLS 34 / 41 Simon Kwok ECMT5001 L7 Properties of OLS 35 / 41
Variance in misspecified models Variance in misspecified models
The choice of whether to include a variable depends on the trade o↵
between bias and variance
Case 1: suppose 2 =0
y= 0 + 1 x1 + 2 x2 +u (true model)
ŷ = ˆ0 + ˆ1 x1 + ˆ2 x2 (estimated model 1) E ( ˆ1 ) = 1, E ( ˜1 ) = 1, Var ( ˜1 ) < Var ( ˆ1 )
ỹ = ˜0 + ˜1 x1 (estimated model 2) conclusion: do not include irrelevant regressors
It may be that the omitted variable bias is compensated by a smaller Case 2: suppose 2 6= 0
variance
2 E ( ˆ1 ) = 1, E ( ˜1 ) 6= 1, Var ( ˜1 ) < Var ( ˆ1 )
Var ( ˆ1 ) = conclusion: tradeo↵ between bias and variance
SST1 (1 R12 )
2 beware: bias will not vanish even in large samples
Var ( ˜1 ) =
SST1
conditional on x1 and x2 , the variance in model 2 is always smaller
than in model 1
Simon Kwok ECMT5001 L7 Properties of OLS 36 / 41 Simon Kwok ECMT5001 L7 Properties of OLS 37 / 41
Simon Kwok ECMT5001 L7 Properties of OLS 38 / 41 Simon Kwok ECMT5001 L7 Properties of OLS 39 / 41
Estimating standard errors Efficiency of OLS
Under assumptions MLR.1 - MLR.5 we know OLS is unbiased
To estimate the standard error of your estimates we would like the unbiased estimator with the smallest variance
1. estimate your regression model to obtain parameter estimates ˆ
2. obtain residuals from your regression, ûi , i = 1, . . . , n Theorem (Gauss-Markov Theorem)
3. estimate the error variance Under assumptions MLR.1 - MLR.5, the OLS estimators are the best
Pn linear unbiased estimators (BLUEs) of the regression coefficients. i.e. for
2
i=1 ûi all j = 0, 1, . . . , k,
ˆ2 =
n k 1 n
X
Var ( ˆj ) Var ( ˜j ), for all ˜j = wij yi for which E ( ˜j ) = j
4. calculate the standard error
i=1
r h i
se( ˆj ) = ˆ 2 / SSTj (1 Rj2 ) Notes
˜j are linear estimators (the OLS estimator is of this form)
OR use a software package! OLS is only the best estimator if MLR.1 - MLR.5 hold
e.g. if there is heteroskedasticity, there are more efficient estimators
Simon Kwok ECMT5001 L7 Properties of OLS 40 / 41 Simon Kwok ECMT5001 L7 Properties of OLS 41 / 41