Multivariate Regression, slides
Multivariate Regression, slides
(Chapters 6, 7.1)
Introduction
E [ûi · Educationi ] = 0
E [ûi · AFQTi ] = 0
E [ûi ] = 0
where ûi = Yi − [β̂0M + β̂1M · Educationi + β̂2M · AFQTi ]
Interpreting a multivariate regression function
Earnings by education
400
300
Annual earnings (1,000$)
200
100
0
8 10 12 14 16 18 20
Years of education
Income and education
Earnings by education
150
Annual earnings (1,000$)
100
50
0
8 10 12 14 16 18 20
Years of education
Income and education
Earnings by education
150
Slope = 10.1
Annual earnings (1,000$)
100
50
0
8 10 12 14 16 18 20
Years of education
Income and education regressions
Education by AFQT
20
18
16
Years of education
14
12
10
8
0 20 40 60 80 100
Education by AFQT
20
18
16
Years of education
14
12
10
8
0 20 40 60 80 100
400
300
Annual earnings (1,000$)
200
100
0
−5 0 5
150
Annual earnings (1,000$)
100
50
0
−4 −2 0 2 4
150
Slope = 6.7
Annual earnings (1,000$)
100
50
0
−4 −2 0 2 4
15
10
Y
5
0
−5
−5 0 5 10
X
R 2 visual
15
Linear fit
Unconditonal mean
10
Y
5
0
−5
−5 0 5 10
X
R 2 visual
R 2 visual
R 2 formula
N
X
TSS = (Yi − Ȳ )2
i=1
N
X N
X
2
SSR = (Yi − Ŷi ) = ûi2
i=1 i=1
SSR
R2 = 1 −
TSS
Share of squared residuals we eliminate by using fitted values rather
than mean
Comments on R 2
Outcome: ∆ GDP
∆G -0.87 -0.96 -0.83
(SE) (0.89) (1.04) (0.82)
Lagged ∆ G 0.29
(0.76)
Lagged ∆ GDP -0.14
(0.30)
Variance of û 20,322 20,287 19,927
Residual variation of ∆ G 442 401 442
Note: constant term not shown
Tip: Adding the lagged outcome variable is often a good trick – good
explanatory power, often not as correlated with other regressors
“OVB for SEs” takeaways