ECON 5350 Class Notes Least Squares: 2.1 The Problem
ECON 5350 Class Notes Least Squares: 2.1 The Problem
Least Squares
1 Introduction
We are interested in estimating the population parameters from the regression equation
Y =X + :
to the error term ( ) is called the residual (e). The two are related according to
Y = X + = Xb + e:
2 Least Squares
We want to estimate the parameter by choosing a …tting criterion that makes the sample regression line
The criterion is minimized by choosing b. Taking the (vector) derivative with respect to b and setting equal
to zero gives
@e0 e
= 2X 0 Y + 2X 0 Xb = 0: (2)
@b
b = (X 0 X) 1
X 0 Y: (3)
which satis…es the condition for a minimum since X 0 X is a positive-de…nite matrix if X is of full rank (Greene
A-114).
1
2.2 Example: UW Enrollment and Energy Prices
Consider the bivariate regression over the sample period 1957-2006 where the variables are
X = price of oil.
yt = 1 + 2 xt + t:
XT XT
e2t = (yt b1 b2 xt )2
t=1 t=1
P X
@( t e2t )
= 2 (yt b1 b2 xt ) = 0 (4)
@b t
P1 2 X
@( t et )
= 2 (yt b1 b2 xt )xt = 0: (5)
@b2 t
Equations (4) and (5) can be arranged to produce the normal equations
X X
yt = T b1 + b2 xt
t t
X X X
yt xt = b1 xt + b2 x2t :
t t t
b1 = y b2 x
P
t (y
Pt
y)(xt x)
b2 = :
t (x t x)2
This is the same answer you get via matrix algebra b = (b1 ; b2 )0 = (X 0 X) 1
(X 0 Y ) for appropriately de…ned
X 0 (Y Xb) = X 0 e = 0: (6)
2
P
1. First column of X implies i ei = 0. Positive and negative residuals exactly cancel out.
P
2. i ei = 0 implies that e = Y Xb = 0, which implies Y = Xb. The regression hyperplane passes
Let a regression have two sets of explanatory variables, X1 and X2 , such that
Y = X1 1 + X2 2 + :
2 32 3 2 3
0
6X1 X1 X10 X2 7 6b1 7 0
6X1 Y 7
4 54 5 = 4 5:
X20 X1 X20 X2 b2 X20 Y
b2 = [X20 (I X1 (X10 X1 ) 1
X10 )X2 ] 1
[X20 (I X1 (X10 X1 ) 1
X10 )Y ]
= [X20 M1 X2 ] 1
[X20 M1 Y ];
where M1 = I X1 (X10 X1 ) 1
X10 can be interpreted as a residual-maker matrix, (i.e., premultiplying
any conformable matrix by M1 will generate the residuals associated with a regression on X1 ). Note the
following:
De…ne eY 1 = M1 Y:
De…ne e21 = M1 X2 :
b2 = [X20 M1 X2 ] 1
[X20 M1 Y ]
= [e021 e21 ] 1
[e021 eY 1 ]:
This is the result that makes multiple regression analysis so powerful for applied economics. We can
interpret b2 as the impact of X2 on Y while “partialing or netting out” the e¤ect of X1 . The results for b1
are analogous.
3
2.5 Goodness of Fit and Analysis of Variance
We will now assess how well the regression model …ts the data. Begin by writing the sample regression
equation Y = Xb + e in deviation from its mean form using the following matrix
2 3
1 n1 1 1
6 n n 7
6 1 7
1 0 6 1 1 1 7
6 n n n 7
M 0 = (In ii ) = 6 . .. .. 7
n 6 . 7
6 . . . 7
4 5
1 1 1
n n 1 n
Y Y = M 0 Y = M 0 (Xb + e) = M 0 Xb + e: (7)
Premultiplying (7) by itself transposed, and noting that M 0 is a symmetric and idempotent matrix, gives
(Y Y )0 (Y Y ) = Y 0 M 0 Y = b0 X 0 M 0 Xb + e0 e
or SST = SSR+SSE, where the three terms stand for total, regression and error sum of squares, respectively.
SSR SSE
R2 = =1 :
SST SST
0 R2 1:
SSE=(n k)
An alternative measure is R2 = 1 SST =(n 1) , the adjusted R2 . This measure adds a penalty for
Value of R2 will depend on the type of data (e.g., cross-sectional data tends to produce low R2 s and