0% found this document useful (0 votes)
46 views4 pages

ECON 5350 Class Notes Least Squares: 2.1 The Problem

This document discusses the least squares method for estimating population parameters from a regression equation. It introduces the method and objective function to be minimized, and provides examples to estimate coefficients. It also covers related topics like goodness of fit, partitioned regressions, and the algebra of least squares.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views4 pages

ECON 5350 Class Notes Least Squares: 2.1 The Problem

This document discusses the least squares method for estimating population parameters from a regression equation. It introduces the method and objective function to be minimized, and provides examples to estimate coefficients. It also covers related topics like goodness of fit, partitioned regressions, and the algebra of least squares.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

ECON 5350 Class Notes

Least Squares

1 Introduction

We are interested in estimating the population parameters from the regression equation

Y =X + :

The population values are , 2


and . Their sample counterparts are b, ^ 2 and e. The sample counterpart

to the error term ( ) is called the residual (e). The two are related according to

Y = X + = Xb + e:

2 Least Squares

2.1 The Problem

We want to estimate the parameter by choosing a …tting criterion that makes the sample regression line

as close as possible to the data points. Our criterion is

min e0 e = (Y Xb)0 (Y Xb) = Y 0 Y b0 X 0 Y Y 0 Xb + b0 X 0 Xb: (1)

The criterion is minimized by choosing b. Taking the (vector) derivative with respect to b and setting equal

to zero gives
@e0 e
= 2X 0 Y + 2X 0 Xb = 0: (2)
@b

Provided X 0 X is nonsingular (guaranteed by Classical assumption two), we solve to get

b = (X 0 X) 1
X 0 Y: (3)

The second-order condition gives


@ 2 (e0 e)
= 2X 0 X
@b@b0

which satis…es the condition for a minimum since X 0 X is a positive-de…nite matrix if X is of full rank (Greene

A-114).

1
2.2 Example: UW Enrollment and Energy Prices

Consider the bivariate regression over the sample period 1957-2006 where the variables are

Y = UW resident undergraduate enrollment &

X = price of oil.

Assume the population regression equation is

yt = 1 + 2 xt + t:

The objective is to choose b1 and b2 to minimize

XT XT
e2t = (yt b1 b2 xt )2
t=1 t=1

which gives the two …rst-order conditions

P X
@( t e2t )
= 2 (yt b1 b2 xt ) = 0 (4)
@b t
P1 2 X
@( t et )
= 2 (yt b1 b2 xt )xt = 0: (5)
@b2 t

Equations (4) and (5) can be arranged to produce the normal equations

X X
yt = T b1 + b2 xt
t t
X X X
yt xt = b1 xt + b2 x2t :
t t t

Finally, solving for b1 and b2 gives

b1 = y b2 x
P
t (y
Pt
y)(xt x)
b2 = :
t (x t x)2

This is the same answer you get via matrix algebra b = (b1 ; b2 )0 = (X 0 X) 1
(X 0 Y ) for appropriately de…ned

X and Y . See MATLAB example 10 for more details.

2.3 Algebra of Least Squares

Consider the normal equations

X 0 (Y Xb) = X 0 e = 0: (6)

Three interesting results from equation 6 (assuming a constant term).

2
P
1. First column of X implies i ei = 0. Positive and negative residuals exactly cancel out.
P
2. i ei = 0 implies that e = Y Xb = 0, which implies Y = Xb. The regression hyperplane passes

through the sample mean.

3. Y^ 0 e = (Xb)0 e = b0 X 0 e = 0. The …tted values are orthogonal to the residuals.

2.4 Partitioned and Partial Regressions

Let a regression have two sets of explanatory variables, X1 and X2 , such that

Y = X1 1 + X2 2 + :

The normal equations can be written in partitioned form as

2 32 3 2 3
0
6X1 X1 X10 X2 7 6b1 7 0
6X1 Y 7
4 54 5 = 4 5:
X20 X1 X20 X2 b2 X20 Y

Solving for b2 gives

b2 = [X20 (I X1 (X10 X1 ) 1
X10 )X2 ] 1
[X20 (I X1 (X10 X1 ) 1
X10 )Y ]

= [X20 M1 X2 ] 1
[X20 M1 Y ];

where M1 = I X1 (X10 X1 ) 1
X10 can be interpreted as a residual-maker matrix, (i.e., premultiplying

any conformable matrix by M1 will generate the residuals associated with a regression on X1 ). Note the

following:

De…ne eY 1 = M1 Y:

De…ne e21 = M1 X2 :

M1 is symmetric and idempotent (i.e., M1 = M10 M1 = M1 M1 ).

This implies that we can write

b2 = [X20 M1 X2 ] 1
[X20 M1 Y ]

= [e021 e21 ] 1
[e021 eY 1 ]:

This is the result that makes multiple regression analysis so powerful for applied economics. We can

interpret b2 as the impact of X2 on Y while “partialing or netting out” the e¤ect of X1 . The results for b1

are analogous.

3
2.5 Goodness of Fit and Analysis of Variance

We will now assess how well the regression model …ts the data. Begin by writing the sample regression

equation Y = Xb + e in deviation from its mean form using the following matrix

2 3
1 n1 1 1
6 n n 7
6 1 7
1 0 6 1 1 1 7
6 n n n 7
M 0 = (In ii ) = 6 . .. .. 7
n 6 . 7
6 . . . 7
4 5
1 1 1
n n 1 n

where i is the unit column vector. We can then write

Y Y = M 0 Y = M 0 (Xb + e) = M 0 Xb + e: (7)

Premultiplying (7) by itself transposed, and noting that M 0 is a symmetric and idempotent matrix, gives

(Y Y )0 (Y Y ) = Y 0 M 0 Y = b0 X 0 M 0 Xb + e0 e

or SST = SSR+SSE, where the three terms stand for total, regression and error sum of squares, respectively.

A natural measure of goodness of …t is

SSR SSE
R2 = =1 :
SST SST

A few notes about R2

0 R2 1:

By adding additional explanatory variables, you can never make R2 smaller.

SSE=(n k)
An alternative measure is R2 = 1 SST =(n 1) , the adjusted R2 . This measure adds a penalty for

additional explanatory variables.

Be cautious interpreting R2 when no constant is included.

Value of R2 will depend on the type of data (e.g., cross-sectional data tends to produce low R2 s and

time series data often produces high R2 s).

Comparing R2 s requires comparable dependent variables.

You might also like