0% found this document useful (0 votes)
61 views29 pages

Econometrics I: Professor William Greene Stern School of Business Department of Economics

This document provides an overview of least squares algebra. It begins by introducing some vocabulary and objectives, such as learning about the conditional mean function and estimating parameters β and σ2. It then discusses different fitting criteria like least absolute deviations and least squares. The bulk of the document derives the least squares normal equations and solution, showing that the least squares estimator β mimics population properties. It also shows that the least squares solution minimizes the sum of squared residuals.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views29 pages

Econometrics I: Professor William Greene Stern School of Business Department of Economics

This document provides an overview of least squares algebra. It begins by introducing some vocabulary and objectives, such as learning about the conditional mean function and estimating parameters β and σ2. It then discusses different fitting criteria like least absolute deviations and least squares. The bulk of the document derives the least squares normal equations and solution, showing that the least squares estimator β mimics population properties. It also shows that the least squares solution minimizes the sum of squared residuals.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Econometrics I

Professor William Greene


Stern School of Business
Department of Economics

3-1/29 Part 3: Least Squares Algebra


Econometrics I

Part 3 – Least Squares Algebra

3-2/29 Part 3: Least Squares Algebra


Vocabulary
 Some terms to be used in the discussion.
 Population characteristics and entities vs.
sample quantities and analogs
 Residuals and disturbances
 Population regression line and sample regression
 Objective: Learn about the conditional mean
function. ‘Estimate’  and 2
 First step: Mechanics of fitting a line (hyperplane) to
a set of data

3-3/29 Part 3: Least Squares Algebra


Fitting Criteria
 The set of points in the sample
 Fitting criteria - what are they:
 LAD: Minimizeb  |y – x’bLAD|
 Least squares: Minimizeb  (y – x’bLS)2
 and so on
 Why least squares?
A fundamental result:
Sample moments are “good” estimators of
their population counterparts
We will examine this principle and apply it to least
squares computation.

3-4/29 Part 3: Least Squares Algebra


An Analogy Principle for Estimating 
In the population E[y | X ] = X so
E[y - X |X] = 0
Continuing (assumed) E[xi i] = 0 for every i
Summing, Σi E[xi i] = Σi 0 = 0
Exchange Σi and E[] E[Σi xi i] = E[ X ] = 0
E[X(y - X) ] = 0
So, if X is the conditional mean, then E[X’] = 0.
We choose b, the estimator of , to mimic this population
result: i.e., mimic the population mean with the sample
mean
1 1
Find b such that X e = 0  X(y - Xb)

n n
As we will see, the solution is the least squares coefficient
vector.

3-5/29 Part 3: Least Squares Algebra


Population Moments

We assumed that E[i|xi] = 0. (Slide 2:40)


It follows that Cov[xi,i] = 0.
Proof: Cov(xi,i) = Cov(xi,E[i |xi]) = Cov(xi,0) = 0.
(Theorem B.2). If E[yi|xi] = xi’, then
 = (Var[xi])-1 Cov[xi,yi].
Proof: Cov[xi,yi] = Cov[xi,E[yi|xi]]=Cov[xi,xi’]
This will provide a population analog to the statistics we
compute with the data.

3-6/29 Part 3: Least Squares Algebra


U.S. Gasoline Market, 1960-1995

3-7/29 Part 3: Least Squares Algebra


Least Squares
 Example will be, Gi regressed on
xi = [1, PGi , Yi]

 Fitting criterion: Fitted equation will be


yi = b1xi1 + b2xi2 + ... + bKxiK.

 Criterion is based on residuals:


ei = yi - b1xi1 + b2xi2 + ... + bKxiK
Make ei as small as possible.
Form a criterion and minimize it.

3-8/29 Part 3: Least Squares Algebra


Fitting Criteria


n
 Sum of residuals: e
i1 i

 i1 i
n 2
 Sum of squares: e


n
 Sum of absolute values of residuals: i1
ei


n
 Absolute value of sum of residuals e
i1 i

We focus on e now and  i1 ei later


n 2 n
 i1 i

3-9/29 Part 3: Least Squares Algebra


Least Squares Algebra

 e   i 1 (yi - xi b) 2 = ee = (y - Xb)'(y - Xb)


n 2 n
i 1 i

Matrix and vector derivatives.


Derivative of a scalar with respect to a vector
Derivative of a column vector wrt a row vector
Other derivatives

3-10/29 Part 3: Least Squares Algebra


Least Squares Normal Equations

 e   i 1 (yi - xi b) 2
n n
 (yi - xi b) 2
2

  i 1
n
i 1 i

b b b
  i 1 2(yi - xi b)(-xi ) = -2 i 1 xi yi  2 i 1 xi xi b
n n n

= -2  n
i 1  
xi y i  2
n
i 1
xi xi b
= -2X'y + 2X'Xb

3-11/29 Part 3: Least Squares Algebra


Least Squares Normal Equations

(y - Xb)'(y - Xb)


 2 X'(y - Xb) = 0
b
(11) / ( K 1) (-2)(n  K )'( n 1)
= (-2)(K  n )( n 1) = K 1
Note: Derivative of (11) wrt K 1 vector is a K 1 vector.
Solution:  2X'(y - Xb) = 0  X'y = X'Xb

3-12/29 Part 3: Least Squares Algebra


Least Squares Solution

Assuming it exists: b = (X'X)-1X'y


Note the analogy:  =  Var(x)   Cov(x,y) 
1

1 1
1  1  1 n   1 n 
b =  X'X   X'y     i ` xi xi    i ` xi yi 
n  n  n  n 
Suggests something desirable about least squares

3-13/29 Part 3: Least Squares Algebra


Second Order Conditions
Necessary Condition : First derivatives = 0
(y - Xb)'(y - Xb)
 2 X'(y - Xb)
b
Sufficient Condition : Second derivatives ...
  (y - Xb)'(y - Xb) 
 
 (y - Xb)'(y - Xb)
2
 b 
=
bb b
 2X'(y - Xb)   2X'y   2 X'  -Xb   2 X'Xb
    0
b b b b
 K 1 column vector
= = K  K matrix
 1 K row vector
= 2X'X
3-14/29 Part 3: Least Squares Algebra
Side Result: Sample Moments
 in1 xi21 in1 xi1 xi 2 ... in1 xi1 xiK 
 n 
 x x  n
x 2
... in1 xi 2 xiK 
X'X =  i 1 i 2 i1 i 1 i 2
 ... ... ... ... 
 n 
i 1 xiK xi1 i 1 xiK xi 2 ... in1 xiK2 
n

 xi21 xi1 xi 2 ... xi1 xiK 


 2 
x x x ... xi 2 xiK 
=in1  i 2 i1 i2
 ... ... ... ... 
 
 xiK xi1 xiK xi 2 ... xiK 
2

 xi1 
x 
=in1  i 2   xi1 xi 2 ... xiK 
 ... 
 
 xik 
=in1xi xi
3-15/29 Part 3: Least Squares Algebra
Does b Minimize e’e?
 in1 xi21 in1 xi1 xi 2 ... in1 xi1 xiK 
 n 
 e'e
2
 x x  n
x 2
... in1 xi 2 xiK 
 2 X'X = 2  i 1 i 2 i1 i 1 i 2

bb'  ... ... ... ... 


 n 
i 1 xiK xi1 i 1 xiK xi 2 ... i 1 xiK 
n n 2

If there were a single b, we would require this to be


positive, which it would be; 2x'x = 2 i 1 xi2  0. OK
n

The matrix counterpart of a positive number is a


positive definite matrix.

3-16/29 Part 3: Least Squares Algebra


A Positive Definite Matrix
Matrix C is positive definite if a'Ca is > 0 for every a.
Generally hard to check. Requires a look at
characteristic roots (later in the course).
For some matrices, it is easy to verify. X'X is
one of these.
a'X'Xa = (a'X')( Xa) = ( Xa)'( Xa) = v'v =  k=1 v 2k  0
K

Could v = 0? v = 0 means Xa = 0. Is this possible? No.


Conclusion: b = ( X'X)-1 X'y does indeed minimize e'e.

3-17/29 Part 3: Least Squares Algebra


Algebraic Results - 1

In the population: E[X'] = 0


1 n
In the sample :
n
 i 1
x ie i  0

X e = 0 means for each column of X , xk e = 0


(1) Each column of X is orthogonal to e.
(2) One of the columns of X is a column of ones.


n
i'e = e = 0. The residuals sum to zero.
i 1 i

1 n
(3) It follows that  i 1 ei = 0 which mimics E[i ]  0.
n

3-18/29 Part 3: Least Squares Algebra


Residuals vs. Disturbances
Disturbances (population) y i  x i  i
Partitioning y: y = E[y|X ] + ε
= conditional mean + disturbance
Residuals (sample) y i  x ib  e i
Partitioning y : y = Xb + e
= projection + residual
(Note : Projection into the column space of X , i.e., the
set of linear combinations of the columns of X. Xb is one of these.)

3-19/29 Part 3: Least Squares Algebra


Algebraic Results - 2
 A “residual maker” M = (I - X(X’X)-1X’)
 e = y - Xb= y - X(X’X)-1X’y = My
 My = The residuals that result when y is regressed on X
 MX = 0 (This result is fundamental!)
How do we interpret this result in terms of residuals?
When a column of X is regressed on all of X, we get a
perfect fit and zero residuals.
 (Therefore) My = MXb + Me = Me = e
(You should be able to prove this.
 y = Py + My, P = X(X’X)-1X’ = (I - M).
PM = MP = 0.
 Py is the projection of y into the column space of X.

3-20/29 Part 3: Least Squares Algebra


The M Matrix

 M = I- X(X’X)-1X’ is an nxn matrix


 M is symmetric – M = M’
 M is idempotent – M*M = M
(just multiply it out)
 M is singular; M-1 does not exist.
(We will prove this later as a side result
in another derivation.)

3-21/29 Part 3: Least Squares Algebra


Results when X Contains a Constant Term
 X = [1,x2,…,xK]
 The first column of X is a column of ones
 Since X’e = 0, x1’e = 0 – the residuals sum to
zero. y  Xb + e
Define i  [1,1,...,1]'  a column of n ones


n
i'y = i=1
y i  ny
i'y  i'Xb + i'e = i'Xb so (1/n)i'y = (1/n)i'Xb
implies
y  x b (the regression line passes through the means)
These do not apply if the model has no constant term.

3-22/29 Part 3: Least Squares Algebra


U.S. Gasoline Market, 1960-1995

3-23/29 Part 3: Least Squares Algebra


Least Squares Algebra

3-24/29 Part 3: Least Squares Algebra


Least Squares

3-25/29 Part 3: Least Squares Algebra


Residuals

3-26/29 Part 3: Least Squares Algebra


Least Squares Residuals (autocorrelated)

3-27/29 Part 3: Least Squares Algebra


Least Squares Algebra-3

I X XX X M

M is n  n potentially huge

3-28/29 Part 3: Least Squares Algebra


Least Squares Algebra-4

MX =

3-29/29 Part 3: Least Squares Algebra

You might also like