0% found this document useful (0 votes)
102 views11 pages

Exercise 5.3 Solution Guide

The document discusses estimation methods for a linear regression model including OLS, IV, and GMM. [1] OLS is consistent if the error term is predetermined (uncorrelated with past variables), which allows the model to be written as a conditional expectation. [2] If some regressors are endogenous (correlated with the error), OLS is inconsistent. IV uses additional instruments that are uncorrelated with the error term. [3] The IV estimator is derived using a GMM approach, where the sample moments of the instruments multiplied by the residual are set to zero. This yields an estimator analogous to two-stage least squares.

Uploaded by

EstefanyRojas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
102 views11 pages

Exercise 5.3 Solution Guide

The document discusses estimation methods for a linear regression model including OLS, IV, and GMM. [1] OLS is consistent if the error term is predetermined (uncorrelated with past variables), which allows the model to be written as a conditional expectation. [2] If some regressors are endogenous (correlated with the error), OLS is inconsistent. IV uses additional instruments that are uncorrelated with the error term. [3] The IV estimator is derived using a GMM approach, where the sample moments of the instruments multiplied by the residual are set to zero. This yields an estimator analogous to two-stage least squares.

Uploaded by

EstefanyRojas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Econometrics II, Fall 2017

Department of Economics, University of Copenhagen


Morten Nyboe Tabor

Solution Guide

#5.3 OLS, IV and Linear GMM Estimation


Consider the linear regression model

yt = x01t β1 + x02t β2 + t = x0t β + t , t = 1, 2, ..., T, (5.3)

where x1t is a K1 × 1 vector, x2t is a K2 × 1 vector, while xt = (x01t , x02t )0 and


β = (β10 , β20 )0 are both a K × 1 vectors with K = K1 + K2 . Assume that yt and xt
are stationary and weakly dependent, so that the usual statistical results hold.

(1) (OLS) State the minimal conditions for consistency of the OLS estimator
in the regression model (5.3). How is this related to the interpretation of
(5.3) as a conditional expectation: E[yt | xt ] = x0t β ?
Discuss how this assumption can be used to construct moment conditions,

g(β) = E[f (yt , xt , β)] = 0,

for estimating β. Write the corresponding sample moment conditions


T
1X
gT (β) = f (yt , xt , β) = 0,
T
t=1

and derive the OLS estimator, βbOLS .

Now assume that the K2 variables in x2t are endogenous, in the sense

E[x2t t ] 6= 0.

How does that effect the properties of OLS?

1
Solution: The miniman condition for consistency of the OLS estimator in
the linear regression model (5.3) is that the moment condition E(xt t ) = 0 holds.
The assumption of predeterminedness, E(t | xt ) = 0, implies that (5.3) can
be interpreted as a conditional expectation, E(yt | xt ) = x0t β. The assumption of
predeterminedness also implies the moment condition,

E(xt t ) = E(E(xt t | xt )) = E(xt E(t | xt )) = E(xt · 0) = E(0) = 0,

where we have used the predeterminedness assumption E(t | xt ) = 0 in the step


from the third to the fourth expression.
We conclude, that if the linear regression model can be interpreted as rep-
resenting the conditional expectation of yt given xt (so the predeterminedness
assumption holds) then the OLS estimator is consistent as the moment condition
E(xt t ) = 0 is implied by the predeterminedness assumption.

The K moment conditions are given by,

g(β) = E(f (wt , zt , β)) = E(xt t ) = 0, (5.4)

which corresponds to the simple linear expression for the function f (wt , zt , β) =
xt t , where wt are the model variables, wt = (yt , x0t )0 , and zt are the instruments
(here we have zt = 0 as there are no instruments). As there are K moment
conditions and K parameters in β we have exact identification and we can find
a unique set of parameters, β, which satisfies the population moment condtions
in (5.4). By pluggin in, t = yt − x0t β, we get,

E(xt (yt − x0t β)) = E(xt yt − xt x0t β)


= E(xt yt ) − E(xt x0t β)
= E(xt yt ) − E(xt x0t )β
= 0, (5.5)

from which we find the population expression for the parameters β,

β = E(xt x0t )−1 E(xt yt ),

provided that there is no perfect multicollenarity in xt , so that E(xt x0t ) is non-


singular and be inverted.
The corresponding sample moment conditions are given by,
T T
1X 1X
gT (β) = f (wt , zt , β) = xt t = 0. (5.6)
T T
t=1 t=1

As we have exact identification (K equations and K parameters), we can solve


the sample moment conditions and find the method of moments (MM) estimator,

2
βM M . Again, we plug in, t = yt − x0t β, to get,
T
b = 1
X
gT (β) xt t
T
t=1
T
1 X
= xt (yt − x0t β)
b
T
t=1
T T
1X 1X
= x t yt − xt x0t βb
T T
t=1 t=1

= 0, (5.7)

from which we find,


T T
1X 0b 1X
xt xt β = x t yt
T T
t=1 t=1
XT T
−1 X 
0
βbM M = xt xt x t yt , (5.8)
t=1 t=1

which corresponds to the OLS estimator, βbOLS . To compute the estimator the
matrix ( Tt=1 xt x0t , must be non-singular, so that it can be inverted, so we assume
P

that there is no perfect multicolleniarity in xt .

Now, assume that the K2 variables in x2t are endogenous, so that,

E(x2t t ) 6= 0.

That implies that the moment condition, E(xt t ) 6= 0, where xt = (x01t , x02t )0 , so
the MM/OLS estimator is inconsistent.

(2) (IV) Now we assume the existence of K2 new instrumental variables, z2t ,
with the property
E[z2t t ] = 0. (5.9)

Should the new instruments, z2t , fulfill other requirements besides (5.9) for
being valid and relevant instruments?
Define the K × 1 vector of instruments, zt = (x01t , z2t
0 )0 . State the pop-

ulation moment conditions for the instrumental variables (IV) estimator,


βbIV , in this model. Write the corresponding sample moment conditions
and derive the IV estimator.
Discuss why the simple IV estimator does not work if the number of
instruments is larger than the number of parameters.

3
Solution: The K2 instruments in z2t are valid instruments if they satisfy the
moment condition, E(z2t t ) = 0, and they are relevant instruments if they are
correlated with the endogenous variables, so that T1 Tt=1 z2t x02t is non-singular.
P

We next define the K ×1 vector of instruments zt = (x01t , z2t


0 )0 , which contains

the K1 model variables x1t (we say x1t are instruments for themselves) and the
K2 new instruments z2t . We have the two sets of valid moment conditions,

E(x1t t ) = 0 (5.10)
E(z2t t ) = 0, (5.11)

or equivalently, the population moment conditions are given by,

g(β) = E(f (wt , zt , β)) = E(zt t ) = 0, (5.12)

which is enough to identify the parameters β as we have K moment conditions


and K parameters. Pluggin in for t , we get,

g(β) = E(zt t )
= E(zt (yt − x0t β))
= E(zt yt ) − E(zt x0t β)
= E(zt yt ) − E(zt x0t )β
= 0, (5.13)

which identifies the parameters,

β = E(zt x0t )−1 E(zt yt ), (5.14)

provided that E(zt xt ) is non-singular, so that it can be inverted.


The corresponding sample moment conditions are given by,
T
1X
gT (β) = zt t . (5.15)
T
t=1

As we have exact identificiation (K moment conditions and K parameters), we


can derive the method of moments estimator from,
T
1X
gT (β)
b = zt t
T
t=1
T
1X
= zt (yt − x0t β)
T
t=1
T T
1 X 1X
= zt yt − zt x0t β
T T
t=1 t=1

= 0, (5.16)

4
which gives the estimator,
 T −1 XT
1X 0 1
βbM M = zt x t zt yt . (5.17)
T T
t=1 t=1

provided that the instruments, zt , are relevant, so that the K × K matrix


1 PT 0
T t=1 zt xt is non-singular. Note, that the method of moments estimator is
identical to the usual instrumental variables estimator, βbIV .

Assume that we have instead R instruments, zt , where R > K. In that case,


we have over-identification and no solution exists to,
T
1X
gT (β)
b = zt t = 0, (5.18)
T
t=1

in general, so we cannot derive a closed-form solution for the estimator. In


particular, the R × Kmatrix T1 Tt=1 zt x0t is no longer invertible.
P

(3) (GMM) Now assume that the number of instruments in zt , R, is larger


than the number of parameters, K. Explain the intuition for the GMM
estimator by referring to the quadratic form:

QT (β) = gT (β)0 WT gT (β). (5.19)

What is the role of the weight matrix WT , and how should it be optimally
chosen?
State the sample moments, gT (β), for the case R > K. Insert the moment
conditions in (5.19) and derive the GMM estimator for a given weight
matrix, βbGM M (WT ), as the solution to

∂QT (β)
= 0.
∂β

Can you think of any difficulties in implementing GMM estimation in prac-


tice?

Solution: We now have R instruments in zt with R > K. Note, that a subset


of the instruments can be the model variables, for example, zt can still contain
the model variables x1t .
We have the R population moment conditions,

g(β) = E(f (wt , zt , β)) = E(zt t ) = 0, (5.20)

and K parameters in β.

5
The corresponding sample moment conditions are,
T T
1X 1X
gT (β) = f (wt , zt , β) = zt t , (5.21)
T T
t=1 t=1

but as we have over-identification, R > K, the is no solution to gT (β) = 0.


To derive the generalized method of moments (GMM) estimator, we instead
consider the quadratic form,
 T 0  XT 
0 1X 1
QT (β) = gT (β) WT gT (β) = zt t WT zt t , (5.22)
T T
t=1 t=1

where WT is a symmetric and positive definite weight matrix of dimensions


R × R. As WT is positive definite, a positive weight is attached to all sample
moments. The weight matrix, WT , attaches weights to the R sample moments
in the quadratic form. Note, that as gT (β) is an R × 1 vector of sample moments
and WT has dimensions R × R, the quadratic form QT (β) is a scalar term. Given
the data (yt , xt , and zt ), some weight matrix, WT , and some values for the pa-
rameters, β, the quadratic form is simply a number. Also note, that since QT (β)
is a quadratic form, it holds that QT (β) ≥ 0.
The GMM estimator is derived by minimizing the quadratic form,
T
 0  X T 
1X 1
βbGM M = arg minβ {QT (β)} = arg minβ zt t WT zt t ,
T T
t=1 t=1
(5.23)
which, in this case, we can solve by solving the first-order conditions,
 0  
1 PT 1 PT
∂ T t=1 zt t WT T t=1 zt t
∂QT (β)
= = 0 . (5.24)
∂β ∂β (K×1)

Note, that as we differentiate a scalar function with respect to a (K × 1) vector


of parameters, β, there are K first-order conditions. In the case we consider
here, the function f (wt , zt , β) = zt t = zt yt − zt x0t β, is linear and we can solve
the first-order conditions and find an analytical solution for the GMM estimator,
βbGM M . In cases, for example, in non-linear models, numerical procedures are
used to find the estimator as an analytical solution cannot be derived.

To derive the GMM estimator, it is convinient to rewrite the moment condi-


tions in matrix notation.

6
Defining the matrices,
 
y1
 y2 
 
Y =  .. 
 (5.25)
(T ×1)  . 
yT
 0  
x1 x11 x12 ... x1K
 0 
 x2   x21 x22 ... x2K 

X =  ..  =  ..
  .. .. ..  (5.26)
(T ×K)  .   . . . . 

x0T xT 1 xT 2 ... xT K
 0  
z1 z11 z12 ... z1R
 0 
 z2   z21 z22 ... z2R 

Z =  ..  =  ..
  .. .. ..  (5.27)
(T ×R) .  . . . . 

zT 0 z T 1 zT 2 ... zT R
 
1
 2 
 
 = ..  ,
 (5.28)
(T ×1) .
T

we can write the model in matrix notation as,

Y = βX + . (5.29)
(T ×1)

The R moment conditions can be written as,

g(β) = E(zt t ) = E(Z 0 ) = E(Z 0 (Y − Xβ)) = E(Z 0 Y − Z 0 Xβ) = 0, (5.30)

and the sample moment conditions as,


T
1X 1
gT (β) = zt t = Z 0 (Y − Xβ). (5.31)
T T
t=1

Finally, the quadratic form can be written as,

QT (β) = gT (β)0 WT gT (β)


 X T 0  XT 
1 1
= zt t WT zt t
T T
t=1 t=1
−1 0
= (T Z (Y − Xβ)) WT (T −1 Z 0 (Y − Xβ))
0

= T −2 (Y 0 Z − β 0 X 0 Z)WT (Z 0 Y − Z 0 Xβ)
= T −2 (Y 0 ZWT Z 0 Y − Y 0 ZWT Z 0 Xβ − β 0 X 0 ZWT Z 0 Y + β 0 X 0 ZWT Z 0 Xβ)
= T −2 (Y 0 ZWT Z 0 Y − 2β 0 X 0 ZWT Z 0 Y + β 0 X 0 ZWT Z 0 Xβ), (5.32)

7
where the last step follows as (Y 0 ZWT Z 0 Xβ) and (β 0 X 0 ZWT Z 0 Y ) are identical
scalar terms.
We differentiate the quadratic form, QT (β) with respect to K parameters in
the vector β,
∂QT (β)
= 0 − 2T −2 X 0 ZWT Z 0 Y + T −2 (X 0 ZWT Z 0 X + (X 0 ZWT Z 0 X)0 )β
∂β
= −2T −2 X 0 ZWT Z 0 Y + T −2 (X 0 ZWT Z 0 X + X 0 ZWT Z 0 X)β
= −2T −2 X 0 ZWT Z 0 Y + T −2 (2X 0 ZWT Z 0 X)β
= −2T −2 X 0 ZWT Z 0 Y + 2T −2 X 0 ZWT Z 0 Xβ (5.33)

where we have used (7∗) and (8∗) in Lecture Note 3, Introduction to Vector and
Matrix Differentiation in the first line.
We find the GMM estimator as the solution to the first-order conditions,

−2T −2 X 0 ZWT Z 0 Y + 2T −2 X 0 ZWT Z 0 X βb = 0


2T −2 X 0 ZWT Z 0 X βb = 2T −2 X 0 ZWT Z 0 Y
 −1
0 0
βGM M = X ZWT Z X
b X 0 ZWT Z 0 Y.

(5.34)

Note, that the GMM estimator depends on the weight matrix WT .

The efficient weight matrix (or the optimal weight matrix ) minimizes the
asymptotic variance of the GMM estimator by setting,

WTopt = ST−1 , (5.35)

where ST is an estimator of the asymptotic variance of the moments, S =


T · V (gT (β)). Hence, moments with a smaller variance are attached a greater
weight as they are more informative. How to estimate ST depends on whether the
moments are 1) independent and identically distributed (IID) over time, 2) inde-
pendent over time, but heteroskedastic, or 3) autocorrelated and heteroskedastic
over time. In practice, choosing the weight matrix WT is often a difficult part of
GMM estimation.

(4) (GIV) Discuss how the optimal weight matrix, WTopt , can be estimated if
t is identically and independently distributed, IID.
Insert the optimal weight matrix in the formula for the estimator to
obtain βbGM M (WTopt ) and show that it simplifies to the generalized IV esti-
mator. Show that the GIV estimator, βbGIV , can be derived as a two-stage
least squares estimator.

8
Solution: If we assume that t is identically and independently distributed
(IID) over time with variance σ 2 , then the asymptotic variance of the moments,
S, is given by,

S = T · V (gT (β))
 X T 
1
=T ·V f (wt , zt , β)
T
t=1
X T 
1
= ·V f (wt , zt , β)
T
t=1
T  
1 X
= · V f (wt , zt , β)
T
t=1
T  
1 X 0
= · E f (wt , zt , β)f (wt , zt , β)
T
t=1
T  
1 X 0
= · E zt t (zt t )
T
t=1
T  
1 X 0 0
= · E zt t t zt
T
t=1
T  
1 X 2 0
= · E t zt zt
T
t=1
T
σ2 X
 
0
= · E zt zt
T
t=1
σ2 0
= ZZ (5.36)
T
Given that t is assumed IID with variance, σ 2 , a natural consistent estimator of
b2 = T −1 Tt=1 2t , and a consistent estimator of S is,
σ 2 is given by, σ
P

b2 0
σ
ST = Z Z. (5.37)
T
The optimal weight matrix becomes,

WTopt = ST−1 = (T −1 σ
b2 Z 0 Z)−1 , (5.38)

9
and we find the GMM estimator as,
 −1
0 0
βbGM M = X ZWT Z X X 0 ZWT Z 0 Y
 −1
0 −1 2 0
−1 0 −1 0
XZ T σb ZZ ZX X 0 Z T −1 σ
b2 Z 0 Z ZY
 −1
0 2 0
−1 0 −1 0
XZ σ
b ZZ ZX X 0Z σ
b2 Z 0 Z ZY
 −1
0 0
−1 0 −1 0
XZ ZZ ZX X 0Z Z 0Z Z Y, (5.39)

which is identical to the generalized instrumental variables (GIV) estimator,


βbGIV .

Finally, we want to show that the GIV estimator can be derived as a two-
stage least squares (2SLS) estimator. The 2SLS estimator is intuitive easy to
understand. In the first step, we regress X on the instruments Z and save the
predicted values X.b In the second step, we regress Y on the predicted values
from the first step, X,
b to get an estimate of the model parameters β.
We consider again the model in matrix notation,

Y = Xβ + , (5.40)

and are interested in estimating the K parameters β using the R instruments Z.


Step 1: Regress X on the instruments Z,

X = Zγ + U, (5.41)

which gives the OLS estimates,

b = (Z 0 Z)−1 Z 0 X,
γ (5.42)

and the predicted values,

X γ = Z(Z 0 Z)−1 Z 0 X.
b = Zb (5.43)

Step 2: Regress Y on the predicted values X,


b

Y = XB
b + E, (5.44)

which gives the OLS estimates,

B b 0 X)
b = (X b −1 X
b 0 Y. (5.45)

10
Pluggin in the expression for X,
b we find the 2SLS estimator,

B
b2SLS = (X b 0Y
b −1 X
b 0 X)
 −1
0 −1 0 0 0 −1 0
= (Z(Z Z) Z X) Z(Z Z) Z X (Z(Z 0 Z)−1 Z 0 X)0 Y
 −1
0 0 −1 0 0 −1 0
= (X Z(Z Z) Z )Z(Z Z) ZX (X 0 Z(Z 0 Z)−1 Z 0 )Y
 −1
0 0 −1 0
= X Z(Z Z) Z X X 0 Z(Z 0 Z)−1 Z 0 Y, (5.46)

which is identical to the GIV estimator and the GMM estimator based on an
optimal weight matrix derived under the assumption of IID residuals.

11

You might also like