0% found this document useful (0 votes)
45 views63 pages

Perasan Common Correlated Effects Estimation

This document summarizes a working paper that extends the common correlated effects (CCE) approach to estimate heterogeneous dynamic panel data models with weakly exogenous regressors. The CCE mean group estimator remains valid if a sufficient number of lags of cross-sectional averages are included in individual equations and the number of cross-sectional averages is at least as large as the number of unobserved common factors. The paper establishes consistency rates, derives the asymptotic distribution, and considers bias correction methods. Extensive Monte Carlo experiments show the proposed estimators perform well when the time series dimension is sufficiently large.

Uploaded by

Esaie Olou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views63 pages

Perasan Common Correlated Effects Estimation

This document summarizes a working paper that extends the common correlated effects (CCE) approach to estimate heterogeneous dynamic panel data models with weakly exogenous regressors. The CCE mean group estimator remains valid if a sufficient number of lags of cross-sectional averages are included in individual equations and the number of cross-sectional averages is at least as large as the number of unobserved common factors. The paper establishes consistency rates, derives the asymptotic distribution, and considers bias correction methods. Extensive Monte Carlo experiments show the proposed estimators perform well when the time series dimension is sufficiently large.

Uploaded by

Esaie Olou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 63

Chudik, Alexander; Pesaran, M.

Hashem

Working Paper
Common Correlated Effects Estimation of
Heterogeneous Dynamic Panel Data Models with
Weakly Exogenous Regressors

CESifo Working Paper, No. 4232

Provided in Cooperation with:


Ifo Institute – Leibniz Institute for Economic Research at the University of Munich

Suggested Citation: Chudik, Alexander; Pesaran, M. Hashem (2013) : Common Correlated


Effects Estimation of Heterogeneous Dynamic Panel Data Models with Weakly Exogenous
Regressors, CESifo Working Paper, No. 4232, Center for Economic Studies and ifo Institute
(CESifo), Munich

This Version is available at:


http://hdl.handle.net/10419/74513

Standard-Nutzungsbedingungen: Terms of use:

Die Dokumente auf EconStor dürfen zu eigenen wissenschaftlichen Documents in EconStor may be saved and copied for your
Zwecken und zum Privatgebrauch gespeichert und kopiert werden. personal and scholarly purposes.

Sie dürfen die Dokumente nicht für öffentliche oder kommerzielle You are not to copy documents for public or commercial
Zwecke vervielfältigen, öffentlich ausstellen, öffentlich zugänglich purposes, to exhibit the documents publicly, to make them
machen, vertreiben oder anderweitig nutzen. publicly available on the internet, or to distribute or otherwise
use the documents in public.
Sofern die Verfasser die Dokumente unter Open-Content-Lizenzen
(insbesondere CC-Lizenzen) zur Verfügung gestellt haben sollten, If the documents have been made available under an Open
gelten abweichend von diesen Nutzungsbedingungen die in der dort Content Licence (especially Creative Commons Licences), you
genannten Lizenz gewährten Nutzungsrechte. may exercise further usage rights as specified in the indicated
licence.
Common Correlated Effects Estimation of
Heterogeneous Dynamic Panel Data Models with
Weakly Exogenous Regressors

Alexander Chudik
M. Hashem Pesaran

CESIFO WORKING PAPER NO. 4232


CATEGORY 12: EMPIRICAL AND THEORETICAL METHODS
MAY 2013

An electronic version of the paper may be downloaded


• from the SSRN website: www.SSRN.com
• from the RePEc website: www.RePEc.org
• from the CESifo website: T www.CESifo-group.org/wp T
CESifo Working Paper No. 4232

Common Correlated Effects Estimation of


Heterogeneous Dynamic Panel Data Models with
Weakly Exogenous Regressors

Abstract

This paper extends the Common Correlated Effects (CCE) approach developed by Pesaran
(2006) to heterogeneous panel data models with lagged dependent variable and/or weakly ex-
ogenous regressors. We show that the CCE mean group estimator continues to be valid but
the following two conditions must be satisfied to deal with the dynamics: a sufficient number
of lags of cross section averages must be included in individual equations of the panel, and
the number of cross section averages must be at least as large as the number of unobserved
common factors. We establish consistency rates, derive the asymptotic distribution, suggest
using covariates to deal with the effects of multiple unobserved common factors, and consider
jackknife and recursive de-meaning bias correction procedures to mitigate the small sample
time series bias. Theoretical findings are accompanied by extensive Monte Carlo experiments,
which show that the proposed estimators perform well so long as the time series dimension of
the panel is sufficiently large.
JEL-Code: C310, C330.
Keywords: large panels, lagged dependent variable, cross sectional dependence, coefficient
heterogeneity, estimation and inference, common correlated effects, unobserved common
factors.

Alexander Chudik M. Hashem Pesaran


Federal Reserve Bank of Dallas Department of Economics
2200 N. Pearl Street University of Southern California
Dallas / Texas / USA 3620 South Vermont Ave
[email protected] Los Angeles / California / USA
[email protected]

May 2013
We are grateful to Ron Smith, Vanessa Smith, Takashi Yamagata and Qiankun Zhou for
helpful comments. In writing of this paper, Chudik benefited from the visit to the Center for
Applied Financial Economics (CAFE). Pesaran acknowledges financial support from ESRC
grant no. ES/I031626/1. The views expressed in this paper are those of the authors and do not
necessarily reflect those of the Federal Reserve Bank of Dallas or the Federal Reserve
System.
1 Introduction

In a recent paper, Pesaran (2006) proposed the Common Correlated E¤ects (CCE) approach to

estimation of panel data models with multi-factor error structure, which has been further developed

by Kapetanios, Pesaran, and Yagamata (2011), Pesaran and Tosetti (2011), and Chudik, Pesaran,

and Tosetti (2011). The CCE method is shown to be robust to di¤erent types of cross section

dependence of errors, possible unit roots in factors, and slope heterogeneity. However, the CCE

approach as it was originally proposed does not cover the case where the panel includes a lagged

dependent variable and/or weakly exogenous variables as regressors.1 This paper extends the CCE

approach to allow for such regressors. This extension is not straightforward because coe¢ cient

heterogeneity in the lags of the dependent variable introduces in…nite order lag polynomials in the

large N relationships between cross-sectional averages and the unobserved factors (Chudik and

Pesaran, 2013a). Our focus is on stationary heterogenous panels with weakly exogenous regressors

where the cross-sectional dimension (N ) and the time series dimension (T ) are su¢ ciently large.

We focus on estimation and inference of the mean coe¢ cients, and consider the application of bias

correction techniques to deal with the small T bias of the estimators.

Recent literature on large dynamic panels focuses mostly on how to deal with cross-sectional

(CS) dependence assuming slope homogeneity. Estimation of panel data models with lagged de-

pendent variables and cross-sectionally dependent errors has been considered in Moon and Weidner

(2010a and 2010b), who propose a Gaussian quasi maximum likelihood estimator (QMLE).2 Moon

and Weidner’s analysis assumes homogeneous coe¢ cients, and therefore is not applicable to dynamic

panels with heterogenous coe¢ cients.3 Similarly, the interactive-e¤ects estimator (IFE) developed

by Bai (2009) also allows for cross-sectionally dependent errors, but assumes homogeneous slopes.4

Song (2013) extends the analysis of Bai (2009) by allowing for a lagged dependent variable as well as

coe¢ cient heterogeneity, but provides results on the estimation of cross-section speci…c coe¢ cients

only. This paper provides an alternative CCE type estimation approach to Song’s extension of the
1
See Everaert and Groote (2012) who derive asymptotic bias of CCE pooled estimators in the case of dynamic
homogeneous panels.
2
See also Lee, Moon, and Weidner (2011) for an extension of this framework to panels with measurement errors.
3
Pesaran and Smith (1995) show that in the presence of coe¢ cient heterogeneity pooled estimators are inconsistent
in the case of panel data models with lagged dependent variables.
4
Earlier literature on large panels typically ignores cross section dependence of errors, including pooled mean
group estimation proposed by Pesaran, Shin, and Smith (1999), fully modi…ed OLS estimation by Pedroni (2000) or
the panel dynamic OLS estimation by Mark and Sul (2003). These papers can also handle panels with nonstationary
data. There is also a large literature on dynamic panels with large N but …nite T , which assumes homogeneous
slopes.

1
IFE estimator. In addition, we propose a mean group estimator of the mean coe¢ cients, and show

that CCE types estimators once augmented with a su¢ cient number of lags and cross-sectional av-

erages perform well even in the case of dynamic models with weakly exogenous regressors. We also

show that the asymptotic distribution of the CCE estimators developed in the literature continue

to be applicable to the more general setting considered in this paper. Our method could extend to

Song’s IFE and we also investigate the performance of the mean group estimator based on Song’s

unit-speci…c coe¢ cient estimates.

More speci…cally, in this paper we considered estimation of autoregressive distributed lagged

(ARDL) panel data models where the dependent variable of the ith cross section unit at time t;

yit , is explained by its lagged values, current and lagged values of k weakly exogenous regressors,

xit , m unobserved (possibly serially correlated) common factors, ft , and a serially uncorrelated

idiosyncratic error. In addition to the regressors included in the panel ARDL model, following

Pesaran, Smith, and Yamagata (2013) we also assume that there exists a set of additional covariates,

git , that are a¤ected by the same set of unobserved common factors, ft . This seems reasonable

considering that agents in making their decisions face a common set of factors such as technology,

institutional set ups and general economic conditions, which then get manifested in many variables,

whether included in the panel data model under consideration or not. Similar arguments also

underlie forecasting using a large number of regressors popularized recently in econometrics by

Stock and Watson (2002) and Forni et al. (2005).

A necessary condition for the CCE mean group (CCEMG) estimator to be valid in the case of

ARDL panel data models is that the number of cross-sectional averages based on xit and git must

be at least as large as the number of unobserved common factors minus one (m 1). In practice,

where the number of unobserved factors is unknown, it is su¢ cient to assume that the number of

available cross-sectional averages is at least mmax 1, where mmax denotes the assumed maximum

number of unobserved factors. In most economic applications mmax is likely to be relatively small.5

We also report on the small sample properties of CCEMG estimators for panel ARDL models,

using a comprehensive set of Monte Carlo experiments. In particular, we investigate two bias

correction methods, namely the half-panel jackknife due to Dhaene and Jochmans, 2012, and

the recursive mean adjustment due to So and Shin, 1999. We …nd that the proposed estimators
5
Stock and Watson (2002), Giannone, Reichlin, and Sala (2005) conclude that only few, perhaps two, factors
explain much of the predictable variations, while Bai and Ng (2007) estimate four factors and Stock and Watson
(2005) estimate as many as seven factors.

2
have satisfactory performance under di¤erent dynamic parameter con…gurations, and regardless

of the number of unobserved factors, so long as they do not exceed the number of cross-sectional

averages, and the time dimension is su¢ ciently large. We compare the performance of CCEMG

with the mean group estimator based on Song’s IFE, and also with Moon and Weidner’s QMLE,

and Bai’s IFE estimators developed for slope homogeneous ARDL panels. We …nd that jackknife

bias correction is more e¤ective in dealing with the small sample bias than the recursive mean

adjustment procedure. Also, the bias correction seems to be helpful only for the coe¢ cients of the

lagged dependent variable. The uncorrected CCEMG estimators of the coe¢ cients of the regressors,

xit , seem to work …ne even in the case of panels with a relatively small time dimension.

The remainder of the paper is organized as follows. Section 2 extends the multifactor residual

panel data model considered in Pesaran (2006) by introducing lagged dependent variables and

allowing the regressors to be weakly exogenous. Section 3 develops a dynamic version of the

CCEMG estimator for panel ARDL models. Section 4 discusses the jackknife and recursive de-

meaning bias correction procedures. Section 5 introduces the mean group estimator based on

Song’s individual estimates, describes the Monte Carlo experiments, and reports the small sample

results. Mathematical proofs are provided in the Appendix and additional Monte Carlo …ndings

are provided in a Supplement.

2 Panel ARDL Model with a Multifactor Error Structure

Suppose that the dependent variable, yit , the regressors, xit , and the covariates, git , are generated

according to the following linear covariance stationary dynamic heterogenous panel data model,

0 0
yit = cyi + i yi;t 1 + 0i xit + 1i xi;t 1 + uit , (1)

0
uit = i ft + "it , (2)

and 0 1
B xit C 0
! it = @ A = c!i + i yi;t 1 + i ft + vit , (3)
git

for i = 1; 2; :::; N and t = 1; 2; :::; T , where cyi and c!i are individual …xed e¤ects for unit i, xit is

kx 1 vector of regressors speci…c to cross-section unit i at time t, git is kg 1 vector of covariates

3
speci…c to unit i, kx + kg = k, ft is an m 1 vector of unobserved common factors, "it are the

idiosyncratic errors, i is an m k matrix of factor loadings, i is a k 1 vector of unknown

coe¢ cients, and vit is assumed to follow a general linear covariance stationary process distributed

independently of the idiosyncratic errors, "it .

The process for the exogenous variables, (3), can also be written equivalently as a panel ARDL

model in ! it . But we have chosen to work with this particular speci…cation as it allows us to distin-

guish between cases of strict and weak exogeneous regressors in terms of the feed-back coe¢ cients,

i. The case of strictly exogenous regressors, covered in Pesaran (2006), refers to the special case

when i = 0 . As in the earlier literature, the above speci…cation also allows the regressors to
k 1
be correlated with the unobserved common factors. Lags of xit and git are not included in (3),

but they could be readily included. In order to keep the notations and exposition simple we also

abstract from observed common e¤ects, additional lags of the dependent variable, and other deter-

ministic terms in (1) and (3). Such additional regressors can be readily accommodated at the cost

of further notational complexity.

In the above ARDL formulation, we specify the same lag orders for yit and xit because it is

desirable in empirical applications to start with a balanced lag order to avoid potential problems

connected with persistent regressors. It is also worth noting that a number of panel data models

investigated in the literature can be derived as special cases of (1)-(3). The analysis of Moon and

Weidner (2010a and 2010b) assumes that i0 = 0, i1 = 1 and i = . Bai (2009) assumes

i0 = 0, i1 = 1 and i = 0. Under the restriction

1i = i 0i ; (4)

we have
0 0
yit i xit = cyi + i yi;t 1 i xit 1 + uit ,

where i = i1 = i , which in turn can be written as (assuming that j i j < 1)

0 0
yit = cyi + i xit + i ft + "it , (5)

1
where cyi = cyi = (1 i ), "it = (1 i L) "it is a serially correlated error term, and ft is a

new set of unobserved common factors. Estimation and inference in panel model (5) have been

4
studied by Pesaran (2006) who introduced the CCE approach. This approach has been shown

to be robust to an unknown number of unobserved common factors (Pesaran, 2006, and Chudik,

Pesaran, and Tosetti, 2011), possible unit roots in factors (Kapetanios, Pesaran, and Yagamata,

2011), serial correlation of unknown form in "it (Pesaran, 2006), spatial or other forms of weak

cross-sectional dependence in "it (Pesaran and Tosetti, 2011, and Chudik, Pesaran, and Tosetti,

2011). However, if the restrictions set out in (4) on 0i and 1i do not hold then the CCE

approach is no longer applicable and the standard CCE estimators could be seriously biased, even

asymptotically.6 Our objective in this paper is to consider estimation and inference in the panel

ARDL model (1)-(3), where the parameter restrictions (4) do not necessarily hold, and the slope
0 0 0
coe¢ cients i = i; i0 ; i1 are allowed to vary across units.

For future reference, partition matrix i =( xi ; gi ) into m kx and m kg matrices xi and gi ,


0 0
vector = 0 ; 0 into kx 1 and kg 1 vectors and 0 ; v0
i xi gi xi gi , and similarly vit = vxit git

into kx 1 and kg 1 vectors vxit and vgit .

3 Estimation

0 )0 and write (1)-(3) compactly as


Let zit = (yit ; x0it ; git

A0i zit = ci + A1i zi;t 1 + Ci ft + eit , (6)

where ci = (cyi ; c0!i )0 , Ci = ( i ; i)


0
,

0 1 0 1
0 0
B 1 0i 0 C B i 1i 0 C
B 1 kg C B 1 kg C
B C B C
A0i = B
B 0 Ikx 0 C , A1i = B
C B xi 0 0 C,
C
B kx 1 kx kg C B kx kx kx kg C
@ A @ A
0 0 Ikg gi 0 0
kg 1 kg kx kg kx kg kg

0 )0 is a serially correlated error process. A is invertible (for any i) and multiply-


and eit = ("it ; vit 0i

ing (6) by A0i1 , we obtain the following reduced form VAR(1) representation of zit with serially

correlated errors,

zit = czi + Ai zi;t 1 + A0i1 Ci ft + ezit ,


6
See Everaert and Groote (2012) for derivation of asymptotic bias of CCE pooled estimators in the case of dynamic
homogeneous panels.

5
where czi = A0i1 ci , ezit = A0i1 eit , and Ai = A0i1 A1i .

We postulate the following assumptions for the estimation of the short-run coe¢ cients.

ASSUMPTION 1 (Individual Speci…c Errors) The individual speci…c errors "it and vjt0 are in-

dependently distributed for all i; j; t and t0 . The vector of errors "t = ("1t ; "2t ; :::; "N t )0 is spatially

correlated according to

"t = R& "t , (7)

where the N N matrix R has bounded row and column matrix norms, namely kRk1 < K

and kRk1 < K, respectively, for some constant K < 1, which does not depend on N , diagonal

elements of RR0 are bounded away from zero, & "t = (& "1t ; & "2t ; :::; & "N t ) 0 , and & "it , for i = 1; 2; :::; N

and t = 1; 2; ::; T , are independently and identically distributed (IID) with mean 0, unit variances,

and …nite fourth-order moments. For each i = 1; 2; :::; N , vit follows a linear stationary process

with absolute summable autocovariances (uniformly in i),

1
X
vit = Si` & v;i;t ` , (8)
`=0

where & vit is a k 1 vector of IID random variables, with mean zero, variance matrix Ik and …nite

fourth-order moments. In particular,

1
X
kV ar (vit )k = Si` S0i` K < 1, (9)
`=0

for i = 1; 2; :::; N , where kAk is the spectral norm of the matrix A.

ASSUMPTION 2 (Common E¤ ects) The m 1 vector of unobserved common factors, ft =

(f1t ; f2t ; :::; fmt )0 , is covariance stationary with absolute summable autocovariances, distributed in-

dependently of the individual speci…c errors "it0 and vit0 for all i; t and t0 . Fourth moments of f`t ,

for ` = 1; 2; :::; m, are bounded.

ASSUMPTION 3 (Factor Loadings) The factor loadings i, and i are independently and iden-

tically distributed across i, and of the common factors ft , for all i and t, with mean and ,

respectively, and bounded second moments. In particular,

i = + i, i IID 0 ; , for i = 1; 2; :::; N ,


m 1

6
and

vec ( i) = vec ( ) + i, i IID 0 ; , for i = 1; 2; :::; N ,


km 1

where and are m m and k m k m symmetric nonnegative de…nite matrices, k k < K,

k k < K, k k < K, and k k < K.

ASSUMPTION 4 (Heterogenous Coe¢ cients) (2kx + 1) 1 dimensional vector of coe¢ cients


0 0 0
i = i; 0i ; 1i follows the random coe¢ cient model

i = + i, i IID 0 ; , for i = 1; 2; :::; N , (10)


2kx +1 1

0 0 0
where = ; 0; 1 , k k < K, k k < K, is (2kx + 1) (2kx + 1) symmetric nonnegative

de…nite matrix, and the random deviations i are independently distributed of j, j, "jt , vjt ,

and ft for all i,j, and t. Furthermore, the support of i lies strictly inside the unit circle, and

E kci k < K for all i.

0 )0 are
ASSUMPTION 5 (Regressors and Covariates) Regressors and covariates in ! it = (x0it ; git

either strictly exogenous and generated according to the canonical factor model (3) with i = 0 ,
k 1
or weakly exogenous and generated according to (3) with i, for i = 1; 2; :::; N , IID across i and

independently distributed of j; j, j, "jt , vjt , and ft for all i, j and t. In the case where the

regressors are weakly exogenous we also assume:

(i) the support of 1 (Ai ) lies strictly inside the unit circle, for i = 1; 2; :::; N , where Ai =

A0i1 A1i , and 1 (Ai ) denotes the largest eigenvalue (in absolute value) of Ai ; and

P1
(ii) the inverse of polynomial (L) = `=0
`
`L , where ` = E A`i A0i1 , exists and has expo-

nentially decaying coe¢ cients.

Let w = (w1 ; w2 ; :::; wN )0 be an N 1 vector of non-stochastic (or pre-determined) weights that

satis…es the following ‘granularity’conditions

1
kwk = O N 2 , (11)
wi 1
= O N 2 uniformly in i, (12)
kwk

7
and the normalization condition
N
X
wi = 1. (13)
i=1

The weights vector w depends on N , but we suppress the subscript N to simplify notations.

Next, we derive a large N representation for cross-sectional averages of zit following Chudik

and Pesaran (2013a). Since the support of the eigenvalues of Ai is assumed to lie strictly inside

the unit circle, zit is an invertible covariance stationary process and can be written as

1
X
zit = A`i czi + A0i1 Ci ft ` + ez;i;t ` ,
`=0

for i = 1; 2; :::; N . Taking weighted cross-sectional averages of the above and making use of the fact

that under our assumptions the elements of ezit are weakly cross-sectionally dependent, together

with the random coe¢ cients Assumptions 3-5, we have

N X
X 1
wi A`i ez;i;t ` = Op N 1=2
.
i=1 `=0

Since (under Assumptions 3-5) Ai and A0;i are independently distributed of Ci , and Ai , A0;i and

Ci are independently distributed across i, we have

N X
X 1 1
X
wi A`i A0;i1 Ci ft ` = E A`i A0;i1 Ci ft ` + Op N 1=2
,
i=1 `=0 `=0
1=2
= (L) Cf t + Op N ,

where C = E (Ci ) = ( ; )0 . Thus, yielding the following large N representation

1=2
e
zwt = (L) Cf t + Op N ; (14)

where e
zwt = zwt czw is k + 1 dimensional vector of de-trended cross section averages, zwt =
0 )0 =
P N
(ywt ; x0wt ; gwt i=1 wi zit is k + 1 dimensional vector of cross section averages, and czw =
PN
i=1 wi (Ik+1 Ai ) 1 czi .

Multiplying (14) by the inverse of (L) now yields the following large N expression for a linear

8
combination of the unobserved common factors,

1 1=2
Cf t = (L) e
zwt + Op N : (15)

Consider now the special case where i = 0 , and the regressors are strictly exogenous. In this
k 1
0 0 0
case the regressors are independently distributed of the coe¢ cients in i = i; 0;i ; 1;i , which

simpli…es the derivation of the large N representation for e


zwt . In particular, (1 i L) is invertible
1
for any i = 1; 2; :::; N under Assumption 4, and multiplying (1) by (1 i L) we have

1
X 1
X 1
X 1
X 1
X
` ` 0 ` 0 ` 0 `
yit = i cyi + i 0i xi;t ` + i 1i xi;t ` 1 + i i ft ` + i "i;t ` . (16)
`=0 `=0 `=0 `=0 `=0

Taking weighted cross-sectional averages, under Assumptions 1-5, and assuming i = 0 , we


k 1
obtain
0 0 0 1=2
y wt = cyw + a (L) ft + a (L) 0 + 1L xwt + Op N , (17)

and
0 1=2
! wt = c!w + ft + Op N , (18)

PN 1 PN P1 `
where cyw = i=1 wi cy;i (1 i) , c!w = i=1 wi c!i , and a (L) = `=0 a` L with its elements
`
given by the moments of i, namely a` = E i , for ` = 0; 1; 2; :::. Note that under Assumption

4, which constraints the support of i to lie strictly inside the unit circle, the rate of decay of the

coe¢ cients in a (L) is exponential. This restriction on the support of i also ensures the existence of

all moments of i. The rate of decay of the coe¢ cients of a (L) will not necessarily be exponential

if the support of i covered 1, and depending on the properties of the distribution of i in the

neighborhood of 1, a (L) need not be absolute summable, in which case y wt could converge (in

a quadratic mean) to a long memory process as N ! 1. Such possibilities are ruled out by

Assumption 4.

However, under Assumption 4 and By Lemma A.1 of Chudik and Pesaran (2013b), the inverse

of a (L) exists and has exponentially decaying coe¢ cients. Pre-multiplying both sides of (17) by

b (L) = a 1 (L), we obtain

0 0 0 1=2
ft = b (L) y wt b (1) cyw 0 xwt 1 xw;t 1 + Op N . (19)

9
Stacking equations (18) and (19), we obtain (15) with 1 (L) reduced (in the strictly exogenous

case) to 0 1
0 0
B b (L) 0 1L 0
1 kg C
B C
B C
1
(L) = B
B kx0 1 Ikx 0 C.
C (20)
B kx kg C
@ A
0 0 Ikg
kg 1 kg kx

It follows from (15) that when rank (C) = m and regardless of whether the regressors are

weakly or strictly exogenous, de-trended cross section averages e


zwt and their lags can be used as

proxies for the unobserved common factors, assuming that N is su¢ ciently large, namely we have

1=2
ft = G (L) e
zwt + Op N , (21)

where
1
G (L) = C0 C C0 1
(L) .

Note that the coe¢ cients of the distributed lag function, G (L), decay at an exponential rate. In

particular, in the case of strictly exogenous regressors (where i = 0 ), the decay rate of the
k 1
coe¢ cients in G (L) is given by the decay rate of the coe¢ cients in b (L), see (20) and (23). As

established by Lemma A.1 of Chudik and Pesaran (2013b), the decay rate of the coe¢ cients in b (L)

is exponential under Assumption 4, which con…nes the support of i to lie strictly within the unit

circle. In the case of weakly exogenous regressors, an exponential rate of decay of the coe¢ cients

in 1 (L) is ensured by Assumption 5-ii.

The full column rank of C ensures that C0 C is invertible and this rank condition is required for

the estimation of unit-speci…c coe¢ cients. In contrast, the rank condition is not always necessary

for estimation of the cross-sectional mean of the coe¢ cients, as we shall see below.

ASSUMPTION 6 (k + 1) m dimensional matrix C = ( ; )0 has full column rank.

Substituting the large N representation for the unobserved common factors (21) into (1), we

obtain
0 0 0 1=2
yit = cyi + i yi;t 1 + 0i xit + 1i xi;t 1 + i (L) zwt + "it + Op N , (22)

10
where
1
X
`
i (L) = i` L = G0 (L) i, (23)
`=0

0
and cyi = cyi i (1) czw .

Consider now the following cross-sectionally augmented regressions, based on (22),

pT
X
0 0 0
yit = cyi + i yi;t 1 + 0i xit + 1i xi;t 1 + i` zw;t ` + eyit , (24)
`=0

where pT is the number of lags (assumed to be the same across units, for the simplicity of exposition).

The error term, eyit ; can be decomposed into three parts: an idiosyncratic term, "it , an error

component due to the truncation of possibly in…nite polynomial distributed lag function, i (L),

and an error component due to the approximation of unobserved common factors, namely

1
X
0 1=2
eyit = "it + i` zw;t ` + Op N .
`=pT +1

Note that the coe¢ cients of the distributed lag function, 0 G (L) ;
i (L) = i decay at an exponential

rate.
0
Let ^ i = ^ ; ^0 ; ^0 be the least squares estimates of i based on the cross-sectionally
i 0i 1i

augmented regression (24). Also consider the following data matrices

0 1 0 1
yipT x0i;pT +1 x0ipT 1 z0w;pT +1 z0w;pT z0w;1
B C B C
B C B C
B yi;p +1 x0 0 C B 1 z0 0 z0w;2 C
B T i;pT +2 xi;pT +1 C B w;pT +2 zw;pT +1 C
i =B .. .. .. C , Qw = B . .. .. .. C, (25)
B C B . C
B . . . C B . . . . C
@ A @ A
yi;T 1 x0iT x0i;T 1 1 z0w;T z0w;T 1 z0w;T pT

and the projection matrix


+
Mq = IT pT Qw Q0w Qw Q0w ,

where IT pT is a (T pT ) (T pT ) dimensional identity matrix, and A+ denotes the Moore-

Penrose generalized inverse of A. Matrices i, Qw , and Mq depend also on pT , N and T , but we

omit these subscripts to simplify notations. We summarize and introduce additional notations that

will be useful (for proofs) in Appendix A.1.

11
^ i can now be written as
0 1 0
bi = i Mq i i Mq yi , (26)

0 0
where yi = (yi;pT +1 ; yi;pT +2 ; :::; yi;T )0 . The mean group estimator of = E( i) = ; 0
0; 1 is

given by
N
1 X
bMG = ^ i. (27)
N
i=1

In addition to Assumptions 1-6 above, we shall also require the following further assumption.

0
ASSUMPTION 7 (a) Denote the (t pT )-th row of matrix e i = Mh i by eit = ei1t ; ei2t ; ::::; ei;2kx +1;t ,

where Mh is de…ned in the Appendix by (A.4). Individual elements of eit have uniformly
4
bounded fourth moments, namely there exists a positive constant K such that E eist < K

for any t = pT + 1; pT + 2; :::; T; i = 1; 2; :::; N and s = 1; 2; :::; 2kx + 1.

(b) There exists N0 and T0 such that for all N N0 , T T0 , (2kx + 1) (2kx + 1) matrices
b 1 0M 1
;iT = i q i =T exist for all i.

(c) (2kx + 1) (2kx + 1) dimensional matrix i de…ned in (A.14) in the Appendix is invertible
1
for all i and i < K < 1 for all i.

This assumption plays a similar role as Assumption 4.6 in Chudik, Pesaran, and Tosetti (2011)

and ensures that b i , b M G and their asymptotic distributions are well de…ned.

First, we establish su¢ cient conditions for the consistency of unit-speci…c estimates.

Theorem 1 (Consistency of b i ) Suppose yit , for i = 1; 2; :::; N and t = 1; 2; :::; T is given by


j
the panel ARDL model (1)-(3), and Assumptions 1-7 hold. Then, as (N; T; pT ) ! 1, such that

p3T =T ! {, 0 < { < 1, we have


p
bi i ! 0 , (28)
2kx +1 1

0 0 0
where b i = bi ; b 0i ; b 1i is given by (26).

No restrictions on the relative expansion rates of N and T to in…nity are required for the

consistency of b i in the theorem above, but the number of lags needs to be restricted so that there

are su¢ cient degrees of freedom for consistent estimation (i.e. the number of lags is not too large,

in particular it is required that p2T =T ! 0) and the bias due to the truncation of (possibly) in…nite

12
p pT
lag polynomials is su¢ ciently small (i.e. the number of lags is not too small, in our case T !0

for some positive constant < 1). Letting p3T =T ! {, 0 < { < 1, as T ! 1, ensures that these

conditions are met.7 The rank condition in Assumption 6 is also necessary for the consistency of
b i . This is because the unobserved factors are allowed to be serially correlated as well as being

correlated with the regressors.

3.1 Consistency and asymptotic distribution of b M G

Consistency of the unit-speci…c estimates b i is not always necessary for the consistency of the mean

group estimator of = E( i ), which is established next.

Theorem 2 (Consistency of b M G ) Suppose yit , for i = 1; 2; :::; N and t = 1; 2; :::; T is given by


j
the panel data model (1)-(3), and Assumptions 1-5 and 7 hold, and (N; T; pT ) ! 1, such that

p3T =T ! {, 0 < { < 1. Then,

(i) if Assumption 6 also holds,


p
bMG ! 0 , (29)
2kx +1 1

0 0 0
where b M G = bM G ; b 0M G ; b 1M G is given by (27);

p
(ii) if Assumption 6 does not hold but ft is serially uncorrelated, b M G ! 0 .
2kx +1 1

Theorem 2 establishes that b M G is consistent (as N and T tend jointly to in…nity at any rate),

regardless of the rank condition when factors are serially uncorrelated, although they can still be

correlated with the regressors. When the factors are serially correlated, then the rank condition

is required for the consistency of b M G . As we have seen, full column rank of C is su¢ cient for

approximating the unobserved common factors arbitrarily well by cross section averages and their

lags. In this case, the serial correlation of factors and correlation of factors and regressors do not

pose any problems. When the rank condition does not hold, but factors are serially uncorrelated,

then b i could be inconsistent due to the correlation of xit and ft , but the asymptotic bias of b i i

is cross-sectionally weakly dependent with zero mean and consequently the mean group estimator

is consistent.

The following theorem establishes the asymptotic distribution of b M G .


7
See also a related discussion in Berk (1974), Chudik and Pesaran (2013b) and Said and Dickey (1984) on the
truncation of in…nite polynomials in least squares regressions.

13
Theorem 3 (Asymptotic distribution of b M G ) Suppose yit , for i = 1; 2; :::; N and t = 1; 2; :::; T
j
are generated by the panel ARDL model (1)-(3), Assumptions 1-5 and 7 hold, and (N; T; pT ) ! 1

such that N=T ! {1 and p3T =T ! {2 , 0 < {1 ; {2 < 1. Then,

(i) if Assumption 6 also holds, we have

p d
N (b M G )!N 0 ; , (30)
2kx +1 1

(ii) if Assumption 6 does not hold, but ft is serially uncorrelated, we have

p d
N (b M G )!N 0 ; MG , (31)
2kx +1 1

0 0 0 0
where b M G = bM G ; b 0M G ; b 1M G is given by (27) and MG is given by equation (A.84)

in the Appendix.

In both cases, the asymptotic variance of b M G can be consistently estimated nonparametrically

by
N
X
b MG = 1
(b i b M G ) (b i b M G )0 . (32)
N 1
i=1
p
The convergence rate of b M G is N due to the heterogeneity of the coe¢ cients. Theorem 3

shows that the asymptotic distribution of b M G di¤ers depending on the rank of the matrix C in

Assumption 6. If C has full column rank, then the unit speci…c estimates b i are consistent, MG

reduces to , and the asymptotic variance of the mean group estimator is given by the variance

of i alone. If, on the other hand, C does not have the full column rank and factors are serially

uncorrelated then the unit-speci…c estimates are inconsistent (since ft is correlated with xit ), but
b M G is consistent and asymptotically normal with variance that depends not only on but also

on other parameters including the variance of factor loadings. Pesaran (2006) did not require any

restrictions on the relative rate of convergence of N and T for the asymptotic distribution of the

common correlated mean group estimator. This is no longer the case in our model due to O T 1

time series bias of b i and b M G that arises from the presence of lagged values of the dependent

variable. This bias dates back to at least to Hurwicz (1950) and it has been well documented in

the literature. Theorem 3 requires N=T ! {1 for the derivation of the asymptotic distribution of

14
b M G due to the time series bias, and it is therefore unsuitable for panels with T small relative to

N.

4 Bias-corrected CCEMG estimators

In this section we review the di¤erent procedures proposed in the literature for correcting the small

sample time series bias of estimators in dynamic panels and consider the possibility of developing

bias-corrected versions of CCEMG estimators for dynamic panels.

Existing literature focuses predominantly on homogeneous panels, where several di¤erent ways

to correct for O T 1 time series bias have been proposed. This literature can be divided into the

following broad categories: (i) analytical corrections based on an asymptotic bias formula (Bruno,

2005, Bun, 2003, Bun and Carree, 2005 and 2006, Bun and Kiviet, 2003, Hahn and Kuersteiner,

2002 and 2011, Hahn and Moon, 2006, Hahn and Newey, 2004, Kiviet, 1995 and 1999, and Newey

and Smith, 2004); (ii) bootstrap and simulation based bias corrections (Everaert and Ponzi, 2007,

Phillips and Sul, 2003 and 2007), and (iii) other methods, including jackknife bias corrections

(Hahn and Newey, 2004, and Dhaene and Jochmans, 2012) and the recursive mean adjustment

correction procedures (So and Shin, 1999).

In contrast, bias correction for dynamic panels with heterogenous coe¢ cients have been consid-

ered only in few studies. Hsiao, Pesaran, and Tahmiscioglu (1999) investigate bias-corrected mean

group estimation, where Kiviet and Phillips (1993) bias correction is applied to the individual esti-

mates of short-run coe¢ cients. Hsiao, Pesaran, and Tahmiscioglu (1999) propose also a Hierarchical

Bayesian estimation of short-run coe¢ cients, which they …nd to have good small sample proper-

ties in their Monte Carlo study.8 Pesaran and Zhao (1999) investigate bias correction methods in

estimating long-run coe¢ cients and consider, in particular, two analytical corrections based on an

approximation of the asymptotic bias of long-run coe¢ cients, a bootstrap bias-corrected estimator,

and a "naive" bias-corrected panel estimator computed from bias-corrected short-run coe¢ cients

(using a result derived by Kiviet and Phillips, 1993).


8
Zhang and Small (2006) further develops the hierarchical Bayesian approach of Hsiao, Pesaran, and Tahmiscioglu
(1999) by imposing a stationarity constraint on each of the cross section units and by considering di¤erent possibilities
for starting values. A Bayesian approach has also been developed by Canova and Marcet (1999) to study income
convergence in a dynamic heterogenous panel of countries, and by Canova and Ciccarelli (2004 and 2009) to forecast
variables and turning points in a panel VAR. Forecasting with Bayesian shrinkage estimators have also been considered
by Garcia-Ferrer, High…eld, Palm, and Zellner (1987), Zellner and Hong (1989) and Zellner, Hong, and ki Min (1991).

15
4.1 Bias corrected versions of b M G

All the bias correction procedures reviewed above are developed for panel data models without

unobserved common factors, and are not directly applicable to b M G . This applies to bootstrapped

based corrections, as well as the analytical corrections based on asymptotic bias formulae such as

the one derived by Kiviet and Phillips (1993). The development of analytical or bootstrapped bias

correction procedures for dynamic panel data models with a multifactor error structure is beyond

the scope of the present paper and deserve separate investigations of their own. Instead here we

consider the application of jackknife and recursive mean adjustment bias correction procedures to
b M G that do not require any knowledge of the error factor structure and are particularly simple

to implement.

4.1.1 Jackknife bias correction

Jackknife bias correction is popular due to its simplicity and wide applicability. Jackknife bias

correction can be applied to the panel mean group estimator, or at the level of unit-speci…c esti-

mates. Since the mean group estimator is a linear function of the unit-speci…c estimators, applying

the correction to b M G or to the unit-speci…c estimates, b i , yields numerically identical results.

We consider the "half-panel jackknife" method discussed by Dhaene and Jochmans (2012), which

corrects for O T 1 bias. Jackknife bias-corrected CCEMG estimators are constructed as:

1
e M G = 2b M G b aM G + b bM G ,
2

where b aM G denotes the CCEMG estimator computed from the …rst half of the available time

period, namely over the period t = 1; 2; :::; [T =2], where [T =2] denotes the integer part of T =2,

and b bM G is the CCEMG estimators computed using the observations over the period t = [T =2] +

1; [T =2] + 2; :::; T .

4.1.2 Recursive mean adjustment

The second bias-correction is based on the recursive mean adjustment method proposed by So and

Shin (1999), who advocate demeaning variables using the partial mean based on observations up

16
to the time period t 1. In particular, we let

t 1
X
1
y~it = yit yis ,
t 1
s=1

and
t 1
X
1
e it = ! it
! ! is ,
t 1
s=1

0 )0 . We then compute bias-adjusted CCE


for i = 1; 2; :::; N and t = 2; 3; :::; T , where ! it = (x0it ; git
e it (with T
mean group estimator based on the recursive demeaned variables y~it and ! 1 available

time periods, t = 2; 3; :::; T ).

5 Monte Carlo Experiments

Our main objective is to investigate the small sample properties of the CCEMG estimator and its

bias corrected versions in panel ARDL models under di¤erent assumptions concerning the parameter

values and the degree of cross-sectional dependence. We also examine the robustness of the quasi

maximum likelihood estimator (QMLE) developed by Moon and Weidner (2010a and 2010b) and

the interactive-e¤ects estimator (IFE) proposed by Bai (2009) to coe¢ cients heterogeneity, and

include an alternative MG estimator based on Song’s extension of Bai’s IFE approach (denoted as

^ sM G ) and investigate its performance as well.

We start with the description of the data generating process in subsection 5.1, followed by a

summary account of the di¤erent estimators being considered in subsection 5.2, before providing a

summary of our main …ndings in the …nal subsection.

5.1 Data Generating Process

We set kx = kg = 1 and write (1)-(3) as

0
yit = cyi + i yi;t 1 + 0i xit + 1i xi;t 1 + uit , uit = i ft + "it , (33)

and 0 1 0 1 0 1 0 1 0 1
0
B xit C B cxi C B xi C B xi C B vxit C
@ A=@ A+@ A yi;t 1 +@ A ft + @ A. (34)
git cgi 0 vgit
gi gi

17
The unobserved common factors in ft and the unit-speci…c components vit = (vxit ; vgit )0 are gener-

ated as independent stationary AR(1) processes:

2
ft` = f ` ft 1;` + & f t` , & f t` IIDN 0; 1 f` , (35)
2
vxit = xi vxi;t 1 + & xit , & xit IIDN 0; vxi , (36)
2
vgit = gi vgi;t 1 + & git , & git IIDN 0; vgi (37)

for i = 1; 2; :::; N , ` = 1; 2; ::; m, and for t = 99; :::; 0; 1; 2; :::; T with the starting values f`; 100 = 0,

and vxi; 100 = vgi; 100 = 0. The …rst 100 time observations (t = 99; 48; :::; 0) are discarded. We

generate xi and gi , for i = 1; 2; ::::N as IIDU [0:0:95], and consider two values for f `, representing

the case of serially uncorrelated factors, f` = 0, for ` = 1; 2; :::; m, and the case of the serially

correlated factors = 0:6, for ` = 1; 2; :::; m. We set 2 = 2 = 2 and allow to be


f` vxi vgi vi vi
q
correlated with 0i and set vi = i0 1 [E ( xi )]2 .

As before, we let zit = (yit ; xit ; git )0 , and write the data generating process for zit more compactly

as (see (6)),

zit = czi + Ai zi;t 1 + A0i1 Ci ft + A0i1 eit ; (38)

0
where czi = (cyi + 0i cxi ; cxi ; cgi ) ,

0 1 0 1
B i + 0i xi 1i 0 C B 1 0i 0 C
B C B C 0
Ai = B
B xi 0 0 C
1 B
C , A0i = B 0 1 0 C
C , Ci = i; xi ; gi ,
@ A @ A
gi 0 0 0 0 1

0
and eit = ("it + 0i vxit ; vxit ; vgit ) is a serially correlated error vector. We generate zit for i =

1; 2; :::; N , and t = 99; :::; 0; 1; 2; :::; T based on (38) with the starting values zi; 100 = 0; and

the …rst 100 time observations (t = 99; 48; :::; 0) are discarded as burn-in replications. The

…xed e¤ects are generated as ciy IIDN (1; 1), cxi = cyi + & cx i ; and cgi = cyi + & cg i , where

& cx i ; & cg i IIDN (0; 1), thus allowing for dependence between (xit ; git )0 and cyi .

For each i the process fzit g is stationary if ft and eit are stationary and the eigenvalues of Ai

lie inside the unit circle. More speci…cally the parameter choices for j 1 (Ai )j < 1 have to be such

that
q
1 2
i+ xi 0i ( i + xi 0i ) +4 1i xi < 1.
2

18
Suppose now that we only consider positive values of i, xi and 0i , such that i + xi 0i < 2.

Then it is easily seen that su¢ cient stationary conditions are

( 0i + 1i ) xi < 1 i,

( 1i 0i ) xi < 1+ i.

Accordingly, we set 1i = 0:5 for all i, and generate 0i as IIDU (0:5; 1). When xi > 0, we need

to generate xi such that 0:5 xi <1 i. We consider two possibilities for i: Low values where

i are generated as IIDU (0; 0:8) and xi as IIDU (0; 0:35). High values where use the draws,

i IIDU (0:5; 0:9) and xi IIDU (0; 0:15). These choices ensure that the support of 1 (Ai )

lies strictly inside the unit circle, as required by Assumption 5. Values of gi do not a¤ect the

eigenvalues of Ai and are generated as gi IIDU (0; 1).

The above DGP is more general than the other DGPs used in other MC experiments in the

literature and allows for weakly exogenous regressors. The factors and regressors are allowed to be

correlated and persistent, and correlated …xed e¤ects are included.

All factor loadings are generated independently as

2
i` = ` + i; ` , i; ` IIDN 0; ` ,
2
xi` = x` + i; x` , i; x` IIDN 0; x` ,
2
gi` = g` + i; g` , i; g` IIDN 0; g`

for ` = 1; 2; ::; m; and i = 1; 2; :::; N . Also, without loss of generality, the factor loadings are

calibrated so that V ar( 0i ft ) = V ar ( 0xi ft ) = V ar 0gi ft = 1. We also set 2 ` = 2 x` = 2


g` =
p p p
0:22 , ` = b ` , x` = `bx` and g` = (2` 1) bg` , for ` = 1; 2; :::; m, where b = 1=m 2
`;

bx = 2= [m (m + 1)] 2= (m + 1) 2 and bg = 1=m2 2 =m, for ` = 1; 2; :::; m. This ensures that


x` g`

the contribution of the unobserved factors to the variance of yit does not rise with m. We consider

m = 1; 2 or 3 unobserved common factors.

Finally, the idiosyncratic errors, "it , are generated to be heteroskedastic and weakly cross-

sectionally dependent. Speci…cally, we adopt the following spatial autoregressive model (SAR) to

generate "t = ("1t ; "2t ; :::; "N t )0 :

"t = a" S" "t + e"t , (39)

19
where the elements of e"t are drawn as IIDN 0; 12 2
i , with 2
i obtained as independent draws

from 2 (2) distribution, 0 1


1
B 0 2 0 0 0 C
B C
B 1 0 1 0 0 C
B 2 C
B .. C
B .. C
B 0 1 0 . . C
S" = B
B
C,
C
B 0 .. ..
B 0 . . 1 0 CC
B . C
B .. 1 0 1 C
B 2 C
@ A
1
0 0 0 2 0

and the spatial autoregressive parameter is set to a" = 0:4. Note that f"it g is cross-sectionally

weakly dependent for ja" j < 0:5.

In addition to these experiments, we also consider pure panel autoregressive experiments where

we set 0i = 1i = 0, for all i. Table 1 summarizes the various parameter con…gurations of all the

di¤erent experiments. In total, we conducted 24 experiments covering the various cases: with or

without regressors in the equation for the dependent variable, low or high values of = E ( i ),

m = 1; 2; or 3 common factors, and persistent or serially uncorrelated common factors. We consider

the following combinations of sample sizes: N; T 2 f40; 50; 100; 150; 200g, and set the number of

replications to R = 2000, in the case of all experiments.

5.2 Estimation techniques

The focus of the MC results will be on the estimates of the average parameter values = E ( i ) and

0 = E( 0i ), in the case of experiments with regressors, xit . But before presenting the outcomes

we brie‡y describe the computation of the alternative estimators being considered.9

5.2.1 Dynamic CCE mean group estimator

We base the CCE mean group estimator on the following cross-sectionally augmented unit-speci…c

regressions,
pT
X
0
yit = ciy + i yi;t 1+ 0i xit + 1i xi;t 1+ i` zt ` + eyit , (40)
`=0
9
We are grateful to Jushan Bai, Hyungsik Roger Moon, and Martin Weidner for providing us with their Matlab
codes.

20
PN
for i = 1; 2; :::; N , where zt = N 1
i=1 zit = (yt ; xt ; gt )0 . We set pT equal to the integer part of

T 1=3 , denoted as pT = T 1=3 . This gives the values of pT = 3; 3; 4; 5; 5 for T = 40; 50; 100; 150; 200,

respectively. The CCE mean group estimator of and 0 is then obtained by arithmetic averages

of the least squares estimates of i and 0i based on (40).

We also computed bias-corrected versions of the CCEMG estimator using the half-panel jack-

knife and the recursive mean adjusted estimators as described in Section 4.1.

5.2.2 QMLE estimator by Moon and Weidner

We deal with …xed e¤ects by de-meaning the variables before implementing the QMLE estimation

procedure. Denote the demeaned variables as

T
X T
X
1 1
y_ it = yit T yit , and x_ it = xit T xit , (41)
t=1 t=1

for s = 1; 2 and i = 1; 2; :::; N . We compute the bias-corrected QMLE estimator de…ned in

Corollary 3.7 in Moon and Weidner (2010a) using y_ it as the dependent variable and the vector

z_it = (y_ i;t _ it ; x_ i;t 1 )0


1; x as the vector of explanatory variables. Two options for the number of

unobserved factors are considered: the true number of factors and the maximum number, 3, of

unobserved factors.

5.2.3 Interactive-e¤ects estimator by Bai

We deal with the …xed e¤ects in the same way as before. In particular, we use the demeaned

variables y_ it , and x_ it;s for s = 1; 2, to compute the interactive-e¤ects estimator as the solution to

the following set of non-linear equations:

N
! 1 N
X X
^b = _ 0i M ^ _ i _ 0i M ^ y_ i ; (42)
F F
i=1 i=1
N
1 X _ i ^b _ i ^b
0
^=F
^ V;
^
y_ i y_ i F (43)
NT
i=1

0 1
where ^ b = ^ b ; ^ 0b ; ^ 1b
is the interactive-e¤ects estimator , MF^ = IT F ^ F^F ^0 ^ 0, V
F ^ is a
P 0
diagonal matrix with the m largest eigenvalues of the matrix N1T N _ i _ i ^b
i=1 y y_ i _ i ^b

21
arranged in decreasing order, y_ i = (y_ i2 ; y_ i3 ; :::; y_ iT )0 and

0 1
y_ i1 x_ i2 x_ i1
B C
B C
B y_ i;2 x_ i3 x_ i2 C
_i =B
B
C
C.
B .. .. .. C
B . . . C
@ A
y_ i;T 1 x_ iT x_ i;T 1

The system of equations (42)-(43) is solved by an iterative method.

Bai (2009) does not allow for a lagged dependent variable in the derivation of the asymptotic

results for the interactive-e¤ects estimator, but considers this possibility in Monte Carlo experiments

and concludes that parameters are well estimated also for the DGP with a lagged dependent

variable. As in the case of the QMLE estimator, we consider Bai’s estimates based on the true

number of factors, and on the maximum number of factors, namely 3.

5.2.4 Mean Group estimator based on Song’s extension of Bai’s IFE approach

Song (2013) extends Bai’s IFE approach by allowing for coe¢ cient heterogeneity and lags of the

dependent variable. Song focuses on the estimates of individual coe¢ cients obtained from the

solution to the following system of nonlinear equations, which as he shows minimizes the sum of

squared errors,

1
^ si = _ 0i M ^ _ i _ 0i M ^ y_ i ; for i = 1; 2; :::; N , (44)
F F
N
1 X _ i ^i _ i ^i
0
^=F
^ V:
^
y_ i y_ i F (45)
NT
i=1

Similarly to Bai’s IFE procedure, we use demeaned observations to deal with the presence of

…xed e¤ects and the system of equations (44)-(45) is solved numerically by an iterative method.
p j
Song (2013) establishes T consistency rates of individual estimates ^ si under asymptotics N; T !

1 such that T =N 2 ! 0.

Given our random coe¢ cient assumption on i, we adopt the following mean group estimator

based on Song’s individual estimates,

N
1 X s
^ sM G = ^i ,
N
i=1

22
and investigate the performance of ^ sM G with its variance estimated nonparemetrically by

N
X
bs 1
MG = (b si b sM G ) (b si b sM G )0 :
N 1
i=1

p j
Note that since T (^ si i) = Op (1) (uniformly in i) as N; T ! 1 such that T =N 2 ! 0 (see

Song, 2013, Theorem 2), it readily follows that (also see Assumption 4)

N
1 X 1
^ sM G = i + Op p .
N T
i=1

p d j
However, su¢ cient conditions for N (^ sM G ) ! N (0; ) as N; T ! 1 remains to be inves-

tigated and this is outside the scope of the present paper.

6 Monte Carlo …ndings

In this section we report some of the main …ndings, and direct the reader to an online Supplement

where the full set of results can be accessed.

Table 2 summarizes the results for the bias ( 100) and root mean square error (RMSE, 100) in

the case of the experiment with regressors, = E ( i ) = 0:4; and one serially correlated unobserved

common factor (Experiment 14 in Table 1). The …rst panel of this table gives the results for the

…xed e¤ects estimator (FE) which provides a benchmark against three sources of estimation bias:

the time series bias of order T 1, the bias from ignoring a serially correlated factor, and the bias

due to coe¢ cient (slope) heterogeneity. The latter two biases are not diminishing in T and we see

that their combined e¤ect remains substantial even for T = 200.

Next consider the QMLE estimator due to Moon and Weidner, which allows for unobserved

factors, but fails to account for coe¢ cient heterogeneity. As can be seen, this estimator still su¤ers

from a substantial degree of heterogeneity bias which does not diminish in T . This is in line with

the theoretical results derived in Pesaran and Smith (1995), where it is shown that in the presence

of slope heterogeneity pooled least squares estimators are inconsistent in the case of panel data

models with lagged dependent variables. This would have been the case even if the unobserved

factors could have been estimated without any sampling errors. Initially, for T = 40, negative time

series bias helps the performance of QMLE in our design, but as T increases, the time series bias

23
diminishes and the positive coe¢ cient heterogeneity bias dominates the outcomes. The bias for

T = 200 ranges between 0:07 to 0:10 which amounts to 20 25% of the true value. Inclusion of 3

as opposed to 1 unobserved common factor improves the performance but does not mitigated fully

the consequences of coe¢ cient heterogeneity. Results for Bai’s IFE approach are similar to those

of QMLE and are therefore reported only in the online Supplement to save space.

In contrast the CCEMG estimator deals with the presence of persistent factors and coe¢ cient

heterogeneity, but fails to adequately take account of the time series bias. As can be seen from the

results, the uncorrected CCEMG estimator su¤ers from the time series bias when T is small, with

the bias diminishing as T in increased. The sign of the bias is negative, which is in line with the

existing literature. Thee bias of the CCEMG estimator is around 0:12 for T = 40, and declines

to around 0:02 when T = 200.

Both bias correction methods considered are e¤ective in reducing the time series bias of the

CCEMG estimator, but the jackknife bias correction method turns out to be more successful

overall. It is also interesting that the jackknife correction tends to slightly over-correct whereas

the RMA procedure tends to under-correct. Both bias-correction methods also reduced the overall

RMSE for all values of N and T considered.

The mean group estimator based on Song’s individual estimates performs slightly worse than

the jackknife bias-corrected CCEMG, but overall its performance (in terms of bias and RMSE)

seems to be satisfactory. The knowledge of the true number of factors, however, plays a very

important role in improving the performance of this estimator.

Table 3 reports …ndings for estimation of 0 in the same experiment. As before, the FE and

QMLE estimators continue to be biased even when T is large. The selection of the number factors

seems to be quite important for the bias of QMLE estimator (and also Bai’s IFE estimator reported

in the Supplement). The bias of CCEMG estimators is, in contrast, very small, between 0:0 to 0:02

for all values of N and T . Bias correction does not seem to matter for the CCEMG estimation of
1
0. The small sample time series O T bias for the estimation of 0 is much smaller as compared

to the bias of the autoregressive coe¢ cient. Bias correction seems therefore not so important for

the estimation of 0, and the uncorrected version of CCEMG estimator performs better in terms

of RMSE compared to its bias corrected versions. ^ sM G also performs well although its RMSE is,

in the majority of cases, slightly worse than RMSE of the uncorrected CCEMG estimator.

24
An important question is how robust are the various estimators to the number of unobserved

factors. The MC results with more than one factor are summarized in Tables 4-7, and show that

the CCEMG estimator continues to work well regardless of the number of factors and whether the

factors are serially correlated. For m = 2 or 3, the performance of the CCEMG estimator and

its bias-corrected versions is qualitatively similar to the case of m = 1 discussed above. Only a

slight deterioration in bias and RMSE is observed when m is increased to 3, most likely due to the

increased complexity encountered in approximating the space spanned by the unobserved common

factors.

To check the validity of the asymptotic distribution of the CCEMG and other estimators, we

now consider the size and power performance of the di¤erent estimators under consideration. We

compute the size ( 100) at 5% nominal level and the power ( 100) for the estimation of and 0

with the alternatives H1 : = 0:5 and H1 : = 0:8, associated with the null values of = 0:4 and

0:7, respectively, and the alternative of H1 : 0 = 0:85, associated with the null value of 0 = 0:75.

The results for size and power in the case of the Experiments 14, 16 and 18 are summarized in

Tables 8-13.

As can be seen the tests based on FE and QMLE estimators and Bai’s IFE (reported in the

Supplement) are grossly oversized irrespective of whether the parameter of interest is or 0. In

contrast the CCEMG estimator and the MG estimator based on Song’s individual estimates have

the correct size if one is interested in making inference about 0, but both estimators tend to be

over-sized if the aim is to make inference about . These results are in line with our theoretical

…ndings and largely re‡ect the time series bias of order O T 1 which is present in the MG type

estimators of . The bias-corrected versions of the CCEMG estimator perform much better, with

the jackknife bias-correction method generally outperforming the RMA procedure. The condition

N=T ! {1 , 0 < {1 < 1, in Theorem 3 plays an important role in ensuring that the tests based on

the CCEMG estimator of have the correct size. In particular, the size worsens with an increase

in the ratio N=T , especially when T = 40. Relatively good size (7%-9%) is achieved only when

T > 100.

As already noted, the size of the tests based on the CCEMG estimator of 0, (Tables 9, 11

and 12) is strikingly well behaved in all experiments and is very close to 5 percent for all values

of N and T , which is in line with low biases reported for this estimator. Similar results also hold

25
for s
M G, although there are some incidences of size distortions for this MG estimator when T is

relatively small (40 50).

Given the importance of the time series bias for the estimation of and inference on , it is also

reasonable to check the robustness of our …ndings to higher values of . The estimation bias is

likely to increase as is increased towards unity. The results for the experiments with set to 0:7

are reported in the online Supplement, and not surprisingly are generally worse than the results

reported in the tables below for = 0:4. Although, once again, the estimates of 0 tend not be

much a¤ected by the choice of .

The results of the experiments with purely autoregressive panel data models (reported in the

Supplement) are very similar to the ones discussed above, although the small sample performance

of CCEMG estimator of is slightly better as compared to the experiments with regressors.

Overall, our …ndings suggest that when 0 is the parameter of interest, the uncorrected CCEMG

estimator seems to be preferred (in terms of bias, RMSE, size, and power), whereas jackknife

corrected CCEMG estimator seems to be preferred for estimation of , but the time dimension T

needs to be relatively large in order to obtain a correct size for the tests of based on the CCEMG

type estimators of , although some marginal improvements can be achieved if the jackknife bias-

corrected version of CCEMG is used.

7 Conclusion

This paper extends the Common Correlated E¤ects (CCE) approach to estimation and inference

in panel data models with a multi-factor error structure, originally proposed in Pesaran (2006),

by allowing for the inclusion of lagged values of the dependent variable and weakly exogenous

regressors in the panel data model. We show that the CCE mean group estimator continues

to be valid asymptotically but the following two conditions must be satis…ed to deal with the

presence of lagged dependent variables amongst the regressors: a su¢ cient number of lags of cross-

sectional averages must be included in individual equations, and the number of cross-sectional

averages must be at least as large as the number of unobserved common factors. CCE mean

group estimator and its jackknife and recursive mean adjustment bias corrected versions are easily

implemented empirically. Results from an extensive set of Monte Carlo experiments show that the

homogeneous slope estimators proposed in the literature can be seriously biased in the presence of

26
slope heterogeneity. In contrast the uncorrected CCEMG estimator proposed in the paper performs

well (in terms of bias, RMSE, size and power) if the parameter of interest is the average slope of

the regressors ( 0 ), even if N and T are relatively small. But the situation is very di¤erent if the

parameter of interest is the mean coe¢ cient of the lagged dependent variable ( ). In the case of

the uncorrected CCEMG estimator su¤ers form the time series bias and tests based on it tend

to be over-sized, unless T is su¢ ciently large relative to N . The jackknife bias-corrected CCEMG

estimator, also proposed in the paper, does help in mitigating the time series bias, but it cannot

fully deal with the size distortion unless T is su¢ ciently large. Improving on the small sample

properties of the CCEMG estimators of in the heterogeneous panel data models still remains a

challenge to be taken on in the future.

27
Table 1: Parameters of the Monte Carlo Design

Experiments without regressors Experiments with regressors


( 0i = 1i = 0) ( 0i IIDU [0:5; 1], 1i = 0:5)
Exp. =E( ) m f Exp. =E( ) m f
1 0.4 1 0 13 0.4 1 0
2 0.4 1 0.6 14 0.4 1 0.6
3 0.4 2 0 15 0.4 2 0
4 0.4 2 0.6 16 0.4 2 0.6
5 0.4 3 0 17 0.4 3 0
6 0.4 3 0.6 18 0.4 3 0.6
7 0.7 1 0 19 0.7 1 0
8 0.7 1 0.6 20 0.7 1 0.6
9 0.7 2 0 21 0.7 2 0
10 0.7 2 0.6 22 0.7 2 0.6
11 0.7 3 0 23 0.7 3 0
12 0.7 3 0.6 24 0.7 3 0.6

Notes: The dependent variable, regressors and covariates are generated according to (33)-(34) with i IIDU [0; 0:8]
(low value of = E ( i ) = 0:4) or with i IIDU [0:5; 0:9] (high value of = E ( i ) = 0:7), with correlated
…xed e¤ects, and with cross-sectionally weakly dependent heteroskedastic idiosyncratic innovations generated from a
SAR(1) model (39) with a" = 0:4. All experiments allow for feedback e¤ects with xi IIDU [0; 0:35] for high value
of , xi IIDU [0; 0:15] for low value of , and gi IIDU [0; 1] for both values of .

28
Table 2. Estimation of in experiments with regressors, = E ( i ) = 0:4, and m = 1
correlated common factor. (Experiment 14)

Bias (x100) RMSE (x100)


(N,T) 40 50 100 150 200 40 50 100 150 200
Fixed E¤ects estimates
40 13.12 14.74 17.83 18.80 19.61 15.48 16.72 19.12 19.83 20.55
50 13.08 14.79 18.07 19.25 19.60 15.13 16.50 19.14 20.12 20.41
100 13.42 15.11 18.29 19.53 20.12 15.08 16.43 19.00 20.12 20.64
150 13.95 15.05 18.47 19.67 20.23 15.47 16.20 19.09 20.09 20.61
200 13.47 15.27 18.64 19.71 20.23 14.89 16.38 19.21 20.11 20.57
Dynamic CCEMG without bias correction
40 -10.93 -8.25 -3.31 -1.98 -1.18 11.86 9.35 5.12 4.37 3.93
50 -11.12 -8.34 -3.61 -2.02 -1.30 11.88 9.23 5.02 4.05 3.74
100 -11.73 -9.04 -3.99 -2.41 -1.59 12.12 9.44 4.69 3.41 2.88
150 -12.06 -9.25 -4.22 -2.60 -1.76 12.33 9.54 4.68 3.25 2.62
200 -12.13 -9.37 -4.32 -2.68 -1.94 12.35 9.60 4.67 3.17 2.56
Dynamic CCEMG with RMA bias correction
40 -8.58 -5.82 -2.20 -0.84 -0.50 10.23 7.63 4.66 3.98 3.91
50 -8.55 -5.97 -2.14 -1.18 -0.57 9.92 7.47 4.24 3.77 3.44
100 -9.08 -6.17 -2.36 -1.25 -0.80 9.81 6.92 3.54 2.73 2.59
150 -9.29 -6.55 -2.40 -1.48 -0.89 9.80 7.06 3.24 2.49 2.22
200 -9.44 -6.75 -2.61 -1.47 -1.01 9.88 7.13 3.24 2.28 2.03
Dynamic CCEMG with jackknife bias correction
40 3.82 2.64 1.74 1.21 0.85 9.96 7.18 4.91 4.41 4.09
50 4.02 2.66 1.59 1.19 0.77 9.26 6.62 4.38 3.96 3.79
100 3.91 2.35 1.40 0.97 0.66 7.64 4.96 3.23 2.83 2.62
150 3.73 2.48 1.30 0.90 0.59 6.93 4.64 2.72 2.32 2.15
200 4.04 2.52 1.27 0.88 0.47 6.78 4.41 2.45 2.05 1.83
MG based on Song’s individual estimates with 3 factors
40 -9.15 -6.77 -2.74 -1.38 -0.90 10.91 8.58 5.11 4.12 4.03
50 -9.48 -7.03 -2.76 -1.50 -0.95 10.81 8.38 4.52 3.84 3.54
100 -10.20 -7.32 -2.85 -1.72 -1.21 10.85 7.98 3.85 3.00 2.75
150 -10.53 -7.56 -2.98 -1.79 -1.27 10.99 8.02 3.69 2.74 2.33
200 -10.85 -7.78 -3.05 -1.85 -1.36 11.21 8.13 3.58 2.55 2.21
MG based on Song with true number of factors (m = 1)
40 -5.34 -3.95 -1.46 -0.40 -0.01 7.57 6.31 4.55 3.98 3.96
50 -6.03 -4.58 -1.76 -0.79 -0.28 7.61 6.33 4.06 3.60 3.43
100 -7.09 -5.47 -2.36 -1.40 -0.99 7.76 6.17 3.49 2.83 2.65
150 -7.27 -5.70 -2.56 -1.59 -1.11 7.71 6.17 3.33 2.60 2.24
200 -7.43 -5.87 -2.67 -1.67 -1.24 7.76 6.22 3.23 2.41 2.13
Moon and Weidner’s QMLE with 3 factors
40 -2.67 0.94 5.73 7.30 7.73 8.93 7.99 8.68 9.55 9.82
50 -3.34 0.37 5.82 7.23 7.86 8.46 7.04 8.20 9.18 9.62
100 -4.66 -0.57 5.65 7.28 7.99 7.58 5.21 7.06 8.34 8.96
150 -5.74 -1.14 5.38 7.15 8.04 7.71 4.61 6.44 7.87 8.69
200 -6.05 -1.70 5.35 7.05 7.81 7.65 4.31 6.18 7.64 8.32
Moon and Weidner’s QMLE with true number of factors (m = 1)
40 1.87 3.62 6.87 8.08 8.48 8.30 8.56 9.79 10.37 10.74
50 1.83 3.89 7.20 8.23 8.76 7.58 8.08 9.60 10.38 10.77
100 1.99 3.82 7.45 8.67 9.18 5.92 6.45 8.79 9.79 10.21
150 2.24 4.00 7.47 8.66 9.31 5.12 5.88 8.46 9.42 10.02
200 2.36 4.10 7.72 8.83 9.32 5.00 5.68 8.46 9.44 9.87

Notes: See notes to Table 1. CCEMG is based on (40) which features cross section averages of zit = (yit ; xit ; git )0 .
QMLE estimator and MG estimator based on Song’s individual estimates are computed from demeaned variables y_ it
and x_ it de…ned in (41).

29
Table 3. Estimation of 0 in experiments with regressors, = E ( i ) = 0:4, and m = 1
correlated common factor. (Experiment 14)

Bias (x100) RMSE (x100)


(N,T) 40 50 100 150 200 40 50 100 150 200
Fixed E¤ects estimates
40 51.52 51.31 51.66 51.37 51.49 51.96 51.68 51.88 51.54 51.64
50 50.96 51.08 51.27 51.25 51.33 51.37 51.42 51.47 51.40 51.46
100 51.07 51.13 51.36 51.13 51.35 51.40 51.39 51.52 51.24 51.43
150 51.22 51.11 51.25 51.22 51.32 51.54 51.36 51.38 51.32 51.39
200 50.99 51.28 51.20 51.09 51.20 51.27 51.51 51.32 51.17 51.27
Dynamic CCEMG without bias correction
40 1.37 1.14 0.69 0.45 0.18 5.92 5.28 3.70 3.30 3.08
50 1.05 0.82 0.48 0.28 0.27 5.48 4.59 3.37 2.93 2.84
100 1.11 0.92 0.58 0.30 0.23 3.92 3.37 2.45 2.15 1.93
150 1.23 1.05 0.46 0.26 0.28 3.34 2.88 1.98 1.77 1.61
200 1.24 0.97 0.50 0.33 0.26 2.97 2.51 1.77 1.52 1.37
Dynamic CCEMG with RMA bias correction
40 1.34 0.91 0.60 0.60 0.36 6.84 5.81 4.05 3.43 3.12
50 1.31 1.11 0.55 0.39 0.49 6.06 4.99 3.56 3.02 2.79
100 1.22 0.99 0.66 0.44 0.24 4.50 3.50 2.53 2.24 1.94
150 1.13 0.96 0.56 0.41 0.37 3.59 3.12 2.14 1.81 1.69
200 1.10 0.97 0.53 0.44 0.32 3.27 2.71 1.84 1.64 1.41
Dynamic CCEMG with jackknife bias correction
40 1.60 0.98 0.36 0.20 0.03 12.04 8.25 4.42 3.69 3.29
50 0.85 0.34 0.07 0.11 0.14 11.21 7.32 4.11 3.32 3.03
100 0.58 0.70 0.22 0.00 0.01 7.71 5.42 2.98 2.36 2.07
150 0.97 0.55 0.08 -0.06 0.07 6.49 4.32 2.38 1.99 1.71
200 0.84 0.52 0.08 0.03 0.02 5.65 3.88 2.08 1.68 1.44
MG based on Song’s individual estimates with 3 factors
40 0.10 0.51 0.42 0.44 0.49 8.13 6.45 4.12 3.60 3.50
50 0.29 0.54 0.31 0.38 0.32 6.81 5.40 3.69 3.12 2.90
100 0.49 0.42 0.30 0.35 0.29 4.21 3.58 2.51 2.22 1.95
150 0.56 0.44 0.35 0.27 0.21 3.34 2.81 2.02 1.73 1.59
200 0.62 0.56 0.37 0.32 0.22 2.81 2.42 1.72 1.53 1.41
MG based on Song with true number of factors (m = 1)
40 -2.76 -2.08 -1.58 -1.51 -1.41 8.58 7.78 5.09 4.42 4.15
50 -1.67 -1.33 -1.09 -0.85 -0.95 7.50 5.61 4.09 3.36 3.25
100 0.09 0.04 -0.01 0.03 0.04 3.64 3.26 2.40 2.17 1.89
150 0.44 0.30 0.22 0.13 0.09 3.04 2.57 1.95 1.70 1.56
200 0.57 0.52 0.30 0.25 0.15 2.66 2.26 1.69 1.50 1.39
Moon and Weidner’s QMLE with 3 factors
40 8.09 7.42 6.25 5.51 5.20 10.50 9.56 7.87 6.95 6.68
50 7.40 6.63 5.23 4.87 4.75 9.46 8.46 6.68 6.14 5.92
100 6.26 5.59 4.55 4.12 4.05 7.32 6.58 5.29 4.83 4.69
150 6.02 5.47 4.34 4.08 4.04 6.82 6.12 4.87 4.56 4.49
200 5.95 5.38 4.39 4.09 3.97 6.56 5.89 4.79 4.45 4.31
Moon and Weidner’s QMLE with true number of factors (m = 1)
40 17.09 16.70 16.36 16.08 16.28 19.93 19.20 18.18 17.75 17.80
50 16.84 16.37 16.16 16.34 16.40 19.40 18.58 17.76 17.69 17.66
100 17.19 17.03 16.86 16.75 17.00 18.88 18.45 17.88 17.62 17.75
150 17.86 17.24 17.25 17.31 17.36 19.34 18.47 18.07 17.93 17.89
200 17.27 17.55 17.32 17.32 17.41 18.60 18.65 18.02 17.85 17.87

Notes: See notes to Table 1. CCEMG is based on (40) which features cross section averages of zit = (yit ; xit ; git )0 .
QMLE estimator and MG estimator based on Song’s individual estimates are computed from demeaned variables y_ it
and x_ it de…ned in (41).

30
Table 4. Estimation of in experiments with regressors, = E ( i ) = 0:4, and m = 2
correlated common factors. (Experiment 16)

Bias (x100) RMSE (x100)


(N,T) 40 50 100 150 200 40 50 100 150 200
Fixed E¤ects estimates
40 21.98 23.35 26.19 27.41 27.95 23.66 24.63 26.98 28.00 28.45
50 21.59 23.37 26.36 27.44 27.89 23.10 24.61 27.01 27.95 28.36
100 22.44 23.76 26.67 27.65 28.34 23.74 24.81 27.24 28.03 28.65
150 22.51 23.77 26.68 27.98 28.26 23.76 24.81 27.16 28.31 28.53
200 22.16 23.63 26.77 27.83 28.42 23.37 24.61 27.22 28.13 28.68
Dynamic CCEMG without bias correction
40 -10.66 -7.93 -3.13 -1.58 -0.68 11.66 9.15 5.12 4.21 3.93
50 -10.83 -8.07 -3.23 -1.66 -0.87 11.64 9.02 4.82 3.84 3.64
100 -11.18 -8.31 -3.43 -1.94 -1.20 11.61 8.79 4.28 3.14 2.66
150 -11.45 -8.67 -3.67 -2.02 -1.37 11.74 8.99 4.23 2.87 2.40
200 -11.64 -8.87 -3.78 -2.23 -1.42 11.86 9.11 4.19 2.85 2.23
Dynamic CCEMG with RMA bias correction
40 -8.72 -5.77 -1.98 -0.89 -0.14 10.40 7.66 4.65 4.08 3.89
50 -8.77 -5.88 -2.10 -0.97 -0.38 10.11 7.37 4.29 3.65 3.57
100 -9.14 -6.11 -2.30 -1.28 -0.75 9.89 6.94 3.51 2.83 2.53
150 -9.33 -6.42 -2.45 -1.33 -0.88 9.89 6.97 3.28 2.48 2.18
200 -9.49 -6.56 -2.53 -1.48 -0.87 9.92 7.00 3.17 2.33 1.95
Dynamic CCEMG with jackknife bias correction
40 3.94 2.97 1.93 1.54 1.40 10.00 7.26 5.01 4.45 4.24
50 4.11 2.86 1.79 1.50 1.14 9.39 6.51 4.50 4.02 3.82
100 3.96 2.83 1.63 1.17 0.79 7.73 5.33 3.39 2.92 2.60
150 4.10 2.59 1.45 1.11 0.63 7.18 4.69 2.80 2.46 2.17
200 4.12 2.70 1.46 0.99 0.64 6.88 4.53 2.55 2.12 1.89
MG based on Song’s individual estimates with 3 factors
40 -9.08 -6.33 -2.04 -0.82 -0.32 10.77 8.02 4.56 4.11 3.94
50 -9.02 -6.41 -1.91 -0.94 -0.36 10.26 7.80 4.12 3.61 3.54
100 -9.46 -6.79 -2.29 -1.01 -0.61 10.10 7.49 3.48 2.69 2.56
150 -9.83 -6.89 -2.39 -1.25 -0.75 10.28 7.37 3.21 2.42 2.15
200 -10.30 -7.19 -2.61 -1.37 -0.85 10.64 7.54 3.21 2.24 1.97
MG based on Song with true number of factors (m = 2)
40 -7.57 -5.41 -1.76 -0.62 -0.14 9.20 7.18 4.39 4.04 3.88
50 -7.54 -5.48 -1.62 -0.79 -0.22 8.80 6.90 3.97 3.57 3.52
100 -7.86 -5.87 -2.04 -0.85 -0.47 8.49 6.57 3.31 2.62 2.51
150 -8.13 -5.91 -2.12 -1.09 -0.61 8.55 6.41 3.00 2.35 2.09
200 -8.39 -6.08 -2.32 -1.19 -0.71 8.72 6.44 2.97 2.13 1.90
Moon and Weidner’s QMLE with 3 factors
40 -0.27 3.31 8.40 9.94 10.80 8.95 8.83 10.68 11.76 12.41
50 -1.40 2.26 7.69 9.31 9.96 8.47 7.59 9.65 10.86 11.44
100 -4.23 0.15 6.46 8.16 9.04 7.52 5.54 7.77 9.11 9.80
150 -5.76 -1.28 5.77 7.80 8.49 7.56 4.73 6.79 8.53 9.12
200 -6.44 -1.76 5.41 7.32 8.23 7.76 4.23 6.19 7.90 8.74
Moon and Weidner’s QMLE with true number of factors (m = 2)
40 2.89 5.33 9.61 10.97 11.66 8.99 9.32 11.73 12.80 13.26
50 2.09 4.49 8.85 10.26 10.79 8.15 8.42 10.77 11.77 12.27
100 0.23 3.14 7.60 8.96 9.77 5.46 5.82 8.70 9.83 10.50
150 -0.15 2.59 7.53 9.15 9.77 4.49 4.82 8.29 9.75 10.30
200 -0.37 2.64 7.56 9.13 9.85 3.91 4.39 8.14 9.59 10.28

Notes: See notes to Table 1. CCEMG is based on (40) which features cross section averages of zit = (yit ; xit ; git )0 .
QMLE estimator and MG estimator based on Song’s individual estimates are computed from demeaned variables y_ it
and x_ it de…ned in (41).

31
Table 5. Estimation of 0 in experiments with regressors, = E ( i ) = 0:4, and m = 2
correlated common factors. (Experiment 16)

Bias (x100) RMSE (x100)


(N,T) 40 50 100 150 200 40 50 100 150 200
Fixed E¤ects estimates
40 9.94 9.49 9.66 9.61 9.70 14.19 13.18 11.72 11.26 11.02
50 9.95 9.43 9.53 9.87 9.92 13.87 12.86 11.48 11.28 11.12
100 9.85 9.83 9.85 9.80 9.46 13.47 12.63 11.49 11.02 10.45
150 10.15 9.75 9.86 9.86 9.74 13.56 12.58 11.28 10.89 10.59
200 9.62 9.81 9.87 9.95 9.60 13.04 12.49 11.35 10.92 10.40
Dynamic CCEMG without bias correction
40 1.00 0.71 0.43 0.09 0.13 5.75 5.10 3.82 3.31 3.08
50 0.79 0.76 0.24 0.24 0.16 5.23 4.57 3.38 3.00 2.77
100 0.95 0.73 0.30 0.15 -0.01 3.78 3.32 2.40 2.10 1.93
150 1.06 0.61 0.28 0.23 0.07 3.26 2.75 1.98 1.78 1.58
200 0.98 0.75 0.29 0.17 0.08 2.80 2.34 1.71 1.48 1.37
Dynamic CCEMG with RMA bias correction
40 1.12 0.80 0.58 0.24 0.27 6.59 5.65 4.00 3.44 3.12
50 0.82 0.73 0.42 0.41 0.33 5.89 4.95 3.59 3.10 2.81
100 0.99 0.73 0.45 0.33 0.18 4.25 3.58 2.50 2.18 1.98
150 1.07 0.68 0.41 0.40 0.24 3.66 3.02 2.09 1.84 1.63
200 0.98 0.79 0.43 0.30 0.23 3.12 2.54 1.83 1.56 1.43
Dynamic CCEMG with jackknife bias correction
40 1.42 0.54 0.20 0.01 0.06 12.35 8.24 4.62 3.73 3.28
50 0.94 0.45 0.12 0.15 0.12 10.68 7.40 4.05 3.35 2.93
100 0.89 0.52 0.09 0.10 -0.03 7.61 5.17 2.89 2.40 2.09
150 1.22 0.44 0.10 0.11 0.03 6.44 4.42 2.42 1.97 1.70
200 0.95 0.67 0.08 0.01 0.03 5.72 3.73 2.10 1.68 1.49
MG based on Song’s individual estimates with 3 factors
40 0.98 0.52 0.40 0.39 0.18 7.45 5.95 3.94 3.55 3.33
50 0.77 0.59 0.38 0.38 0.32 6.31 5.35 3.65 3.15 3.00
100 0.77 0.77 0.43 0.39 0.33 4.17 3.58 2.64 2.34 2.21
150 0.91 0.70 0.40 0.41 0.39 3.41 2.93 2.22 1.97 1.88
200 0.96 0.75 0.54 0.44 0.35 2.92 2.50 1.92 1.74 1.78
MG based on Song with true number of factors (m = 2)
40 0.87 0.69 0.54 0.43 0.27 6.71 5.58 3.89 3.50 3.25
50 0.82 0.67 0.35 0.43 0.34 5.68 4.96 3.52 3.10 2.92
100 0.90 0.84 0.51 0.40 0.41 3.88 3.43 2.54 2.26 2.16
150 0.94 0.78 0.45 0.43 0.43 3.27 2.88 2.12 1.90 1.79
200 1.00 0.77 0.58 0.45 0.34 2.83 2.41 1.86 1.68 1.70
Moon and Weidner’s QMLE with 3 factors
40 5.21 4.83 4.53 4.20 4.23 7.88 7.45 6.43 5.81 5.89
50 5.06 4.95 4.47 4.57 4.49 7.55 7.08 6.04 5.94 5.78
100 5.54 5.14 4.81 4.53 4.47 6.83 6.29 5.66 5.27 5.18
150 5.62 5.15 4.66 4.57 4.43 6.54 5.95 5.25 5.11 4.91
200 5.68 5.21 4.56 4.45 4.31 6.36 5.81 5.04 4.84 4.69
Moon and Weidner’s QMLE with true number of factors (m = 2)
40 4.94 4.68 4.32 3.95 4.05 7.94 7.52 6.41 5.77 5.86
50 4.91 4.83 4.33 4.43 4.33 7.62 7.08 6.04 5.96 5.74
100 5.43 5.18 4.91 4.65 4.64 6.89 6.47 5.82 5.45 5.39
150 5.59 5.33 4.98 4.96 4.83 6.68 6.27 5.64 5.51 5.34
200 5.75 5.40 5.02 5.00 4.90 6.64 6.16 5.55 5.43 5.31

Notes: See notes to Table 1. CCEMG is based on (40) which features cross section averages of zit = (yit ; xit ; git )0 .
QMLE estimator and MG estimator based on Song’s individual estimates are computed from demeaned variables y_ it
and x_ it de…ned in (41).

32
Table 6. Estimation of in experiments with regressors, = E ( i ) = 0:4, and m = 3
correlated common factors. (Experiment 18)

Bias (x100) RMSE (x100)


(N,T) 40 50 100 150 200 40 50 100 150 200
Fixed E¤ects estimates
40 25.74 27.09 30.09 31.21 31.58 27.09 28.16 30.68 31.67 31.98
50 25.86 27.55 30.06 31.22 31.77 27.17 28.50 30.60 31.65 32.12
100 26.31 27.72 30.40 31.34 31.70 27.37 28.58 30.84 31.65 31.95
150 26.16 27.50 30.46 31.58 31.94 27.14 28.36 30.87 31.87 32.15
200 26.26 27.65 30.51 31.62 32.21 27.26 28.40 30.89 31.88 32.42
Dynamic CCEMG without bias correction
40 -11.29 -8.46 -3.08 -1.41 -0.61 12.26 9.53 5.06 4.12 4.04
50 -11.36 -8.38 -3.34 -1.61 -0.72 12.16 9.26 4.88 3.84 3.56
100 -11.59 -8.71 -3.50 -1.74 -1.09 12.00 9.14 4.32 3.02 2.66
150 -11.64 -8.76 -3.53 -1.88 -1.13 11.94 9.07 4.09 2.75 2.29
200 -11.64 -8.81 -3.62 -1.93 -1.13 11.86 9.05 4.03 2.61 2.07
Dynamic CCEMG with RMA bias correction
40 -9.99 -6.82 -2.32 -1.04 -0.42 11.45 8.41 4.78 4.04 4.04
50 -10.02 -6.86 -2.59 -1.32 -0.58 11.26 8.18 4.52 3.79 3.56
100 -10.44 -7.26 -2.84 -1.53 -1.03 11.13 7.94 3.88 2.97 2.66
150 -10.56 -7.34 -2.93 -1.72 -1.09 11.08 7.84 3.62 2.67 2.29
200 -10.56 -7.37 -3.03 -1.77 -1.15 10.95 7.77 3.56 2.51 2.09
Dynamic CCEMG with jackknife bias correction
40 4.25 2.99 2.17 1.78 1.37 10.26 7.47 5.08 4.47 4.34
50 4.49 3.12 1.90 1.56 1.21 9.65 6.91 4.52 4.05 3.79
100 3.74 2.77 1.71 1.30 0.73 7.59 5.35 3.36 2.96 2.61
150 3.99 2.78 1.58 1.10 0.67 7.19 4.92 2.83 2.41 2.15
200 4.24 2.60 1.50 1.05 0.62 6.99 4.49 2.54 2.14 1.90
MG based on Song with true number of factors (m = 3)
40 -7.94 -4.88 -0.14 0.96 1.54 9.72 6.96 4.17 3.95 4.10
50 -7.86 -5.05 -0.38 0.76 1.32 9.35 6.75 3.77 3.66 3.70
100 -8.79 -5.82 -0.95 0.28 0.73 9.58 6.65 2.83 2.51 2.67
150 -9.28 -6.28 -1.51 -0.30 0.19 9.78 6.84 2.69 2.18 2.03
200 -9.86 -6.76 -1.96 -0.70 -0.21 10.23 7.19 2.78 1.97 1.80
Moon and Weidner’s QMLE with true number of factors (m = 3)
40 2.21 5.83 11.43 12.87 13.18 9.75 10.12 13.26 14.43 14.64
50 0.88 4.70 10.13 11.66 12.49 8.75 8.96 11.79 12.99 13.71
100 -3.20 0.99 7.93 9.91 10.64 7.18 5.48 9.02 10.72 11.33
150 -5.01 -0.42 6.91 9.05 9.88 7.07 4.54 7.75 9.65 10.39
200 -5.70 -1.20 6.25 8.49 9.54 7.01 4.00 6.94 8.97 9.97

Notes: See notes to Table 1. CCEMG is based on (40) which features cross section averages of zit = (yit ; xit ; git )0 .
QMLE estimator and MG estimator based on Song’s individual estimates are computed from demeaned variables y_ it
and x_ it de…ned in (41).

33
Table 7. Estimation of 0 in experiments with regressors, = E ( i ) = 0:4, and m = 3
correlated common factors. (Experiment 18)

Bias (x100) RMSE (x100)


(N,T) 40 50 100 150 200 40 50 100 150 200
Fixed E¤ects estimates
40 -18.62 -18.43 -18.70 -18.50 -18.38 21.16 20.51 19.99 19.45 19.25
50 -18.31 -18.45 -18.29 -18.80 -18.64 20.83 20.42 19.47 19.70 19.41
100 -18.20 -18.56 -18.40 -18.29 -18.42 20.42 20.29 19.32 18.98 18.98
150 -18.10 -18.24 -18.43 -18.45 -18.32 20.18 19.91 19.33 19.04 18.82
200 -17.87 -18.42 -18.44 -18.73 -18.54 19.90 20.08 19.23 19.31 18.99
Dynamic CCEMG without bias correction
40 0.98 0.84 0.47 0.38 0.27 6.12 5.22 3.76 3.31 3.12
50 0.93 0.73 0.53 0.34 0.05 5.30 4.67 3.43 2.96 2.71
100 0.92 0.66 0.32 0.18 0.10 3.78 3.39 2.42 2.07 1.94
150 0.83 0.65 0.40 0.12 0.15 3.23 2.76 1.94 1.72 1.61
200 0.90 0.73 0.29 0.13 0.11 2.81 2.47 1.67 1.49 1.36
Dynamic CCEMG with RMA bias correction
40 1.01 0.86 0.52 0.46 0.35 6.91 5.63 3.92 3.42 3.18
50 0.73 0.67 0.59 0.45 0.16 6.00 5.15 3.61 3.04 2.72
100 0.93 0.62 0.41 0.28 0.20 4.30 3.63 2.53 2.13 1.99
150 0.81 0.58 0.48 0.24 0.27 3.61 3.03 2.06 1.80 1.65
200 0.87 0.67 0.38 0.25 0.22 3.17 2.67 1.75 1.56 1.40
Dynamic CCEMG with jackknife bias correction
40 1.02 0.93 0.22 0.19 0.15 12.39 8.52 4.56 3.76 3.36
50 1.05 0.68 0.29 0.19 -0.06 10.94 7.73 4.21 3.34 2.91
100 1.39 0.45 0.10 -0.01 0.02 7.99 5.34 2.91 2.33 2.10
150 1.01 0.54 0.17 -0.03 0.09 6.52 4.44 2.32 1.95 1.72
200 1.00 0.58 0.03 -0.01 0.05 5.72 3.88 2.01 1.69 1.47
MG based on Song with true number of factors (m = 3)
40 0.49 0.24 -0.21 -0.08 0.01 7.73 6.23 4.20 3.77 3.59
50 0.20 0.29 0.02 -0.08 -0.09 6.71 5.55 3.91 3.34 3.12
100 0.38 0.26 -0.02 -0.30 -0.19 4.28 3.67 2.78 2.52 2.44
150 0.27 0.28 -0.12 -0.25 -0.20 3.29 2.88 2.32 2.12 2.10
200 0.35 0.22 -0.07 -0.22 -0.22 2.84 2.47 1.95 1.82 1.80
Moon and Weidner’s QMLE with true number of factors (m = 3)
40 4.18 4.51 4.13 4.24 4.12 7.19 7.15 6.10 5.94 5.74
50 4.67 4.75 4.36 4.17 4.08 6.96 6.75 5.92 5.56 5.45
100 5.17 4.88 4.61 4.48 4.46 6.36 5.97 5.38 5.17 5.10
150 5.19 5.01 4.75 4.44 4.57 6.10 5.79 5.24 4.93 5.03
200 5.28 5.15 4.67 4.44 4.41 5.94 5.75 5.07 4.82 4.77

Notes: See notes to Table 1. CCEMG is based on (40) which features cross section averages of zit = (yit ; xit ; git )0 .
QMLE estimator and MG estimator based on Song’s individual estimates are computed from demeaned variables y_ it
and x_ it de…ned in (41).

34
Table 8. Size and Power of estimating in Experiment 14 (with regressors, = 0:4, m = 1 and
f = 0:6).
Size (x100) Power (x100)
(N,T) 40 50 100 150 200 40 50 100 150 200
Fixed E¤ects estimates
40 88.00 91.70 98.10 99.65 99.80 60.85 68.50 83.65 89.45 92.10
50 88.75 94.15 99.30 99.90 99.85 63.25 70.20 85.65 92.00 93.80
100 94.85 98.10 99.90 100.00 100.00 71.35 75.95 91.90 96.20 97.80
150 96.70 99.05 100.00 100.00 100.00 77.30 81.00 95.60 98.65 99.45
200 97.15 99.35 100.00 100.00 100.00 78.40 83.15 96.45 99.15 99.65
Dynamic CCEMG without bias correction
40 72.75 53.85 16.00 10.80 8.90 99.80 99.35 94.15 90.15 88.25
50 80.95 60.75 21.45 12.90 10.00 100.00 99.90 98.00 95.20 92.40
100 98.30 92.60 38.80 19.80 11.90 100.00 100.00 100.00 99.90 99.90
150 99.95 98.60 57.45 28.50 16.75 100.00 100.00 100.00 100.00 100.00
200 100.00 99.70 70.85 37.15 22.10 100.00 100.00 100.00 100.00 100.00
Dynamic CCEMG with RMA bias correction
40 42.65 27.85 10.15 6.20 6.90 94.90 92.40 86.25 80.70 78.80
50 49.65 32.65 10.55 8.25 6.35 97.65 96.65 91.90 89.25 89.05
100 79.50 57.80 17.20 8.30 7.20 100.00 100.00 99.85 99.70 99.55
150 91.80 77.40 22.70 11.75 9.20 100.00 100.00 100.00 100.00 100.00
200 95.60 88.95 32.40 14.55 10.55 100.00 100.00 100.00 100.00 100.00
Dynamic CCEMG with jackknife bias correction
40 14.15 12.20 9.85 8.65 7.90 20.65 30.10 49.60 59.90 65.00
50 15.40 12.60 9.05 8.15 8.95 21.20 33.70 59.80 69.20 75.00
100 21.70 16.05 10.80 7.80 7.65 34.05 54.95 86.80 93.80 96.25
150 26.85 20.35 10.95 8.65 7.05 42.20 66.00 96.55 99.15 99.50
200 31.90 25.85 11.95 8.60 6.95 48.15 74.15 99.00 99.85 99.95
MG based on Song’s individual estimates with 3 factors
40 51.50 36.20 13.75 7.95 7.15 95.60 94.10 88.20 84.55 81.55
50 62.00 45.45 13.15 9.05 7.05 98.45 98.50 95.65 91.90 89.50
100 90.30 75.70 23.00 11.35 8.95 100.00 100.00 99.95 99.70 99.70
150 97.35 89.50 33.45 16.70 10.60 100.00 100.00 100.00 100.00 100.00
200 99.50 96.20 42.95 20.80 13.60 100.00 100.00 100.00 100.00 100.00
MG based on Song with true number of factors (m = 1)
40 30.45 20.90 10.35 6.55 7.45 91.45 89.00 81.55 76.70 74.85
50 39.85 27.70 10.50 7.15 6.75 96.45 95.85 91.70 88.70 86.00
100 72.45 56.80 17.60 10.45 8.30 100.00 100.00 99.90 99.60 99.55
150 88.60 74.85 26.75 15.10 9.15 100.00 100.00 100.00 100.00 100.00
200 95.45 87.60 34.80 17.10 11.70 99.95 100.00 100.00 100.00 100.00
Moon and Weidner’s QMLE with 3 factors
40 51.95 52.55 71.60 80.20 84.90 81.85 74.10 66.30 67.80 70.55
50 55.15 51.30 74.35 83.35 87.85 85.05 79.20 67.45 69.60 72.35
100 63.80 50.15 81.35 91.85 94.90 96.50 91.50 73.20 70.80 73.05
150 73.00 53.45 84.25 96.10 98.55 99.10 96.55 80.40 74.40 70.95
200 79.65 57.00 89.15 97.60 99.05 99.55 98.80 84.85 77.95 75.15
Moon and Weidner’s QMLE with true number of factors (m = 1)
40 46.30 53.35 72.50 81.15 85.55 67.95 63.15 63.80 65.95 68.25
50 46.15 53.50 76.90 83.45 88.70 69.80 67.05 65.35 68.05 72.60
100 49.60 57.30 88.10 94.60 96.45 80.85 76.50 66.55 67.90 70.05
150 50.65 64.35 93.10 98.50 98.95 86.30 80.85 70.95 68.85 70.60
200 54.20 70.75 96.85 99.30 99.55 88.40 84.10 69.30 68.80 71.75

Notes: See notes to Table 1. CCEMG is based on (40) which features cross section averages of zit = (yit ; xit ; git )0 .
QMLE estimator and MG estimator based on Song’s individual estimates are computed from demeaned variables y_ it
and x_ it de…ned in (41).

35
Table 9. Size and Power of estimating 0 in Experiment 14 (with regressors, = 0:4, m = 1 and
f = 0:6).

Size (x100) Power (x100)


(N,T) 40 50 100 150 200 40 50 100 150 200
Fixed E¤ects estimates
40 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00
50 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00
100 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00
150 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00
200 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00 100.00
Dynamic CCEMG without bias correction
40 6.75 7.15 6.85 7.10 6.60 33.35 41.45 74.80 85.20 91.10
50 7.00 6.25 5.05 5.95 7.20 42.55 54.15 82.70 92.45 94.60
100 6.95 6.45 6.10 5.65 5.10 67.05 80.30 98.10 99.65 100.00
150 7.90 7.40 6.30 6.10 5.85 81.55 92.60 99.85 99.95 100.00
200 8.65 7.45 7.00 5.80 5.65 90.80 97.30 99.95 100.00 100.00
Dynamic CCEMG with RMA bias correction
40 6.35 7.65 7.25 6.70 6.45 28.05 40.55 69.55 81.90 90.00
50 6.55 6.25 6.40 6.45 6.55 33.45 44.95 78.75 90.40 94.40
100 7.40 5.80 6.80 7.15 4.70 56.40 74.60 97.45 99.65 100.00
150 7.10 7.95 6.35 6.85 6.50 74.45 88.10 99.80 100.00 100.00
200 8.05 7.25 6.30 7.20 5.50 84.15 94.70 100.00 100.00 100.00
Dynamic CCEMG with jackknife bias correction
40 5.55 6.40 5.55 6.55 6.50 62.10 87.15 99.70 99.95 100.00
50 6.40 4.85 5.65 6.45 6.90 68.45 92.00 100.00 100.00 100.00
100 5.00 6.70 6.00 5.20 5.10 90.00 99.75 100.00 100.00 100.00
150 5.85 5.45 4.95 6.45 5.60 97.60 100.00 100.00 100.00 100.00
200 6.20 5.95 5.40 5.00 4.30 99.45 100.00 100.00 100.00 100.00
MG based on Song’s individual estimates with 3 factors
40 4.30 5.15 4.20 4.60 4.20 30.15 37.45 61.55 71.90 75.50
50 5.20 4.50 4.45 4.30 3.75 36.60 48.30 75.50 84.00 87.60
100 5.45 6.00 5.80 5.85 4.85 68.15 79.75 97.05 99.50 99.75
150 6.30 5.55 5.95 5.05 5.20 85.80 93.70 99.70 99.80 100.00
200 6.50 6.80 6.00 6.00 5.85 93.15 98.30 100.00 100.00 100.00
MG based on Song with true number of factors (m = 1)
40 8.30 9.05 5.80 6.25 6.45 51.45 56.15 78.15 85.10 88.20
50 8.60 6.75 6.55 4.95 4.90 55.60 66.10 85.55 92.25 95.00
100 5.55 6.45 5.25 5.75 4.90 80.80 89.25 98.10 99.60 99.95
150 7.25 6.25 5.60 5.65 4.85 92.30 96.65 99.95 100.00 100.00
200 7.00 6.55 6.15 6.15 5.70 97.20 99.35 99.95 100.00 100.00
Moon and Weidner’s QMLE with 3 factors
40 52.70 54.60 62.20 66.15 69.35 29.50 31.20 48.30 59.40 69.50
50 54.00 54.10 57.90 64.80 68.90 30.65 35.05 58.90 70.00 74.65
100 63.15 62.25 69.60 73.15 76.80 37.90 50.40 80.10 90.55 94.10
150 69.85 73.65 76.25 83.15 86.45 48.30 62.05 91.05 97.10 98.15
200 79.70 81.05 86.30 89.55 92.15 56.20 71.60 95.30 99.15 99.60
Moon and Weidner’s QMLE with true number of factors (m = 1)
40 79.80 84.10 93.40 95.40 97.75 49.40 52.15 60.55 65.20 67.85
50 83.70 86.20 95.35 97.90 98.60 51.40 52.65 60.85 68.15 72.45
100 93.75 96.50 99.30 99.55 99.90 63.55 66.20 76.05 80.00 84.20
150 96.80 98.35 99.75 100.00 100.00 72.75 73.35 81.90 87.20 90.80
200 97.85 99.35 99.95 100.00 100.00 74.70 78.50 85.85 90.60 93.80

Notes: See notes to Table 1. CCEMG is based on (40) which features cross section averages of zit = (yit ; xit ; git )0 .
QMLE estimator and MG estimator based on Song’s individual estimates are computed from demeaned variables y_ it
and x_ it de…ned in (41).

36
Table 10. Size and Power of estimating in Experiment 16 (with regressors, = 0:4, m = 2 and
f = 0:6).

Size (x100) Power (x100)


(N,T) 40 50 100 150 200 40 50 100 150 200
Fixed E¤ects estimates
40 97.30 99.45 99.95 100.00 100.00 84.35 90.30 97.70 99.60 99.80
50 98.50 99.30 100.00 100.00 100.00 85.85 92.40 99.00 99.85 99.90
100 99.50 99.85 100.00 100.00 100.00 92.05 95.50 99.75 100.00 100.00
150 99.85 99.95 100.00 100.00 100.00 93.95 96.05 99.75 100.00 100.00
200 99.95 99.95 100.00 100.00 100.00 94.15 97.40 99.95 100.00 100.00
Dynamic CCEMG without bias correction
40 69.65 50.75 18.30 10.70 9.85 99.65 98.90 93.70 89.20 84.45
50 79.70 59.65 19.80 10.90 9.80 99.90 99.90 98.05 95.05 91.70
100 97.25 87.40 32.95 16.45 10.25 100.00 100.00 99.95 100.00 99.75
150 99.65 97.85 48.65 22.35 13.00 100.00 100.00 100.00 100.00 100.00
200 100.00 99.70 61.80 30.95 17.10 100.00 100.00 100.00 100.00 100.00
Dynamic CCEMG with RMA bias correction
40 44.25 26.90 10.15 7.10 6.95 95.35 92.15 84.05 79.70 76.55
50 51.00 33.25 10.80 6.80 7.20 98.25 96.60 91.85 88.55 86.10
100 78.85 57.10 16.00 8.85 6.35 99.85 99.90 99.60 99.65 99.35
150 90.40 75.45 22.85 11.45 7.45 100.00 100.00 99.95 100.00 100.00
200 96.85 86.00 30.55 14.40 8.30 100.00 100.00 100.00 100.00 100.00
Dynamic CCEMG with jackknife bias correction
40 14.30 12.35 11.20 8.60 9.15 19.55 27.65 48.30 56.45 62.45
50 15.85 11.50 10.75 8.95 8.00 21.80 32.15 56.20 65.90 72.20
100 22.15 17.95 11.90 9.25 6.45 33.70 49.60 85.30 92.65 96.35
150 29.05 21.40 12.10 10.30 8.00 39.85 64.95 96.00 98.60 99.65
200 34.00 26.40 14.25 10.40 7.30 46.35 72.00 98.85 99.80 100.00
MG based on Song’s individual estimates with 3 factors
40 53.25 34.75 9.40 8.40 6.85 95.95 95.20 86.30 80.10 78.55
50 59.95 40.95 10.05 7.75 7.40 99.00 98.25 92.10 89.50 86.20
100 88.60 70.25 17.50 9.40 6.90 99.90 99.95 99.75 99.60 99.05
150 96.60 83.85 23.60 11.15 8.05 100.00 100.00 100.00 100.00 100.00
200 99.35 94.20 32.80 14.00 8.85 100.00 100.00 100.00 100.00 100.00
MG based on Song with true number of factors (m = 2)
40 44.00 28.40 8.30 7.65 6.50 95.90 94.30 85.15 78.80 77.25
50 50.15 34.35 9.10 8.20 7.45 98.65 97.95 91.50 89.20 85.15
100 79.15 60.35 15.20 8.55 6.65 99.95 100.00 99.65 99.30 99.15
150 92.00 76.05 20.75 10.30 7.55 100.00 100.00 100.00 99.90 100.00
200 97.45 88.45 28.15 12.35 8.25 100.00 100.00 100.00 100.00 100.00
Moon and Weidner’s QMLE with 3 factors
40 53.00 56.05 79.60 89.30 92.00 74.50 68.70 61.50 65.70 71.45
50 56.40 54.65 80.35 90.55 92.65 80.40 72.25 62.35 67.15 70.50
100 64.95 55.05 84.35 94.30 97.85 94.90 89.20 69.90 69.10 68.20
150 76.20 53.65 86.15 96.70 98.40 99.25 96.65 78.60 70.70 72.00
200 82.90 56.15 89.50 98.55 99.20 99.70 98.85 85.75 76.00 73.20
Moon and Weidner’s QMLE with true number of factors (m = 2)
40 51.55 59.30 84.60 90.25 94.30 64.75 59.10 61.95 67.70 70.95
50 52.10 57.55 83.95 92.05 93.50 69.95 63.20 62.40 64.70 70.05
100 45.70 55.20 89.10 96.20 98.45 87.25 78.35 63.25 63.10 66.15
150 45.35 56.55 94.40 98.65 99.65 93.35 88.25 64.35 64.40 67.20
200 46.50 59.80 97.60 99.75 99.85 97.05 92.85 67.70 67.40 70.10

Notes: See notes to Table 1. CCEMG is based on (40) which features cross section averages of zit = (yit ; xit ; git )0 .
QMLE estimator and MG estimator based on Song’s individual estimates are computed from demeaned variables y_ it
and x_ it de…ned in (41).

37
Table 11. Size and Power of estimating 0 in Experiment 16 (with regressors, = 0:4, m = 2
and f = 0:6).

Size (x100) Power (x100)


(N,T) 40 50 100 150 200 40 50 100 150 200
Fixed E¤ects estimates
40 67.90 68.90 81.40 85.25 90.30 48.70 49.65 48.70 54.70 55.25
50 70.35 72.50 82.70 89.40 92.85 53.45 51.55 54.60 54.95 57.30
100 79.35 82.80 90.55 94.85 95.55 62.10 60.90 64.05 66.15 66.25
150 82.45 84.30 92.80 95.75 97.95 70.70 68.80 68.30 68.80 71.50
200 85.55 87.80 94.40 96.80 97.60 71.15 73.10 69.40 73.75 73.15
Dynamic CCEMG without bias correction
40 6.00 5.85 6.50 6.65 6.25 35.55 46.10 73.80 87.35 90.75
50 5.60 5.90 5.75 6.80 6.30 43.25 54.30 83.70 92.25 95.70
100 5.40 6.85 5.45 5.50 5.70 69.00 81.85 98.65 99.85 99.95
150 7.20 5.50 5.70 6.60 4.95 85.00 95.30 99.95 100.00 100.00
200 7.30 6.35 6.30 5.20 4.95 93.25 98.05 100.00 100.00 100.00
Dynamic CCEMG with RMA bias correction
40 5.80 6.15 6.50 6.65 6.50 29.25 40.05 68.95 84.30 89.55
50 5.75 6.30 7.25 6.50 6.80 35.50 47.60 80.00 90.15 94.85
100 5.40 5.70 5.55 6.05 6.40 58.50 76.15 97.65 99.65 99.85
150 7.45 5.55 6.90 6.80 5.45 74.05 89.85 99.90 100.00 99.95
200 6.60 6.00 6.50 5.95 5.70 86.75 96.45 100.00 100.00 100.00
Dynamic CCEMG with jackknife bias correction
40 5.80 5.45 5.70 6.00 7.00 61.70 86.45 99.85 100.00 100.00
50 5.30 5.65 6.40 6.05 5.40 68.20 92.55 100.00 100.00 100.00
100 4.25 5.60 5.35 5.30 5.45 91.70 99.70 100.00 100.00 100.00
150 5.65 5.20 5.60 5.85 5.30 98.25 99.95 100.00 100.00 100.00
200 6.75 5.40 5.60 4.90 5.60 99.50 100.00 100.00 100.00 100.00
MG based on Song’s individual estimates with 3 factors
40 9.75 9.10 6.50 5.70 4.80 43.95 51.75 71.55 78.75 83.50
50 9.95 8.65 7.15 5.50 3.75 50.50 60.25 80.45 86.30 90.30
100 9.35 9.40 5.65 5.50 4.65 75.75 85.30 96.20 98.30 98.25
150 11.25 9.35 7.50 5.60 3.75 88.15 94.10 98.95 98.55 98.15
200 11.75 10.15 7.80 5.15 4.60 94.40 97.15 99.35 98.85 98.45
MG based on Song with true number of factors (m = 2)
40 11.05 9.95 6.55 5.90 5.40 48.50 55.85 72.15 80.45 84.00
50 11.40 10.70 7.20 6.10 4.75 56.15 64.75 82.85 88.10 91.20
100 11.70 11.00 6.70 5.80 5.30 80.65 87.85 96.90 99.20 98.85
150 13.15 10.80 7.95 5.85 4.10 91.00 95.35 99.15 99.20 98.85
200 13.00 10.35 7.90 5.40 5.15 95.85 98.25 99.40 99.35 98.70
Moon and Weidner’s QMLE with 3 factors
40 39.00 41.70 51.75 57.60 61.90 37.10 41.60 58.60 70.20 75.20
50 40.95 44.35 53.40 62.00 67.10 41.20 45.70 64.80 70.50 77.15
100 56.90 58.40 71.95 76.85 80.30 46.80 56.70 76.40 86.80 90.15
150 67.05 70.40 80.15 86.05 89.30 54.20 66.55 87.60 93.10 95.80
200 77.05 78.50 86.30 91.25 93.00 59.70 73.85 93.25 97.80 98.40
Moon and Weidner’s QMLE with true number of factors (m = 2)
40 36.25 39.50 47.90 52.55 59.75 36.95 39.80 57.90 69.95 74.85
50 38.35 42.80 50.10 60.15 64.30 39.40 43.70 63.75 69.75 77.00
100 53.20 56.70 70.45 76.20 80.95 47.65 55.20 73.65 84.40 87.40
150 65.85 69.30 81.55 87.65 90.85 53.85 62.65 81.70 88.60 92.80
200 73.55 75.55 88.00 92.80 94.85 59.10 68.30 87.70 93.00 95.90

Notes: See notes to Table 1. CCEMG is based on (40) which features cross section averages of zit = (yit ; xit ; git )0 .
QMLE estimator and MG estimator based on Song’s individual estimates are computed from demeaned variables y_ it
and x_ it de…ned in (41).

38
Table 12. Size and Power of estimating in Experiment 18 (with regressors, = 0:4, m = 3 and
f = 0:6).

Size (x100) Power (x100)


(N,T) 40 50 100 150 200 40 50 100 150 200
Fixed E¤ects estimates
40 99.40 99.85 100.00 100.00 100.00 91.55 95.75 99.70 99.95 99.90
50 99.60 100.00 100.00 100.00 100.00 93.80 97.35 99.85 100.00 100.00
100 100.00 100.00 100.00 100.00 100.00 97.25 98.40 100.00 100.00 100.00
150 99.90 100.00 100.00 100.00 100.00 98.05 99.25 100.00 100.00 100.00
200 100.00 100.00 100.00 100.00 100.00 97.75 99.35 100.00 100.00 100.00
Dynamic CCEMG without bias correction
40 73.90 56.55 17.20 10.80 10.25 99.75 99.45 94.00 88.60 82.80
50 82.90 64.40 20.45 11.65 9.45 100.00 99.85 97.80 94.35 91.70
100 98.35 91.30 35.00 14.40 11.10 100.00 100.00 100.00 99.90 99.90
150 99.90 97.75 47.65 20.30 13.05 100.00 100.00 100.00 100.00 100.00
200 100.00 99.75 59.45 26.15 14.70 100.00 100.00 100.00 100.00 100.00
Dynamic CCEMG with RMA bias correction
40 52.75 34.05 11.00 6.95 7.60 96.45 95.40 86.00 81.85 76.65
50 61.30 40.35 12.55 8.05 6.85 99.00 97.95 93.35 89.80 87.20
100 86.60 69.35 21.35 10.00 8.55 100.00 100.00 99.95 99.75 99.85
150 95.00 83.40 29.15 13.70 9.45 100.00 100.00 100.00 100.00 100.00
200 98.05 91.25 40.15 17.70 10.70 100.00 100.00 100.00 100.00 100.00
Dynamic CCEMG with jackknife bias correction
40 15.45 13.15 11.15 9.05 9.35 18.70 27.20 45.70 55.00 60.05
50 17.65 14.55 10.30 9.45 8.30 20.05 31.05 56.05 66.20 71.30
100 21.45 17.30 12.00 10.50 6.70 32.40 49.35 84.20 91.75 96.55
150 28.00 21.90 11.40 8.85 6.85 41.35 61.00 95.90 98.80 99.50
200 34.60 25.30 13.85 10.20 7.80 45.50 71.90 98.65 99.70 99.95
MG based on Song with true number of factors (m = 3)
40 44.45 24.95 7.50 6.40 7.90 94.50 90.75 73.55 65.95 62.80
50 50.00 32.45 7.90 7.80 8.50 97.50 95.85 83.20 77.70 73.90
100 82.40 58.70 9.55 6.45 8.65 99.95 99.85 98.95 97.80 96.60
150 94.10 77.90 15.10 8.55 6.60 100.00 100.00 99.90 99.85 99.85
200 98.65 90.05 24.25 9.50 7.20 100.00 100.00 100.00 100.00 100.00
Moon and Weidner’s QMLE with true number of factors (m = 3)
40 57.25 63.35 90.10 95.00 96.05 68.15 61.75 63.30 70.90 74.35
50 58.10 62.10 88.65 94.90 96.60 73.00 67.20 62.75 69.60 73.35
100 61.80 51.50 91.30 98.00 98.90 93.30 86.70 63.20 62.75 68.15
150 74.65 53.20 92.35 98.40 99.70 98.35 95.60 71.35 66.20 66.05
200 79.80 54.85 93.85 99.60 99.90 99.60 98.30 79.85 68.90 68.60

Notes: See notes to Table 1. CCEMG is based on (40) which features cross section averages of zit = (yit ; xit ; git )0 .
QMLE estimator and MG estimator based on Song’s individual estimates are computed from demeaned variables y_ it
and x_ it de…ned in (41).

39
Table 13. Size and Power of estimating 0 in Experiment 18 (with regressors, = 0:4, m = 3
and f = 0:6).

Size (x100) Power (x100)


(N,T) 40 50 100 150 200 40 50 100 150 200
Fixed E¤ects estimates
40 88.05 91.30 97.50 99.45 99.50 98.35 99.15 100.00 100.00 100.00
50 89.30 92.90 98.55 99.50 99.95 98.55 99.65 100.00 100.00 100.00
100 93.90 96.20 99.65 100.00 100.00 99.35 99.75 100.00 100.00 100.00
150 94.65 96.65 99.65 100.00 100.00 99.80 99.95 100.00 100.00 100.00
200 95.85 98.10 99.95 100.00 100.00 99.75 100.00 100.00 100.00 100.00
Dynamic CCEMG without bias correction
40 7.20 6.75 6.25 6.00 7.20 36.15 45.80 73.95 84.75 90.35
50 6.50 5.35 7.25 7.10 5.15 42.45 55.15 82.40 91.75 96.10
100 5.75 6.60 6.65 5.80 5.40 68.75 82.30 98.45 99.90 99.95
150 7.05 6.15 5.15 5.75 6.00 85.45 94.85 99.90 100.00 100.00
200 6.35 7.35 5.45 5.20 5.40 93.70 97.80 100.00 100.00 100.00
Dynamic CCEMG with RMA bias correction
40 7.15 6.80 6.10 6.50 6.85 28.90 38.55 69.70 81.05 89.05
50 6.05 7.15 6.40 7.05 5.15 35.55 48.80 77.60 90.25 95.65
100 6.50 6.50 6.70 6.05 5.60 58.10 76.55 97.75 99.70 99.80
150 6.55 6.15 5.55 5.75 5.90 76.20 90.00 99.70 100.00 100.00
200 6.75 7.15 5.35 5.50 5.30 86.30 95.65 99.95 100.00 100.00
Dynamic CCEMG with jackknife bias correction
40 5.90 6.20 5.55 6.50 6.90 59.95 85.85 99.85 100.00 100.00
50 5.10 6.85 5.70 6.85 5.50 66.95 90.30 100.00 100.00 100.00
100 5.75 5.40 5.45 5.30 5.75 91.30 99.30 100.00 100.00 100.00
150 5.05 5.30 5.00 5.35 5.85 97.50 99.95 100.00 100.00 100.00
200 5.55 5.90 4.95 5.30 5.45 99.45 100.00 100.00 100.00 100.00
MG based on Song with true number of factors (m = 3)
40 8.45 8.55 5.15 4.90 5.05 41.95 50.75 72.65 79.20 80.80
50 8.20 9.10 6.85 5.00 4.50 52.00 59.00 79.50 86.95 89.75
100 9.35 8.55 7.10 5.55 4.85 76.70 84.20 95.65 97.85 97.25
150 8.70 8.25 6.05 5.40 4.75 90.15 94.65 98.30 98.45 97.90
200 9.45 8.30 6.65 4.55 4.05 95.20 97.60 98.60 98.90 98.05
Moon and Weidner’s QMLE with true number of factors (m = 3)
40 33.45 39.20 48.30 58.05 63.25 41.35 44.25 61.60 68.50 75.90
50 37.15 43.05 53.85 60.50 65.60 40.65 44.90 64.75 75.35 80.60
100 54.05 56.30 70.35 77.85 83.05 49.60 59.00 80.15 88.40 90.95
150 63.35 68.80 83.45 85.70 90.35 59.10 68.40 88.90 94.95 95.45
200 73.65 78.75 88.80 91.30 93.75 67.30 74.70 94.05 97.55 98.70

Notes: See notes to Table 1. CCEMG is based on (40) which features cross section averages of zit = (yit ; xit ; git )0 .
QMLE estimator and MG estimator based on Song’s individual estimates are computed from demeaned variables y_ it
and x_ it de…ned in (41).

40
A Mathematical Appendix
A.1 Notations and De…nitions
We begin by brie‡y summarizing the notations used in the paper, and introduce new notations which will
prove useful in the proofs provided below. All vectors are represented by bold lower case letters and matrices
are represented by bold upper case letters. We use ha; bi = a0 b to denote the inner product (corresponding
Pn Pn
to the Euclidean norm) of vectors a and b. kAk1 max i=1 jaij j ; and kAk1 max j=1 jaij j denote
1 j n 1 i n
the maximum absolute column and row sum norms of A 2 Mn n , respectively, where Mn n is the space
p
of real-valued n n matrices. kAk = % (A0 A) is the spectral norm of A, % (A) max fj i (A)jg is the
1 i n
spectral radius of A, and j 1 (A)j j 2 (A)j ::: j n (A)j are the eigenvalues of A. Col (A) denotes the
p p
space spanned by the column vectors of A. Note that kak = % (a0 a) = a0 a corresponds to the Euclidean
length of vector a.
Let
0 1 0 1 0 1 0 1
yi;pT +1 yipT x0i;pT +1 x0ipT
B C B C B C B 0 C
B yi;pT +2 C B yi;pT +1 C B x0i;pT +2 C B xi;pT +1 C
yi =B
B .. C , yi; 1 = B
C B .. C,
C Xi =B
B .. C , Xi;
C 1 =B
B .. C,
C
T pT 1 @ . A T pT 1 @ . A T pT kx @ . A T pT kx @ . A
0 0
yiT yi;T 1 xiT xi;T 1

0 0 0 0
T pT = (1; 1; :::; 1) is T pT 1 vector of ones, it = yi;t 1 ; xit ; xi;t 1 ,

0 0
1 0 1 0 1
i;pT +1 fp0 T +1 "i;pT +1
B 0 C B C B C
B i;pT +2 C B fp0 T +2 C B "i;pT +2 C
= B .. C = (yi; 1 ; Xi ; Xi; 1) , F =B .. C , and "i = B .. C.
i B C T pT m B C B C
T pT 2kx +1 @ . A @ . A @ . A
0
iT fT0 "iT

Using the above notations, model (1) can be written as

yi = cyi T pT + i yi; 1 + Xi 0i + Xi; 1 1i +F i + "i ,

or more compactly as
yi = cyi T pT + i i +F i + "i , (A.1)
0 0 0 0 0 PN
for i = 1; 2; :::; N , where i = i; 0i ; 1i . Let also zit = (yit ; ! 0it ) , zwt = (ywt ; ! 0wt ) = i=1 wi zit ,
0 1 0 P1 0
1
1 z0w;pT +1 z0w;pT z0w;1 `=pT +1 i` zw;pT +1 `
B C B P1 0 C
B 1 z0w;pT +2 z0w;pT +1 z0w;2 C B `=pT +1 i` zw;pT +2 ` C
Qw =B
B .. .. .. .. C , and
C i
B
=B .. C.
C
T pT (k+1)pT +1 @ . . . . A T pT 1 @ . A
0 0
P1 0
1 zw;T zw;T 1 z0w;T pT `=pT +1 i` zw;T `

Model (A.1) can be equivalently written as (see also (22)),

yi = i i + Qw di + "i + i + #i , (A.2)
h i0
0 0 0 0 1
where di = cyi ; i0 ; i1 ; :::; ipT , i (L) is given by i (L) = G0 (L) i = 0
i (C0 C) C0 1
(L) , see (23),

41
0
cyi = cyi i (1) czw , and

#i = cyi + F i Qw di i

= F i
ew
Z i (L) , (A.3)

in which 0 1
z0w;pT +1
B C
B z0w;pT +2 C XN
e w = Zw
Z c 0
, Z = B .. C , and czw = wi (Ik+1 Ai )
1
czi .
T pT zw w B C
@ . A i=1
z0w;T
0 1=2
Note that the individual elements of #i = (#i;pT +1 ; #i;pT +2 ; :::; #i;T ) areOp N uniformly across all i
and t.
De…ne also the following projection matrices

+ +
Ph = Hw (H0w Hw ) H0w , and Mh = IT pT Hw (H0w Hw ) H0w , (A.4)
T pT T pT T pT T pT

in which
0 1
1 h0w;pT +1 h0wpT h0w1
B C
B 1 h0w;pT +2 h0w;pT +1 h0w2 C
Hw =B
B .. .. .. .. C,
C
T pT (k+1)pT +1 @ . . . . A
1 h0w;T h0w;T 1 h0w;T pT

and hwt = w (L) ft + czw , where

N
X 1
w (L) = wi (Ik+1 Ai L) A0;i1 Ci .
i=1

Furthermore, let Vw = Qw Hw , and note that


0 0 0 0
1
0 w;pT +1 wpT w1
B 0 0 0 C N
X
B 0 w;pT +2 w;pT +1 w2 C
Vw = B C, 1
B .. .. .. .. C wt = wi (Ik+1 Ai L) A0;i1 eit ,
@ . . . . A i=1
0 0 0
0 wT w;T 1 w;T pT

and Hw = F w, where 0 1
1 fp0 T +1 fp0 T f10
B C
B 1 fp0 T +2 fp0 T +1 f20 C
F =B
B .. .. .. .. C,
C
T pT 1+mpT @ A
. . . .
1 fT0 0
fT 1 fT0 pT
0 1
1 c0zw c0zw c0zw
B C
B 0 0
w (L) 0 0 C
B m 1 m k+1 m k+1 C
B C N
X
B 0 0 0
w (L) 0 C 1
w =B m 1 m k+1 m k+1 C , and w (L) = wi (Ik+1 Ai L) A0;i1 Ci .
(pT m+1) [pT (k+1)+1] B .. .. .. C
B .. C i=1
B . . . . C
@ A
0
0 0 0 w (L)
m 1 m k+1 m k+1

42
We also de…ne 0 1
1 0 0
B 1 kx 1 kx C
B 0 0 Ikx C
S =B C, (A.5)
(1+2kx ) (1+2kx ) @ kx 1 kx kx A
0 Ikx 0
kx 1 kx kx

0 0
= yi;t 1 ; x0i;t 1 ; x0it , and note that it = S0 it , and i = i S, where i = i;pT +1 ; i;pT +2 ; :::; iT .
it
Individual elements of it are also denoted as ist for s = 1; 2; :::; 2k + 1, and the vector of observations on
ist is 0 1
i;s;pT +1
B .. C
is =B
@ .
C.
A
T pT 1
isT

0
Recall that the panel data model (1)-(3) can be written as the VAR model (6) in zit = (yit ; x0it ; git
0
).
Hence we have
1
X
zit = A`i czi + A0i1 Ci ft ` + A0i1 ei;t ` ,
`=0

and 0 1
yi;t 1 !
B C S0yx zi;t 1
it = @ xi;t 1 A = =c i + i (L) (Ci ft + eit ) ,
S0x zit
xit
where 0 1
1 0 0
1 kx 1 kg
S0yx =@ A, S0x = kx 1
0 Ikx 0
kx kg
,
kx +1 k+1 0 Ikx 0 kx k+1
kx 1 kx kg

0
c i = i (L) (Syx ; Sx ) czi , and
! 0 1 !
0
kx +1 k+1 A0i1 hSyx (Ik+1 Ai L) L i A0i1 .
i (L) = + 1 (A.6)
(1+2kx ) (k+1) S0x S0x (Ik+1 Ai L) Ik+1

A.2 Statement of Lemmas


Lemma A.1 Let A = (a1 ; a2 ; :::; asN ) and B = (b1 ; b2 ; :::bsN )be rN sN random matrices, and rN and
1=2
sN are deterministic sequences nondecreasing in N . Suppose also that ka` k = Op rN and kb` k =
1=2
Op rN N 1=2 , uniformly in `, for ` = 1; 2; :::; sN . Then for any A;1 ; A;2 2 Col (A) for which there
exist vectors c1 and c2 such that A;1 = Ac1 , A;2 = Ac2 , kc1 k1 < K and kc2 k1 < K, where the constant
K < 1 does not depend on N , we have
p
sN rN
kMA+B A;1 k = Op p , (A.7)
N

and
0 s2N rN
hMA+B A;1 ; MA+B A;2 i = A;1 MA+B A;2 = Op (A.8)
N
where MA+B is orthogonal projection matrix that projects onto the orthogonal complement of Col (A + B).

43
j
Lemma A.2 Suppose Assumptions 1-5 and 7 hold and (N; T; pT ) ! 1. Then

T
1X p
yi;t 1 "it ! 0, uniformly in i (A.9)
T t=1

T
1X p
! i;t s "it ! 0 ,uniformly in i, (A.10)
T t=1 k 1

and, if also p3T =T ! { for some constant 0 < { < 1,

T
1X 1=2
hw;t q "it = Op T , uniformly in i and q, (A.11)
T t=1

for i = 1; 2; :::; N , q = 1; 2; :::; pT , and s = 0; 1. The same results hold when "it is replaced by it and #it .
j
Lemma A.3 Suppose Assumptions 1-5 and 7 hold and (N; T; pT ) ! 1 such that p3T =T ! {, 0 < { < 1.
Then 0
i Mh i p
! i uniformly in i, (A.12)
T
and 0
i Mh F p
! Qif uniformly in i, (A.13)
T
where i is positive de…nite and given by

i = i + f i, (A.14)

and
Qif = cov [S0 i (L) Ci ft ; Ci ft ] , (A.15)

in which
i = V ar [S0 i (L) eit ] , fi = V ar [S0 i (L) Ci ft ] , (A.16)

Ci = Mc Ci , Mc = Ik+1 CC+ is orthogonal projector onto the orthogonal complement of Col (C),
P1 ` 0 0
i (L) = `=0 i` L is de…ned in (A.6), selection matrix S is de…ned in (A.5) and eit = ("it ; vit ) . When
P1
factors are serially uncorrelated, then f i = `=0 S0 i` (Ci f Ci 0 ) 0 i` S and Qif = S0 i0 (Ci f Ci 0 ),
where f = V ar (ft ).
j
Lemma A.4 Suppose Assumptions 1-5 and 7 hold and (N; T; pT ) ! 1 such that p3T =T ! { for some
constant 0 < { < 1. Then,
0
i Mh "i p
! 0 , uniformly in i, (A.17)
T 2kx +1 1

0
i Mh i p
! 0 , uniformly in i, (A.18)
T 2kx +1 1

and 0
i Mh #i p
! 0 , uniformly in i. (A.19)
T 2kx +1 1

Lemma A.5 Suppose Assumptions 1-5 hold and unobserved common factors are serially uncorrelated.
j
Then, as (N; T; pT ) ! 1, we have

N
1 X 1
0
i Mh F p
i i ! 0 . (A.20)
N i=1 T 2kx +1 1

44
j
Lemma A.6 Suppose Assumptions 1-5 hold and (N; T; pT ) ! 1 such that and p2T =T ! 0. Then,

p 0
i Mq i
p 0
i Mh i p
N N ! 0 uniformly in i, (A.21)
T T 2kx +1 2kx +1

p 0
i Mq "i
p 0
i Mh "i p
N N ! 0 uniformly in i, (A.22)
T T 2kx +1 1

p 0
i Mq F
p 0
i Mh F p
N N ! 0 uniformly in i. (A.23)
T T 2kx +1 m

0 0
i Mq i i Mh i p
! 0 , uniformly in i, (A.24)
T T 2kx +1 1

and
0 0
i Mq #i i Mh #i p
! 0 , uniformly in i. (A.25)
T T 2kx +1 1

j
Lemma A.7 Suppose Assumptions 1-5 hold and (N; T; pT ) ! 1 such that N=T ! {, for some 0 < { < 1,
and p2T =T ! 0. Then,
N
1 X 0i Mh "i p
p ! 0 . (A.26)
N i=1 T 2kx +1 1

A.3 Proofs of Lemmas


Proof of Lemma A.1. Hilbert projection theorem (see Rudin, 1987) implies

kMA+B A;1 k A;1 A+B , (A.27)

for any vector A+B 2 Col (A + B). Consider the following choice of A+B ,

sN
X
A+B = Pa` +b` a` c1` , (A.28)
`=1

where Pa` +b` is orthogonal projector onto Col (a` + b` ), and c1` , for ` = 1; 2; :::; sN are elements of vector
PsN
c1 . Using A;1 = Ac1 = `=1 a` c1` , (A.27) with A+B given by (A.28) can be written as

sN
X sN
X
kMA+B A;1 k a` c1` Pa` +b` a` c1` .
`=1 `=1

Using now the triangle inequality, we obtain


sN
X
kMA+B A;1 k ka` c1` Pa` +b` a` c1` k
`=1
sN
X
jc1` j ka` Pa` +b` a` k (A.29)
`=1

Next, we establish an upper bound to ka` Pa` +b` a` k. Consider the triangle given by a` , Pa` +b` a` and
a` + b` . Hilbert projection theorem (see Rudin, 1987) implies

ka` Pa` +b` a` k ka` (a` + b` ) k ,

45
for any scalar and setting = 1 we have

ka` Pa` +b` a` k ka` a` + b` k ,


kb` k ,
1=2 1=2
= Op rN N .

Using this result in (A.29) and noting that jc1` j < K by assumption, it follows that
!
1=2
sN rN
kMA+B A;1 k = Op ,
N 1=2

as desired.
Consider now the inner product of vectors MA+B A;1 and MA+B A;2 . Using Cauchy-Schwarz inequal-
ity, we obtain

0 0
A;1 MA+B A;2 = (MA+B A;1 ) (MA+B A;2 ) kMA+B A;1 k kMA+B A;2 k .

p p
But (A.7) implies that both kMA+B A;1 k and kMA+B A;2 k are Op sN rN = N . These results establish
(A.8), as desired.
Proof of Lemma A.2. Note that all processes, "it , it , #it , yit , ! it and hwt , are stationary with absolutely
summable autocovariances and their cross products are ergodic in mean. Lemma A.2 can be established in
the same way as Lemma 1 in Chudik and Pesaran (2011) by applying a mixingale weak law.
Proof of Lemma A.3. Lemma (A.3) can be established in a similar way as Lemma A.5 in Chudik,
Pesaran, and Tosetti (2011) and by observing that Mh is asymptotically the orthogonal complement of the
space spanned by Cf t .
Proof of Lemma A.4. Let us denote the individual columns of i as is , for s = 1; 2; :::; 2k + 1, and
de…ne the scaled vectors is = T 1=2 is and "i = T 1=2 "i . Since the individual elements of is and
"i are uniformly Op (1), we have k is k = Op T 1=2 , k"i k = Op T 1=2 and therefore k is k = Op (1) and
k"i k = Op (1). Now consider the inner product

hMh is ; Mh "i i = h is ; "i i + hPh is ; Ph " i i , (A.30)

+
where ha; bi = a0 b denotes the inner product of vectors a and b, and Ph = Hw (H0w Hw ) H0w is the
orthogonal projection matrix that projects onto the column space of Hw . Consider the probability limits of
j
the elements in (A.30) as (N; T; pT ) ! 1 such that p3T =T ! { for some constant 0 < { < 1. (A.9) and
(A.10) of Lemma A.2 establish that

p
h is ; "i i ! 0, for s = 1; 2; :::; 2k + 1. (A.31)

Consider the Euclidean norm of the second term of (A.30). Using Cauchy-Schwarz inequality we obtain the
following upper bound,
khPh is ; Ph "i ik 5 kPh is k kPh "i k , (A.32)

where (by Pythagoras’theorem)10


kPh is k k is k = Op (1) . (A.33)
10
Let Mh = (IT pT Ph ) and note that is = Mh is + Ph is . Vectors Mh is and Ph is are orthogonal and
therefore kMh is + Ph is k2 = kMh is k2 + kPh is k2 . It now follows that k is k2 = kMh is k2 + kPh is k2 , but
since kMh is k2 0, we obtain k is k2 kPh is k2 .

46
Now we will establish convergence of kPh "i k in probability. By spectral theorem there exists a unitary
matrix V such that
0 1
0 D 0
0 Hw Hw rc pT +1 (k+1 rc )pT
V V=@ A, (A.34)
T 0 0
(k+1 rc )pT rc pT +1 (k+1 rc )pT (k+1 rc )pT

where D is rc pT + 1 dimensional diagonal matrix with strictly positive diagonal elements and rc = rank (C).
Also by assumption ft is a stationary process with absolute summable autocovariances, and so is hwt . Further-
more, H0w Hw =T = Op (1) as well as the diagonal elements of D have nonzero (and …nite) probability limits.
Partition unitary matrix V = (V1 ; V2 ) so that T 1 V10 H0w Hw V1 = D and de…ne U1 = T 1=2 Hw V1 D 1=2 .
Note that U1 is orthonormal basis of the space spanned by the column vectors of Hw , namely

H0w Hw 0
U01 U1 = D 1=2
V1
V1 D 1=2
T
= D 1=2 DD 1=2
= Irc pT +1 .

1=2 1=2
Scaled matrix T Hw can now be written as T Hw = U1 D1=2 V10 . Consider

1=2 0 H0w "i 1=2 0


D V1 =D V1 V1 D1=2 U01 "i = U01 "i ,
T
0
where we have used that V1 V1 is an identity matrix since V1 is unitary. Using now the submultiplicative
property of matrix norms and (A.11) of Lemma A.2, we obtain

0 H0w "i
kU01 "i k1 = D 1=2
V1
T 1

1=2 H0w "i


D kV10 k1
1 T 1
1=2
= Op T ,

where D 1=2 1 = Op (1) since the diagonal elements of the diagonal matrix D have positive probability
limits, and kV10 k1 = Op (1) since V1 is unitary. This establishes that the individual elements of the vector
U01 "i are (uniformly) Op T 1=2 . Consider next Ph "i , which is an orthogonal projection of "i on the space
spanned by the column vectors of Hw . Since U1 is an orthonormal basis of this space, we can write Ph "i
as the following linear combination of basis vectors,11

(rc +1)pT +1
X
Ph "i = h"i ; u1j i u1j , (A.35)
j=1

where u1j , for j = 1; 2; :::; rc pT +1, denote the individual columns of U1 . But we have shown that jh"i ; u1j ij =
11
The column vectors in U are orthogonal and therefore for any vector a 2 Col (U) we have a =
Prc pT +1 ha;u1j i
u . But hu1j ; u1j i = 1 since each of the column vectors contained in U have unit length (or-
j=1 hu1j ;u1j i 1j
Prc pT +1
thonormality) and we obtain a = j=1 ha; u1j i u1j . (A.35) now follows by letting a = Ph "i and noting that
hPh "i ; u1j i = h"i ; u1j i since Ph u1j = u1j .

47
1=2
Op T and ku1j k = 1 (orthonormality), and therefore

p
kPh "i k = Op pT . (A.36)
T

Using (A.33) and (A.36) in (A.32) yields

p
khPh is ; Ph "i ik = Op pT ,
T

for s = 1; 2; :::; 2k + 1, and using this result together with (A.31) in (A.30) we obtain

p
khMh is ; Mh "i ik1 ! 0,

as desired. This completes the proof of (A.17)


p
(A.18) and (A.19) can be established in a similar way by noting that Lemma A.2 implies T 1 0
i i 1 !
1=2 p
0 and i;T Hw 1 = Op T 1=2 (required to establish (A.18)) and also T 1 0i #i 1 ! 0, #i ; T 1=2 Hw 1
=
Op T 1=2 (required for (A.19)).
Proof of Lemma A.5. De…ne 0
Mh F
'iT = i 1 i i,
T
PN
and consider the cross section average 'T = N 1 i=1 'iT . Note that

E ('iT ) = 0 , (A.37)
2kx +1 1

and
E 'iT '0jT = 0 for i 6= j, i; j = 1; 2; :::; N , (A.38)
2kx +1 2kx +1

since the unobserved common factors are serially uncorrelated and independently distributed of i , and i
is independently distributed across i. Next, we show that the individual elements of E ('iT '0iT ) are bounded
1
in N . i de…ned in Lemma A.3 is invertible under Assumption 7 and in particular i < K < 1. Using
Cauchy-Schwarz inequality, we obtain
r
2 4
E e f`t E eist E f`t
4 4 = O (1) ,
ist i` i`

for s = 1; 2; :::; 2k + 1, and ` = 1; 2; :::; m, where eist are the individual elements of 0i Mh , eist has uniformly
4 4 4 4
bounded 4-th moments under Assumption 7, and E f`t i` = E f`t E i` is also uniformly bounded
under Assumptions 2 and 3. It follows that there exists a constant K < 1, which does not depend on N
and such that
kE ('iT '0iT )k < K. (A.39)

Using now (A.38)-(A.39), we obtain


1
kV ar ('T )k = O N . (A.40)
p
(A.37) and (A.40) imply 'T ! 0, as desired.
Proof of Lemma A.6. Denote the individual columns of i by is , s = 1; 2; :::; 2k + 1 and consider

0 0 2 2
is Mq is is Mh is = Mq is kMh is k , (A.41)

48
for s = 1; 2; :::; 2k + 1. Hilbert projection theorem (see Rudin, 1987) implies

2 2
Mq is k is qk ,

for any vector q 2 Col Qw . Choose q = Ph is Mq Ph is , where Ph is orthogonal projector matrix


onto Col Qw , and note that q = IT pT Mq Ph is 2 Col Qw . Hence,

2 2
Mq is is Ph is + Mq Ph is
2
Mh is + Mq Ph is
2 2
kMh is k + Mq Ph is + 2 Mh is ; Mq Ph is , (A.42)

2 2 2
where we used Mh = IT pT Ph to obtain the second inequality, and we used ka + bk = kak + kbk +
2 ha; bi, for any vectors a and b, to obtain the third inequality. Similarly, we obtain the following upper
2
bound on kMh is k ,

2 2
kMh is k is Pq is + Mh Pq is
2
Mq is + Mh Pq is
2 2
Mq is + Mh Pq is + 2 Mq is ; Mh Pq is (A.43)

Using (A.42) and (A.43) in (A.41) yields the following lower and upper bounds,

2 2
1;N T Mq is kMh is k 2;N T , (A.44)

where
2
1;N T = Mh Pq is + 2 Mq is ; Mh Pq is , (A.45)

and
2
2;N T = Mq Ph ; Mq Ph is .
is + 2 Mh (A.46) is
p
Note that Pq is belongs to Col Qw and Pq is k is k = Op T pT since the individual elements
of is : are uniformly Op (1). Also, Qw = Hw + Vw , where elements of Vw are uniformly Op N 1=2 ,
whereas the elements of Hw are Op (1). Using Lemma A.1 (by setting A = Hw + Vw , B = Vw and
A;1 = Pq is ), we obtain p
pT T pT
Mh Pq is = Op p . (A.47)
N
Similarly, Lemma A.1 can be used again (by setting A = Hw , B = Vw and A;1 = Ph is ) to show that
p
pT T pT
Mq Ph is = Op p . (A.48)
N

Now consider the inner product on the right side of (A.45). Using Cauchy-Schwarz inequality, we have

Mq is ; Mh Pq is Mq is Mh Pq is ,
pT (T pT )
= Op p (A.49)
N
p 1=2
p
where Mq is k is k = Op T pT , and Mh Pq is = Op p T N T pT by (A.47). Similarly,

49
p
using kMh is k k is k = Op T pT , (A.48) and the Cauchy-Schwarz inequality, we obtain

Mh is ; Mq Ph is kMh is k Mq Ph is
pT (T pT )
= Op p (A.50)
N

Using (A.47)-(A.50) in (A.45) and (A.46) we obtain


!
2
p2T (T pT ) pT (T pT )
`;N T = Op + Op p , for ` = 1; 2;
N N

and using this result in (A.44) yields

2
!
p 2
Mq is Mh is (T pT ) pT (T pT )
N = Op p2T p + Op ,
T T T2 N T2
p
! 0,

for s = 1; 2; :::; 2k + 1, as (N; T; pT ) ! 1 such that p2T =T ! 0. This establishes that the diagonal elements
of
p 0
Mq i p 0
Mh i
N i N i
T T
tend to 0 in probability uniformly in i.
Now consider the o¤-diagonal elements. Convergence of individual terms

p 0 p 0
is Mq i` is Mh i`
N N , for s 6= `, s; ` = 1; 2; :::; k + 1,
T T
can be established following the same arguments as above, but using (A.8) instead of (A.7) of Lemma A.1.
This completes the proof of (A.21). (A.22)-(A.25) can be established in the same way.
Proof of Lemma A.7. Using the identity Mh = IT pT Ph , where Ph is orthogonal projection matrix
that projects onto Col (Hw ), we write the expression on the left side of (A.26) as:

N N N
1 X 0
i Mh "i 1 X 0i "i 1 X 0
i Ph " i
p =p p : (A.51)
N i=1 T N i=1 T N i=1 T

First we establish convergence of the …rst term on the right side of (A.51). Let TN = T (N ) and pN =
pT [T (N )] be any non-decreasing integer-valued functions of N such that limN !1 TN = 1 and limN !1 p2T =T =
0. The …rst term on the right side of (A.51) can be written as

N TN
1 X 0i "i X
p = N t,
N i=1 TN t=pT +1

where
N
X
1
Nt = p it "it .
TN N i=1
1 1
Let fcN t gt= 1 N =1 be two-dimensional array of constants and set cN t = T1N for all t 2 Z and N 2 N.
it and "jt are independently distributed for any i, j and t, and we have: E ( N t ) = 0; and the elements of

50
covariance matrix of N t =cN t are bounded, in particular
0
Nt Nt Nt
V ar = E 2 ,
cN t cN t
N N
1 XX 0
= E it jt "it "jt ,
N i=1 j=1

N N
1 XX 0
= E it jt E ("it "jt ) .
N i=1 j=1

0
Noting that E it jt is bounded in i; j and t, and E ("t "0t ) = RR0 under Assumption 1, we obtain

N N
Nt K XX
V ar E ("it "jt ) ,
cN t N i=1 j=1
K
k 0 E ("t "0t ) k ,
N
K
k 0N k kRk kR0 k k Nk .
N
p p
But k 0N k = k N k = N and kRk kRk1 kRk1 < K, where kRk1 and kRk1 are postulated to be
bounded by Assumption 1, and therefore

Nt
V ar = O (1) . (A.52)
cN t

(A.52) implies uniform integrability of f N t =cN t g and the array N t is uniformly integrable L1 -mixingale
array with respect to the constant array cN t . Using a mixingale weak law yields (Davidson, 1994, Theorem
19.11)
TN
X TN
X XN
1 L1
Nt = p it "it ! 0 .
t=pT +1
T N N t=pT +1 i=1
2kx +1 1

Convergence in L1 norm implies convergence in probability. This establishes

N
1 X 0i "i p
p ! 0 , (A.53)
N i=1 T 2kx +1 1

j
as (N; T; pT ) ! 1 and p2T =T ! 0.
Next consider the second term on the right hand side of (A.51), and note that

0 0 +
i Ph " i 1 i Hw H0w Hw H0w "i
= p p ,
T T T T T
1
= p G0iT #"i ,
T

where
0 +
i Hw H0w Hw
G0iT = ,
T T
and
H0 "i
#"i = pw .
T

51
De…ne also
+
G0i = i hh ,

in which =E e0 e 0 = 1; h0 ; h0
,h 0
i it hwt wt wt w;t 1 ; :::; hw;t pT denotes the individual rows of Hw , is 2kx +
1 (k + 1) pT + 1 dimensional matrix, and hh = E h e wt h
e0
wt is (k + 1) pT + 1 (k + 1) pT + 1 matrix.
Elements of i and hh are uniformly bounded and in particular

+ +
hh 1 = O (1) , hh 1 = O (1) , k i k1 = O (1) and k i k1 = O (1) , (A.54)
P1 P1
because `=0 jE ( ist hw;r2 t ` )j < K and `=0 jE (hw;r1 t hw;r2 t ` )j < K for any r1 ; r2 = 1; 2; :::; k + 1 and
s = 1; 2; :::k + 1, where hw;r1 t for r1 = 1; 2; :::; k + 1 denotes individual elements of hwt = w (L) ft + czw
and ist for s = 1; 2; :::k + 1 denotes individual elements of it . Using these notations, we can now write the
second term on the right side of (A.51) as

N r N
1 X 0
i Ph " i N 1 X 0
p = G #"i
N i=1 T T N i=1 iT
r N N
!
N 1 X 0 1 X 0
= Gi #"i + (GiT G0i ) #"i (A.55)
T N i=1 N i=1

Consider the …rst term inside the brackets on the right side of (A.55), and note that

N
! N
!0 N N
1 X 0 1 X 0 1 XX 0
E G #"i G #"i = G E #"i #0"j Gj . (A.56)
N i=1 i N i=1 i N 2 i=1 j=1 i

e 0 and the stochastic processes in h


Since "i is independently distributed of h e 0 are covariance stationary we
wt wt
also have
1
E #"i #0"j = E H0w "i "0j Hw = ij hh , (A.57)
T
where ij = E ("it "jt ). Using (A.57) in (A.56) and applying the submultiplicative property of matrix norm
yields

N
! N
!0 N N
1 X 0 1 X 0 1 XX 0
E G #"i G #"i = ij Gi hh Gj
N i=1 i N i=1 i N 2 i=1 j=1
1 1
N N
1 XX 0
j ij j kGi k1 k hh k1 kGj k1 ,
N 2 i=1 j=1

0
where k hh k1 = O (1), kG0i k1 = k i hh k1 k i k1 k hh k1 = O (1), and kGj k1 = ( j hh ) 1
=
PN PN
k j hh k1 k j k1 k hh k1 = O (1), see (A.54). Using these results and noting that N 1 i=1 j=1 j ij =
j
O (1) under Assumption 1, we obtain

N
! N
!0 N N
1 X 0 1 X 0 K XX
E G #"i G #"i j ij j
N i=1 i N i=1 i N 2 i=1 j=1
1
K
; (A.58)
N

52
which in turn implies that
r N
N 1 X 0 p
G #"i ! 0 , (A.59)
T N i=1 i 2kx +1 1

j
as (N; T; pT ) ! 1 such that N=T ! {1 , for some 0 < {1 < 1.
Now consider the second term inside the brackets on the right side of (A.55). Using submultiplicative
property of matrix norms, we have

N N
1 X 0 1 X
(GiT G0i ) #"i kG0iT G0i k1 k#"i k1 . (A.60)
N i=1 N i=1
1

Note that #"i has zero mean and V ar (#"i ) = E #"i #0"j = ij hh , see (A.57), where ij and the elements
of hh are uniformly bounded. It therefore follows that

k#"i k1 = Op (1) uniformly in i and pT . (A.61)


p
Consider now the term T kG0iT G0i k1 , and …rst note that

0 +
i Hw H0w Hw +
G0iT G0i = i hh
T T
" #
0 +
i Hw H0w Hw +
0
i Hw +
= i hh + i hh
T T T
" #
+
H0w Hw +
+ i hh :
T

Hence

0 +
i Hw H0w Hw +
0
i Hw +
kG0iT G0i k1 i hh + i hh 1
T 1 T T 1
1
" #
+
H0w Hw +
+k i k1 hh (A.62)
T
1

0
PT e0 e0
Individual elements of i Hw =T i can be written as , for r =
t=pT +1 i;r;t hw;s;t E i;r;t hw;s;t
e 0
1; 2; :::; k + 1 and s = 1; 2; :::; (k + 1) pT + 1, where i;r;t and hw;s;t are the elements of it and he wt . The
stochastic processes i;r;t and e h0w;s;t are covariance stationary with absolute summable autocovariances and
PT e e
we have t=pT +1 h0 E
i;r;t w;s;t h0 = Op T 1=2 uniformly in i and pT . This implies
i;r;t w;s;t

0
i Hw p
i = Op pT uniformly in i. (A.63)
T 1 T

Lemmas A.7 and A.8 of Chudik and Pesaran (2013b) establish that in the full column rank case where
rank (C) = m and k + 1 = m, we have

1
H0w Hw p
1
hh = Op pT ,
T T
1

where hh
e wt h
= E h e0 is (k + 1) pT + 1 (k + 1) pT + 1 nonsingular matrix (in the full column rank
wt

53
case with k + 1 = m). Using generalized inverse instead of inverse, the diagonalization of H0w Hw =T in
(A.34) and similar arguments as in Lemmas A.7 and A.8 of Chudik and Pesaran (2013b), the same result
can be established for the more general case when C does not necessarily have full column rank or when
rank (C) = m but k + 1 m, namely:

+
H0w Hw p
+
hh = Op pT (A.64)
T T
1

Using (A.54) and (A.63)-(A.64) in (A.62), we obtain

p
kG0iT G0i k1 = Op pT , uniformly in i. (A.65)
T

Using now (A.61) together with (A.65) in (A.60) yield

N
1 X 0 p
(GiT G0i ) #"i ! 0 , (A.66)
N i=1 2kx +1 1

j
as (N; T; pT ) ! 1; and p2T =T ! 0. Finally, using (A.59) and (A.66) in (A.55), we obtain

N
1 X 0
i Ph "i p
p ! 0 , (A.67)
N i=1 T 2kx +1 1

j
when (N; T; pT ) ! 1 such that N=T ! {, for some 0 < { < 1, and p2T =T ! 0. This completes the proof.

A.4 Proofs of Theorems and Propositions


Proof of Theorem 1. Equation (24), for t = pT + 1; pT + 2; :::; T , can be written as (see (A.2))

yi = i i + Qw di + "i + i + #i , (A.68)

0 0
where di = cyi ; 0i0 ; 0i1 ; :::; 0ipT , "i = ("i;pT +1 ; "i;pT +2 ; :::; "iT ) , i is T pT 1 vector with its elements
P1
given by `=pT +1 0i` zw;t ` , for t = pT + 1; pT + 2; :::; T , and #i is T pT 1 vector de…ned in (A.3) with
its elements uniformly bounded by Op N 1=2 . Substituting (A.68) into the de…nition of b i in (26) and
0 0 1 0
noting that i Mq i i Mq i i = i , we obtain

0 0 1 0
bi i = i Mq i i Mq Qw di + "i + i + #i . (A.69)

+
Note that Mq Qw = Qw Qw Q0w Qw Q0w Qw = Qw Qw = 0 and (A.69) reduces to
T pT (k+1)pT +1

0 0 1 0
i Mq i i Mq
bi i = ("i + i + #i ) (A.70)
T T

j
Consider the asymptotics (N; T; pT ) ! 1, such that p3T =T ! {, for some constant 0 < { < 1. (A.12) of
Lemma A.3 and (A.21) of Lemma A.6 show that T 1 0i Mq 0i converges in probability to a full rank matrix
and therefore
0 0 1
i Mq i
= Op (1) . (A.71)
T

54
Moreover, Lemmas A.4 and A.6 establish
0 0 0
i Mq "i p i Mq i p i Mq #i p
! 0 ; ! 0 , and ! 0 . (A.72)
T 2kx +1 1 T 2kx +1 1 T 2kx +1 1

Using (A.71)-(A.72) in (A.70) establish (28), as desired.


Proof of Theorem 2. First suppose that the rank condition stated in Assumption 6 holds and consider
j
the asymptotics (N; T; pT ) ! 1, such that p3T =T ! {, for some constant 0 < { < 1. Using Theorem 1
and the de…nition of the mean group estimator b M G in (27), we have

N
1 X p
bMG i ! 0 . (A.73)
N i=1 2kx +1 1

Assumption 4 postulates that i = + i, where i IID 0 ; and the norms of and


2kx +1 1
1
PN
are bounded. It follows that V ar N i=1 i =k =N k ! 0 as N ! 1 and

N N
1 X 1 X p
i = i ! 0 , as N ! 1. (A.74)
N i=1 N i=1 2kx +1 1

(A.73) and (A.74) establish (29), as desired.


Now suppose that the rank condition does not hold. Using model (1)-(2), vector of observations on the
0
dependent variable, yi = (yi;pT +1 ; yi;pT +2 ; :::; yi;T ) , can be written as (see (A.1))

yi = cyi + i i +F i + "i , (A.75)

0
where cyi = cyi T pT and F = (f1 ; f2 ; :::; fm ) with f` = (f`;pT +1 ; f`;pT +2 ; :::; f`;T ) for ` = 1; 2; :::; m. Substi-
1 0
tuting (A.75) into the de…nition of b i in (26) and noting that Mq cyi = 0 and 0i Mq 0i i Mq i i =
T pT 1
i, we obtain the following expression for the mean group estimator,

N N N
1 X 1 Xb 1
0
i Mq "i 1 Xb 1
0
i Mq F i
bMG = i + ;iT + ;iT , (A.76)
N i=1 N i=1 T N i=1 T

j
where b ;iT is de…ned in Assumption 7. Consider the asymptotics (N; T; pT ) ! 1, such that p3T =T ! {,
for some constant 0 < { < 1. The probability limit of the …rst term in (A.76) is established in (A.74). As
before (see (A.71)), b ;iT
1
= Op (1) uniformly in i and using also (A.17) and (A.22) of Lemmas A.4 and A.6,
respectively, we obtain
N
1 Xb 1 0
i Mq "i p
;iT ! 0 . (A.77)
N i=1 T 2kx +1 1

Finally, consider the last term on the right side of (A.76). Since i is nonsingular, (A.12) of Lemma A.3
p
and (A.21) of Lemma A.6 establish that b ;iT 1
! i 1 , and together with (A.23) of Lemma A.6 we have

N N
1 Xb 1
0
i Mq F 1 X 1
0
i Mh F p
;iT i i i ! 0 .
N i=1 T N i=1 T 2kx +1 1

Note that i = i + w w . F w w does not necessarily belong to the linear space spanned by
the column vectors of Q due to the truncation lag pT and, in particular, we have T 1 Mh F w = Op ( pT ),

55
1
T Mh F w = Op N 1=2 pT , and T 1 0i Mh F i = T 1 0i Mh F i + Op N 1=2 pT + Op ( pT ), where
1=2
w = Op N , j j < 1 and function ` , for ` = 1; 2; :::, is an upper bound on the exponential decay of
PN 1
coe¢ cients in the polynomial w (L) = i=1 wi (Ik+1 Ai L) A0;i1 Ci in the de…nition of Qw . Now, when
unobserved common factors are serially uncorrelated, we can use Lemma A.5 to obtain

N
1 Xb 1
0
i Mq F p
;iT i ! 0 . (A.78)
N i=1 T 2kx +1 1

Note that when factors are serially correlated and the rank condition does not hold then T 1 0i Mq F i does
not converge to 0 and as a result equation (A.78) would not hold. Using (A.74), (A.77) and (A.78) in
2kx +1 1
j
(A.76) establish b M G ! , when (N; T; pT ) ! 1 such that p3T =T ! { for some constant 0 < { < 1, as
desired.
p
Proof of Theorem 3. Multiplying (A.76) by N and substituting i = + i we obtain

N N N
p 1 X 1 Xb 1
0
i Mq "i 1 Xb 1
0
i Mq F i
N (b M G )= p i + p ;iT +p ;iT (A.79)
N i=1 N i=1 T N i=1 T

j
where b ;iT is de…ned in Assumption 7. Consider the asymptotics (N; T; pT ) ! 1 such that N=T ! {1
and p3T =T ! {2 , for some constants 0 < {1 ; {2 < 1. We establish convergence of the individual elements
on the right side of (A.79) below.
It follows from (A.21) of Lemma A.6 and (A.12) of Lemma A.3 that

b ;iT i = op N 1=2
uniformly in i. (A.80)

(A.80), (A.22) of Lemma A.6, and (A.26) of Lemma A.7 imply

N
1 Xb 1
0
i Mq "i p
p ;iT ! 0 . (A.81)
N i=1 T 2kx +1 1

As in the proof of Theorem 2, i = i + w w , F w w does not necessarily belong to the


linear space spanned by the column vectors of Q due to the truncation lag pT and, in particular, we have
T 1 0i Mh F i = T 1 0i Mh F i + Op N 1=2 pT + Op ( pT ), where w = Op N
1=2
, j j < 1 and
`
function , for ` = 1; 2; :::, is an upper bound on the exponential decay of coe¢ cients in the polynomial
PN 1
w (L) = i=1 wi (Ik+1 A L) A0;i1 Ci in the de…nition of Qw . Using now (A.21) and (A.23) of Lemma
p p i
A.6 and noting that N T ! 0 yields

N N
1 Xb 1 X
0 0 1 0
1 i Mq F i Mh i i Mh F p
p ;iT i p i ! 0 . (A.82)
N i=1 T N i=1 T T 2kx +1 1

Using (A.81)-(A.82) in (A.79), we obtain


p d
N (b M G ) # i, ,

where
N N
1 X 1 X 0 1 0
i Mh i i Mh F
# i =p i +p i, (A.83)
N i=1 N i=1 T T
p
and recall that i and i are independently distributed across i. It now follows that N (b M G )!

56
N 0 ; MG , where
2kx +1 1

" N
#
1 X 1 1
MG = + lim i Qif Q0if i , (A.84)
N !1 N i=1

1 0
in which = V ar ( i ) = V ar ( i ), = V ar ( i ) = V ar i , and i = p lim T i Mh i and
1 0
Qif = p lim T i Mh F are de…ned by (A.12) and (A.13) of Lemma A.3, respectively. When the rank
condition stated in Assumption 6 hold then Qif = 0 and M G reduces to M G = .
2kx +1 m
Consider now the non-parametric variance estimator (32) and the same assumptions on the divergence
of (N; T; pT ). We have
b i b M G = (b i )+( b M G) ;
p d
where N( b M G) ! N 0 ; MG with k M Gk < K. It therefore follows that
2kx +1 1

N
X N
X
1 1
(b i b M G ) (b i b M G )0 = (b i ) (b i
0
) + Op N 1=2
.
N 1 i=1
N 1 i=1

Consider now b i . As before, using the de…nition of i in (26) and substituting i = + i we obtain

0 0
i Mq "i i Mq F i
bi = i +b 1
;iT +b 1
;iT .
T T
Using (A.81)-(A.82), we have

N
X N
X
1 0 1 0
(b i ) (b i ) = i i
N 1 i=1
N 1 i=1
N
X 0 1 0 0 0 0 1
1 i Mh i i Mh F 0 i Mh F i Mh i
+ i i
N 1 i=1
T T T T
+op (1)
N
X N
X
1 0 1 1 0 0 1
= i i+ i Qif i i Qif i + op (1) ,
N 1 i=1
N 1 i=1

where i = p lim T 1 0i Mh i and Qif = p lim T 1 0i Mh F are de…ned by by (A.12) and (A.13) of
Lemma A.3, respectively. Note that i and i are independently distributed across i and therefore
PN 0 p p
1 ^
N 1 i=1 (b i ) (b i ) M G ! 0 and MG ! M G , as required.

57
References
Bai, J. (2009). Panel data models with interactive …xed e¤ects. Econometrica 77, 1229–1279.
Bai, J. and S. Ng (2007). Determining the number of primitive shocks in factor models. Journal of Business
and Economic Statistics 25, 52–60.
Berk, K. N. (1974). Consistent autoregressive spectral estimates. The Annals of Statistics 2, 489–502.
Bruno, G. S. (2005). Approximating the bias of the LSDV estimator for dynamic unbalanced panel data
models. Economics Letters 87, 361–366.
Bun, M. J. G. (2003). Bias correction in the dynamic panel data model with a nonscalar disturbance
covariance matrix. Econometric Reviews 22, 29–58.
Bun, M. J. G. and M. A. Carree (2005). Bias-corrected estimation in dynamic panel data models. Journal
of Business and Economic Statistics 23, 200–210.
Bun, M. J. G. and M. A. Carree (2006). Bias-corrected estimation in dynamic panel data models with
heteroscedasticity. Economics Letters 92, 220–227.
Bun, M. J. G. and J. Kiviet (2003). On the diminishing returns of higher order terms in asymptotic
expansions of bias. Economic Letters 19, 145–152.
Canova, F. and M. Ciccarelli (2004). Forecasting and turning point predictions in a Bayesian panel VAR
model. Journal of Econometrics 120, 327–359.
Canova, F. and M. Ciccarelli (2009). Estimating multicountry VAR models. International Economic Re-
view 50, 929–959.
Canova, F. and A. Marcet (1999). The poor stay poor: Non-convergence across countries and regions.
Mimeo, June 1999.
Chudik, A. and M. H. Pesaran (2011). In…nite dimensional VARs and factor models. Journal of Econo-
metrics 163, 4–22.
Chudik, A. and M. H. Pesaran (2013a). Aggregation in large dynamic panels. forthcoming in Journal of
Econometrics.
Chudik, A. and M. H. Pesaran (2013b). Econometric analysis of high dimensional VARs featuring a
dominant unit. Econometric Reviews 32, 592–649.
Chudik, A., M. H. Pesaran, and E. Tosetti (2011). Weak and strong cross section dependence and estima-
tion of large panels. Econometrics Journal 14, C45–C90.
Davidson, J. (1994). Stochastic Limit Theory. Oxford University Press.
Dhaene, G. and K. Jochmans (2012). Split-panel jackknife estimation of …xed-e¤ect models. Mimeo, 21
July 2012.
Everaert, G. and T. D. Groote (2012). Common correlated e¤ects estimation of dynamic panels with
cross-sectional dependence. Mimeo, 9 November 2012.
Everaert, G. and L. Ponzi (2007). Bootstrap-based bias correction for dynamic panels. Journal of Economic
Dynamics and Control 31, 1160–1184.
Forni, M., M. Hallin, M. Lippi, and L. Reichlin (2005). The generalized dynamic factor model: One-sided
estimation and forecasting. Journal of the American Statistical Association 100, 830–840.
Garcia-Ferrer, A., R. A. High…eld, F. Palm, and A. Zellner (1987). Macroeconomic forecasting using
pooled international data. Journal of Business and Economic Statistics 5, 53–67.

58
Giannone, D., L. Reichlin, and L. Sala (2005). Monetary policy in real time. In M. Gertler and K. Rogo¤
(Eds.), NBER Macroeconomics Annual 2004, Volume 19, pp. 161–200. MIT Press.
Hahn, J. and G. Kuersteiner (2002). Asymptotically unbiased inference for a dynamic panel model with
…xed e¤ects when both N and T are large. Econometrica 70, 1639–1657.
Hahn, J. and G. Kuersteiner (2011). Bias reduction for dynamic nonlinear panel models with …xed e¤ects.
Econometric Theory 27, 1152–1191.
Hahn, J. and H. Moon (2006). Reducing bias of MLE in a dynamic panel model. Econometric Theory 22,
499–512.
Hahn, J. and W. Newey (2004). Jackknife and analytical bias reduction for nonlinear panel models.
Econometrica 72, 1295–1319.
Hsiao, C., M. H. Pesaran, and A. K. Tahmiscioglu (1999). Bayes estimation of short-run coe¢ cients in
dynamic panel data models. In C. Hsiao, K. Lahiri, L.-F. Lee, and M. H. Pesaran (Eds.), Analysis of
Panels and Limited Dependent Variables: A Volume in Honour of G. S. Maddala, Chapter 11, pp.
268–296. Cambridge University Press.
Hurwicz, L. (1950). Least squares bias in time series. In T. C. Koopman (Ed.), Statistical Inference in
Dynamic Economic Models, pp. 365–383. New York: Wiley.
Kapetanios, G., M. H. Pesaran, and T. Yagamata (2011). Panels with nonstationary multifactor error
structures. Journal of Econometrics 160, 326–348.
Kiviet, J. F. (1995). On bias, inconsistency, and e¢ ciency of various estimators in dynamic panel data
models. Journal of Econometrics 68, 53–78.
Kiviet, J. F. (1999). Expectations of expansions for estimators in a dynamic panel data model; some
results for weakly-exogenous regressors. In C. Hsiao, K. Lahiri, L.-F. Lee, and M. H. Pesaran (Eds.),
Analysis of Panel Data and Limited Dependent Variables. Cambridge University Press, Cambridge.
Kiviet, J. F. and G. D. A. Phillips (1993). Alternative bias approximations in regressions with a lagged-
dependent variable. Econometric Theory 9, 62–80.
Lee, N., H. R. Moon, and M. Weidner (2011). Analysis of interactive …xed e¤ects dynamic linear panel
regression with measurement error. Cemmap working paper CWP37/11.
Mark, N. C. and D. Sul (2003). Cointegration vector estimation by panel DOLS and long-run money
demand. Oxford Bulletin of Economics and Statistics 65, 655–680.
Moon, H. R. and M. Weidner (2010a). Dynamic linear panel regression models with interactive …xed
e¤ects. Mimeo, July 2010.
Moon, H. R. and M. Weidner (2010b). Linear regression for panel with unknown number of factors as
interactive …xed e¤ects. Mimeo, July 2010.
Newey, W. K. and R. J. Smith (2004). Higher order properties of GMM and generalized empirical likelihood
estimators. Econometrica 72, 219–255.
Pedroni, P. (2000). Fully modi…ed OLS for heterogeneous cointegrated panels. Advances in Economet-
rics 15, 93–130.
Pesaran, M. H. (2006). Estimation and inference in large heterogenous panels with multifactor error
structure. Econometrica 74, 967–1012.
Pesaran, M. H., Y. Shin, and R. P. Smith (1999). Pooled mean group estimation of dynamic heterogeneous
panels. Journal of the American Statistical Association 94, 621–634.
Pesaran, M. H., L. V. Smith, and T. Yamagata (2013). A panel unit root test in the presence of a
multifactor error structure. forthcoming in Journal of Econometrics.

59
Pesaran, M. H. and R. Smith (1995). Estimating long-run relationships from dynamic heterogeneous
panels. Journal of Econometrics 68, 79–113.
Pesaran, M. H. and E. Tosetti (2011). Large panels with common factors and spatial correlations. Journal
of Econometrics 161, 182–202.
Pesaran, M. H. and Z. Zhao (1999). Bias reduction in estimating long-run relationships from dynamic
heterogenous panels. In C. Hsiao, K. Lahiri, L.-F. Lee, and M. H. Pesaran (Eds.), Analysis of Panels
and Limited Dependent Variables: A Volume in Honour of G. S. Maddala, Chapter 12, pp. 297–322.
Cambridge University Press.
Phillips, P. C. B. and D. Sul (2003). Dynamic panel estimation and homogeneity testing under cross
section dependence. Econometrics Journal 6, 217–259.
Phillips, P. C. B. and D. Sul (2007). Bias in dynamic panel estimation with …xed e¤ects, incidental trends
and cross section dependence. Journal of Econometrics 137, 162–188.
Rudin, W. (1987). Real and Complex Analysis. McGraw-Hill.
Said, E. and D. A. Dickey (1984). Testing for unit roots in autoregressive-moving average models of
unknown order. Biometrika 71, 599–607.
So, B. S. and D. W. Shin (1999). Recursive mean adjustment in time series inferences. Statistics &
Probability Letters 43, 65–73.
Song, M. (2013). Asymptotic theory for dynamic heterogeneous panels with cross-sectional dependence
and its applications. Mimeo, January 2013.
Stock, J. H. and M. W. Watson (2002). Macroeconomic forecasting using di¤usion indexes. Journal of
Business and Economic Statistics 20, 147–162.
Stock, J. H. and M. W. Watson (2005). Implications of dynamic factor models for VAR analysis. NBER
Working Paper No. 11467.
Zellner, A. and C. Hong (1989). Forecasting international growth rates using Bayesian shrinkage and other
procedures. Journal of Econometrics 40, 183–202.
Zellner, A., C. Hong, and C. ki Min (1991). Forecasting turning points in international output growth
rates using Bayesian exponentially weighted autoregression, time-varying parameter, and pooling tech-
niques. Journal of Econometrics 49, 275–304.
Zhang, P. and D. Small (2006). Bayesian inference for random coe¢ cient dynamic panel data models.
Mimeo, 20 February 2006.

60

You might also like