Chapter 2 Panel Data
Chapter 2 Panel Data
Chapter 2 Panel Data
Lu Liu
Contents
1 Introduction 2
1.1 Advantages of Panel Data . . . . . . . . . . . . . . . . . . . . . . . 2
2 The Models 4
2.1 The Fixed Eects Model . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 Within transformation . . . . . . . . . . . . . . . . . . . . . 6
2.2 Time-xed Eects Model . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 The Random Eects Model . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Fixed Eects or Random Eects? . . . . . . . . . . . . . . . . . . . 12
2.4.1 The Hausman test . . . . . . . . . . . . . . . . . . . . . . . 14
1
1 Introduction
This chapter draws on Chapter 11 of Brooks (2019).
"Panel data" or "longitudinal data" refers to the pooling of observations on
a cross-section of entities such as households, rms and countries over several
time periods. A panel of data embodies information across both time and space.
Importantly, a panel follows the same entities over time. If the data are not on
the same entities measured over time, this would not be panel data but "repeated
cross-section". In this chapter, we will discuss the important features of panel data
and study the techniques used to model such data.
Econometrically, the most basic setup of panel data is
1. Panel data give more variability and less collinearity among the variables.
The additional variation introduced by combining cross-sectional data with
time series data can help to mitigate problems of multicollinearity that
plagues time series models. Let's demonstrate with an empirical example.
Baltagi and Levin (1992) model the consumption of cigarette as a function of
lagged consumption, price and income in the US. In the aggregate time series
2
for the US, there is high collinearity between price and income. Instead,
Baltagi and Levin (1992) consider cigarette demand across 46 American
states between 1963 and 1988. High collinearity is less likely with a
panel across American states since the cross-section dimension adds more
informative data on price and income, and hence increases variability in the
data.
2. As will be demonstrated later in this chapter, panel data has the ability to
control for all time-invariant variables or cross-sectional invariant variables,
whose omission could bias the estimates in a typical cross-section study or
a time-series study. For example, in the cigarette demand study, state-
invariant variables such as national advertising on TV could aect the sales
of cigarette. So do time-invariant variables such as distribution of religious
population within dierent states. For example, Utah, which has a high
percentage of Mormon population, has a per capita sales of cigarette less than
one half of the national average. If these variables correlate with price and
income, omitting them from the regression will cause bias in the estimated
coecients for price and income. We will demonstrate that panel data are
able to control for these time-invariant variables or cross-sectional invariant
while time series or cross-sectional data cannot.
3
3. Panel data are not only able to model why individual units behave dierently
(cross-sectional dimension) but also why a given unit behaves dierently at
dierent time periods (time dimension). This allows a researcher to identify
certain parameters or questions that cannot be addressed using pure cross-
sectional data or pure time-series data. Consider a situation in which the
average consumption level rises with 2% from one year to another. It might
imply 2% increase for all individuals. Or, it might imply an increase of 4%
for one half of the individuals and no change for the other half (or another
combination). To discriminate between these two models, we need to utilize
changes of consumption on an individual level, which is panel data.
2 The Models
In this section we discuss two common models for panel data, namely the xed
eects and the random eects models, and subsequently discuss the choice between
the two.
4
where a rm has its headquarters. In this model, we need to utilize the restriction
that µi = 0 for one arbitrary entity i to avoid perfect multicollinearity between
the individual-xed eect terms and the intercept term.
Eq(2.1) could be estimated using dummy variables
where D2i is a dummy variable that takes value 1 for all observations on the
second entity (e.g. the second rm) in the sample and zero otherwise, D3i is
a dummy variable that takes value 1 for all observations on the third entity
(e.g. the third rm) in the sample and zero otherwise, and so on. As stated
before, because of having the intercept term α, we put the dummy variable for the
rst entity to zero in order to avoid the "dummy variable trap", where there is
perfect multicollinearity between the dummy variables and the intercept. Putting
a dummy for one arbitrary entity to zero will do the same job. The parameters α,
β , µ2 , . . . , µN can be estimated by OLS. The implied estimator for β is referred to
as the least squares dummy variable (LSDV) estimator.
An alternative setting for the econometric presentation of the xed eects
model is
where the intercept term α is suppressed but dummy variables for all the entities
are included. Notice that equation (2.2) and (2.3) are equivalent. Both have N + k
parameters to estimate (k parameters in β , N slope parameters for dummies in
(2.3), N − 1 slope parameters for dummies and α in (2.2)). β parameters are
identical. Howver, the interpretation of µi diers: α in eq(2.2) is the same as the
parameter for the rst entity's dummy variable, µ1 in eq (2.3); α + µi in eq(2.2) is
the same as µi in eq (2.3) for i > 1.
5
2.1.1 Within transformation
It is challenging to estimate N + k parameters when N is large. Often there can be
observations on hundreds of rms, or tens of thousands of households solicited from
surveys. Fortunately we can estimate β in a simpler way. It can be shown that
exactly the same estimator for β if we transform the data. This transformation,
known as the within transformation, involves subtracting the time-mean of each
entity away from the values of the variable. Then we perform the regression on
the deviations from the entity means. Essentially, this implies that we eliminate
the individual-xed eects µi .
1 PT
Let's dene ȳi = yit as the time-mean of the observations on y for entity
T t=1
1 PT
i and x̄i = xit as the time-mean of the explanatory variables. Eq(2.1)
T t=1
averaged over time gives
This new regression contains demeaned variables only. Note that uit − ūi = (µi +
vit ) − (µ̄i + v̄it ) = vit − v̄it , because µi is time-invariant, so µi = µ̄i . In addition, α
is a constant, so it is also dropped after the transformation.
We could write eq(2.5) as
where the double dots above the variables denote the demeaned values. The
implied estimator for β after the within transformation is referred to as the xed-
eects estimator or within estimator. Regression (2.6) can now be routinely
estimated using OLS on the pooled sample of demeaned data. µ̂i and α can be
recovered from eq(2.4). As µ1 = 0, α = ȳi=1 − β̂ x̄i=1 .
6
It is important to recognize that although estimating the regression (2.6) uses
only k degrees of freedom from N T observations, we also used a further N degrees
of freedom in constructing the demeaned variables. We lost a degree of freedom
the
for every one of the N individuals for which we estimated the mean. Hence,
number of degrees of freedom that must be used in estimating standard
erros in an unbiased way and when conducting hypothesis tests is N T −
N − k.
2. The LSDV estimator obtained directly from the estimating the regression
with dummies and within estimator obtained from within transformation
have identical estimated values and standard errors.
3. Testing for xed eects To test for whether xed eects are necessary, we can
test the joint signicance of the µi in eq(2.2), i.e. H0 : µ2 = µ3 · · · = µN = 0,
by performing an F -test. This is a simple Chow test with the restricted
residual sums of squares (RRSS) being that of OLS on the pooled model and
the unrestricted residual sums of squares (URSS) being that of the LSDV
regression. If N is large, one can perform the Within transformation and use
that residual sum of squares as the URSS.
(RRSS − U RSS)/(N − 1)
F0 = ∼ FN −1,N T −N −k (2.7)
U RSS/(N T − N − k)
where RRSS = uit where uit is the residual in equation (1.1), and U RSS =
P 2
7
If the F -test rejects xed eects, the xed-eect estimator and the LSDV
estimator are not ecient while the pooled OLS estimator in eq(1.1) is consistent
and eciency. If the model with xed eects is the true model, which implies that
the F -test does not reject xed eects, the pooled OLS estimator is biased and
inconsistent.
For consistency of the xed-eect estimator and the LSDV estimator, it is
required that E[(xit − x̄i )(vit − v̄i )] = 0. Because of the averages, this requires that
x is strictly exogenous: E[xit vis ] = 0, s = 0, 1, . . . , T .
The disadvantage of within transformation : The xed-eects (within) estima-
tor cannot estimate the eect of any time-invariant variable like gender, religion
or the sector of the rm operates in. These time-invariant variables are wiped out
by demeaning the variables. Alternatively, time-invariant variables are spanned
by the individual dummies.
For consistency of the between estimator, it is required that E[xit |µi ] = 0, in
addition to E[xit vis ] = 0. This additional assumption that explanatory variables
being uncorrelated to individual specic eects may be unreasonable.
where λt captures all of the variables that vary over time but are constant cross-
sectionally. In the literature of nance, for example, business cycle aects many
8
variables such as bank credit supply and household saving. The change in business
cycle may inuence credit supply of all banks in the same way. Another example is
the regulatory environment or tax rate changes part-way through a sample period.
As in the entity-xed eects model, to avoid multi-collinearity, we put the
xed-eect for one time period to zero. A least squares dummy variable (LSDV)
model can be estimated
where Tjt denotes a dummy variable that takes value 1 for time period j and zero
elsewhere. Similarly as in the entity-xed eects model, we can directly estimate
the parameters in eq(2.8) and obtain LSDV estimator for β . Alternatively, we can
avoid estimating the model containing all the dummies by conducting a within
transformation, which subtracts the cross-sectional averages from each observation
1 PN
where ȳt = yit as the mean of the observations on y across entities for
N i=1
each time period.
We can test the two-way xed eects model against the entity-xed eect
model, time-xed eect model, and the pooled OLS model using F-test in the same
manner as we test the one-way xed eects models. Restricted residual sum of
squares and the number of degrees of freedom change with the null hypothesis. To
test for whether time-xed eects are necessary, we can test the joint signicance
of the λi in eq(2.2), i.e. H0 : λ2 = λ3 · · · = λT = 0, by performing an F -test.
Now, the unrestricted model is the time-xed eects one, therefore, URSS and the
degree of freedom of the unrestricted model are those of the time-xed eects.
(RRSS − U RSS)/(N − 1)
F0 = ∼ FN −1,N T −T −k (2.11)
U RSS/(N T − T − k)
9
Finally, it is possible to allow for both entity-xed eects and time-xed eects
within the same model. Such a model is termed a two-way xed eects model
or two-way error component model, which combines equations (2.1) and (2.8)
The LSDV equivalent model contains both cross-sectional and time dummies
where the double dots above the variables denote the values after subtracting
the time mean. Then, we can subtract the cross-sectional mean of the variables in
eq(2.15) to remove λ̈t . Same estimates will be obtained if the within transformation
if step 2 is performed before step 1.
The two-way within transformation is more complicated to implement than the
one-way within transformation. Fortunately, a lot of statistical softwares estimate
the two-way within estimators for us. Alternatively, we can implement within
transformation only for the large data dimension, then estimate the regression on
demeaned variables with dummy variables for the small dimension. For example,
microdata panels (panels with large N and small T ) solicited from household
10
surveys normally span observations of many households but over a few periods.
For this case, we can rst transform the variables by subtracting time-mean and
then run a regression on the variables with dummy variables for time periods. The
opposite applies to macrodata panels (panels with small N and large T ) like panels
of observations for countries over many time periods.
We can test the two-way xed eects model against the entity-xed eect
model, time-xed eect model, and the pooled OLS model using F-test in the
same manner as we test the one-way xed eects models. Restricted residual sum
of squares and the number of degrees of freedom change with the null hypothesis.
11
transformation involved in this GLS procedure is to subtract a weighted mean of
the variables. Dene the 'quasi-demeaned ' data as yit∗ = yit −θȳi and x∗it = xit −θx̄i ,
where ȳi and x̄i are time-mean of the observations on yit and xit , respectively.
Weight θ is a function of the variance of the observation error term, σv2 , and of the
variance of the entity-specic error term, σµ2
σv
θ =1− p 2 (2.17)
T σµ + σv2
This transformation will precisely ensure that there are no autocorrelations in the
error terms. We can easily compute the random-eects estimator by estimating
the transformed model using OLS:
12
The random eects are more appropriate when the entities in the sample can be
thought of as having been randomly selected from the population. One way to
formalize this is noting that random eects model states that
The β coecients in these two conditional expectations are the same only if
E{µi xit } = 0. Therefore, one may prefer the xed eects estimator if some interest
lies in µi , which makes sense if the number of units is relatively small and of a
specic nature.
The random eects approach is valid only when the error term µi is uncorrelated
with all explanatory variables. Random eects-estimators will be biased and
inconsistent if µi are correlated with some explanatory variables. To see how
this arises, suppose that we have only one explanatory variable, x2it , that varies
positively with yit and also with the error term, µi . The estimator will ascribe
all of any increase in y to x when in reality some of it arises from the error term,
resulting in biased coecients. In contrast, xed eects estimators is consistent
regardless of the relationship between explanatory variables and µi . Therefore,
even if we are not interested in particular individuals and the sample is randomly
selected from the population, the xed eects estimator may be preferred.
The xed-eects estimator exploits the within dimension of the data (dier-
ences within individuals) only. The between dimension of the data (dierences
between individuals) is lost due to within transformation, because transformation
produces observations in deviation from individual averages and removes the cross-
sectional variation. In contrast, random-eects estimator use more of the variation
in the data (specically, they use the cross sectional/between variation). So, if the
13
assumptions of the random eects model are valid (random-eects estimators are
consistent) , the random-eects estimators will be more ecient (have smaller
standard errors) than xed- eects estimators.
14
average equal to the true value β of the parameter. This is formally formulated as
E{b} = β. (3.1)
Rit − Rf t = λ0 + λm βi + ui (4.1)
15
where the dependent variable is the excess return of i at each t and the independent
variable is the estimated beta for i.
If CAPM holds, λ0 should not be signicantly dierent from zero and λm should
approximate the (time average) equity market risk premium, Rm − Rf . Fama and
MacBeth (1973) proposed estimating this second stage regression separately for
each time period, and then taking the average of the parameter estimates to con-
duct hypothesis tests. However, we can see that regression (4.1) is a combination
setting of cross-section and time-series, therefore, we can also achieve a similar ob-
jective using a panel approach. For this example, we will use a sample comprising
the annual returns and estimated betas for eleven years on 2500 UK rms pro-
vided by Brooks (2019): https://www.cambridge.org/as/academic/subjects/
economics/finance/introductory-econometrics-finance-4th-edition?format=
PB → 'Resources' → 'General Resources' → 'Excel les' → 'panelex.xls'.
References
Baltagi, B. H. and Levin, D. (1992). Cigarette taxation: raising revenues and
reducing consumption. Structural Change and Economic Dynamics, 3(2):321
335.
16
Hausman, J. A. (1978). Specication tests in econometrics. Econometrica: Journal
of the Econometric Society, pages 12511271.
17