0% found this document useful (0 votes)
11 views20 pages

Part2 - FEM and REM

The document discusses fixed effect and random effect models in panel data analysis, detailing the fixed effects transformation and its implications for estimating parameters. It explains how the fixed effects estimator allows for correlation between unobserved effects and explanatory variables while eliminating time-invariant variables. Additionally, it introduces the Hausman test to determine whether to use fixed or random effects based on the correlation between unique errors and regressors.

Uploaded by

nampn22413ca
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views20 pages

Part2 - FEM and REM

The document discusses fixed effect and random effect models in panel data analysis, detailing the fixed effects transformation and its implications for estimating parameters. It explains how the fixed effects estimator allows for correlation between unobserved effects and explanatory variables while eliminating time-invariant variables. Additionally, it introduces the Hausman test to determine whether to use fixed or random effects based on the correlation between unique errors and regressors.

Uploaded by

nampn22413ca
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Panel data analysis

Fixed effect transformation

Consider a model with a single explanatory variable

yit = β1 xit + ai + uit , t = 1, 2, . . . , T (1)

Now for each i, average this equation over time. We get

ȳi = β1 x̄i + ai + ui (2)

1 P
where ȳi = yit
T
Because ai is fixed over time, it appears both in 1 and 2. If we
substract 2 from 1 for each t, we wind up with
Fixed effect transformation

yit − ȳi = β1 (xit − x̄i ) + uit − ūi , t = 1, . . . , T (3)


or
ÿit = β1 ẍit + üit , t = 1, . . . , T (4)
where ÿit = yit − ȳi is the time-demeaned data on y, and similarly
for ẍit and üit
▶ The fixed effects transformation is also called the within
transformation
▶ The important thing about equation (4) is that the
unobserved effect, ai , has disappeared.
▶ This suggests we should estimate (4) by pooled OLS.
▶ A pooled OLS estimator that is based on the time-demeaned
variables is called the fixed effects estimator or the within
estimator .
▶ The latter name comes from the fact that OLS on (4) uses
the time variation in y and within each cross-sectional
observation.
Fixed effect transformation

▶ The between estimator is obtained as the OLS estimator on


the cross-sectional equation 2
▶ We include an intercept, β0
▶ We use the time averages for both y and x.
▶ Then, we run a cross-sectional regression.
▶ We will not study the between estimator in detail because it is
biased when ai is correlated with x̄it .
Fixed effect estimator

▶ Under a strict exogeneity assumption on the explanatory


variables, the fixed effects estimator is unbiased
▶ Roughly, the idiosyncratic error uit should be uncorrelated
with each explanatory variable across all time periods.
▶ The fixed effects estimator allows for arbitrary correlation
between ai and the explanatory variables in any time period.
▶ Because of this, any explanatory variable that is constant over
time for all i gets swept away by the fixed effects
transformation.
▶ Therefore we cannot include variables such as gender or a
city’s distance from a river.
Example 14.2
▶ Each of the 545 men in the sample worked in every year from
1980 through 1987.
▶ Variables in the data set change over time: experience,
marital status, and union status.
▶ Variables in the data set do not change over time: race,
education, experience.
▶ If we use fixed effects (or first differencing), we cannot include
race, education, or experience in the equation.
▶ However, we can include interactions of educ with year
dummies for 1981 through 1987 to test whether the return to
education was constant over this time period.
▶ We use
▶ log(wage) as the dependent variable,
▶ dummy variables for marital and union status,
▶ a full set of year dummies, and the interaction terms
d81 ∗ educ, d82 ∗ educ, . . . , d87 ∗ educ.
Example 14.2

▶ The estimates on these interaction terms are all positive, and


they generally get larger for more recent years.
▶ The largest coefficient of .030 is on d87educ, with t = 2.48.
▶ In other words, the return to education is estimated to be
about 3 percentage points larger in 1987 than in the base
year, 1980.
▶ Give significant α = 0.05, the other significant interaction
term is d86educ (coefficient=.027, t = 2.23).
▶ The estimates on the earlier years are smaller and insignificant
at the 5% level against a two-sided alternative.
The dummy variable regression
▶ A traditional view of the fixed effects approach is to assume
that the unobserved effect, ai , is a parameter to be estimated
for each i.
▶ Thus, in equation 1, ai is the intercept for person i (or firm i,
city i, and so on) that is to be estimated along with the βj .
▶ Clearly, we cannot do this with a single cross section: there
would be N + k parameters to estimate with only N
observations. We need at least two time periods.
▶ The way we estimate an intercept for each i is to put in a
dummy variable for each cross-sectional observation, along
with the explanatory variables (and probably dummy variables
for each time period)
▶ This method is usually called the dummy variable regression.
▶ After fixed effects estimation with N of any size, the âi are
pretty easy to compute:

âi = ȳi − β̂1 x̄i1 − . . . − β̂k x̄ik , i = 1, . . . , k (5)


Fixed effects of first differencing ?

▶ When T = 2, the FE and FD estimates, as well as all test


statistics, are identical, and so it does not matter which we
use.
▶ When T > 2, the FE and FD estimators are not the same.
▶ When the uit are serially uncorrelated, fixed effects is more
efficient than first differencing (and the standard errors
reported from fixed effects are valid).
▶ When uit are serially correlated. One example is that uit
follows a random walk (which means that there is very
substantial, positive serial correlation). In this case, the
difference is serially uncorrelated, and first differencing is
better.
Practice: Exercises at the end of the chapter 14.
Random effect models

We begin with the same unobserved effects model as before,

yit = β0 + β1 xit1 + . . . + βk xitk + ai + uit (6)

where we explicitly include an intercept so that we can make the


assumption that the unobserved effect, ai , has zero mean

Equation (14.7) becomes a random effects model when we assume


that the unobserved effect ai is uncorrelated with each explanatory
variable.

(xitj , ai ) = 0, t = 1, 2, . . . , T ; j = 1, 2, . . . , k (7)
Random effect models

▶ If we believe that ai is uncorrelated with the explanatory


variables, the βj can be consistently estimated by using a
single cross section: there is no need for panel data at all.
▶ But using a single cross-section disregards much useful
information in the other time periods.
▶ Define the Composite error term as

vit = ai + uit (8)

▶ Equation 6 becomes

yit = β0 + β1 xit1 + . . . + βk xitk + vit (9)


Random effect models

▶ Because ai is in the composite error in each time period, the


vit are serially correlated across time.
▶ Let

σa2 = Var(ai )
σu2 = Var(uit )

▶ under the random effects assumptions

σa2
Corr(vit , vis ) = , t ̸= s (10)
σa2 + σu2
Random effect models
Define
1
σu2

2
λ=1− 2 2
σu + T σa
then
0≤λ≤1 (11)
Then the transformed equation turns out to be

yit −λȳi = β0 (1−λ)+β1 (xit1 − λx̄i1 )+. . .+βk (xitk − λx̄ik )+(vit −λv̄i )
(12)
where the overbar again denotes the time averages.
▶ The random effects transformation subtracts a fraction of
time average ȳi , where the fraction depends on σu2 , σa2 , and
the number of time periods, T .
▶ Then, we can use the generalized least square (GLS) to
estimate the parameters.
▶ It can be proven that the equation 12 are serially uncorrelated.
Random effect models

▶ Since REM only substracts a fraction of the time averages, it


allow the presence of the variables which are constant over
time in the equations.
▶ A more thoughtful explanation is the FEM assume the
unobserved effect ai are uncorrelated with any explanatory
variable whether or not it’s constant over time.
Random effect models

▶ λ is a constant to be estimated:

 1

 
 2

 

 1 
λ̂ = 1 − ! (13)
 σˆa2 
1 + T ·

 

 
σˆ2

u

▶ Pooled OLS is obtained when λ = 0


▶ When λ̂ is close to 0 !
σˆa2
≈0
σˆ2 u

▶ This means σa is super small compared to σu .


▶ So, the unobserved effect ai is relatively unimportant
Random effect models

▶ FEM is obtained when λ = 1


▶ When λ̂ is close to 1
!
σˆa2
→∞
σˆ2
u

▶ This means σa is super large compared to σu .


▶ So, the unobserved effect ai is relatively important
▶ The composite error can be decomposed as

vit − λv̂i = (1 − λ)ai + uit − λûi (14)

▶ the errors in the transformed equation used in random effects


estimation weight the unobserved effect by (1 − λ)
Example 14.4

▶ data: WAGEPAN
▶ Three methods
▶ pooled OLS
▶ random effects
▶ fixed effects
▶ In the first two methods, we can include educ and race
dummies (black and hispan), but these variables are drop out
of the fixed effects analysis.
Fixed effect or random effect ?

Hausman test
▶ Null hypothesis: Random effect model
▶ Alternative hypothesis: Fixed effect model.
▶ If the p-value of the Hausman test is less than the significance
level, then reject the null hypothesis and use the Fixed effect
model
▶ Essentially, the tests looks to see if there is a correlation
between the unique errors and the regressors in the model.
The null hypothesis is that there is no correlation between the
two.
Practice: Exercises at the end of the chapter 14.

You might also like