0% found this document useful (0 votes)

336 views42 pages

Panel Data Analysis

Notes

Uploaded by

Ivy Dasgupta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

336 views42 pages

Panel Data Analysis

Notes

Uploaded by

Ivy Dasgupta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

Panel Data Dr.

Ivy Das Gupta

Analysis
Panel Data

PANEL DATA ARE A PANEL CONSISTS THE CROSS-SECTION SOME EXPLANATORY SOME VARIABLES LIKE
CONSTRUCTED OF A SET OF MULTIPLE UNITS MAY BE VARIABLES ARE AGE AND WEALTH
THROUGH SURVEY ENTITIES FROM HOUSEHOLDS, FIRMS, OBSERVED THAT CAN PROFILE ARE TIME-
CONDUCTED AT WHICH COUNTRIES AND SO BE CONTROLLED, BUT DEPENDENT, WHILE
SEVERAL POINTS IN INFORMATION ON ON. SOME INFORMATION SOME VARIABLES LIKE
TIME USING THE SAME SIMILAR ISSUES IS IS UNOBSERVED GENDER ARE TIME-
CROSS SECTION COLLECTED OVER WHICH IS INDEPENDENT
UNITS. TIME. UNCONTROLLABLE.
Panel Data

Panel data econometric

Panel data can take care of models examine unobserved
A formulation of the panel inter-individual differences heterogeneity by estimating
data model includes both
and intra-individual dynamics cross section-specific effects,
observed and unobserved time effects or both which
by mixing cross section and
explanatory variables time series components. are analysed by fixed effects
or random effects model
Panel Data

THE ANALYSIS OF COLLECTING PANEL FOR THIS REASON, THE NATIONAL

PANEL DATA HAS DATA, HOWEVER, IS PANEL DATA HAVE LONGITUDINAL
BEEN RAPIDLY MUCH MORE COSTLY NOT BECOME WIDELY SURVEYS OF LABOUR
GROWING BECAUSE THAN COLLECTING AVAILABLE EVEN IN MARKET EXPERIENCE
OF THE AVAILABILITY CROSS SECTION OR MANY DEVELOPED IN THE US, THE
OF ECONOMETRIC TIME SERIES DATA. COUNTRIES. DATABASE PREPARED
AND STATISTICAL BY THE MICHIGAN
PROGRAMMES. PANEL STUDY OF
INCOME DYNAMICS
(PSID), ARE THE WELL-
KNOWN PANEL DATA
USED BY THE
RESEARCHERS.
Panel Data

In India, panel data are still not available in official

statistics.

Although the industrial statistics wings of the Central

Statistics Office (CSO) has been trying to prepare factory
level panel in Annual Survey of Industries (ASI)

The types of panel data econometric model depend on

the cross-section dimension, time dimension and the
nature of the entities (cross section units).
Structure of Panel Data

 Panel data can be organised by taking three

dimensions into account: number of cross section
units (i = 1, 2, 3, …, N), number of time periods (t = 1, 2,
3, …, T) and the number of variables (v = 1, 2, …, k).
 We need to rearrange these three dimensions into a
two-dimensional data matrix to estimate an
econometric model by using software.
 The long format is the appropriate way of organising
panel data in a computer programme
 In this format, the data matrix has N.T rows and k
columns.
Structure of Panel Data

 The number of records for k number of variables in the corresponding data file is N.T
Data Matrix of a Single Variable X
Structure of Panel Data

 A particular row of the data matrix represents cross section data

 The entries in a particular column present information on X from a particular entity over time,
forming the time series data
 In a panel data, X varies across i and over t and a particular entry in the data matrix is presented as
Xit
 The means:
Types of Panel Data

If each cross-section unit

Panel data may be is observed each and
balanced or every time period, the
unbalanced. data are called
balanced panel.

In other words, in a
In a balanced panel, balanced panel, all
there will be no missing entities have
value in the data set. measurements in all time
periods.
Types of Panel Data

On the other hand, in unbalanced panel the information of

some cross section units are not available for the entire time
period.

If there are missing data, the number of measurements, Ti,

varies between cross section units and the data set formed is
called an unbalanced panel

In other words, in this case, each cross-section unit does not

appear in every time, and there are missing values.
Types of Panel Data

➢ There are two types of economic panel data based on the types of cross
section units: micro panel and macro panel.
➢ If the cross section units are micro units, the panel data are micro panel.
 In the case of micro panel, number of cross section units is much larger than
time period (N >> T).
 The large surveys of the households or firms over time form micro panel
 The micro panel is also called cross section panel or short panel.
Types of Panel Data

 On the other hand, if the cross section units are macro units, the panel forms a
macro panel.
 In a macro panel, the number of cross section units is much smaller than time
period (N < T).
 The macro panel is also called long panel or time series panel.
 As the time dimension is large, time series properties will be dominating in
macro panel.
Variation in a Panel Data Set

 Sources of variation in a Panel Dataset :

➢ Within Variation : Variable values change over time for any single unit (e.g., firm or individual
or household)
➢ Between Variation : Variable values change between units (i.e. variables differ across firms /
individuals / households)
➢ Overall or Total Variation = Within Variation + Between Variation
 Let us understand these sources of variation in a Panel Dataset with a labour-market example
This is an excerpt from a panel dataset used
for a labour-market study on determinants of
earnings:
● Dependent variable is log of wage
(lwage)
● Every Cross Section unit is given a
unique numeric identifier (person-id)
● This is the most standard arrangement
of panel data (long-form). All years
stacked in a column for each unit.
● Within variation indicated in a single
colour for each unit
● Between variation combines two
colours
● Total variation can be decomposed
into Within and Between variations
Data Description by using STATA

➢ The . use command in Stata reads a data set, and the clear option removes data in
memory currently used and then loads new one into the main memory.
➢ The . list command lists data items of individual observations.
➢ Suppose that we want to look at output per worker (output_per_worker), GDP
growth (gdp_growth) and workers in wage employment (wage_workers) in the data
set by using the following command:
list country_SA year output_per_worker gdp_growth wage_workers in 1/35, sep (28)
Data Description by using STATA

 The following command reshapes the data from the wide form to long one.
. reshape long output_per_worker gdp_growth wage_workers, i(country_SA) j( year)
 The i(country_SA) specifies identification variables to be used as identification of
observations.
 . describe command displays the basic information of the variables
 Descriptive statistics like mean, standard deviation, minimum and maximum of
variables listed in the sample data set are obtained by using . summary
Data Description by using STATA

➢ To use panel data commands, we need to declare which variable is treated as cross section units
and which one is used as time series variables
➢ Using the .xtset command followed by the name of the cross section and time series variables in
order.
➢ For example, if country is the variable name for cross section units and year is the name for time
variable, we can use the following command:
xtset country year
Data Description by using STATA

 If following error appears:

xtset country year
string variables not allowed in varlist;
country is a string variable
r(109);

➢ We need to convert the cross section id ‘country’ to numeric by using the command:
encode country, g(country_i)
 Then execute: xtset country_i year
 To describe the pattern: xtdescribe
 To explore descriptive statistics: xtsum
 To plot: xtline
Benefits of Panel Data

 Panel data have several advantages over cross section data and or time series
data
1. In panel data no. of data points is increased. If there are N cross section units
and T time periods, then total number of observations is NT. Thus degrees of
freedom is more
2. It is helpful in constructing and testing more complicated behavioural
hypothesis, eg., in the presence of unobserved heterogeneity
Benefits of Panel Data

3. It contains intertemporal dynamics and may allow to control

the effects of unobserved variables in estimating a model

4. The collinearity between current and lag variables can be

reduced
Benefits of Panel Data

5. panel data are helpful in providing micro foundations for aggregate data analysis. If micro units are
heterogeneous, the time series properties of aggregate data will be very different from those of disaggregate
data. In this case, the prediction of aggregate outcomes by using aggregate time series may be misleading.
The use of panel data can resolve this problem by capturing the heterogeneity issue.

6. In panel data, if observations among cross-sectional units are independent, one can show by using the
central limit theorem that the limiting distributions of many estimators remain asymptotically normal even for
nonstationary series
Error Component
model
Error Component model

 One way to restore homogeneity across i or over t and to solve the problem of endogeneity (when
explanatory variable is correlated with the error term) and to decompose the random error, and the
model developed is known as the error component model.
 If the error is decomposed in one way, either cross section-specific or time-specific, it is called one-
way error component model
 If error is decomposed in both cross section- and time-specific, it will be two-way error component
model
One-way Error Component model

 In one-way error component model, the random disturbance is decomposed into a

cross section-specific error μi (or time-specific error λt) and an distinctive error εit
uit = μi + εit

Or
uit = λt + εit
 In one-way error component structure, then the multiple linear regression takes
the following form:
yit = μi + xit’β + εit
One-way Error Component model

 Or,
yit = λt + xit’β + εit

 We impose the restrictions that slope coefficients are identical but the intercepts are
not and it is estimated by applying OLS
Two-way Error Component model

 The random error can also be decomposed in two ways: both cross section-specific and time-
specific errors
uit = μi + λt + εit

 The two-way error component model is expressed as

yit = μi + λt + xit’β + εit

➢ The error component model can be estimated by applying either fixed effects or random effects
specification depending on the nature of the error component
Error Component model

➢ When the error component is assumed to be non-stochastic, it will be a fixed effects

model
➢ When the error component is treated as random, it becomes random effects model

We have therefore four different types of error component models:

1) One-way error component fixed effects model,

2) One-way error component random effects model,
3) Two-way error component fixed effects model,
4) Two-way error component random effects model.
Error Component model

 In a fixed effects model, the cross section - or time-specific errors are treated as the coefficients
of the dummy variables and are the part of the intercept term.
 The fixed effects error component model therefore is sometimes called the least squares dummy
variable (LSDV) model
➢ In a random effects model the errors are combined to the random disturbance
➢ A simple trick to eliminate the unobserved cross section-specific error is the first-differenced
estimator
First-Differenced Estimator

 One inherent problem in estimation applying OLS is that it contains unobserved heterogeneity
and that cannot be estimated separately
 With panel data we can difference out the cross section-specific error after taking difference over
time

△yi = △xi’β + △εi

 This is a simple cross-sectional regression equation in differences without constant

 The coefficient vector β can be estimated consistently by applying OLS
First-Differenced Estimator (using STATA)

 We generate the first-differenced variables after setting the data in panel

.xtset country_SA year /* xtset the data
. gen d_lab= ln_lab - l.ln_lab /* l. is the lag-operator
. gen d_pro = ln_lab_pro - l.ln_lab_pro
. gen d_growth= gdp_growth - l.gdp_growth
➢ Then we estimate an OLS regression (with no constant):
. reg d_lab d_pro d_growth, noconstant
First-Differenced Estimator

 The advantage of FD estimation is that the fixed effects are cancelled out.
 The intuition behind the FD estimator is that it uses only within-entity changes bypassing the
between-entity change
 Unobserved differences between countries no longer bias the estimator
 But, in the first-differenced model we cannot estimate the measure of heterogeneity, μi
 The fixed effects model can incorporate the estimates of cross section-specific unobserved
heterogeneity
Fixed Effects Model

 We assume that the individual effects are time constant but are not common across the entities.
 The distinctive error varies over individuals and time.
 In the fixed effects model we can estimate each μi along with β
 There are several ways for estimating a fixed effects model
 One popular method is the “within” estimation or mean-corrected estimation
 Another method for estimating fixed effects is the least squares dummy variable (LSDV) model
that uses dummy variables for the cross section units
The Within Estimation

 The “within” estimation uses deviations from group (or time period) means or variation
within each individual or entity.
 Let us start from the one-way error component model with single regressor:
yit = β0 + μi+ β1xit+ εit …………. (i)
➢ Taking mean of this equation over time for each i (“between” transformation), we have:

-----------------(ii)
The Within Estimation

 Again by taking average of equation (ii) across individuals, we have the following
mean equation:

 The underlying assumption here,

 This restriction on the coefficients of dummy variable is required to avoid the dummy
variable trap. Only β 1 and (β0 + μi) are estimable from equation (i) and not β0 & μi
separately unless the restriction is imposed.
The Within Estimation

➢ On the other hand, subtracting (ii) from (i) for each t (“within”
transformation) we get

➢ Here, the incidental parameter (μi) is no longer a problem and the model can
be estimated by applying OLS. Time constant unobserved heterogeneity is
no longer an issue in “within” estimation.
The “within” estimation, however, has
several disadvantages
• First, it will not work well with data for which within-
cluster variation is minimum.
• Second, data transformation for “within” estimation
Disadvantages wipes out all time-invariant variables like gender,
of Within citizenship and ethnic group, and it is not possible to
Estimation
estimate coefficients of such variables in “within”
estimation.

• Third, the “within” estimation does not report the

estimated fixed effects
LSDV Regression

 The least squares dummy variable (LSDV) regression is the OLS regression of a set of dummies in
fixed effects framework
 In many cases, the unobserved characteristics of the cross section units may be of interest to the
researchers.
 But, the within-group method does not estimate the unobserved fixed effects because of the
construction of the model, the unobserved effects are swept from the model.
LSDV Regression

 To estimate the fixed effects, we can treat the unobserved fixed effects as
the coefficients of the binary variables representing the cross section units
➢ The least squares dummy variable (LSDV) model provides the fixed effects
estimators along with the slope parameter.
➢ We also get estimates for the μi
➢ The LSDV estimator is practical only when N is small
Random Effects Model

 In LSDV, there is a possibility of the loss of degrees of freedom.

 The loss of degrees of freedom could be avoided if the unobserved effect μi is assumed to be
random.
 If the unobserved effects are random, the error component model will be random effects
model
 Random effect of the unobserved heterogeneity is captured by the distribution of the intercepts.
 In the random effects model, degrees of freedom are more because we do not need to estimate
the parameters describing the cross section-specific or time-specific unobserved effects
Random Effects Model

 The random effects model is an appropriate specification when the cross

section units in a panel are drawn randomly from a large population.
 Such type of sampling is more relevant for micro panel.
 The variation of unobserved effects across entities is assumed to be
random and uncorrelated with the independent variables included in the
model
Assumptions in Random Effect Model

For i ≠ j and t ≠ s

The components of the error terms are not correlated, i.e.

E(μi, εit) = 0
Assumptions in Random Effect Model

 The μi are independent of the error term εit and the regressors xit, for all i and t.
 Therefore, the mean and variance of the composite error are
E(uit) = 0, V(uit) = V(yit) = σy2 = σμ2 + σε2

 σμ2 & σε2 are called variance components of σy2

 Random effects model is also known as variance components model

 The covariance of the composite error

Cov(uit, ujs) = E(uit, ujs) = 0

econometrics 2
No ratings yet
econometrics 2
20 pages
Panel Data Econometrics-1
No ratings yet
Panel Data Econometrics-1
131 pages
Chapter 5 Panel Data (2) (1)
No ratings yet
Chapter 5 Panel Data (2) (1)
47 pages
Introduction To Panel Data Analysis Using Eviews
No ratings yet
Introduction To Panel Data Analysis Using Eviews
43 pages
Geneve Chapitre0
No ratings yet
Geneve Chapitre0
37 pages
CHAPTER 7
No ratings yet
CHAPTER 7
121 pages
Ecmetrics II Ch4
No ratings yet
Ecmetrics II Ch4
56 pages
8) Lesson_11_Panel_FE
No ratings yet
8) Lesson_11_Panel_FE
18 pages
Block 3
No ratings yet
Block 3
36 pages
Panel Data Methods
No ratings yet
Panel Data Methods
17 pages
The Nature of Econometrics and Economic Data
No ratings yet
The Nature of Econometrics and Economic Data
10 pages
Panel Data Slides - 230919 - 160722
No ratings yet
Panel Data Slides - 230919 - 160722
92 pages
DMA_Session6_Video6.1
No ratings yet
DMA_Session6_Video6.1
12 pages
Panel Data
No ratings yet
Panel Data
105 pages
Emping Stat Ass
No ratings yet
Emping Stat Ass
5 pages
Samggfy
No ratings yet
Samggfy
2 pages
A Guide to Panel Data Regression_ Theoretics and Implementation with Python
No ratings yet
A Guide to Panel Data Regression_ Theoretics and Implementation with Python
17 pages
PD2004_1
No ratings yet
PD2004_1
24 pages
Dr. Abuzar Nomani
No ratings yet
Dr. Abuzar Nomani
26 pages
Panel Time-Series
No ratings yet
Panel Time-Series
113 pages
Week 1
No ratings yet
Week 1
48 pages
Panel Data Models
No ratings yet
Panel Data Models
112 pages
A Guide to Panel Data Regression_ Theoretics and Implementation with Python TEXT
No ratings yet
A Guide to Panel Data Regression_ Theoretics and Implementation with Python TEXT
5 pages
Advanced Econometrics
No ratings yet
Advanced Econometrics
61 pages
Panel Data Analysis With Stata Part 1: Fixed Effects and Random Effects Models
No ratings yet
Panel Data Analysis With Stata Part 1: Fixed Effects and Random Effects Models
26 pages
Block 3
No ratings yet
Block 3
105 pages
Panel Data Analysis With Stata Part 1 Fixed Effects and Random Effects Models
No ratings yet
Panel Data Analysis With Stata Part 1 Fixed Effects and Random Effects Models
57 pages
ECN3322 - Panel Data-1
No ratings yet
ECN3322 - Panel Data-1
56 pages
PANEL_DATA_ANALYSIS
No ratings yet
PANEL_DATA_ANALYSIS
14 pages
Introduction To Panel Data UG-students
100% (1)
Introduction To Panel Data UG-students
57 pages
Ecotrics (PR) Panel Data Reference
No ratings yet
Ecotrics (PR) Panel Data Reference
22 pages
Topic 1_An Introduction to Panel Data Analysis
No ratings yet
Topic 1_An Introduction to Panel Data Analysis
37 pages
PanelDataAnalysiswithStata1FEandREModelsMPRA Paper 76869
No ratings yet
PanelDataAnalysiswithStata1FEandREModelsMPRA Paper 76869
58 pages
30905022117 RohanChakraborty FinancialAnalytics CA2.PDF
No ratings yet
30905022117 RohanChakraborty FinancialAnalytics CA2.PDF
10 pages
Fem & Rem
No ratings yet
Fem & Rem
20 pages
Structure of Economic Data
No ratings yet
Structure of Economic Data
4 pages
Panel Data Notes
No ratings yet
Panel Data Notes
26 pages
Ecotrics (PR) Panel Data 2
No ratings yet
Ecotrics (PR) Panel Data 2
16 pages
Econometrics 5
No ratings yet
Econometrics 5
29 pages
Introduction To Panel Data
No ratings yet
Introduction To Panel Data
20 pages
Panel Data Assignment
No ratings yet
Panel Data Assignment
24 pages
Introduction To Panel Data Analysis
No ratings yet
Introduction To Panel Data Analysis
18 pages
Chapter 2 Panel Data
No ratings yet
Chapter 2 Panel Data
17 pages
Yaffee Promer For Panel Data Analysis
No ratings yet
Yaffee Promer For Panel Data Analysis
12 pages
Topic 6 - Static Panel Data
No ratings yet
Topic 6 - Static Panel Data
21 pages
Panel 101
No ratings yet
Panel 101
48 pages
Panel Data Analysis of Microeconomic Decisions: Fall 2020
0% (1)
Panel Data Analysis of Microeconomic Decisions: Fall 2020
25 pages
Guja - Chap 16 PDF
No ratings yet
Guja - Chap 16 PDF
26 pages
Panel Data Econometrics Kenya
No ratings yet
Panel Data Econometrics Kenya
114 pages
Panel Data Analysis Using Stata: Sebastian T. Braun University of ST Andrews
No ratings yet
Panel Data Analysis Using Stata: Sebastian T. Braun University of ST Andrews
90 pages
Panel Data Analysis Using STATA 13
No ratings yet
Panel Data Analysis Using STATA 13
17 pages
Panel Data
100% (2)
Panel Data
5 pages
Note On Panel Data
No ratings yet
Note On Panel Data
19 pages
Primer On Panel Data Analysis PDF
No ratings yet
Primer On Panel Data Analysis PDF
11 pages
Bsa s01 s02 Ppt-In-class
No ratings yet
Bsa s01 s02 Ppt-In-class
125 pages
Eviews Training: Frequency Conversion
No ratings yet
Eviews Training: Frequency Conversion
73 pages
Panel Data Analysis
No ratings yet
Panel Data Analysis
42 pages
This Web Page: - Sort Panelvar Datevar - Tsset Panelvar Datevar
No ratings yet
This Web Page: - Sort Panelvar Datevar - Tsset Panelvar Datevar
4 pages
Pooled & Panel
No ratings yet
Pooled & Panel
34 pages
Econometrics - Basic 1-8
100% (1)
Econometrics - Basic 1-8
58 pages
Introductory Econometrics: Prachi Singh & Partha Bandopadhyay
No ratings yet
Introductory Econometrics: Prachi Singh & Partha Bandopadhyay
29 pages
1.3 1.4 Unit 1 Statistics and Data Dr. Rafiq
No ratings yet
1.3 1.4 Unit 1 Statistics and Data Dr. Rafiq
27 pages
Panel Data
No ratings yet
Panel Data
9 pages
The Structure of Economic Data
No ratings yet
The Structure of Economic Data
2 pages
Different Types of Data For Economic Analysis 2
No ratings yet
Different Types of Data For Economic Analysis 2
3 pages
Lista de Comandos Database Wooldridge
No ratings yet
Lista de Comandos Database Wooldridge
9 pages
Panel Data Assignment
No ratings yet
Panel Data Assignment
32 pages
If The Bcuse Command Is Not Available, Install It With The Stata Command SSC Install Bcuse
No ratings yet
If The Bcuse Command Is Not Available, Install It With The Stata Command SSC Install Bcuse
9 pages
(2021) EC6041 Lecture 1 Intro Notes
No ratings yet
(2021) EC6041 Lecture 1 Intro Notes
9 pages
CHAPTER ONE. Econometrics
No ratings yet
CHAPTER ONE. Econometrics
8 pages
ASSIGNMENT 1 Econometrics
No ratings yet
ASSIGNMENT 1 Econometrics
7 pages
Descriptive Research
No ratings yet
Descriptive Research
3 pages
Business Statistics I Essentials
From Everand
Business Statistics I Essentials
Louise Clark
5/5 (5)
Introduction To Business Statistics Through R Software: Software
From Everand
Introduction To Business Statistics Through R Software: Software
Editor IJSMI
No ratings yet
Time Series with Python: How to Implement Time Series Analysis and Forecasting Using Python
From Everand
Time Series with Python: How to Implement Time Series Analysis and Forecasting Using Python
Bob Mather
3/5 (1)

Panel Data Analysis

Uploaded by

Panel Data Analysis

Uploaded by

Panel Data Dr.

Ivy Das Gupta

Panel data econometric

THE ANALYSIS OF COLLECTING PANEL FOR THIS REASON, THE NATIONAL

In India, panel data are still not available in official

Although the industrial statistics wings of the Central

The types of panel data econometric model depend on

 Panel data can be organised by taking three

 A particular row of the data matrix represents cross section data

If each cross-section unit

On the other hand, in unbalanced panel the information of

If there are missing data, the number of measurements, Ti,

In other words, in this case, each cross-section unit does not

 Sources of variation in a Panel Dataset :

 If following error appears:

3. It contains intertemporal dynamics and may allow to control

the effects of unobserved variables in estimating a model

4. The collinearity between current and lag variables can be

 In one-way error component model, the random disturbance is decomposed into a

 The two-way error component model is expressed as

➢ When the error component is assumed to be non-stochastic, it will be a fixed effects

We have therefore four different types of error component models:

1) One-way error component fixed effects model,

△yi = △xi’β + △εi

 This is a simple cross-sectional regression equation in differences without constant

 We generate the first-differenced variables after setting the data in panel

 The underlying assumption here,

• Third, the “within” estimation does not report the

 In LSDV, there is a possibility of the loss of degrees of freedom.

 The random effects model is an appropriate specification when the cross

The components of the error terms are not correlated, i.e.

 σμ2 & σε2 are called variance components of σy2

 Random effects model is also known as variance components model

Cov(uit, ujs) = E(uit, ujs) = 0

You might also like