0% found this document useful (1 vote)

83 views25 pages

Panel Data Analysis of Microeconomic Decisions: Fall 2020

Panel data analysis involves studying the same observational units, like individuals, firms or countries, over multiple time periods. This allows researchers to address identification problems like omitted variable bias that are difficult with cross-sectional data alone. Panel data sets are typically larger than cross-sectional or time series data alone, allowing for more efficient parameter estimates. A key advantage is that panel data methods can eliminate the influence of time-invariant characteristics through transformations like first-differencing, overcoming omitted variable bias from unobserved static factors.

Uploaded by

Aish Jamil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (1 vote)

83 views25 pages

Panel Data Analysis of Microeconomic Decisions: Fall 2020

Uploaded by

Aish Jamil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Panel Data Analysis of Microeconomic Decisions

Fall 2020
Panel Data Analysis Fall 2020

General Information

• Bettina Siflinger, email: [email protected] ; office: K538

• first half: 10 lectures, 2 computer exercises, all on zoom

• first part assignment due on October 23: submit online via canvas

• second assignment due about two weeks before exam (tba)

• grading: group assignments 30%, final written exam 70%

• textbooks
– M. Verbeek “A Guide to Modern Econometrics”, 5th edition, chapter 10.
– J. Wooldridge “Econometric analysis of cross section and panel data”, 2nd
edition, mostly chapter 10.
– C. Cameron & P.Trivedi “Microeconometrics – methods and applications”,
chapter 21.

1
Panel Data Analysis Fall 2020

Specifics part I
• first part of this course is rather theoretical. This is required to grasp and correctly
apply panel methods. It also will yield a solid preparation for the second part.

• we will not spend too much time on interpreting coefficients or discussing articles
in detail.
– research articles will be mainly used for illustration and as examples.
– slides contain links to all articles used for the course.
– we will mostly use our own data generating process (DGP) to examine the
behavior of estimators. We will use Stata. Material will be available on Canvas.

• the slides contain exercises. Some exercises will be done during the lecture, others
are to be solved at home. The solution will be provided on Canvas.

• for more extensive calculations, I will provide a set of pdf documents or additional
videos.

• important: Report typos/mistakes! This is just fair towards fellow students.

2
Panel Data Analysis Fall 2020

Outline

• panel data methods for static/dynamic linear models

1. introduction to panel data modeling
– what are panel data?
– advantages over conventional cross sections?
2. static linear model
– fixed effects (FE), first difference (FD), random effects (RE) model
– model comparison (Hausman test)
– instrumental variable (IV) models
3. dynamic linear model
– models for lagged dependent variables
– GMM estimation

3
Panel Data Analysis 1. Introduction

1. What are panel data?

• panel data are repeated observations for the same unit i = 1, ..., N observed over
t = 1, ..., T periods

• units can be individuals firms, HH, countries,...

• an observation is the pair {yit; xit} where i subscript denotes the unit i.e. individual,
and t subscript denotes time

• xit can be time-varying, i.e. age, labor force status, health, or time-constant, i.e.
sex, genes, birth date

• panel can be balanced, {yit; xit}: t = 1, . . . , T ; i = 1, . . . , N , or unbalanced,

{yit; xit}: t = ti, . . . , t̄i; i = 1, . . . , N

• we only consider panel data with large N and (relatively) small T

4
Panel Data Analysis 1. Introduction

Structure of panel data

5
Panel Data Analysis 1. Introduction

• a panel data set can be interpreted as a three-dimensional matrix where the 3rd
dimension is time: think of T two-dimensional matrices (one for each period)
stacked behind each other in the 3rd dimension

• most econometric software is written for two-dimensional matrices → need to

represent three in two dimensions

• this can be done by stacking the T two-dimensional matrices below each other
(long format, done here) or next ot each other (wide format)

• Exercise: Consider the structure of the data set. Does it represent an iid random
sample? Find one pro and one contra argument.

6
Panel Data Analysis 1. Introduction

Example panel data set: Contoyannis and Rice (2001, link)

• interested in modeling the impact of health on wages in UK

• data set
– five waves of the British Household Panel Study (BHPS)
– all individuals are employed during observation period
– balanced sample containing 1,625 individuals for 5 waves

• variables
– outcome variable yit: wage rate hourly (wage)
– time-varying xit: health status (sahex, sahgd), age (age),...
– time-invariant xit: race (white), highest educational degree (deg),...

7
Panel Data Analysis 1. Introduction

Example panel data set: structure selected variables

8
Panel Data Analysis 1. Introduction

Example panel data set: variation selected variables

9
Panel Data Analysis 1. Introduction

Motivation: Why should we use panel data?

• panel data allow the identification of certain parameters or questions, without

making too restrictive assumptions

• example above: Observe changes in individual’s health and link to wage development

→ 1.1 address identification problems (1st order for causal analysis)

• panel data sets are typically larger than cross-section/time-series data and variables
vary over two dimensions

• estimators based on panel data often more accurate than from other data

→ 1.2 more efficient estimators (even with identical N )

10
Panel Data Analysis 1. Introduction

1.1 Motivation: The omitted variables problem

• consider the following structural linear model with cross section data

y = β0 + x 0 β + α + u

y, x ≡ (x1, . . . , xk )0 observed random variables; α unobserved regressor;

u unobserved iid error term with zero conditional mean E(u|x, α) = 0

• the population regression function is

E(y|x1, ..., xk , α) = E(y|x, α) = β0 + x0β + α

– if α is uncorrelated with xj for some variable j, then α is just another variable

affecting y
– if Cov(xj , α) 6= 0, not observing α creates an omitted variable bias (OMV)

11
Panel Data Analysis 1. Introduction

• Exercise: Consider the situation above.

1. Why do we obtain an OMV?

2. Show how Cov(xj , α) 6= 0 leads to an inconsistent estimate of βj using OLS.

3. Which solutions do we have for this problem when data are cross-section?

12
Panel Data Analysis 1. Introduction

Panel data and omitted variable bias

• assume we observe the same cross section units at two points in time
– yt, xt for t = 1, 2, observed for two time periods
– assume that α is time-constant and does not vary across t

• the population regression function is

E(yt|xt, α) = β0 + x0tβ + α, t = 1, 2, (1)

yt = β0 + x0tβ + α + ut

• zero conditional mean assumption, E(ut|xt, α) = 0, t = 1, 2 implies the

orthogonaity condition E(xtut) = 0

• problem: OMV still exists as long as Cov(xjt, α) 6= 0

• with panel data we can difference Equation (1) across t to eliminate α

13
Panel Data Analysis 1. Introduction

• Differencing eliminates all time-constant factors, including α

∆y = ∆x0β + ∆u (2)

where ∆y = y2 − y1, ∆x = x2 − x1, ∆u = u2 − u1

• for consistent estimate β̂, check orthogonality condition for Equation (2)

E(∆x∆u) = 0
←→ E [(x2 − x1)(u2 − u1)] = E(x2u2) + E(x1u1) − E(x1u2) − E(x2u1) = 0

• since E(xtut) = 0, t = 1, 2, so the first two terms vanish

• to get rid of 3. and 4. term, we also need to assume E(xtus) = 0, for t 6= s!

• strict exogeneity is the key assumption for identification with panel data models
and it will be central to this course!

14
Panel Data Analysis 1. Introduction

Other reasons why panel data are useful for identification

individual dynamics

• individual who has experienced an event in past is more likely to experience that
event in future compared to individual who has not experienced that event

• conditional probability of experiencing event in future is a function of past experience

• two explanations for this empirical regularity

(a) true state dependence: lagged state, yt−1, enters model in a structural way as
explanatory variable i.e. experiencing the event changes preferences
(b) spurious state dependence: individuals differ in unobserved characteristics
which make them more/less likely to experience the event

• panel data allow distinguishing between (a) and (b). Why?

15
Panel Data Analysis 1. Introduction

internal instruments
• panel data also allow computing internal instruments to address endogeneity in xit

• structural model: yit = x0itβ + αi + uit = x0itβ + it

• Example: impact of health on wages

– problem: individuals with higher wages may be healthier (reverse causality)
– consequence: regression of health on wage yields biased estimates
– solution: instrumental variable → correlated with health but uncorrelated with
it = αi + uit
– “internal instrument”: transformed endogenous health as instrument for xit,
T
zit = xit − T1 s=1 xis
P

– crucial assumptions: Cov(xis, αi) = Cov(xit, αi), and Cov(xis, uit) = 0, ∀s, t

16
Panel Data Analysis 1. Introduction

1.2 Efficiency of parameter estimates

• panel data comprise large amount of information, variables vary over more than one
dimension
→ considerable efficiency gains compared to less informative data

• repeated cross section

– repeated cross section: data sets expands over several time periods, but in
each time period a new random sample is drawn from the population
– each individual is observed only once in the data → we use variation across
individuals in each time period

• panel data
– data sets expands over several time periods, but in each time period we observe
the same individual again
– each individual is observed over all periods → we use variation across and
within individuals in each time period

17
Panel Data Analysis 1. Introduction

Example: Variance of estimator in repeated cross section data vs.

panel data

• linear model with unobserved time-constant effect and time dummies

yit = µt + αi + uit, i = 1, ..., N, t = 1, ..., T

• interested in estimating time effects, i.e. (µt − µs) change from one period to next

• to assess efficiency we require the variance of the estimator

V ar(µ̂t − µ̂s) = V ar (µ̂t) + V ar (µ̂s) − 2Cov(µ̂t, µ̂s), (3)

N
1 X
with µ̂t = yit, t = 1, . . . , T
N i=1

18
Panel Data Analysis 1. Introduction

• Exercise: Any idea why the variance of the estimator is smaller for panel data
than for repeated cross section? Use the formula in Equation (3) to come up
with a formal answer that you elaborate in group. Assume that uit is an iid
error term that is independent of αi and µ1, ..., µT . Hint:
– derive covariance, Cov(µ̂t, µ̂s), and variance, V ar(µ̂t), as functions of
unobservables αi and uit
– then plug in for V ar(µ̂t − µ̂s), for panel and cross section data and compare

19
Panel Data Analysis 1. Introduction

1.3 Panel data estimators: An overview

• consider the following linear model for i = 1, ..., N and t = 1, ..., T

yit = β0 + x0itβ + it (4)

– xit: single explanatory variable

– it: error term varies over t and captures unobservable factors

• consistent estimation of β using OLS requires satisfying

– population orthogonality assumption: E(xitit) = 0, t = 1, . . . , T
– rank assumption or no perfect multicollinearity

• efficient: {it : t = 1, . . . , T } is homoskedastic and serially uncorrelated

20
Panel Data Analysis 1. Introduction

• unobserved effects formulation of Equation (4)

yit = x0itβ + αi + uit (5)

– error term it is it = αi + uit

– it: composite error term, comprising a time-invariant component αi and an
idiosyncratic error component uit

• different panel data models differ by how αi is treated

1. pooled OLS (POLS) estimator: ignores panel dimension and treats data as one
big cross section
– exogeneity assumption: E(xitit) = 0 with E(xituit) = 0 and E(xitαi) = 0
– even if exogeneity assumption satisfied, POLS has efficiency problem: it
depends on αi for all t
→ correlation between is and it does not decrease as distance |t − s|
increases

21
Panel Data Analysis 1. Introduction

2. random effects (RE) estimator

– similar idea as pooled OLS but strict exogeneity assumption required
– reason: RE estimator is a Generalized Least Squares (GLS) estimator which
requires strict exogeneity to produce consistent estimates
– RE estimator imposes specific structure on composite error term it (exploits
serial correlation in it) to achieve efficiency

3. fixed effects (FE) estimator

– like RE estimator, FE estimator also requires strict exogeneity assumption
– FE estimator does not make any assumptions on αi but allows for arbitrary
dependence between αi and xit
– trick: transform Equation (5) to eliminate the unobserved effect αi (see e.g.
differencing in 1.1)
– FE estimator is an OLS estimator on transformed data

22
Panel Data Analysis 1. Introduction

Stata example: pooled OLS in unobserved effects model

• generate data set of N = 200 units that are observed over T = 5 periods, such
that we obtain N × T = 1000 observations

• data generating process (DGP): outcome is generated through

yit = β0 + β1xit + αi + uit
– true parameters β0 = 1 and β1 = 2
iid
– explanatory variable: xit = 0.5αi + vit, where αi ∼ N (0, 1) across i, and
iid
vit ∼ N (0, 1) across i and t (implies xit ∼ N (0, 0.52 + 1))
iid
– idiosyncratic error: uit ∼ N (0, 1) across i and t; uit ⊥ αi, xit

• What result do we get for β1 from an OLS regression? How does the estimator β̂1
change as N increases?

23
Panel Data Analysis 1. Introduction

Empirical example: The impact of retirement on life satisfaction

• Paper: Bonsang & Klein (2012): “Retirement and subjective well-being” (link)
– study the effect of retirement on life satisfaction using panel data
– distinguish voluntary and involuntary retirement: differences in consumption-
leisure trade-off according to classical life cycle model
– different panel models to identify causal effect of retirement on life satisfaction

• example aims illustrating the advantage of having panel data in a situation where
the variable of interest is endogenous: OLS, FE, internal instruments

• Exercise (homework)
– What is the endogeneity problem with retirement? Name at least one source
of endogeneity.
– What role could income play in this relationship? How may the endogeneity
problem with income differ from the endogeneity problem with retirement?

A Testament of Hope The Essential Writings of Martin Luther King Jr.
75% (4)
A Testament of Hope The Essential Writings of Martin Luther King Jr.
10 pages
Remotivation Therapy
100% (1)
Remotivation Therapy
2 pages
3D Geoscience Modeling, Computer Techniques For Geological Characterization (Simon W. Houlding, 1994) - (Geo Pedia) PDF
100% (2)
3D Geoscience Modeling, Computer Techniques For Geological Characterization (Simon W. Houlding, 1994) - (Geo Pedia) PDF
309 pages
Week 1
No ratings yet
Week 1
48 pages
Panel Data Slides - 230919 - 160722
No ratings yet
Panel Data Slides - 230919 - 160722
92 pages
CHAPTER 7
No ratings yet
CHAPTER 7
121 pages
Panel Data Analysis
No ratings yet
Panel Data Analysis
42 pages
Panel Data Notes
No ratings yet
Panel Data Notes
26 pages
Ecotrics (PR) Panel Data 2
No ratings yet
Ecotrics (PR) Panel Data 2
16 pages
ECN3322 - Panel Data-1
No ratings yet
ECN3322 - Panel Data-1
56 pages
Panel Data Assignment
No ratings yet
Panel Data Assignment
24 pages
Introduction To Panel Data Analysis Using Eviews
No ratings yet
Introduction To Panel Data Analysis Using Eviews
43 pages
Panel Data Assignment
No ratings yet
Panel Data Assignment
32 pages
A Guide to Panel Data Regression_ Theoretics and Implementation with Python TEXT
No ratings yet
A Guide to Panel Data Regression_ Theoretics and Implementation with Python TEXT
5 pages
8) Lesson_11_Panel_FE
No ratings yet
8) Lesson_11_Panel_FE
18 pages
Panel Data Analysis of Microeconomic Decisions: Fall 2020
No ratings yet
Panel Data Analysis of Microeconomic Decisions: Fall 2020
23 pages
econometrics II CH-4 PPT (3)
No ratings yet
econometrics II CH-4 PPT (3)
25 pages
Panel Data
100% (2)
Panel Data
5 pages
Panel Data Models
No ratings yet
Panel Data Models
112 pages
Panel Data
No ratings yet
Panel Data
105 pages
PanelDataAnalysiswithStata1FEandREModelsMPRA Paper 76869
No ratings yet
PanelDataAnalysiswithStata1FEandREModelsMPRA Paper 76869
58 pages
Primer On Panel Data Analysis PDF
No ratings yet
Primer On Panel Data Analysis PDF
11 pages
Emping Stat Ass
No ratings yet
Emping Stat Ass
5 pages
30905022117 RohanChakraborty FinancialAnalytics CA2.PDF
No ratings yet
30905022117 RohanChakraborty FinancialAnalytics CA2.PDF
10 pages
Panel Data Methods
No ratings yet
Panel Data Methods
17 pages
Ecmetrics II Ch4
No ratings yet
Ecmetrics II Ch4
56 pages
Introduction To Panel Data UG-students
100% (1)
Introduction To Panel Data UG-students
57 pages
Advanced Econometrics
No ratings yet
Advanced Econometrics
61 pages
Guja - Chap 16 PDF
No ratings yet
Guja - Chap 16 PDF
26 pages
Panel Data Analysis For Economics and The Melbourne Institute
No ratings yet
Panel Data Analysis For Economics and The Melbourne Institute
36 pages
Block 3
No ratings yet
Block 3
105 pages
Chapter 2 Panel Data
No ratings yet
Chapter 2 Panel Data
17 pages
Chapter 2_Panel Data Regression
No ratings yet
Chapter 2_Panel Data Regression
30 pages
PANEL_DATA_ANALYSIS
No ratings yet
PANEL_DATA_ANALYSIS
14 pages
Introduction To Panel Data Analysis
No ratings yet
Introduction To Panel Data Analysis
18 pages
PANEL DATA ANSWERS
No ratings yet
PANEL DATA ANSWERS
5 pages
2025 Static Panels
No ratings yet
2025 Static Panels
19 pages
Introduction To Regression Models For Panel Data Analysis Indiana University Workshop in Methods October 7, 2011 Professor Patricia A. Mcmanus
No ratings yet
Introduction To Regression Models For Panel Data Analysis Indiana University Workshop in Methods October 7, 2011 Professor Patricia A. Mcmanus
42 pages
Celebrating40YearsofPanelDataAnalysis PastPresentandFuture MU
No ratings yet
Celebrating40YearsofPanelDataAnalysis PastPresentandFuture MU
22 pages
Lectute 2 - Panel Data Regression
No ratings yet
Lectute 2 - Panel Data Regression
30 pages
Topic 1_An Introduction to Panel Data Analysis
No ratings yet
Topic 1_An Introduction to Panel Data Analysis
37 pages
Econometrics I 16
No ratings yet
Econometrics I 16
153 pages
Panel Data
No ratings yet
Panel Data
9 pages
Econometric Analysis of Panel Data 46ld72a5p1
No ratings yet
Econometric Analysis of Panel Data 46ld72a5p1
5 pages
Econ-654 - Unit 3-PDM
No ratings yet
Econ-654 - Unit 3-PDM
211 pages
Panel Data Methods For Microeconomics Using Stata
100% (1)
Panel Data Methods For Microeconomics Using Stata
39 pages
Econometrics 5
No ratings yet
Econometrics 5
29 pages
Panel Data Assign
No ratings yet
Panel Data Assign
19 pages
Panel Data Analysis With Stata Part 1: Fixed Effects and Random Effects Models
No ratings yet
Panel Data Analysis With Stata Part 1: Fixed Effects and Random Effects Models
26 pages
Fem & Rem
No ratings yet
Fem & Rem
20 pages
Block 3
No ratings yet
Block 3
36 pages
A Guide to Panel Data Regression_ Theoretics and Implementation with Python
No ratings yet
A Guide to Panel Data Regression_ Theoretics and Implementation with Python
17 pages
Panel Analysis - April 2019 PDF
100% (1)
Panel Analysis - April 2019 PDF
303 pages
A Guide to Panel Data Regression_ Theoretics and Implementation with Python. _ by Bernhard Brugger _ Towards Data Science
No ratings yet
A Guide to Panel Data Regression_ Theoretics and Implementation with Python. _ by Bernhard Brugger _ Towards Data Science
17 pages
SurveyData 3
No ratings yet
SurveyData 3
49 pages
Session 08 2024
No ratings yet
Session 08 2024
27 pages
Econometric Methods For Panel Data
No ratings yet
Econometric Methods For Panel Data
58 pages
Panel 101
No ratings yet
Panel 101
48 pages
00 panels1e
No ratings yet
00 panels1e
20 pages
The Nature of Econometrics and Economic Data
No ratings yet
The Nature of Econometrics and Economic Data
10 pages
Feedback Control Theory
From Everand
Feedback Control Theory
Bruce Francis
5/5 (1)
Técnicas Estadísticas para la Ciencia de Datos a través de R. Aprendizaje Supervisado: Análisis Discriminante, Árboles de Decisión, Redes Neuronales y Modelos Lineales Generalizados
From Everand
Técnicas Estadísticas para la Ciencia de Datos a través de R. Aprendizaje Supervisado: Análisis Discriminante, Árboles de Decisión, Redes Neuronales y Modelos Lineales Generalizados
César Pérez López
No ratings yet
Thinking Statistically
From Everand
Thinking Statistically
Anthony Banfield
5/5 (1)
Strategic Six Sigma Best Practices
100% (10)
Strategic Six Sigma Best Practices
339 pages
WiderWorld2e 4 GrammarPresentation 1.2
No ratings yet
WiderWorld2e 4 GrammarPresentation 1.2
6 pages
Chapter 4.2
No ratings yet
Chapter 4.2
5 pages
Excel 2007 Data Analysis and Business Modelling PDF
No ratings yet
Excel 2007 Data Analysis and Business Modelling PDF
2 pages
SBI General RFP Social Media Agency
No ratings yet
SBI General RFP Social Media Agency
4 pages
Experience Deep Meditation in Min Download Your Free Meditation Audio
No ratings yet
Experience Deep Meditation in Min Download Your Free Meditation Audio
6 pages
Probability and Non-Probability
No ratings yet
Probability and Non-Probability
6 pages
Conceptual Framework
71% (7)
Conceptual Framework
2 pages
Box Method Multiplication Math Worksheet
No ratings yet
Box Method Multiplication Math Worksheet
10 pages
Crewes News: Low-Frequency Survey To Go Ahead
No ratings yet
Crewes News: Low-Frequency Survey To Go Ahead
2 pages
From Half Knowledge To Whole Truth
100% (1)
From Half Knowledge To Whole Truth
196 pages
The City Grammar Schools, Faisalabad.: 1st Test Timings: 8:15am To 9:30am 2nd Test Timings: 10:00am To 11:15am
No ratings yet
The City Grammar Schools, Faisalabad.: 1st Test Timings: 8:15am To 9:30am 2nd Test Timings: 10:00am To 11:15am
3 pages
As Math January 2020 P3 MS
No ratings yet
As Math January 2020 P3 MS
21 pages
Data Mining Lab File
No ratings yet
Data Mining Lab File
20 pages
KIM MATCHES STRATEGY
83% (6)
KIM MATCHES STRATEGY
6 pages
Section - 3 Manual Control
No ratings yet
Section - 3 Manual Control
2 pages
Chapter 3Mcqs Math
No ratings yet
Chapter 3Mcqs Math
4 pages
Quiz 1 K53
No ratings yet
Quiz 1 K53
4 pages
Linux Programming Bible - John Goerzen
75% (4)
Linux Programming Bible - John Goerzen
517 pages
CARMA White Paper - Importance of Media Content Analysis
No ratings yet
CARMA White Paper - Importance of Media Content Analysis
17 pages
Gtu Civil 3160608 Summer 2022
No ratings yet
Gtu Civil 3160608 Summer 2022
2 pages
Notice Writing Worksheet
No ratings yet
Notice Writing Worksheet
3 pages
Modul 62
No ratings yet
Modul 62
2 pages
Moon.vn: Danh Từ - Phần Iv
No ratings yet
Moon.vn: Danh Từ - Phần Iv
11 pages
Spadafora Et Al - Redimendionando Los Vínculos Entre Naturaleza y Cultura
No ratings yet
Spadafora Et Al - Redimendionando Los Vínculos Entre Naturaleza y Cultura
18 pages
Nabl 134
No ratings yet
Nabl 134
7 pages
Radiated, RF Immunity Test
No ratings yet
Radiated, RF Immunity Test
42 pages