0% found this document useful (0 votes)
23 views61 pages

Econometrics II Slides-1

Uploaded by

lamesafufa6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views61 pages

Econometrics II Slides-1

Uploaded by

lamesafufa6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

Chapter One

Regression Analysis with Qualitative Information


1.1. Types of Economic Data
1.2. Describing Qualitative Information
1.3. Dummy as Independent Variables
1.4. Dummy as Dependent Variable
1.4.1. The Linear Probability Model (LPM)
1.4.2. The Logit and Probit Models
1.4.3. Interpreting the Probit and Logit Model Estimates
1.1. Types of Economic Data
• There are four types of data in economics
A) Cross-sectional Data
• Sample of individuals, households, firms, cities, states, countries, or other units
at a specific point of time/in a given period
• Cross-sectional observations are more or less independent
• They can be collected using random sampling from a population
• Ordering of observations is not important
• They are widely used for applied microeconomics
Cross-sectional data can be given as below
Observation Sex Age Income Experience

1 M 34 4000 5

2 F 28 4500 7

3 M 36 5000 4

4 M 25 3000 3

5 F 30 2500 3

… … …. …. ….
B) Time Series
• Time series data consist observations of a variable or several variables over
time
• E.g., stock prices, money supply, consumer price index, gross domestic product,
annual homicide rates, automobile sales, etc.
• Time series observations are typically serially correlated
• Ordering of observations conveys important information
• Data frequency: daily, weekly, monthly, quarterly, annually,
• Typical features: trends and seasonality
• Typical applications: applied macroeconomics and finance
Time Series data can be given as below
Year GDP in bn Export in bn Import in bn
2001 100 30 50
2002 120 32 60
2003 115 30 85
2004 135 36 70
2005 145 45 80
… … … …
C) Pooled Data
• Two or more cross sections are combined in one data set
• i.e. The data is collected from different cross-sectional units at different periods
of time
• Cross sections are drawn independently of each other
• Pooled cross sections are often used to evaluate policy changes
• Example: To evaluate effect of change in property taxes on house prices
Take random sample of house prices for the year 1993
Take a new random sample of house prices for the year 1995
Then, compare before/after (1993: before reform, 1995: after reform)
Pooled Data are given as below
Observation Year Housing Price Property Tax
1 1993 20,000 1
2 1993 25,000 1
3 1993 35,000 1
4 1995 40,000 0
5 1995 35,000 0
6 1995 45,000 0
D) Panel Data
• The same cross-sectional units are followed over time
• It has a cross-sectional and a time series dimension
• It can be used to account for time-invariant unobservable effects
• It can also be used to model lagged responses
• Example:
City crime statistics; each city is observed in two years
Time-invariant unobserved city characteristics may be modeled
Effect of police on crime rates may exhibit time lag
Panel Data are given as below
Observation Year GDP in bn Export in bn
1 2016 400 100
1 2017 500 120
2 2016 550 200
2 2017 600 250
3 2016 80 30
3 2017 90 35
1.2. Describing Qualitative Information
• We often face qualitative information on variables in econometric analysis
• Qualitative variables are variables that can’t be explained in terms of numbers
• They also are difficult to quantify
• They explain peoples’ behavior, existence or absence of certain events and
characteristics
• Qualitative variables often come in the form of binary information
• These binary variables are often called dummy variables and take values of 0 or
1 in regression analysis
• Assigning these arbitrary numbers (0 and 1) is useful for interpretation of
parameter estimates
• Dummy variables can be used as independent or dependent variables
Examples of Qualitative Variables
• Sex; male, female
• Residence; urban, rural
• Poverty status; poor, non-poor
• Race; black, white
• Educational status; illiterate, elementary, high school, diploma and TVET,
degree and above
• Employment; unemployed, self-employed, public employed
• Marital status; married, unmarried, divorced, widowed, separated
• If the Qualitative Variable has two categories, one dummy is required (1
category used as base group)
• If the Qualitative Variable has three categories, two dummies are required
• If the Qualitative Variable has n categories, n-1 dummies are required
1.3. Dummy as Independent Variables
• Dummy variables are binary variables having only two values
• They are used to study behaviors of households, firms, cities, countries, etc.
• They are useful for event study (drought, famine, war, disease), policy analysis,
and program evaluation
• They are used to represent ordinary and/ or qualitative information in
regressions
• The coefficients of dummy variables in regressions show differences in
intercepts between groups/ categories
• Differences in slope coefficients are captured through interaction variables
(interaction of dummy variables and other explanatory variables)
• Example 1: Interpret the following model (level-level form)
• 𝑤𝑎𝑔𝑒 = 𝛽0 + 𝛽1 𝐸𝑑𝑢𝑖 + 𝛽2 𝐸𝑥𝑝𝑖 + 𝛽3 𝐴𝑔𝑒𝑖 + 𝛽4 𝐷𝑓𝑒𝑚𝑎𝑙𝑒 + 𝑢𝑖
• 𝑤𝑎𝑔𝑒 = 120 + 0.6 𝐸𝑑𝑢𝑖 + 0.7 𝐸𝑥𝑝𝑖 + 0.25 𝐴𝑔𝑒𝑖 − 13 𝐷𝑓𝑒𝑚𝑎𝑙𝑒
• Interpretations
• 𝛽0 =120: wage is 120 when all explanatory variables are zero.
• 𝛽1 =0.6; as education increases by 1 year, wage increases by 0.6 units keeping
the other variables constant.
• 𝛽2 =0.7; as experience increases by 1 year, wage increases by 0.7 units keeping
the other variables constant.
• 𝛽3 =0.25; as age increases by 1 year, wage increases by 0.25 keeping the other
variables constant.
• 𝜷𝟒 =-13; females’ wage is less than males’ wage by 13 units keeping the
other variables constant.
• Example 2: Interpret the following model (log-level form)
• ln𝑤𝑎𝑔𝑒 = 𝛽0 + 𝛽1 𝐸𝑑𝑢𝑖 + 𝛽2 𝐸𝑥𝑝𝑖 + 𝛽3 𝐴𝑔𝑒𝑖 + 𝛽4 𝐷𝑓𝑒𝑚𝑎𝑙𝑒 + 𝑢𝑖
• 𝑙𝑛𝑤𝑎𝑔𝑒 = 120 + 0.06 𝐸𝑑𝑢𝑖 + 0.07 𝐸𝑥𝑝𝑖 + 0.025 𝐴𝑔𝑒𝑖 − 0.12 𝐷𝑓𝑒𝑚𝑎𝑙𝑒
• Interpretations
• 𝛽0 =120: lnwage is 120 when all explanatory variables are zero.
• 𝛽1 =0.06; as education increases by 1 year, wage increases by 6% keeping the
other variables constant.
• 𝛽2 =0.07; as experience increases by 1 year, wage increases by 7% keeping the
other variables constant.
• 𝛽3 =0.025; as age increases by 1 year, wage increases by 2.5% keeping the other
variables constant.
• 𝜷𝟒 =-0.12; females’ wage is less than males’ wage by 12% keeping the other
variables constant.
• Example 3: Interpret the following model (log-log form)
• 𝑙𝑛𝐺𝐷𝑃 = 𝛽0 + 𝛽1 ln 𝐿𝑡 + 𝛽2 ln 𝐾𝑡 + 𝛽3 ln 𝑅𝐹𝑡 + 𝛽4 𝐷𝑤𝑎𝑟 + 𝑢𝑡
• 𝑙𝑛𝐺𝐷𝑃 = 406 + 0.30 ln 𝐿𝑡 + 0.35 ln 𝐾𝑡 − 0.24 ln 𝑅𝐹𝑡 − 0.2 𝐷𝑤𝑎𝑟
• Interpretation
• 𝛽0 =406; lnGDP is 406 when all of the explanatory variables are zero
• 𝛽1 =0.30; as labor increases by 1%, GDP increases by 0.30% keeping the other
variables constant.
• 𝛽2 =0.35; as capital increases by 1%, GDP increases by 0.35% keeping the other
variables constant
• 𝛽3 =-0.24; as rainfall variability increases by 1%, GDP decreases by 0.24%
keeping the other variables constant
• 𝜷𝟒 =-0.2; GDP during war period is less than GDP during peace period by
0.2% keeping the other variables constant.
1.4. Dummy as Dependent Variable
• We may encounter dummy dependent variables in econometric analysis
• We explain a qualitative event with binary outcome
• Our dependent variable, Y, takes only two values: zero and one
• There are three approaches of estimating binary dependent variable
regressions
• They are: Linear Probability Model (LPM), Logit Model and Probit Model
1.4.1. The Linear Probability Model (LPM)
• The simplest model in terms of interpreting parameter estimates
• Easier to estimate using OLS method
• Given the model:
𝑌𝑖 = 𝛽0 + 𝛽1 𝑋1𝑖 + 𝛽2 𝑋2𝑖 + ⋯ + 𝛽𝑘 𝑋𝑘𝑖 + 𝑢𝑖 … … … … … … … … … 1
• Where the dependent variable, Y, takes two values 0 and 1and the Xs are
explanatory variables
 The β can’t be interpreted as the change in Y as a result of a change in X
 Y only changes either from 0 to 1 or from 1 to 0
 Since the 𝐸(𝑢𝑖/𝑋) = 0, the 𝐸(𝑌𝑖 /𝑋) = 𝛽0 + 𝛽1 𝑋1𝑖 + 𝛽2 𝑋2𝑖 + ⋯ + 𝛽𝑘 𝑋𝑘𝑖
 Since Y has two values 0 and 1, the probability of “success” or the probability
that Y equals 1 is its expected value. i.e
𝑃 𝑌 = 1Τ𝑋 = 𝐸(𝑌𝑖 /𝑋) = 𝛽0 + 𝛽1 𝑋1𝑖 + 𝛽2 𝑋2𝑖 + ⋯ + 𝛽𝑘 𝑋𝑘𝑖 … . . 2
• Since the sum of probabilities is 1, the probability that Y equals 0 is given by
𝑃 𝑌 = 0Τ𝑋 = 1 − 𝑃 𝑌 = 1Τ𝑋 … … … … … … … … … .3
• The multiple linear regression model with a binary dependent variable is called
Linear Probability Model (LPM)
• The response probability is linear in the parameters, β
• In the LPM, β measures the change in the probability of success when X
changes
• Example: given a model on house ownership
• 𝐻𝑂𝑖 = 𝛽0 + 𝛽1 𝑀𝑖 + 𝛽2 𝑀𝑎𝑟𝑟𝑖 + 𝛽3 𝐷𝑖𝑣𝑜𝑟𝑖 + 𝛽4 𝐹𝑒𝑚𝑖 + 𝑢𝑖
• 𝐻𝑂𝑖 = 0.001 + 0.031 𝑀𝑖 + 0.01 𝑀𝑎𝑟𝑟𝑖 − 0.005 𝐷𝑖𝑣𝑜𝑟𝑖 + 0.015 𝐹𝑒𝑚𝑖
• Interpretation
• 𝛽0 =0.001; the probability of owning a house is 0.001 when all of the
explanatory variables are zero.
• 𝛽1 =0.031; as income increases by 1 unit, the probability of owning a house
increases by 0.031 keeping other variables constant
• 𝛽2 =0.01; married people have higher probability of owning a house than the
unmarried people by 0.01 keeping other variables constant.
• 𝛽3 =-0.005; divorced people have lower probability of owning a house than the
unmarried people by 0.005 keeping other variables constant
• 𝛽4 =0.015; females have higher probability of owning a house than males by
0.01 keeping other variables constant
• Drawbacks of LPM
• Functional form problem (theoretically not intuitive), non-normality of ui,
hetroscedasticity of ui, possibility of estimated probabilities lying outside 0-1
range, and lower R2
1.4.2. The Logit and Probit Models
• Logit and Probit models are used to model binary response variables
• The primary interest of logit/ probit models is the response probability
• They are nonlinear in nature
I) The Logit Model
• Consider a binary variable of house ownership
• Let Y=1 means the household owns a house and 0 otherwise and X is income
• The probability of home ownership can be given by;
1
• 𝑃𝑖 = 𝐸 𝑌 = 1/𝑋𝑖 = ………………………..1
1+𝑒 − 𝛽0 +𝛽1 𝑋𝑖
• Defining 𝑍 = 𝛽0 + 𝛽1 𝑋𝑖 , equation 1 can be rewritten as;
1 𝑒𝑍
• 𝑃𝑖 = = ………………………………………2
1+𝑒 −𝑍 1+𝑒 𝑍
• Equation 2 is the cumulative logistic distribution function
• As X and hence Z ranges from - to +, Pi ranges from 0 to 1
• At a very low level of income, X, the change in probability of owning house
for a small increase in income is low.
• Similarly, for a large level of income, the change in probability of owning
house for a small change in income is low
• Probability of owning house only changes highly for some medium range of
income level
• Thus, the graph of logistic distribution function is S shaped
P
1 cdf

0 X

• Fig. Cumulative logistic distribution function


• If Pi in equation 2 is the probability of owning house, the probability
of not owning a house is 1-Pi
1
1 − 𝑃𝑖 = 𝑍
… … … … … … … … … … … … … … … .3
1+𝑒
• Now, we can define the Odds ratio in favor of owning a house
• The odds ratio is the ratio of the probability of owning house to the probability of
not owning

𝑒𝑍
𝑃𝑖 1 + 𝑒 𝑍
= = 𝑒𝑍 … … … … … … … … … … … … … . . 4
1 − 𝑃𝑖 1
1 + 𝑒𝑍
• Taking the logarithm of the odds ratio in equation 4, we get
𝑃𝑖
𝐿 = ln = 𝑍 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝑢𝑖 … … … … … … … … … 5
1−𝑃𝑖
• Where L is the log of the odds ratio and is called Logit.
II) The Probit Model
• Another alternative of modelling binary response variables
• Probit model uses normal cumulative distribution function and is called Normit
model
• The Logit and Probit models, though mathematically different, are similar in
outcome
• If a variable X follows normal distribution with mean μ and variance 2 , its
probability distribution function (PDF) and cumulative distribution function
(CDF) are given by;
(𝑋−𝜇)2 (𝑋−𝜇)2
1 − 𝑋0 1 −
•𝑓 𝑋 = 𝑒 
2 2 𝑎𝑛𝑑 𝐹 𝑋 = ‫׬‬− 𝑒 
2 2 respectively.
2  2 2  
2

• Consider the home owning depends on some unobservable utility index Ii


determined by income, X.
𝐼𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 … … … … … … … … … … … … .6
• Let Y=1 is the family owns a house and 0 otherwise. It is plausible to assume
that there is a critical level of the utility index, say Ic, such that if Ii exceeds Ic,
the family owns house otherwise not.
• Though not observable, it is possible to estimate Ii using equation 6 if it is
normally distributed
• Given the normality assumption, the probability that Ic is less than or equal to Ii
can be calculated from the CDF
𝑃𝑖 = 𝑃(𝑌 = 1/𝑋) = 𝑃 𝐼𝑐 ≤ 𝐼𝑖 = 𝑃 𝑍 ≤ 𝛽0 + 𝛽1 𝑋𝑖 = 𝐹 𝛽0 + 𝛽1 𝑋𝑖 … … … … .7
• Where F( ) is the standard normal CDF and is given by;
−𝑍2 −𝑍2
1 𝐼𝑖 1 𝛽0 +𝛽1 𝑋𝑖
• 𝐹 𝐼𝑖 = ‫𝑒 ׬‬ 2 𝑑𝑧 = ‫׬‬ 𝑒 2 𝑑𝑧 … … … … … … … … … … … 8
2  − 2  −
1.4.3. Interpreting the Probit and Logit Model Estimates
• Estimated Logit and Probit coefficients show the direction and significance of
the effect of the explanatory variables on the dependent variable
• Probabilities can be computed at some fixed values of the explanatory variables
such as mean
I) Logit Interpretation
• Consider Logit model given by; 𝐿 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝑢𝑖
• Here the slope coefficient 𝛽1 measures the change in log of the odds ratio in
favour of owning house for a unit change in income.
• Example, if the Logit estimate is given as 𝐿 = 1.6 + 0.08𝑋,
• The log of the odds ratio increases by 0.08 for a unit increase in income.
II) Odds Interpretation
• The odds ratio can be calculated by taking the antilog of the estimated Logit; from our
example 𝑎𝑛𝑡𝑖𝑙𝑜𝑔 (0.08) = 1.083
• The interpretation is; for a unit increase in income, the odds ratio in favour of owning
house increases by 1.083.
III) Computing probabilities
• It is possible to calculate probabilities at certain level of income, say at the mean
• Example, compute probability of owning house at income, X=20
• To calculate, first find L and then take antilog of L. Finally solve for probability L.
𝑃𝑖
 𝐿 = 1.6 + 0.08 20 = 3.2, and = 𝑎𝑛𝑡𝑖𝑙𝑜𝑔 𝐿 = 𝑒 𝐿 = 𝑒 3.2 = 24.53. we have defined 𝑃𝑖 =
1−𝑃𝑖
𝑒𝑧 24.53
= = 0.96 𝑜𝑟 96%
1+𝑒 𝑧 1+24.53
IV) Computing the rate of change of probability
• The rate of change of probability depends on the estimated slope coefficient and the level of
income at which initial probability is measured
• The rate of change in probability is given by 𝛽1 1 − 𝑃 𝑃. In our example, the change in
probability of owning house at income level, 𝑋 = 20 is; 𝑃 = 0.08 1 − 0.96 0.96 = 0.003.
As income changes from 20 to 21, probability of owning house increases by 0.3%.
• Example: Suppose you want to analyze the determinants of poverty in Maichew town and
poverty status of a household is given as a function of education, employment status, number
of children, age and sex. The model is specified as;
• 𝑃𝑜𝑣𝑖 = 𝛽0 + 𝛽1 𝐸𝑑𝑢𝑖 + 𝛽2 𝐸𝑚𝑝𝑖 + 𝛽3 𝑁𝑐ℎ𝑖 + 𝛽4 𝐴𝑔𝑒𝑖 + 𝛽5 𝐹𝑒𝑚𝑖 + 𝑢𝑖
• Where Pov=1 if hh is poor and 0 otherwise
• The model is estimated using Logit and Probit models and the estimates are given below
a) Using Logit Model;
𝑃𝑜𝑣𝑖 = 0.01 − 0.013 𝐸𝑑𝑢𝑖 − 0.022 𝐸𝑚𝑝𝑖 + 0.031 𝑁𝑐ℎ𝑖 − 0.015𝐴𝑔𝑒𝑖 + 0.001 𝐹𝑒𝑚𝑖
Interpretation
• 𝛽0 =0.01; the log of adds ratio is 0.01 when all of the explanatory variables are zero
• 𝛽1 =-0.013; education reduces the probability of being poor keeping other variables constant (or as
education increases by 1 year, the log of odds ratio decreases by 0.013 keeping other variables constant)
• 𝛽2 =-0.022: employed people have lower probability of being poor than the unemployed people keeping
other variables constant (or employed people have lower log of odds ratio than the unemployed people by
0.022 keeping other variables constant)
• 𝛽3 =0.031; number of children increases the probability of being poor keeping other variables constant (or
as number of children increases by 1, the log of odds ratio increases by 0.031 keeping other variables
constant)
• 𝛽4 =-0.015; age reduces the probability of being poor keeping other variables constant (or as age increases
by 1, the log of odds ratio decreases by 0.015 keeping other variables constant)
• 𝛽5 =0.001; females have higher probability of being poor than males keeping other variables constant (or
females have higher log of odds ratio than males by 0.001 keeping other variables constant)
b) Using Probit Model;
𝑃𝑜𝑣𝑖
= 0.024 − 0.035 𝐸𝑑𝑢𝑖 − 0.018 𝐸𝑚𝑝𝑖 + 0.053 𝑁𝑐ℎ𝑖 − 0.021𝐴𝑔𝑒𝑖 + 0.023 𝐹𝑒𝑚𝑖
Interpretation
• 𝛽0 =0.024
• 𝛽1 =-0.035; education reduces the probability of being poor keeping other variables
constant
• 𝛽2 =-0.018: employed people have lower probability of being poor than the
unemployed people keeping other variables constant
• 𝛽3 =0.053; number of children increases the probability of being poor keeping
other variables constant
• 𝛽4 =-0.021; age reduces the probability of being poor keeping other variables
constant
• 𝛽5 =0.023; females have higher probability of being poor than males keeping other
variables constant
Chapter Two
Introduction to Basic Regression Analysis with Time Series Data
2.1. The nature of Time Series Data
2.2. Stationary and non-stationary stochastic Processes
2.3. Trend Stationary and Difference Stationary Stochastic Processes
2.4. Integrated Stochastic Process
2.5. Tests of Stationarity: The Unit Root Test
2.6. Co-integration and Error Correction Models
2.1. The nature of Time Series Data
• A sequence of random variable is called stochastic process
• A particular realization of the stochastic process is called time series
• Hence, time series is generated by a stochastic (random) process
• We don’t know, for instance, what the GDP of Ethiopia will be in 2011 in
advance
• But, we will get a single realization at the end from a number of possible
outcomes
• We take samples from a population in cross sectional data and we take time
series from a random process
• There are two types of time series models; static models and FDL models
a) Static Models
• Static models show contemporaneous relationship among variables
• The variables in the model are dated concurrently
• Used to model relationships when a change in the explanatory variable is
assumed to have immediate effect on the dependent variable
• Show tradeoff between variables
• 𝑖𝑛𝑓𝑡 = 𝛽0 + 𝛽1 𝑢𝑛𝑒𝑚𝑝𝑡 + 𝑢𝑡
b) Finite Distributed Lag Models
• Used to model relationships when a change in explanatory variables is assumed
to have lagged effect on the dependent variable
• 𝑖𝑛𝑓𝑡 = 𝛽0 + 𝛽1 𝑚𝑡 + 𝛽2 𝑚𝑡−1 + 𝛽3 𝑚𝑡−2 + 𝑢𝑡
• A finite distributed lag model of order two (contains two lags)
 𝛽1 is the impact propensity (multiplier) that measures the immediate change
on inflation as a result of a unit change in money supply
 𝛽2 & 𝛽3 measure the change in inflation one period and two periods after the
change in money supply respectively
 If the change in money supply is permanent, the sum of the coefficients gives
the long run propensity (multiplier). [𝐿𝑅𝑃 = 𝛽1 + 𝛽2 + 𝛽3 ]
2.2. Stationary and non-stationary stochastic Processes
I) Stationary time series
• The notion of stationarity plays key role in time series analysis
• A stationary time series process is the one whose probability distributions are stable over
time
• If we take any collection of random variables in the sequence and then shift that sequence
ahead h time periods, the joint probability distribution must remain unchanged.
• Two types of stationarity conditions; strict stationarity and weak (covariance) stationarity
a) Strictly stationary series:
 The joint distribution of any set of n observations [𝑋𝑡1 , 𝑋𝑡2 , … , 𝑋𝑡𝑛 ] is the same as the joint
distribution of [𝑋𝑡1+𝑘 , 𝑋𝑡2+𝑘 , … , 𝑋𝑡𝑛+𝑘 ] for all n and k.
 The distribution of variables depends on the lag length ( 𝑡2 − 𝑡1 ) not on time t.
 The mean and variance of a strictly stationary series are constant over time.
 𝐸 𝑋𝑡 = 𝜇, 𝑎𝑛𝑑 𝑣𝑎𝑟 𝑋𝑡 = 2
b) Weakly stationary series:
• For weakly stationary time series, the mean is constant and its auto-covariance function (acvf)
depends only on the lag.
• 𝐸 𝑋𝑡 = 𝜇, 𝑐𝑜𝑣 𝑋𝑡 , 𝑋𝑡+𝑘 = 𝑘 = 𝑡1 , 𝑡2 , 𝑤ℎ𝑒𝑟𝑒 𝑘 = 𝑡2 − 𝑡1
• A weakly stationary process is also called wide-sense stationary, covariance stationary or
second order stationary.
• For normally distributed variable, 𝑋𝑡 , weak and strict stationarity are equivalent
II) Non-stationary time series
• A stochastic process that is not stationary is said to be a non-stationary process.
• Consider a variable 𝑌𝑡 generated by a random process of the nature:
• 𝑌𝑡 = 𝑌𝑡−1 + 𝑢𝑡
• Where 𝑢𝑡 is a white noise error term with mean 0 and variance 2 (it is stationary by
definition.)
• This model is called random walk without drift.
• If 𝑌0 is the initial value of 𝑌𝑡 , we can have ; 𝑌1 = 𝑌0 + 𝑢1 , 𝑌2 = 𝑌1 + 𝑢2 = 𝑌0 + 𝑢1 + 𝑢2 ,
𝑌3 = 𝑌2 + 𝑢3 = 𝑌0 + 𝑢1 + 𝑢2 + 𝑢3 , and so on
• Thus at time t we have; 𝑌𝑡 = 𝑌0 + 𝑢1 + 𝑢2 + ⋯ + 𝑢𝑡 = 𝑌0 + σ𝑇𝑡=1 𝑢𝑡 … … … … . . 16
• Then we have; 𝐸 𝑌𝑡 = 𝐸 𝑌0 + σ𝑇𝑡=1 𝑢𝑡 = 𝑌0 , 𝑎𝑛𝑑 𝑣𝑎𝑟 𝑌𝑡 = 𝑣𝑎𝑟 𝑌0 + σ𝑇𝑡=1 𝑢𝑡 = 𝑡2
• Although the mean of the random walk model is constant, its variance changes with time.
Thus, the random walk model without drift is a non-stationary time series
• A random walk is said to have an indefinite memory; any shock designated by non-zero 𝑢𝑡
persists over time.
Other examples of non-stationary time series include;
 Random walk with drift process; 𝑌𝑡 = 𝜇 + 𝑌𝑡−1 + 𝑢𝑡
 Random walk plus drift plus trend process; 𝑌𝑡 = 𝜇 + 𝑌𝑡−1 + 𝑡 + 𝑢𝑡
 Trend stationary process; 𝑌𝑡 = 𝜇 + 𝑡 + 𝑢𝑡
2.3. Trend Stationary and Difference Stationary Stochastic Processes
• A variable may not be stationary but tends to change upward or downward along
a trend line according to; 𝑌𝑡 = 𝛽0 + 𝛽1 𝑡 + 𝑢𝑡
• Trend stationary processes are converted to stationary by removing the trend (detrending)
• In this case, the variable is generated by trend stationary process (TSP).
• If a variable (𝑌𝑡 ) is not stationary but its differences (∆𝑌𝑡 ) are stationary, the
variable is said to be generated by difference stationary process (DSP).
• Most economic data are DSP type.
2.4. Integrated Stochastic Process
• An interesting property of the purely random walk model (and also with drift) is
that its difference can be made stationary; 𝐸 𝑌𝑡 = 𝐸 𝑢𝑡 = 0, 𝑎𝑛𝑑 𝑣𝑎𝑟 𝑌𝑡 =
𝑣𝑎𝑟 𝑢𝑡 = 2
• A non-stationary time series 𝑌𝑡 whose first difference is stationary is said to be
integrated of order 1: 𝑌𝑡  𝐼 1
• A non-stationary time series which has to be differenced d times to make it
stationary is said to be integrated of order d: 𝑌𝑡  𝐼 𝑑
• A stationary time series is an 𝐼(0) varaible.
2.5. Tests of Stationarity: The Unit Root Test
• Consider a variable 𝑌𝑡 generated by an autoregressive (AR) random process
of the nature; 𝑌𝑡 = 𝑌𝑡−1 + 𝑢𝑡
• Substituting 𝑌𝑡−2 + 𝑢𝑡−1 for 𝑌𝑡−1 , 𝑌𝑡−3 + 𝑢𝑡−2 for 𝑌𝑡−2 , and so on, we get;
𝑌𝑡 = 𝑛 𝑌𝑡−𝑛 + σ𝑛−1
𝑗=1  𝑗𝑢
𝑡−𝑗

• If ρ ≥ 1, then 𝑌𝑡 drifts apart (explodes) and is non-stationary


• If ρ < 1, then 𝑌𝑡 converges to some constant value and is stationary
• Our null for the test is: H0: ρ = 1, with alternative HA: ρ < 1. (One -tail test).
However, this null cannot be directly tested and thus we must transform it into
testable hypothesis.
• The equation 𝑌𝑡 = 𝑌𝑡−1 + 𝑢𝑡 is equivalent to ∆𝑌𝑡 = 𝛼𝑌𝑡−1 + 𝑢𝑡 ; 𝛼 =
−1
• Thus, the null H0: ρ = 1 is equivalent with the null H0: α = 0, and the
alternative HA: ρ < 1 is equivalent with HA: α < 0
• The root for 𝛼 = ( − 1) is 1, and hence the name unit root.
• If the null 𝛼 = 0 is not rejected, we say there is unit root and 𝑌𝑡 is non-
stationary. If it is rejected, there is no unit root and 𝑌𝑡 is stationary.
• We do not use the t table for the test. Rather, we read the computed ratios
against critical values under τ –statistic table constructed by Dicky and Fuller.
• The test is called Dicky and Fuller (DF) unit root test
• We can include a drift and time trend in our AR process. Moreover, we can
include lags to account for autocorrelation problem in which case our test is
called Augmented Dickey Fuller test
• If variables are found to be stationary, run the static (long run) model of the
form;
• 𝒀𝒕 = 𝜷𝟎 + 𝜷𝟏 𝑿𝒕𝟏 + 𝜷𝟐 𝑿𝒕𝟐 + ⋯ + 𝜷𝒌 𝑿𝒕𝒌 + 𝒖𝒕 … … … … … … … … … … 𝟏𝟕
• If variables are non-stationary, we have to test for co-integration (possibility of long
run relationship)
• If there is no co-integration, the variables have no long run relationship
• If there is no co-integration, the variables have short run relationship and we
estimate the short run model given below;
∆𝒀𝒕 = 𝜷𝟎 + 𝜷𝟏 ∆𝑿𝒕𝟏 + 𝜷𝟐 ∆𝑿𝒕𝟐 + ⋯ + 𝜷𝒌 ∆𝑿𝒕𝒌 + 𝒖𝒕 … … … … … … 𝟏𝟖
• For non-stationary variables, it is possible to have long run relationship as well. In
this case, we can go for test of co-integration.
• If we get evidence of co-integration (long run relationship), we can estimate error
correction models
• Engle-Granger method, Johansen method, and ARDL (Autoregressive Distributed
Lag) method are examples of testing and estimating co-integration & error
correction models
2.6. Co-integration and Error Correction Models; Ebgle-Granger Method
• The notion of coi-ntegration was given a formal treatment in Engle and Granger
(1987)
• If {Yt: t = 0, 1, …} and {Xt: t = 0, 1, …} are two I(1) processes, then, Yt -βXt can be
an I(1) process for any number β
• Moreover, it is possible that for some β≠0, Yt - βXt can be an I(0) process,
which means it has constant mean, constant variance, and autocorrelations
that depend only on the lag length
• If such a β exists, we say that Y and X are said to be co-integrated, and we call β
the co-integration parameter
Testing for Co-integration
• The null hypothesis Ho: The variables are not co-integrated
• Alternative hypothesis H1: The variables are co-integrated
• Engle-Granger Procedure
• Given a model; Yt = β0 + β1 Xt1 + β2 Xt2 + ⋯ + βk Xtk + ut
• Step 1: regress the above model using OLS
• Step 2: predict the residuals from the model (^ut)
• Step 3: test unit root of the residuals (^ut)
• Decision rule
• If the residuals are found to be stationary, reject the null hypothesis of no-co-
integration and go for error correction models
• If the residuals are found to be non-stationary, accept the null hypothesis of no
co-integration and estimate the short run model only (equation 18)
Error Correction Model
• If Yt and Xt are co-integrated with parameter β, we have to estimate the error
correction model
• The error correction model contains the short run and long run information about the
relationship b/n variables
• The error correction model can be specified as below;
∆𝒀𝒕 = 𝟎 + 𝟏 ∆𝒀𝒕−𝟏 + 𝜷𝟏 ∆𝑿𝒕 + 𝜷𝟐 ∆𝑿𝒕−𝟏 + 𝒆𝒕−𝟏 + 𝒖𝒕
• Where 𝒆𝒕−𝟏 is called the error correction term and contains long run information
• The parameter  measures the speed of adjustment towards equilibrium
• If  < 𝟎, the system is said to be stable. i.e. any disequilibrium in the short run will be
adjusted towards equilibrium
• If  > 𝟎, the system is said to be instable. i.e. any disequilibrium in the short run will
never be adjusted towards equilibrium
• The variable 𝒆𝒕−𝟏 is the lagged value of the residual from the OLS regression
Chapter Three
Introduction to Simultaneous Equation Models
3.1. The Nature Of Simultaneous Equation Models
3.2. Simultaneity Bias
3.3. Order and Rank Conditions of Identification
3.4. IV and 2SLS Estimation of the Structural Equations
3.1. The Nature Of Simultaneous Equation Models
• Simultaneous equation models (SEMs) are required when we face endogeneity
problem due to simultaneity
• Simultaneity arises when one or more of the explanatory variables are jointly
determined with the dependent variable
• Instrumental Variable (IV) and two stage least squares (2SLS) regression are the
leading methods of estimating SEMs
• The classical example of SEMs is the equilibrium analysis of supply and demand
in a commodity or factor market
3.2. Simultaneity Bias
• Simultaneity bias refers to the bias in OLS estimation due to the correlation
between explanatory variables and error terms in simultaneous equation
models
• Consider two equation SEMs of;
𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝛽2 𝑍1𝑖 + 𝑢1𝑖
𝑋𝑖 = 𝛼0 + 𝛼1 𝑌𝑖 + 𝛼2 𝑍2𝑖 + 𝑢2𝑖
• Where Yi and Xi are jointly determined within the model and Z1 and Z2 are
exogenous variables
• By definition, Z1 and Z2 are not correlated with either u1 or u2
• However, we can easily verify that Yi is correlated with u2 and Xi is also
correlated with u1
• Solve for Xi to check whether it is correlated with u1 or not
𝑋𝑖 = 𝛼0 + 𝛼1 𝛽0 + 𝛽1 𝑋𝑖 + 𝛽2 𝑍1𝑖 + 𝑢1𝑖 + 𝛼2 𝑍2𝑖 + 𝑢2𝑖
𝑋𝑖 = 𝛼0 + 𝛼1 𝛽0 + 𝛼1 𝛽1 𝑋𝑖 + 𝛼1 𝛽2 𝑍1𝑖 + 𝛼1 𝑢1𝑖 + 𝛼2 𝑍2𝑖 + 𝑢2𝑖
1 − 𝛼1 𝛽1 𝑋𝑖 = 𝛼0 + 𝛼1 𝛽0 + 𝛼1 𝛽2 𝑍1𝑖 + 𝛼2 𝑍2𝑖 + 𝛼1 𝑢1𝑖 + 𝑢2𝑖

𝛼0 + 𝛼1 𝛽0 𝛼1 𝛽2 𝛼2 𝛼1 1
𝑋𝑖 = + 𝑍1𝑖 + 𝑍2𝑖 + 𝑢1𝑖 + 𝑢2𝑖
1 − 𝛼1 𝛽1 1 − 𝛼1 𝛽1 1 − 𝛼1 𝛽1 1 − 𝛼1 𝛽1 1 − 𝛼1 𝛽1
𝑋𝑖 = 0 + 1 𝑍1𝑖 + 2 𝑍2𝑖 + 1 𝑢1𝑖 + 2 𝑢2𝑖

𝛼0 +𝛼1 𝛽0 𝛼1 𝛽2 𝛼2 𝛼1
• Where; 0 = , 1 = , 2 = , 1 = , 𝑎𝑛𝑑 2 =
1−𝛼1 𝛽1 1−𝛼1 𝛽1 1−𝛼1 𝛽1 1−𝛼1 𝛽1

1
, 𝛼1 𝛽1 ≠ 1
1−𝛼1 𝛽1

• From the last equation, we can clearly see the correlation between Xi and u1 and this
causes bias and inconsistency in OLS estimation
3.3. Order and Rank Conditions of Identification
a) The Identification Problem
• Estimation of simultaneous equation models need identification
• i.e., to estimate a model from SEMs we have to identify it
• Consider the simple supply and demand equations for equilibrium quantity Q:
𝑄 = 𝛽1 𝑃 + 𝛽2 𝑍1 + 𝑢1 … … … … … … … .1
𝑄 = 𝛼1 𝑃 + 𝑢2 … … … … … … … … … … . . 2
• Where the 1st equation is the supply function while the 2nd is the demand function
• The presence of the exogenous variable Z1 in the supply function helps us identify the
demand function
• i.e. Z1 serves as an Instrumental Variable (IV) for price in the demand function
• Therefore, we can estimate the demand function
• However, there is no exogenous variable in the demand equation and thus we have
no IV for price in the supply function
• The supply function is said to be unidentified and can’t be estimated
b) Conditions for Identification: Order and Rank Condition
• Order condition: The number of excluded exogenous variables from an
equation should be, at least, as large as the number of right-hand side
endogenous variables
• The order condition is only necessary but not sufficient condition
• Rank condition: The 1st equation in a two-equation simultaneous equation
model is identified, if and only if, the second equation contains at least as many
exogenous variables (with a non-zero coefficient) that are excluded from that
equation as the number of endogenous variables included in it
• The rank condition is the necessary and sufficient condition for identification
• Once an equation is identified, it can be estimated using IV or 2SLS
• Example: consider the following three equation simultaneous equation models
𝑋1 = 𝛽1 𝑋2 + 𝛽2 𝑋3 + 𝛽3 𝑍1 + 𝑢1
𝑋2 = 𝛼1 𝑋1 + 𝛼2 𝑋3 + 𝛼3 𝑍1 + 𝛼4 𝑍2 + 𝑢2
𝑋3 = 1 𝑋2 + 2 𝑍1 + 3 𝑍2 + 4 𝑍3 + 5 𝑍4 + 𝑢3 … … … … … … 3
• Where the Xs are endogenous while the Zs are exogenous variables
• In terms of the order condition, the 1st equation is over identified because
while we need two IVs, we have three
• The second equation is just identified while the third is unidentified equation
3.4. IV and 2SLS Estimation of the Structural Equations
• consider the structural models of;
𝑋1 = 𝛽1 𝑋2 + 𝛽2 𝑍1 + 𝑢1 … … … 1
𝑋2 = 𝛼1 𝑋1 + 𝛼2 𝑍2 + 𝑢2 … … … . . 2
• Where, the Xs are endogenous and the Zs are exogenous variables
• Suppose we want to estimate the 1st equation
• Substituting the right-hand side of the 1st equation into the 2nd, we have:
𝑋2 = 𝛼1 [𝛽1 𝑋2 + 𝛽2 𝑍1 + 𝑢1 ] + 𝛼2 𝑍2 + 𝑢2
𝑋2 = 𝛼1 𝛽1 𝑋2 + 𝛼1 𝛽2 𝑍1 + 𝛼2 𝑍2 + 𝛼1 𝑢1 + 𝑢2
1 − 𝛼1 𝛽1 𝑋2 = 𝛼1 𝛽2 𝑍1 + 𝛼2 𝑍2 + 𝛼1 𝑢1 + 𝑢2

𝛼1 𝛽2 𝛼2 𝛼1 𝑢1 + 𝑢2
𝑋2 = 𝑍 + 𝑍 +
1 − 𝛼1 𝛽1 1 1 − 𝛼1 𝛽1 2 1 − 𝛼1 𝛽1
𝑋2 = 1 𝑍1 + 2 𝑍2 + 𝑉2 … … … … … … . . 3
𝛼1 𝛽2 𝛼2 𝛼1 𝑢1 +𝑢2
• Where 1 = , 2 = , 𝑎𝑛𝑑 𝑉2 =
1−𝛼1 𝛽1 1−𝛼1 𝛽1 1−𝛼1 𝛽1

• Equation 3 which expresses X2 in terms of exogenous variables and error terms


is called the reduced form equation
• The πs are called reduced form equation parameters and are non-linear
functions of the structural parameters
• Substituting the fitted values of X2 from the reduced form equation (3) to the
first equation we can estimate the following equation using OLS;
𝑋1 = 𝛽1 𝑋෠2 + 𝛽2 𝑍1 + 𝑢1
Chapter Four
Introduction to Panel Data Regression Models
4.1. Introduction
4.2. Estimation of Panel Data Regression Model: The Fixed Effects Approach
4.3. Estimation of Panel Data Regression Model: The Random Effects Approach
4.1. Introduction: Pooled Cross Section Vs Panel Data
a) Pooled cross sections
• Are obtained by sampling randomly from a large population at different points
in time
• The respondents are not same in different time
• They consist of independently sampled observations and it rules out correlation
in the error terms for different observations
• An independently pooled cross section differs from a single random sample in
that pooled cross sections likely lead to observations that are not identically
distributed
• The non-identical distribution of pooled cross sections can be captured by
allowing either the intercept or the slope coefficients to change over time
b) Panel data
• Are data collected at different point in time like pooled cross section
• The respondents are the same in all time periods
• Observations are not independently distributed across time; E.g unobserved
factors that affect an individual’s wage in 2000 will affect the wage of that
person in 2005 as well
4.2. Estimation of Panel Data Regression Model: The Fixed Effects Approach
• Two period panel data are the simplest form of panel data models
• The model is given by
𝑌𝑖𝑡 = 𝛽0 + 𝛽1 𝐷2𝑡 + 𝛽2 𝑋𝑖𝑡1 + 𝛽3 𝑋𝑖𝑡2 + ⋯ + 𝛽4 𝑋𝑖𝑡𝑘 + 𝑎𝑖 + 𝑢𝑖𝑡 … … … … … … 1
• Where, the variable D2t is a dummy variable that equals 0 in the first period and 1 in
the second period. It does not change across i, and that is why it has no i subscript.
• The variable 𝑎i captures all unobserved, time-constant factors that affect Yit. Since 𝑎i is
constant over time, it has not time subscript.
• Generically, 𝑎i is called unobserved effect or fixed effect and the above model is called
unobserved effects model or fixed effects model.
• The error uit is often called the idiosyncratic error or time-varying error, because it
represents unobserved factors that change over time and affect Yit.
Estimation
• If 𝑎i is correlated with the explanatory variables, use we first differencing mechanism
or fixed effects model which remove it from the equation
• Fixed effects transformation is used to eliminate the fixed effect, 𝑎i from the model
• Consider a model 𝑌𝑖𝑡 = 𝛽0 + 𝛽1 𝑋𝑖𝑡 + 𝑎𝑖 + 𝑢𝑖𝑡 … … … … … … … … . . 2
• For each 𝑖 average this equation over time, we get 𝑌ത𝑖 = 𝛽0 + 𝛽1 𝑋ത𝑖 + 𝑎𝑖 + 𝑢ത 𝑖 … … … .3
σ𝑇
𝑡=1 𝑌𝑖𝑡
• ത
Where, 𝑌𝑖 = and so on. Since 𝑎𝑖 is constant over time, it appears in both
𝑇
equations. Subtracting equation 3 from2, we get;
• 𝑌𝑖𝑡 − 𝑌ത𝑖 = 𝛽1 𝑋𝑖𝑡 − 𝑋ത𝑖 + 𝑢𝑖𝑡 − 𝑢ത 𝑖  𝑌𝑖𝑡 ሷ = 𝛽1 𝑋ሷ 𝑖𝑡 + 𝑢ሷ 𝑖𝑡 … … … … … … … … … 4
• Where 𝑌𝑖𝑡ሷ , 𝑋ሷ 𝑖𝑡 , 𝑎𝑚𝑑 𝑢ሷ 𝑖𝑡 are the time demeaned data on Y, X and u respectively
• A pooled OLS estimator based on the time-demeaned variables is called the fixed
effects estimator or within estimator
• Time constant variables do not appear in regression output. To estimate the effect of time constant
variables in our model, we interact them with other variables or with time
4.3. Estimation of Panel Data Regression Model: The Random Effects Approach
• Given the panel data model of the form 𝑌𝑖𝑡 = 𝛽0 + 𝛽1 𝑋𝑖𝑡 + 𝑎𝑖 + 𝑢𝑖𝑡 … … … … … … 5
• Random effects model is used to estimate the parameters when we assume the
unobserved effect 𝑎𝑖 is uncorrelated with the explanatory variables
• If we define a composite error term vit, 𝑣𝑖𝑡 = 𝑎𝑖 + 𝑢𝑖𝑡 , equation 5 can be rewritten as
𝑌𝑖𝑡 = 𝛽0 + 𝛽1 𝑋𝑖𝑡 + 𝑣𝑖𝑡 … … … … … … … … … … … … … 6
• Equation 6 can be estimated using pooled OLS. However, the existence of 𝑎𝑖 in 𝑣𝑖𝑡
may cause serial correlation problem in the composite error term 𝑣𝑖𝑡
• Under random effects assumption, the composite errors are serially correlated and their
correlation is given by;
𝑎
2
2 2
• 𝑐𝑜𝑟𝑟 𝑣𝑖𝑡 , 𝑣𝑖𝑠 = , 𝑡 ≠ 𝑠, 𝑎 = 𝑣𝑎𝑟 𝑎𝑖 , 𝑢 = 𝑣𝑎𝑟(𝑢𝑖𝑡 )
(𝑎 +𝑢 )
2 2

• Therefore, we use GLS estimation to solve the serial correlation problem


GLS estimation of random effects model
1
2
𝑢
2
• Define  = 1 − … … … … … … … … … … … .7
𝑢 +𝑇𝑎
2 2

• The GLS transformation equation can be given by


• 𝑌𝑖𝑡 − 𝑌ത𝑖 = 𝛽0 1 − + 𝛽1 𝑋𝑖𝑡 − 𝑋ത𝑖 + 𝑣𝑖𝑡 − 𝑣ҧ𝑖 … … … .8
• This model involves a quasi-demeaned data on each variable
• In fixed effects model, we subtract the time averages while in random effects model we
subtract a fraction of that time average from the corresponding variable
• The GLS estimator is the pooled OLS estimator of equation 8
• For large T,  approaches 1 and the random effects estimator approaches the fixed effects
estimator
• For small T,  approaches 0 and the random effects estimator approaches the pooled OLS
estimator
• Fixed effects estimator is obtained when  = 1 and pooled OLS is obtained when  = 0
• Houseman test is used to determine whether to use random effects or fixed effects model

You might also like