0% found this document useful (0 votes)
23 views8 pages

Advance Econometrics Assignment

Assignment

Uploaded by

istiak ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views8 pages

Advance Econometrics Assignment

Assignment

Uploaded by

istiak ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Dynamic Econometric Model

Autoregressive Model:
Autoregressive models are a class of machine learning (ML) models that automatically
predict the next component in a sequence by taking measurements from previous inputs in the
sequence. Autoregression is a statistical technique used in time-series analysis that assumes
that the current value of a time series is a function of its past values. Autoregressive models
use similar mathematical techniques to determine the probabilistic correlation between
elements in a sequence. They then use the knowledge derived to guess the next element in an
unknown sequence. For example, during training, an autoregressive model processes several
English language sentences and identifies that the word “is” always follows the word
“there.” It then generates a new sequence that has “there is” together.

Linear regression

You can imagine linear regression as drawing a straight line that best represents the average
values distributed on a two-dimensional graph. From the straight line, the model generates a
new data point corresponding to the conditional distribution of historical values.

Consider the simplest form of the line graph equation between y (dependent variable) and x
(independent variable); y=c*x+m, where c and m are constant for all possible values of x and
y. So, for example, if the input dataset for (x,y) was (1,5), (2,8), and (3,11). To identify the
linear regression method, you would use the following steps:

1. Plot a straight line and measure the correlation between 1 and 5.


2. Change the straight line direction for new values (2,8) and (3,11) until all values fit.
3. Identify the linear regression equation as y=3*x+2.
4. Extrapolate or predict that y is 14 when x is 4.
Autoregression:

Autoregressive models apply linear regression with lagged variables of its output taken from
previous steps. Unlike linear regression, the autoregressive model doesn’t use other

independent variables except the previously predicted results. Consider the following
formula.

When expressed in the probabilistic term, an autoregressive model distributes independent


variables over n-possible steps, assuming that earlier variables conditionally influence the
outcome of the next one. We can also express autoregressive modeling with the equation
below.
Here, y is the prediction outcome of multiple orders of previous results multiplied by their
respective coefficients, ϕ. The coefficient represents weights or parameters influencing the
predictor’s importance to the new result. The formula also considers random noise that may
affect the prediction, indicating that the model is not ideal and further improvement is
possible.

Lag

Data scientists add more lagged values to improve autoregressive modeling accuracy. They
do so by increasing the value of t, which denotes the number of steps in the time series of
data. A higher number of steps allows the model to capture more past predictions as input.
For example, you can expand an autoregressive model to include the predicted temperature
from 7 days to the past 14 days to get a more accurate outcome. That said, increasing the
lagged order of an autoregressive model does not always result in improved accuracy. If the
coefficient is close to zero, the particular predictor has little influence on the result of the
model. Moreover, indefinitely expanding the sequence results in a more complex model
requiring more computing resources to run.

Distributed -lag Model:

In time series applications, we often use the distributed-lag model to assess the dynamic
effects of a predictor x on the response variable y. Dynamic effects herein refers to influence
that occurs incrementally over time rather than all at once. In the simplest case of one
explanatory variable x, the response y in time period t is specified as a linear combination
of x values in the same and previous periods.

y[t] = µ + ϕ₀x[t] + ϕ₁x[t-1] + ϕ₂x[t-2] + . . . + ε[t]

where

y[t] denotes the value of y at time period t

µ denotes the intercept of the model

x[t], x[t-1], x[t-2], . . . denote the x values at time periods t, t-1, t-2, . . . , with ϕ₀, ϕ₁,
ϕ₂, . . . representing the weights of these x values

ε[t] denotes a random variable that represents the unexplainable variation in y[t]; it
is typically assumed to follow a Gaussian distribution with zero-mean and constant variance

A distributed-lag model with infinite lags assumes that y is related to values of x occurred
infinitely long in the past. In cases where the influence of x on y diminishes to zero within a
finite number of time periods, considering a finite number of lags in the model would be
sufficient.

Change in x can be temporary or permanent, which results in different behaviors in the


response y. In the rest of this tutorial, we will use simulated data to illustrate such dynamic
behaviors. Data is simulated from a simple distributed-lag model with two lag terms as shown
below.

y[t] = 1.2 + 0.5x[t] + 0.8x[t-1] + 0.3x[t-2] + ε[t]

This model has one explanatory variable x measured at t, t-1, and t-2, with parameters fixed
at δ = 1.2, ϕ₀ = 0.5, ϕ₁ = 0.8, and ϕ₂ = 0.3. Let us assume that ε[t] = 0 so that we can focus on
analyzing the contribution of x to the response y.Temporary Change in x

Suppose the explanatory variable x increases temporarily from 0.0 in period t-1 to 1.0 in
period t, and returns to 0.0 in subsequent periods. Examples of x and y are depicted by the
blue and orange lines, respectively, in Figure 1.

From period t-1 to t, the temporary unit increase in x causes y to increase from 1.0 to 1.5. This
change is equal to the value of ϕ₀, which is the rate of change of y with respect to x in the
same time period. It is also commonly known as the immediate effect or short-run
effect. From period t-1 to t+1, y increases from 1.0 to 1.8 and this change is equal to the value
of ϕ₁. Similarly, from period t-1 to t+2, y increases by the value of ϕ₂. We can also infer from
this example that, if x changes by -1.0 from period t-1 to t, then the effects on y would have
the same magnitudes as defined by the lag weights but with opposite sign.
This example shows that the effects on y is not sustainable when the change in x is temporary.
In the absence of error and given an arbitrary temporary change in x from period t-1 to t, the
resulting change in y from period t-1 to t+j for j≥0 is proportional to the change in x. The
proportionality constant is given by the lag weight associated with x[t-j] in the model, or zero
if x[t-j] does not exist in the model.
Permanent Change in x consider when the explanatory variable x increases from 0.0 in
period t-1 to 1.0 in period t, and remains at 1.0 permanently. Change in y from
period t-1 to t is, again, given by the value of ϕ₀. However, since the change in x is
permanent, y does not return to its original level in future periods. Figure 2 illustrates
how y evolves due to a permanent unit change in x. permanent, y does not return to its
original level in future periods. Figure 2 illustrates how y evolves due to a permanent
unit change in x.

From period t-1 to t+1, the permanent unit increase in x causes y to increase from 1.0 to 2.3.
This change is equal to the value of ϕ₀+ϕ₁. From period t-1 to t+2, y increases from 1.0 to
2.6, whose change is equal to the value of ϕ₀+ϕ₁+ϕ₂. In periods t+3 and so on, y persists at
2.8 as x remains at 1.0. The sum of all lag weights determines the eventual change in y. This is
commonly referred as the long-run effect on y in response to a permanent unit change in x.

This example shows that the effects on y would persist when the change in x is permanent.
In the absence of error and given an arbitrary permanent change in x beginning in period t,
the resulting change in y from period t-1 to t+j for j≥0 is proportional to the change in x.
The proportionality constant is given by the sum of lag weights associated with x[t], x[t-
1], . . . x[t-j] in the model.
Simultaneous Equation Model

The problem of Identification:

The identification problem is a deductive, logical issue that must be solved before estimating
an economic model. In a demand and supply model, the equilibrium point belongs to both
curves, and many presumptive curves can be drawn through such a point. We need prior
information on the slopes, intercepts, and error terms to identify the true from the
presumptive demand and supply curves. Such prior information will give a set of structural
equations. If the equations are linear, and the error terms are normally distributed with zero
mean and constant variance, then a model is formed for estimation. A typical identification
process may fix the demand curve and shift the supply curve, cutting the demand curve at
many points to trace it out. By the zero mean assumption of the error term, half the
observations are expected above and half below the demand curve. In the same way, the
supply curve can be identified. This method originated with Ragnar Frisch (1938) and Trygve
Haavelemo (1944). Tjalling Koopmans evolved the order and rank conditions for identifying
linear models (1949). Franklin Fisher’s work was the first major textbook on the subject
(1966), and Charles Manski extended it to the social sciences (1995). The rank condition
guarantees that the equations can be solved. Econometric texts often create a spreadsheet to
demonstrate the rank condition. For the model above, the column is labeled with the
variables Q, P, Y, T, and the rows contain information on the equations. Each cell has either a
0 for an excluded variable or a 1 for an included variable. For the demand function above, the
entry for the row vector is [1, 1, 1, 0] and for the supply function [1, 1, 0, 1]. To identify the
demand curve for the order condition, first locate the zero in its vector, then pick up the
corresponding number in the supply vector. The picked-up number, which is 1, should be
equal to M –1, which is also 1. With many equations, the numbers that we pick up will array
into many rows and columns. The general rank test requires one to find M –1 rows and M –1
columns in that array whose elements are not all zeros, because such a (M –1)(M –1)
spreadsheet will make the model solvable.

Implication of identification problem

The identification problem poses significant challenges in statistical modeling and causal
inference, with profound implications across various fields such as economics, social
sciences, and public health. At its core, the identification problem refers to the difficulty
in distinguishing between correlation and causation among variables. This challenge
arises when multiple factors influence an outcome, making it hard to determine which
variable directly affects another. As a result, researchers may draw erroneous conclusions
that can misguide theory development and empirical analysis.
One of the primary implications of the identification problem is its impact on causal
inference. In many cases, observed relationships between variables might be coincidental
rather than causal. For instance, a study might find that higher education levels correlate
with better health outcomes. Without proper identification, researchers might mistakenly
conclude that education directly improves health, overlooking potential confounding
factors such as socioeconomic status or access to healthcare. This misinterpretation can
lead to ineffective or misguided policies aimed at improving health through educational
initiatives, ultimately failing to address the root causes of health disparities.

The identification problem also affects data utilization in research. Researchers often
collect vast amounts of data, but if they cannot accurately identify the relationships
among variables, they may miss critical insights. Omitted variable bias, where
unobserved factors influence the dependent variable, can lead to incomplete or misleading
interpretations. This issue is particularly pronounced in observational studies, where
randomization is not feasible, making it essential for researchers to employ robust
methodologies to account for potential confounders.

Lastly, in complex systems characterized by feedback loops and intricate interactions,


the identification problem becomes even more pronounced. In fields like ecology or
social networks, changes in one variable can have cascading effects throughout the
system. Without clear identification of these relationships, efforts to intervene or
modify system behaviors can result in unintended consequences, further complicating
the challenges faced by researchers and policymakers.

Rule of identification:

All the rules listed here assume that errors are independent of exogenous variables,
and all variables have expected value zero.

• Counting Rule: There should be at least as many identifying equations as parameter


values. This applies to all models.

• Recursion Rule: The Observed Variable Model Y = βY + ΓX + ζ is identified if the


model is recursive and V (ζ) = Ψ has a particular block diagonal structure.

The two-variable and three-variable rules apply to the Factor Analysis Model: X = ΛF
+ e. These rules assume that all errors are uncorrelated, and each observed variable is
caused by only one factor. If a model includes variables that are caused by more than
one factor, it may be possible to add them to the model later using the Expansion Rule
below.

• Three-Indicator Rule for Standardized Variables: For a factor analysis model


with standardized observed variables (classical), the model will be identified if
– The variance of each factor equals one.

– There are at least 3 variables with non-zero loadings per factor.

– The sign of one non-zero loading is known.

• Three-Indicator Rule for Unstandardized Variables: If a factor causes at least three


observed variables and

– The scale is fixed, meaning one factor loading equals one, and

– At least two additional observed variables have loadings that do not equal zero,
then the variance of the factor, the variances of the error terms and the factor loadings
are all identified.

• Two-Indicator Rule for Unstandardized Variables: If a factor causes two


observed variables and

– The scale is fixed, meaning one factor loading equals one,

– The other factor loading is non-zero,

– The model contains at least one other factor having a non-zero correlation with
this factor, and

– The scale of the other factor is fixed, then the variance of the factor, the variances
of the error term, and the one factor loading are all identified.

• Double Measurement Rule: The double measurement model is identified.


Correlated measurement errors are allowed within sets of measurements, but not
between sets.

• Combination Rule: Suppose that in a factor analysis model for unstandardized


variables, some factors follow the Three-Variable Rule, others follow the Two-
Variable rule, and still others follow the Double Measurement Rule. In this case,
model parameters are identified in stages.

– Apply the Double Measurement Rule to all relevant factors, identifying their
variances and covariances, as well as the variances and possibly covariances of sets of
error terms.

– Apply the two and three-variable rules to each remaining factor, one at a time.
This will identify the factor loadings and the variances of the factors.

– If the scales are fixed for two factors, their covariance is identified provided the
error terms of the two measurements are independent.
The last two rules apply to general Structural Equation Models (such as the LISREL
Model) that have both latent and observed variables.

• Two-Step Rule

1: Consider the latent variable model as a model for observed variables. Check
identification (usually using the Counting Rule and the Recursive Rule).

2: Consider the measurement model as a factor analysis model, ignoring the structure
of V (F). Check identification.

If both identification checks are successful, the whole model is identified.

• Expansion Rule: Suppose the measurement component of a structural equation


model is identified. A vector of observed variables may be added to the latent
component of the model without losing the identification of the measurement
component, provided

– The additional variables are independent of the error terms in the original
measurement model.

– In the original measurement model, each latent variable has at least one
observed variable that is a function of that latent variable, and of no other latent variable
except for error terms. This will automatically be true if the rules above are used to establish
identification of the measurement model.

You might also like