0% found this document useful (0 votes)
80 views12 pages

Specification Error

Specification error occurs when an irrelevant variable is included in a regression model. This reduces the model's precision by making the variance estimates larger than necessary. While the coefficient estimates remain unbiased and consistent, confidence intervals will be less precise. Including irrelevant variables does not invalidate inferences but results in less efficient "least upper efficient" estimators rather than "best linear unbiased" estimators. The key assumption of classical linear regression that the model is correctly specified is violated if irrelevant variables are included.

Uploaded by

Sarita Arora
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views12 pages

Specification Error

Specification error occurs when an irrelevant variable is included in a regression model. This reduces the model's precision by making the variance estimates larger than necessary. While the coefficient estimates remain unbiased and consistent, confidence intervals will be less precise. Including irrelevant variables does not invalidate inferences but results in less efficient "least upper efficient" estimators rather than "best linear unbiased" estimators. The key assumption of classical linear regression that the model is correctly specified is violated if irrelevant variables are included.

Uploaded by

Sarita Arora
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 12

SPECIFICATION ERROR:

INCLUSION OF
IRRELEVANT VARIABLE
ECONOMETRICS
In regression analysis specification is the process of developing a regression model. This process consists of
selecting an appropriate functional form for the model and choosing which variables to include. As a first
step of regression analysis, a person specifies the model. If an estimated model is misspecified, it will be
biased and inconsistent. Specification error occurs when an independent variable is correlated with the error
term. There are several different causes of specification error:
1. incorrect functional form
2. a variable omitted from the model may have a relationship with both the dependent variable and one or
more of the independent variables (omitted-variable bias);
3. an irrelevant variable may be included in the model
4. the dependent variable may be part of a system of simultaneous equations (simultaneity bias)measurement
errors may affect the independent variables

One of the assumptions of the classical linear regression model (CLRM) Assumption 9, is that the regression
model used in the analysis is “Correctly” specified: If the model is not “Correctly” specified, we encounter
the problem of model specification error or model specification bias.
WHAT IS A SPECIFICATION ERROR?
 The specification of a linear regression model consists of a formulation of the
regression relationships and of statements or assumptions concerning the
explanatory variables and disturbances. If any of these is violated, e.g., incorrect
functional form, the improper introduction of disturbance term in the model, etc.,
then specification error occurs. In a narrower sense, the specification error refers to
explanatory variables.
 The complete regression analysis depends on the explanatory variables present in
the model. It is understood in the regression analysis that only correct and
important explanatory variables appear in the model. In practice, after ensuring the
correct functional form of the model, the analyst usually has a pool of explanatory
variables which possibly influence the process or experiment. Generally, all such
candidate variables are not used in the regression modeling, but a subset of
explanatory variables is chosen from this pool.
When choosing a subset of explanatory variables, there are two possible options:
 1. To make the model as realistic as possible, the analyst may include as many as
possible explanatory variables.
 2. To make the model as simple as possible, one may include fewer explanatory
variables.
In such selections, there can be two types of incorrect model specifications.
1. Omission/exclusion of relevant variables.
2. 2. Inclusion of irrelevant variables.
WHAT IS MEANT BY INCLUSION OF
IRRELEVANT VARIABLE?
Sometimes due to enthusiasm and to make the model more realistic, the analyst may
include some explanatory variables that are not very relevant to the model. Such
variables may contribute very little to the explanatory power of the model. This may
tend to reduce the degrees of freedom (n-k) , and consequently, the validity of the
inference drawn may be questionable. For example, the coefficient of determination
will increase, indicating that the model is improving, which may not be true.
What is an irrelevant variable?
There are several reasons a regression variable can be considered as irrelevant or
superfluous. Here are some ways to characterize such variables:
• A variable that is unable to explain any of the variance in the response variable (y) of
the model.
• A variable whose regression coefficient (β_m) is statistically insignificant (i.e. zero) at
some specified α level.
• A variable that is highly correlated with the rest of the regression variables in the
model. Since the other variables are already included in the model, it is unnecessary to
include a variable that is highly correlated with the existing variables.
 Now let us assume that Yi = B1 + B2X2i +ui
Is the truth, but we fit the following model:
 Yi = A1 + A2X2i + A3X3i +ui
And thus commit the specification error of including an unnecessary variable in the
model
CONSEQUENCES
 The consequences of this specification error are as follows:
 1. The OLS estimators of the parameters of the “incorrect” model are all unbiased and consistent
 2. The error variance A square is correctly estimated.
 3. The usual confidence interval and hypothesis-testing procedures remain valid.
 4. However, the estimated A's will be generally inefficient, that is, their variances will be generally
larger than those of the B's of the true model.
 5. Here, the OLS estimators will be “LUE” instead of “BLUE”.
 The implication of this finding is that the inclusion of the unnecessary variable B3 makes the
variance larger than necessary, thereby making the model less precise.
CONCLUSION
The assumption of the CLRM that the econometric model used in the analysis is
correctly specified has two meanings. One, there are no equation specification errors,
and two, there are no model specification errors.
. The consequences of including irrelevant variables in the model are fortunately less
serious: The estimators of the coefficients of the relevant as well as “irrelevant”
variables remain unbiased as well as consistent, and the error variance σ2 remains
correctly estimated. The only problem is that the estimated variances tend to be larger
than necessary, thereby making for a less precise estimation of the parameters. That is,
the confidence intervals tend to be larger than necessary.

You might also like