0% found this document useful (0 votes)

321 views72 pages

CH 4 - Problems

This document discusses problems of measurement, specification, estimation, and their solutions in classical linear regression models. It focuses on two specific problems - heteroscedasticity and autocorrelation. Heteroscedasticity occurs when the variance of the error term is not constant, violating assumptions of homoscedasticity. Autocorrelation happens when error terms are correlated over time. The document outlines causes, detection methods, and remedies for each problem to ensure valid statistical inferences from regression analyses.

Uploaded by

temesgen yohannes

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

321 views72 pages

CH 4 - Problems

Uploaded by

temesgen yohannes

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 72

Topic II: Problems of Measurement,

specification, estimation and their Solutions

1
•Inferences made based on the results of OLS estimations
are valid so long as the assumptions of the classical
linear regression model hold.

•If one or more of those assumptions are violated,

inferences will not be valid; hence, remedy should be made
for such problems.

•The seriousness and implications of the violations of the

assumptions depend on which assumption is violated.

2
1. Heteroskedasticity

3
Heteroscedasticity is likely to be a problem
when the values of the variables in the
regression equation vary substantially in
different observations.

4
Causes of Heteroscedasticity
•Non homogenous sample: When we consider the specific product sales of firms, the
sales of large sized firms is usually volatile compared to small sized firms. Also,
consumption expenditure studies showed that the consumption expenditure of high-
income people is relatively volatile compared to low income individuals.
•Outlier: some thing which is situated away or detached from the main body or
system
• Functional specification error (omitted variable): In such a case the residuals
obtained from the regression may give distinct impression that the error variances
may not be constant. eg. Consider economies with different economic background. If
we want to estimate the effect of GDP on value added by manufacturing sector, the
variation in the value added to the sector for a small change in GDP seems to be
relatively higher for big economies whereas it is lower for small economies. Hence,
economic background of the countries should be considered in the regression.
•Skewness: the distribution of some variables such as income, wealth, etc… is
skewed.
•Incorrect data transformation ladder/gladder cons

5
What are the consequences of existence of
heteroscedasticity?
•Heteroscedasticity by itself does not cause OLS estimators to be
biased or inconsistent since neither bias nor consistency are
determined by the covariance matrix of the error term.

1. The least squares(OLS) estimators are still unbiased.

Given , the estimates remain unbiased.
E() is still equal to b.
2. However they are inefficient in the sense that variances of the
estimates are biased and as a result the tests of significance will
become invalid.
Given Var(), since δi2 is not constant, tests of significance will be
biased. Hence, whatever conclusions we draw or inferences we
make can be misleading.
6
Detection of the Presence of Heteroskedasticity

•The graph of the residual against the dependent variable gives rough
indication of the existence of heteroscedasticity. The command in STATA to
detect heteroscedasticity graphically is rvfplot, yline(0)
•If there appears a systematic trend in the graph it may be an indication of the
existence of heteroscedasticity.
•First run OLS regression and then find the fitted values and residuals, then
plot the residual against the fitted value of the dependent variable and see the
scatteredness of the residual (i.e, variance of the residual).
If the scatteredness of the residual has systematic trend with the dependent
variable, heteroskedasticity exists.
However, this is not adequate and we need to conduct further tests. There are
many other formal methods of detecting the problem. They are outlined
below.
7
8
Breusch-Pagan test
run yi  1   2 x2i     k xki   i
2
then run ˆi  1   2 x2i     k xki  ei
obtain R2ˆ2
R2ˆ2 ( k  1)
form F  or LM  nR2ˆ2
(1  R2ˆ2 ) n  k 
In this case, H0: There is homoscedasticity (there is no problem of
heteroscedasticity).
Compare the calculated value of F with the tabulated value. If we find
calculated value > tabulated value (or if the P-value < 0.05 at 5% level of
significance), we reject the null hypothesis (saying there is
homoskedasticity); we accept that there is problem of heteroscedasticity.

Or we can use the Lagrange Multiplier result, nR2, which follows

approximately Chi-square distribution (with k-1 degree of freedom).
9
White’s General Heteroscedasticity Test

Run OLS and obtain ˆ 2 and yˆ

Then run ˆ 2   0  1 yˆ   2 yˆ 2  e and obtain R2ˆ
2

Form either F or LM test statistic

Sometimes ratios of Error Sum of Squares (ESS) for

two different groups divided by observations of a
sample; and using F-test to check the presence of
heteroskedasticity; with (n1 – d1 - k)/(n2 –d2 –k) degrees
of freedom, where n is the total number of
observations, d is the number of omitted observations,
and k is the number of estimated parameters.

Eg. see Salvatore (2002).

10
What to do with heteroskedasticity

•While heteroscedasticity is the property of disturbances, the above

tests deal with residuals.

•Diagnostic results against homoscedasticity could be due to

misspecification of the model.

If we are sure that there is a genuine heteroscedasticity problem, we

can deal with the problem using:

- Heteroscedasticity-robust statistics after estimation by OLS.

- Weighted least squares (WLS) estimation. Usually applied in
panel data regression

11
12
13
a. Assume a model of the following
2
form: Yi = α + β1X1i + β2X2i + β3X3i + ui which has a
heteroskedastic disturbance. Assume that the heteroskedastic disturbance is generated by
the following equation: Var (ui) = σi² = kX2i. Perform the transformation necessary to
make the model homoscedastic.

Hence, the above equation can be transformed by dividing both sides by .

You can also use as denominator; Hence 1

b. What if
E(εi²) = σi² = a + bXi + cXi 2
14
Note: Summary of stata commands

• For simple regression (with y-dependent variable and

x- the explanatory variable), first run OLS using the
command: regress y x, then

1.For detecting HS graphically use:

rvfplot, yline(0)

2.For Breusch-Pagan/Cook-Weisberg test use

estat hettest

3.For White’s Test- use

estat imtest
15
See example

16
2. Autocorrelation
• One of the assumptions of the OLS is that successive values of
the error terms are independent or are not related.
cov ui u j   E ui u j   0

• If this assumption is not satisfied and the value of any of the

disturbance terms is related to its preceding value, then there is
autocorrelation (serial correlation).

•In this part we will concentrate on first order serial correlation:

ut   ut 1   t

17
Sources of Autocorrelation

1. Omitted explanatory variables :

• We know in economics one variable is affected by so many

variables.
• The investigator may include only important and directly fit
variables.
• But the error term represents the influence of omitted
variables and because of that an error term in one period
may have a relation with the error terms in successive
periods .Thus the problem of autocorrelation arises .

• Eg. If X3 is omitted and lagged X3 affects X3, autocorrelation arises

18
2. Misspecification of the mathematical form of the
model:
• If we have adopted a mathematical form which differs
from the true form of the relationship, then the
disturbance term may show serial correlation.
• For example, the true estimation should be

19
3. Lags
• If we have adopted a mathematical form which differs
from the true form of the relationship, then the
disturbance term may show serial correlation.
• For example, in case lagged terms of the dependent and
the independent variables should be included in the
model, but we overlook them, the regression will face
serial correlation of the error terms.

4. Inertia
The momentum built into economic data will continue until
something happens. Thus successive time series data are
likely to be interdependent. 20
5. Manipulation of Data
• This involves averaging, interpolation or extrapolation of data.

In empirical analysis, the raw data are often “manipulated.’’ For example,
in time series regressions involving quarterly data, such data are usually
derived from the monthly data by simply adding three monthly
observations and dividing the sum by 3. This averaging introduces
smoothness into the data by dampening the fluctuations in the monthly
data.

Another source of manipulation is interpolation or extrapolation of data.

For example, the Census of Population is conducted every 10 years in this
country, the last being in 2000 and the one before that in 1990. Now if
there is a need to obtain data for some year within the inter census
period 1990–2000, the common practice is to interpolate on the basis of
some ad hoc assumptions. All such data “massaging’’ techniques might
impose upon the data a systematic pattern that might not exist in the
original data. 21
Consequences of
Autocorrelation
•Autocorrelation does not affect the linearity, consistency and
unbiasedness of OLS estimators.
•Under the existence of autocorrelation, OLS estimators do not
have minimum variance (not efficient).
•Var() = -1 …. This holds true if there is no autocorrelation. With
autocorrelation ( ≠ 0), we don’t have the identity matrix. Hence,
variances of the estimators will be inefficient.
•=

•This results in invalid statistical inferences. The usual t and F

statistics are not reliable. 22
Detection of Autocorrelation

• There are two methods for testing

autocorrelation

i) Asymptotic test
ii) Durbin Watson test

23
i) Asymptotic test of autocorrelation
• The OLS residuals from the regression Y = XB + u provide useful
information about the possible presence of serial correlation in
the equation’s error term.
• An intuitively appealing starting point is to consider the
regression of the OLS residual ut upon its lag ut−1.

• We know that first order autocorrelation exists if coefficient of ut−1

, in the regression equation ut  is  t
ut 1significantly different
from zero.
• Hence, we test whether is different from zero or not at 5% level
of significance. The step is as follows:
i. Estimate the regression Y = XB + u
ii. Find and
iii. Run the regression equation ut   ut 1   t and test whether = 0
24
ii) Durbin Watson test of autocorrelation
• Given a regression equation Y = XB + e

• Where

25
• Consequently, a value of dw close to 2 indicates that the first order
autocorrelation coefficient is close to zero. If dw is much smaller
than 2, this is an indication for positive autocorrelation ( 0); if dw is
much larger than 2 then < 0.
• Since the distribution of dw depends not only upon the sample size
n and the number of variables k, but also upon the actual values of
Xs, the critical values cannot be tabulated for general use.
• Fortunately, it is possible to compute upper and lower limits for
critical values of dw that depend only upon sample size n and
number of variables K. These values, dL and du, were tabulated by
Durbin Watson (1950)and Savin and Whit (1977).
• The Durbin-Watson statistic has a range from 0 to 4 with a midpoint
of 2. estat dwatson
• Ranges for dw is shown as:
_______
26
Decision

27
Other alternative tests of autocorrelation
using STATA

• estat durbinalt
(This is Durbin's alternative test for autocorrelation)

• estat bgodfrey
(Breusch-Godfrey LM test for autocorrelation)

28
Remedies to correct autocorrelation

- Use of Heteroskedasticity-and-autocorrelation-
consistent Standard Errors for OLS (Regression with
robust standard errors)

- Correct functional specification

29
Correction of functional specification
• In many cases the finding of autocorrelation is an
indication that the model is mis-specified.
• If this is the case, the most natural route is not to change
your estimator (from OLS to EGLS) but to change your
model.
• Typically, three (interrelated) types of misspecification
may lead to a finding of autocorrelation in your OLS
residuals:
- dynamic misspecification,
- omitted variables, and
- functional form misspecification. 30
Summary questions
• What is meant by autocorrelation?
• What are the sources of autocorrelation?
• What are the consequences of autocorrelation?
• How can we detect autocorrelation?
• What are the remedy measures to correct autocorrelation?

31
3. Multicollinearity
One of the assumptions of CLRM says that there is no exact linear
relationship between any of the explanatory variables included in
the regression analysis.
•When this assumption is violated we speak of Multicollinearity.
•Practically, in empirical research we encounter moderate to high
degree of MC. (read Gujarati starting from pp. 341)
•If there exists perfect multicollinearity, the matrix for regression
will not be with full rank. (read about rank of matrices).
•Rank of matrix refers to simply the number of independent
row/column vectors in that matrix.
•If the number of independent row/column vectors of a matrix is
equal to the number of rows/columns, we say the matrix is with
full rank. 32
The following is an example of presence of perfect MC.

In this case, coefficient of x1 is not B1 ; rather it is B1 + 2B2

33
Sources of multicollinearity
1. Problem with data collection method: sampling over a limited range of the
values taken by the regressors in the population. (specially on discrete variables
with limited values). Small sample can also limit the range of the values even if
the variables are continuous.

2. Constraints on the model or in the population: For example, suppose an

electric utility facilitator is investigating the effect of family income and house
size on residential electricity consumption.

In this case, physical constraints in the population can cause multicollinearity, in the
sense that, by very nature income and household size are highly related variables.

Family with the higher incomes generally have larger homes than families
with lower incomes.

3. Model specification problem: Multicollinearity may also be induced by the choice

of model. We know that adding a polynomial term to a regression model causes ill
conditioning of the matrix x’x. Furthermore, if the range of x is small, adding an x 2 term
can result in significant multicollinearity.
34
4. Over determined model: In case we have a number of
explanatory variables more than the number of observations, the
model will be over defined thereby we face problem of
multicollinearity. This happens when the number of columns of the
matrix is greater than the number of rows.

5. Time series data: In most cases, the regressors used in time series
data share a common trend, that is, they all increase or decrease
over time together.

35
Consequences of Multicollinearity
• The presence of multicollinearity has a number of potentially
serious effects on the least-squares estimates of the regression
coefficients.
• Some of these effects may be easily demonstrated.
• To illustrate the effect of multicollinearity on the OLS estimator in
more detail, consider the following example.
• Let the following regression model be estimated,

• Where it is assumed that the sample means . Moreover, assume that the
sample variances of are equal to 1, while the sample covariance (correlation
coefficient) is

36
•
•

Correlation between X1 and X2,

where: = 1

37
• This shows that if there is strong multicollinearity between x1 and
x2, then the correlation coefficient r12 will be large.

• If there is perfect multicollinearity, r12 = 1, so that the estimators

could not be determined.
• This also has significant effect on variance of the estimators.

• Therefore, strong multicollinearity between x1 and x2 result in

large variances and covariance for the least square estimators of
the regression coefficients. 38
• We know

39
40
Detection of Multicollinearity
• One method of detection of multicollinearity is to
estimate correlation coefficient among variables.

• The other standard way of detection of multicollinearity

is using variance-inflating factor (VIF)
VIF =
Where is the correlation coefficient between x and z
By the rule of thumb, we see whether the VIF is greater
than 10 or not (for each variable and the mean VIF).
How can we overcome multicollinearity?
• Multicollinearity can sometimes be overcome or reduced

1. By collecting more data (increasing the sample size)

2. By utilizing a priori information
Suppose we consider the model

42
3. By transforming the functional relationship
Eg. Estimation with first difference of the variables (in the case of time
series analysis)

4. By dropping one of the highly collinear variables 43

Summary questions
• What is meant by multicollinearity?
• What are the sources of multicollinearity?
• What are the consequences of multicollinearity?
• How can we detect multicollinearity?
• What are the remedy measures to correct multicollinearity?

44
4) Problem of Model Specification

• The other questions that arise in CLRM are that of proper

functional form.

 Is it a linear model?
 Should the dependent or independent variable be transformed?
 Is there any important variable omitted?
 Is there irrelevant variable included in the model?
 Is the model estimated with error in variables?
 Is there structural break in the regression?

 And other related questions

45
A. Test for Linearity of the Model:
• Checking the linear assumption in the case of simple regression is
straightforward, since we only have one predictor.

• All we have to do is a scatter plot between the response variable

and the predictor to see if nonlinearity is present, such as a curved
band or a big wave-shaped curve.

• If we run a regression reg cons hhsize

• We can use a command

• twoway(scatter cons hhsize)(lfit cons hhsize)(lowess cons hhsize)

46
4000
3000
2000
1000
0

0 5 10 15 20
household size

consumption per month Fitted values

lowess cons hhsize

• In this case, the result shows that the fitted values are almost
aligned to the lowess smoother predictor. Hence, we see linear
relationship 47
• Checking the linearity assumption is not so straightforward in the case
of multiple regressions.
• In this case, the most straightforward thing to do is to plot the
standardized residuals against each of the predictor variables in the
regression model.
• If there is a clear nonlinear pattern, there is a problem of nonlinearity.
• To do so; we first run the regression (eg. let us say reg cons ageh sexh
hhsize)
• Then, find residual for each observation. (predict resid, residual)
• Then use the command for each regressor:
• scatter resid ageh
• Scatter resid sexh (for this variable, we don’t have contiunous r/n
ship)
• Scatter resid hhsize
48
• scatter resid ageh
30002000
Residuals
1000 0
-1000

20 40 60 80 100
Age of household head

49
• scatter resid sexh
3000 2000
R esiduals
1000 0
-1000

0 .2 .4 .6 .8 1
Sex of household head
50
• Scatter resid hhsize
30002000
Residuals
1000 0
-1000

0 5 10 15 20
household size

51
• Another command for detecting non-linearity is acprplot.
• acprplot graphs an augmented component-plus-residual plot,
a.k.a. augmented partial residual plot.
• It can be used to identify nonlinearities in the data.

• The command for each variable:

acprplot ageh, lowess lsopts(bwidth(1))
acprplot sexh, lowess lsopts(bwidth(1))
acprplot hhsize, lowess lsopts(bwidth(1))

52
A u gm e n te d c o m p o n e n t p lu s re sid u a l
-1 0 0 0 0 1000 2000 3000 4000

20
40
60
Age of household head
80
acprplot ageh, lowess lsopts(bwidth(1))

100

53
A u gm e nted co m po ne n t plus residu al
0 1 00 0 2 00 0 3 00 0 4 00 0

0
5
10
household size
15
acprplot hhsize, lowess lsopts(bwidth(1))

54
B) Test for functional specification problem in the
regressors
• Since any function can be approximated by a polynomial of
sufficiently high order, this mimics any non-linear function. (Taylor
series expansion Y = f(x) = a + bx2 + cx3 + dx4 + etc.).
• In practice we don't usually go beyond the 2 terms. Also, instead
of using powers of X, we use powers of Y , which of course is a
function of X.
• Ramsey’s RESET test procedure is:
• 1. Estimate Y = b1 + b2 X2 + ... + bk Xk + u
• 2. Obtain fitted values
• 3. Estimate Y = b'1 + b'2 X2 + ... + b'k Xk + c Yˆ 2+ u'
• 4. Test for the significance of c. If it is significantly different from
0, reject the linear specification. Alternative is not clear, though.55
• The linktest command can also perform test of a model specification
problem.
• linktest creates two new variables, the variable of prediction, Y_hat,
and the variable of squared prediction, Y_hatsq.
• The model is then refit using these two variables as predictors.
• Linktest test procedure is:
1. Estimate Y = b1 + b2 X2 + ... + bk Xk + u
2. Obtain fitted values (Y_hat and Y_hatsq)
3. Estimate Y = b'1 + b'2 Y_hat + b‘3 Y_hatsq + u'
4. Test for the significance of b'2 and b‘3.

• Y_hat should be significant since it is the predicted value.

• On the other hand, Y_hatsq shouldn't be significant, because if our
model is specified correctly, the squared predictions should not have
much explanatory power. 56
•Example: linktest

• reg cons sexh ageh hhsize

• linktest

Source | SS df MS Number of obs = 1449

-------------+------------------------------ F( 2, 1446) = 56.73
Model | 18617914.5 2 9308957.23 Prob > F = 0.0000
Residual | 237297751 1446 164106.329 R-squared = 0.0728
-------------+------------------------------ Adj R-squared = 0.0715
Total | 255915666 1448 176737.338 Root MSE = 405.1

------------------------------------------------------------------------------
cons | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_hat | 1.842698 .6423948 2.87 0.004 .5825729 3.102824
_hatsq | -.0007843 .0005914 -1.33 0.185 -.0019444 .0003757
_cons | -215.7849 169.9994 -1.27 0.205 -549.2568 117.687
------------------------------------------------------------------------------

57
• C) Problem of omitted variable or inclusion of
irrelevant variable
• A model specification error can occur when one or more relevant
variables are omitted from the model or one or more irrelevant
variables are included in the model.
i. Problem of Omitted Variable

• Imagine that the true causes of Y can be represented as:

Yt   0  1 X 1   2 X 2  e
• But we estimate the model: Yt   0  1 X 1  e*
Where :
e*   2 X 2  e
58
• Clearly this model violates our assumption that E(u)=0

• E(β2X2+e) is not 0

• We required this assumption in order to show that b1 is

an unbiased estimator of β1

• Thus by excluding X2, we risk biasing our coefficients for

• If b1 is biased, then σb1 is also biased

59
• Under these circumstances, what does our
estimate of b1 reflect?
• Recall that:

1
b1  ( X 1 ' X 1 ) X 1 ' y
1
b1  ( X 1 ' X 1 ) X 1 ' ( X 1 B1  X 2  2  e)
1 1
b1  1  ( X 1 ' X 1 ) X 1 ' X 2  2  ( X 1 ' X 1 ) X 1 ' e

60
• We retain the assumption that E(e)=0
• That is, the expectation of the errors is 0 after we
account for the impact of X2
• Thus our estimate of b1 is:
1
E (b1 )  1  ( X 1 ' X 1 ) X 1 ' X 2  2
• This equation indicates that our estimate of b1 will be
biased by two factors.
 The extent of the correlation between X1 and X2 and
 The extent of X2’s impact on Y.
• Often difficult to know direction of bias. 61
• If either of these sources of bias is 0, then the
overall bias is 0

• Thus E(b1)=β1 if:

– Correlation of X1 & X2=0

OR
– B2=0

62
• Omitted variable bias can be detected using Ramsey regression
specification error test (RESET) for omitted variables.
• The syntax for this is estat ovtest or ovtest
• This test amounts to fitting y=xb+zt+u and then testing the
significance of coefficients of powers of the fitted values are
used for z. The test involves F-statistic.
Where: z = matrix of , ,

• Example: estat ovtest

Ramsey RESET test using powers of the fitted values of cons

Ho: model has no omitted variables
F(3, 1442) = 1.25
Prob > F = 0.2913

63
ii) Problem of Including Irrelevant Variables

• Do we see the same kinds of problems if we

include irrelevant variables?
• NO!
• Not as serious as omitting relevant variable
• Imagine we estimate the equation:
Y  a  b1 X 1  b2 X 2  e
• While the true model is
Y  c  d1 X 1  u
• In this case:
1
 
b1   X 1 X 1  X 1 Y
 
1
 
  X 1 X 1  X 1 1 X 1   2 X 2  e 
 
1 1
   
 1   X 1 X 1  X 1 X 2  2   X 1 X 1  X 1 e
   
1
 
E (b1 )  1   X 1 X 1  X 1 X 2  2
 

• If β2=0, our estimate of b1 is not affected

• If X1’X2=0, our estimate of b1 is not affected
• But including X2 if it is not relevant does unnecessarily inflate the σb2

• If b2=0 but X1’X2 is not, then including X2 unnecessarily reduces

independent variation of X1
• Thus parsimony remains a virtue
d) Measurement Error
• A measurement error occurs when a variable is
measured with an error.
• There are two cases that we will look at:
– The dependent variable y is measured with
error
– The independent variable(s) is(are) measured
with an error.
• As we will see, the second case is more serious in
terms of its effect on OLS properties.
• Suppose, in the model Y *   0  1 X *  u
the dependent*
variable is measured with
error: Y  Y e
• Then the model is
Y *  Y  e   0  1 X *  u  Y   0  1 X *  (u  e)
• If we assume that E(𝓔|X*) = E(u+e) =0, then
we can use OLS to estimate this model => no
bias or inconsistency
• It is important to note that the error variance
is now:
V [(u  e) | X ]  V (u | X )  V (e | X )   2   e2   2

• So, the result of the measurement error in y

will produce larger error variance in the
model.
• Measurement error in an explanatory
variable is a much more important problem.
• Suppose, X* is measured with an error:
*
X  X   , E ( )  0
• Can we simply replace X* with X in the
model?
Y *   0  1 X *  u  Y *   0  1 ( X   )  u   0  1 X  (u  1 )
• Is E[(u-β1ε)|X]=0? E[u|X] – E[β1ε|X] = 0?

• Two possibilities:
1. Yes => Cov(X,ε)=0 => ε must be correlated
with X* => OLS is unbiased, but we have large
error variance
V [(u  1 ) | X )  V (u | X )  12V ( | X )   2  12 2

2. But if we have Cov(X*,ε)=0 => ε must be

correlated with X => problems with OLS
estimator
• The second case (observed values are correlated
with the measurement error) is the classical error-
in-variables (CEV) case that leads to problems with
OLS.

Cov(X,ε)=0, X=X+ε => E(𝓔) = 0

Cov(X,ε)= E[(X-E(X))(𝓔 – E(𝓔))]

=E[(X·ε)-XE(𝓔)-E(X)𝓔+E(X)E(ε)]= E(X·ε)- E(X)E(𝓔) = E(X·ε)
=E[(X*+ ε) ε]
= E(X* ε)+E(ε2)=σε 2≠0
Reading assignment

• Regression with dummy variables

• Structural break

Heteros Ce Dasti City
No ratings yet
Heteros Ce Dasti City
43 pages
Work Book-Economics-Grade-11 - 12
100% (1)
Work Book-Economics-Grade-11 - 12
31 pages
Econometrics Course For RDAE Chapter 5
No ratings yet
Econometrics Course For RDAE Chapter 5
82 pages
Wage Determination in A Perfect Market
No ratings yet
Wage Determination in A Perfect Market
21 pages
EC229 Part II Answers
No ratings yet
EC229 Part II Answers
9 pages
Microeconomics II
67% (3)
Microeconomics II
6 pages
MA Chapter 4 For Exit Exam
No ratings yet
MA Chapter 4 For Exit Exam
30 pages
#.Development Planing II
100% (1)
#.Development Planing II
173 pages
Economics Exam With Answer
100% (1)
Economics Exam With Answer
17 pages
5.difference & Differential Equations
No ratings yet
5.difference & Differential Equations
121 pages
Exit Exam BA in Economics
100% (1)
Exit Exam BA in Economics
22 pages
OLS Assumptions
No ratings yet
OLS Assumptions
40 pages
Applied Econometrics Module
100% (2)
Applied Econometrics Module
141 pages
Agribusiness Organization & Coop MGMT Handout
100% (1)
Agribusiness Organization & Coop MGMT Handout
56 pages
Chapter 5
No ratings yet
Chapter 5
116 pages
Chapter 1 International Marketing University of Gondar Aschalew
100% (1)
Chapter 1 International Marketing University of Gondar Aschalew
15 pages
Econometrics Chapter Two
No ratings yet
Econometrics Chapter Two
108 pages
Introduction To Management - CH 3
No ratings yet
Introduction To Management - CH 3
41 pages
Agricultural Economics Chapter 4
No ratings yet
Agricultural Economics Chapter 4
34 pages
Econometrics 2
No ratings yet
Econometrics 2
135 pages
Econometrics Chapter Three
No ratings yet
Econometrics Chapter Three
70 pages
Economics
0% (1)
Economics
44 pages
Econometrics A
No ratings yet
Econometrics A
18 pages
CH Chapter 4 Test Bank CH Chapter 4 Test Bank
No ratings yet
CH Chapter 4 Test Bank CH Chapter 4 Test Bank
32 pages
International Economics Final Exam
No ratings yet
International Economics Final Exam
5 pages
Econometrics - Basic 1-8
100% (1)
Econometrics - Basic 1-8
58 pages
Question Bank
100% (1)
Question Bank
3 pages
Econometrics I CH-1
No ratings yet
Econometrics I CH-1
32 pages
SRSTRT Mod
100% (1)
SRSTRT Mod
184 pages
Chapter 3 Econometrics Practice MC
No ratings yet
Chapter 3 Econometrics Practice MC
35 pages
Econometrics MTU
No ratings yet
Econometrics MTU
31 pages
Settlement of Shallow Foundation
No ratings yet
Settlement of Shallow Foundation
21 pages
Violations of OLS
No ratings yet
Violations of OLS
64 pages
Chapter Two: Simple Linear Regression Models: Assumptions and Estimation
100% (3)
Chapter Two: Simple Linear Regression Models: Assumptions and Estimation
34 pages
Chapter 4 Sample Size
No ratings yet
Chapter 4 Sample Size
28 pages
Labor Ecos Chapter 2
No ratings yet
Labor Ecos Chapter 2
100 pages
Dev't PPA I (Chap-4)
100% (1)
Dev't PPA I (Chap-4)
48 pages
Chapter Three: Aggregate Demand in Closed Economy
No ratings yet
Chapter Three: Aggregate Demand in Closed Economy
85 pages
Pile Foundation Design: Dr. Magdi Zumrawi
100% (1)
Pile Foundation Design: Dr. Magdi Zumrawi
20 pages
Microeconomics II
No ratings yet
Microeconomics II
37 pages
Chapter 4
No ratings yet
Chapter 4
68 pages
Chapter 5 Indicators and Variables
No ratings yet
Chapter 5 Indicators and Variables
24 pages
Econometrics Chapter 1 7 2d AgEc 1
No ratings yet
Econometrics Chapter 1 7 2d AgEc 1
89 pages
Chapter 1
No ratings yet
Chapter 1
39 pages
Assignment: Subject: Risk Management and Insurance
No ratings yet
Assignment: Subject: Risk Management and Insurance
17 pages
Gage University College Business Research Methods (Mba 611) Shishay K. (PHD in Development Studies) 20 November 2021
No ratings yet
Gage University College Business Research Methods (Mba 611) Shishay K. (PHD in Development Studies) 20 November 2021
82 pages
Natural Resource and Environmental Economics
100% (1)
Natural Resource and Environmental Economics
6 pages
CH-1 OM Introduction
No ratings yet
CH-1 OM Introduction
40 pages
Consumer Preferenceand Industrial Policy in Eastern Ethiopia The Case of Pasta Market in Dire Dawa City
No ratings yet
Consumer Preferenceand Industrial Policy in Eastern Ethiopia The Case of Pasta Market in Dire Dawa City
20 pages
International Economics II - CH 3
No ratings yet
International Economics II - CH 3
35 pages
Chapter Five
No ratings yet
Chapter Five
39 pages
Questions: Temesgen - Worku@aau - Edu.et
100% (1)
Questions: Temesgen - Worku@aau - Edu.et
3 pages
Chapter Six Agricultural Finance & Rural Credit Markets: 6.1 Why Do People Demand Credit in Rural Areas
No ratings yet
Chapter Six Agricultural Finance & Rural Credit Markets: 6.1 Why Do People Demand Credit in Rural Areas
12 pages
Final Exam Review
No ratings yet
Final Exam Review
46 pages
92-Worksheet - Econometrics II
100% (1)
92-Worksheet - Econometrics II
4 pages
Exit Exam
No ratings yet
Exit Exam
50 pages
Pharmaceutical Calculation: Yohannes Nigatu (B.PHARM)
No ratings yet
Pharmaceutical Calculation: Yohannes Nigatu (B.PHARM)
63 pages
Confidence Level and Sample Size
No ratings yet
Confidence Level and Sample Size
19 pages
Econometrics II Assignment
No ratings yet
Econometrics II Assignment
3 pages
CHAPTER FIVE-SAMPLING DESIGN-modified
No ratings yet
CHAPTER FIVE-SAMPLING DESIGN-modified
34 pages
Group Assignment Final PDF
100% (1)
Group Assignment Final PDF
13 pages
CH 1 Strategic MGMT Oveview
No ratings yet
CH 1 Strategic MGMT Oveview
28 pages
Determinants of Household Savings
100% (2)
Determinants of Household Savings
12 pages
Econometrics II-1
No ratings yet
Econometrics II-1
56 pages
Chapter 1
No ratings yet
Chapter 1
11 pages
Questions For Review: Multiple Choice Questions With Answer
100% (1)
Questions For Review: Multiple Choice Questions With Answer
3 pages
KNN (K Nearest Neighbor)
No ratings yet
KNN (K Nearest Neighbor)
21 pages
Small and Medium Enterprise in Ethiopia
No ratings yet
Small and Medium Enterprise in Ethiopia
48 pages
CH 5. Discrete Choice Model
No ratings yet
CH 5. Discrete Choice Model
38 pages
Chapter 3 Human Resource Management
No ratings yet
Chapter 3 Human Resource Management
14 pages
ADL 01 Assignment A
50% (2)
ADL 01 Assignment A
5 pages
Course Code: Acfn2102: Financial Management-I
No ratings yet
Course Code: Acfn2102: Financial Management-I
45 pages
Chapter 2 Material Management
No ratings yet
Chapter 2 Material Management
10 pages
Statistical Concepts in The Determination of Weight Variation
No ratings yet
Statistical Concepts in The Determination of Weight Variation
4 pages
Determinants of Foreign Direct Investment in Ethiopia
No ratings yet
Determinants of Foreign Direct Investment in Ethiopia
34 pages
Chapter 1 Exam - The Nature of Economics
No ratings yet
Chapter 1 Exam - The Nature of Economics
7 pages
Section 1: Multiple Choice Questions (1 X 12) Time: 50 Minutes
No ratings yet
Section 1: Multiple Choice Questions (1 X 12) Time: 50 Minutes
7 pages
FM Ch4 Bond & Stock Ev
No ratings yet
FM Ch4 Bond & Stock Ev
40 pages
Chapter 3 Sampling Methods
No ratings yet
Chapter 3 Sampling Methods
46 pages
Inequality Is A Necessary Evil'
No ratings yet
Inequality Is A Necessary Evil'
2 pages
JSO (Test - 14) Paid
No ratings yet
JSO (Test - 14) Paid
5 pages
A Catalog of Noninformatie Priors Ruoyong Yang and James O Berger
No ratings yet
A Catalog of Noninformatie Priors Ruoyong Yang and James O Berger
44 pages
Chapter 1 Inclusiveness (SNIE 1012)
No ratings yet
Chapter 1 Inclusiveness (SNIE 1012)
40 pages
Anova Test - Post Hoc 1
No ratings yet
Anova Test - Post Hoc 1
2 pages
Assignment 3
100% (3)
Assignment 3
3 pages
A New Look at The Statistical Model Identification PDF
No ratings yet
A New Look at The Statistical Model Identification PDF
8 pages
A OM Process and Desin Chapter 4-5
No ratings yet
A OM Process and Desin Chapter 4-5
50 pages
Econometrics LL CH 2
No ratings yet
Econometrics LL CH 2
64 pages
Managing Quality
No ratings yet
Managing Quality
49 pages
Problem Set 2 (Econtwo)
No ratings yet
Problem Set 2 (Econtwo)
1 page
ML - LAB - FILE Pankaj
No ratings yet
ML - LAB - FILE Pankaj
13 pages
Ferramentas para Lean 6 Sigma
No ratings yet
Ferramentas para Lean 6 Sigma
306 pages
Chapter 2 Types of Data
No ratings yet
Chapter 2 Types of Data
24 pages
CH 1. Introduction
No ratings yet
CH 1. Introduction
18 pages
Ms. Koni Bernadette C. Tarayao Faculty in Mathematics College of Education
No ratings yet
Ms. Koni Bernadette C. Tarayao Faculty in Mathematics College of Education
68 pages
Open Screenshot 2023-12-13 at 8.06.13 PM 26
No ratings yet
Open Screenshot 2023-12-13 at 8.06.13 PM 26
56 pages
Best Subset Reg 2
No ratings yet
Best Subset Reg 2
9 pages
Flashcards - Analysis and Interpretation of Data - CIE Biology A-Level
No ratings yet
Flashcards - Analysis and Interpretation of Data - CIE Biology A-Level
45 pages
The Central Limit Theorem
No ratings yet
The Central Limit Theorem
6 pages
Expansive Soils
No ratings yet
Expansive Soils
20 pages
Chapter 7-Managing Quality: 07/20/2022 1 Arsi University Om Mba-2020
No ratings yet
Chapter 7-Managing Quality: 07/20/2022 1 Arsi University Om Mba-2020
33 pages
JIT and Lean Operations: 07/20/2022 1 OM Arsi University MBA 2020
No ratings yet
JIT and Lean Operations: 07/20/2022 1 OM Arsi University MBA 2020
33 pages
Sustainability Disclosure and Intellectual Capital: A New Perspective On Corporate Reporting
No ratings yet
Sustainability Disclosure and Intellectual Capital: A New Perspective On Corporate Reporting
10 pages
MSO 201A: Probability and Statistics 2021 (2nd Semester) Assignment-IV
No ratings yet
MSO 201A: Probability and Statistics 2021 (2nd Semester) Assignment-IV
5 pages
Chapter 1 Research Survey
No ratings yet
Chapter 1 Research Survey
20 pages
5 Pca
No ratings yet
5 Pca
33 pages
Deep Foundations: Dr. Magdi Zumrawi
No ratings yet
Deep Foundations: Dr. Magdi Zumrawi
8 pages
Moving Averages and Smoothing Methods PDF
No ratings yet
Moving Averages and Smoothing Methods PDF
32 pages
Confidence Intervals For The Population Proportion Instructions
0% (1)
Confidence Intervals For The Population Proportion Instructions
3 pages
Estimating Population Variance
No ratings yet
Estimating Population Variance
26 pages
Modelling of The Paper Plane Systems
No ratings yet
Modelling of The Paper Plane Systems
6 pages
Statistics Acadza
No ratings yet
Statistics Acadza
2 pages
Music Streaming Patterns A Research On The Music Streaming Patterns in Different Age Groups
No ratings yet
Music Streaming Patterns A Research On The Music Streaming Patterns in Different Age Groups
7 pages
USL - 21070126112 - Colaboratory
No ratings yet
USL - 21070126112 - Colaboratory
3 pages
Week 11 Assignment 11.2.2
No ratings yet
Week 11 Assignment 11.2.2
3 pages

CH 4 - Problems

Uploaded by

CH 4 - Problems

Uploaded by

Topic II: Problems of Measurement,

specification, estimation and their Solutions

•If one or more of those assumptions are violated,

•The seriousness and implications of the violations of the

1. The least squares(OLS) estimators are still unbiased.

Or we can use the Lagrange Multiplier result, nR2, which follows

Run OLS and obtain ˆ 2 and yˆ

Form either F or LM test statistic

Sometimes ratios of Error Sum of Squares (ESS) for

Eg. see Salvatore (2002).

•While heteroscedasticity is the property of disturbances, the above

•Diagnostic results against homoscedasticity could be due to

If we are sure that there is a genuine heteroscedasticity problem, we

- Heteroscedasticity-robust statistics after estimation by OLS.

Hence, the above equation can be transformed by dividing both sides by .

You can also use as denominator; Hence 1

• For simple regression (with y-dependent variable and

1.For detecting HS graphically use:

2.For Breusch-Pagan/Cook-Weisberg test use

3.For White’s Test- use

• If this assumption is not satisfied and the value of any of the

•In this part we will concentrate on first order serial correlation:

1. Omitted explanatory variables :

• We know in economics one variable is affected by so many

• Eg. If X3 is omitted and lagged X3 affects X3, autocorrelation arises

Another source of manipulation is interpolation or extrapolation of data.

•This results in invalid statistical inferences. The usual t and F

• There are two methods for testing

• We know that first order autocorrelation exists if coefficient of ut−1

- Correct functional specification

In this case, coefficient of x1 is not B1 ; rather it is B1 + 2B2

2. Constraints on the model or in the population: For example, suppose an

3. Model specification problem: Multicollinearity may also be induced by the choice

Correlation between X1 and X2,

• If there is perfect multicollinearity, r12 = 1, so that the estimators

• Therefore, strong multicollinearity between x1 and x2 result in

• The other standard way of detection of multicollinearity

1. By collecting more data (increasing the sample size)

4. By dropping one of the highly collinear variables 43

• The other questions that arise in CLRM are that of proper

 And other related questions

• All we have to do is a scatter plot between the response variable

• If we run a regression reg cons hhsize

• We can use a command

consumption per month Fitted values

• The command for each variable:

• Y_hat should be significant since it is the predicted value.

• reg cons sexh ageh hhsize

Source | SS df MS Number of obs = 1449

• Imagine that the true causes of Y can be represented as:

• We required this assumption in order to show that b1 is

• Thus by excluding X2, we risk biasing our coefficients for

• If b1 is biased, then σb1 is also biased

• Thus E(b1)=β1 if:

– Correlation of X1 & X2=0

• Example: estat ovtest

Ramsey RESET test using powers of the fitted values of cons

• Do we see the same kinds of problems if we

• If β2=0, our estimate of b1 is not affected

• If b2=0 but X1’X2 is not, then including X2 unnecessarily reduces

• So, the result of the measurement error in y

2. But if we have Cov(X*,ε)=0 => ε must be

Cov(X*,ε)=0, X=X*+ε => E(𝓔) = 0

Cov(X,ε)= E[(X-E(X))(𝓔 – E(𝓔))]

• Regression with dummy variables

You might also like

Cov(X,ε)=0, X=X+ε => E(𝓔) = 0