0% found this document useful (0 votes)
17 views4 pages

Prac 12-Model Selection

Uploaded by

lucastone325
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views4 pages

Prac 12-Model Selection

Uploaded by

lucastone325
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

PRACTICAL EXERCISE 12: MODEL SELECTION

1. Start a log file in your folder (call it prac12.log)

Download the datasets model select 1.dta and model select 2.dta from
the Moodle site for the course and save them to a convenient location. Open STATA
and then open the dataset model select 1.dta.

2. The dataset gives data on the real gross domestic product (y), labour input (x2), and
real capital input (x3) in the manufacturing sector for a developing country for the
years 1958 to 1972. Suppose that the theoretically correct production function that
we can estimate using this data, is of the Cobb-Douglas type. Our model can be
specified as follows:

ln Y t =
B1 + B2 ln X 2 t + B3 ln X 3t + ut

Where ln = the natural log.

3. Generate logged values of y, x2 and x3. Type:


gen lnY = log(y)
gen lnX2 = log(x2)
gen lnX3 = log(x3)
4. Using regression, estimate the Cobb-Douglas production function for this country for
the sample period and interpret the results.
reg lnY lnX2 lnX3
5. Now suppose that capital data (i.e. X3) were not initially available and therefore you
estimated the following production function:

ln Y t =A 1 + A 2 X 2 t +v t

v
where t = error term.

Run the above regression and examine the consequences by referring to the Note
on Omitted Variable Bias uploaded on Moodle.

reg lnY lnX2

What difference(s) do you note with regard to the estimated coefficient values (i.e.
elasticity values), the standard errors and the R2 values?

Prac 12 – Model Selection Page 1 of 4


6. To estimate the extent of the omitted-variable bias in the above regression and
assess whether it is upward or downward, regress
ln X 3 on
ln X 2 (refer “Note on
Omitted Variable Bias”).
b
What is the value of 32 ?

Using this value and the equation


E( a2 )=B2 +B3 b 32 , calculate the biased estimate of
the output-labour elasticity. Does this estimate concur with that obtained in the

misspecified model (in 6) above? Also, what does the product of


B3 and b32
indicate?

7. Now suppose that data on labour (i.e. X 2) were not initially available and therefore
you estimated the following production function:

ln Y t =
B1 + B2 ln X 3 t + w t

where
w t = error term.
Run the above regression and again examine the consequences. What difference(s)
do you note with regard to the estimated coefficient values (i.e. elasticity values), the
standard errors and the R2 values?

8. Repeat step (7) above, but this time regressln X 2 on


ln X 3 and follow the rest of the
E( a3 )=B3 +B2 b 23 , are your conclusions similar to those
procedure. After calculating
B b
reached previously? Judging from 2 23 , is the bias upward or downward?
9. Now assume that you extend the Cobb-Douglas production function model to include
the trend variable X4, which is a measure of time elapsed and we use it here as a
surrogate for technological progress. If you find that X 4 turns out to be statistically
significant, what type of error did you commit by not including it previously? And what
if it turned out to be statistically insignificant?

Comment on this by running the regression with the trend variable in the model,
examining and commenting on the statistical significance of all the variables, on why
they may have possibly changed (compared to your original model) and how you
would interpret these changes.

Prac 12 – Model Selection Page 2 of 4


10. Close the model select 1.dta dataset and open dataset

model select 2.dta.

This data set contains information on U.S expenditure on imported goods (y),
personal disposable income (x) and the trend variable (t) for the period 1968
to 1987.

11. Regress expenditure on imports (y) on PDI (x) only.

12. Conduct an examination of the residuals plotted against the period of the study (i.e.
“year” variable).
predict e, resid
twoway connected e year, yline(0)
Do the residuals look randomly distributed or do they reveal any kind of systematic
pattern? If they do not appear to be randomly distributed, provide one or more
possible reasons.

13. Now regress y on x and t. Again, examine the residuals (call the variable for the
residuals e2) to see whether they now appear to be randomly distributed. What do
you conclude?
reg y x t
predict e2, resid
twoway connected e2 year, yline(0)

14. Given the above, let us now test whether a log-linear specification may not have
been more appropriate than a linear specification. But since both models may look
equally good in terms of the usual criteria we can now test for the “better” model
using the MWD Test as follows:

Step 1: Estimate the linear model (which you have already done) and obtain the
^
estimated Y values i.e. Y i .
predict yfit

^ ^
Step 2: Generate the logged value of the estimated Y i (above) to yield ln { Y i¿ :
g lnyfitted= log(yfit)

Prac 12 – Model Selection Page 3 of 4


Step 3: After generating logged values of all your other variables (x and y), estimate
^i
the log-linear model and obtain the estimated lnYi values i.e. lnY
g lny= log(y)
g lnx= log(x)
reg lny lnx t
predict lnyfit

^i
^i−lnY
Step 4: Obtain. Z1 i=ln Y

g z1= lnyfitted - lnyfit


Step 5: Regress Y on the X’s and Z1.
Is the coefficient of Z1 (using usual t test) statistically significant at the 5% level? If it
is, then you should reject the null hypothesis that the model is a linear one.

15. To test whether the log-linear model is appropriate, continue the MWD test as
explained on pg. 11-12 of your notes.
^ )−Y^
Step 6 : Obtain Z 2i=antilog (ln Y i i

gen z2 = exp(lnyfit)-yfit
Step 7: Regress lnY on the X’s or logs of X’s and Z2.
reg lny lnx t z2

Step 8: Reject H1 (that the log-log model is preferable) if the coefficient of Z2 is


statistically significant.

Prac 12 – Model Selection Page 4 of 4

You might also like