0% found this document useful (0 votes)
90 views

Tutorials2016s1 Week7 Answers-3

- The document provides reading assignments and review questions for an introductory econometrics course, including chapters from Wooldridge to read and key terms to know. - It also provides tutorial exercises and problem sets to work on, including questions from Wooldridge and a regression analysis using data on housing prices. - Students are asked to read chapters on asymptotic properties of estimators, inference procedures, and interaction terms among other topics in preparation for the class.

Uploaded by

yizzy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
90 views

Tutorials2016s1 Week7 Answers-3

- The document provides reading assignments and review questions for an introductory econometrics course, including chapters from Wooldridge to read and key terms to know. - It also provides tutorial exercises and problem sets to work on, including questions from Wooldridge and a regression analysis using data on housing prices. - Students are asked to read chapters on asymptotic properties of estimators, inference procedures, and interaction terms among other topics in preparation for the class.

Uploaded by

yizzy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

ECON2206 Introductory Econometrics

Week 7 Tutorial Exercises

Readings
Read Chapters 5.1-5.2 (excluding 5.2a) thoroughly. Chapters 5.2a and 5.3 will not be
covered and are not examinable.
Read Chapter 6 (excluding 6.1a). Chapter 6.1a will not be covered and is not examinable.
Make sure that you know the meanings of the Key Terms at the end of Chapters 5 & 6.

Review Questions (these may or may not be discussed in tutorial classes)


Why would you care about the asymptotic properties of the OLS estimators?
Comparing the inference procedures in Chapter 5 with those in Chapter 4, can you list the
similarities and differences?
Under MLR.1-MLR.5, the OLS estimators are consistent, asymptotically normal, and
asymptotically efficient. Try to explain these properties in your own words.
What are the advantages of using the log of a variable in regression?
Be careful when you interpret the coefficients of explanatory variables in a model where
some variables are in logarithm. Do you remember Table 2.3?
How do you compute the change in y caused by x when the model is built for log(y)?
Why do we need interaction terms in regression models?
What is the adjusted R-squared? What is the difference between it and the R-squared?
How do you construct interval prediction for given x-values?
How do you predict y for given x-values when the model is built for log(y)?
What is involved in residual analysis?

Problem Set (these will be discussed in tutorial classes)

Q1. Wooldridge Ch5 Q2


This is about the inconsistency of the simple regression of pctstck on funds. A higher tolerance of
risk means more willingness to invest in the stock market, so 2 > 0. By assumption, funds and
risktol are positively correlated.

Now we use equation (5.5), where 1 > 0: plim(1 ) = 1 + 21 > 1, so 1 has a positive
inconsistency (asymptotic bias).

This makes sense: if we omit risktol from the regression and it is positively correlated with
funds, some of the estimated effect of funds is actually due to the effect of risktol.
Q2. Wooldridge Ch6 Q4 (Also use Stata to confirm the results in part (ii)). (See Ch6_4.do)

(i) Holding all other factors fixed we have


log(wage) = 1 educ + 2 educ pareduc = (1 + 2 pareduc) educ .
Dividing both sides by educ gives the result. The sign of 2 is not obvious, although 2 > 0 if we
think a child gets more out of another year of education the more highly educated are the childs
parents.

(ii) We use the values pareduc = 32 and pareduc = 24 to interpret the coefficient on
educpareduc. The difference in the estimated return to education is .00078(32 24) = .0062, or
about .62 percentage points. (Percentage points are changes in percentages.)

(iii) When we add pareduc by itself, the coefficient on the interaction term is negative. The t
statistic on educpareduc is about 1.33, which is not significant at the 10% level against a two-
sided alternative. Note that the coefficient on pareduc is significant at the 5% level against a
two-sided alternative. This provides a good example of how omitting a level effect (pareduc in
this case) can lead to biased estimation of the interaction effect.

Confirming the results in part (ii) provides an opportunity to illustrate some further Stata
features. In Ch6_4.do we illustrate:
the tabulate command (tab abbreviation works)
the generate command (gen abbreviation works)
the if qualifier used to limit analysis to observations satisfying some condition
Statas treatment of missing observations

. * Read data from Stata file


. * Because parents education & its interaction with own education are not
. * in the data they need to be generated.
. use wage2.dta
. gen pareduc=meduc+feduc //Note Stata message about missing values
(213 missing values generated)

. gen ed_ped=educ*pareduc //For some people we don't know parents education


(213 missing values generated)

. tab pareduc //tabulate command useful in summarizing categorical data

pareduc | Freq. Percent Cum.


------------+-----------------------------------
2 | 1 0.14 0.14
5 | 3 0.42 0.55
6 | 2 0.28 0.83
7 | 4 0.55 1.39
8 | 3 0.42 1.80
9 | 1 0.14 1.94
10 | 12 1.66 3.60
11 | 8 1.11 4.71
12 | 12 1.66 6.37
13 | 15 2.08 8.45
14 | 23 3.19 11.63
15 | 17 2.35 13.99
16 | 60 8.31 22.30
17 | 21 2.91 25.21
18 | 31 4.29 29.50
19 | 31 4.29 33.80
20 | 73 10.11 43.91
21 | 39 5.40 49.31
22 | 58 8.03 57.34
23 | 34 4.71 62.05
24 | 138 19.11 81.16
25 | 17 2.35 83.52
26 | 29 4.02 87.53
27 | 6 0.83 88.37
28 | 27 3.74 92.11
29 | 12 1.66 93.77
30 | 13 1.80 95.57
31 | 4 0.55 96.12
32 | 17 2.35 98.48
33 | 4 0.55 99.03
34 | 3 0.42 99.45
35 | 2 0.28 99.72
36 | 2 0.28 100.00
------------+-----------------------------------
Total | 722 100.00
.
. * Run regression from Ch 6 Q6(ii)
. * ed_ped only defined when pareduc available so lets only use complete cases
. * Regression only uses observations that satisfy the condition pareduc>1
. reg lwage educ ed_ped exper tenure if pareduc>1

Source | SS df MS Number of obs = 722


-------------+---------------------------------- F(4, 717) = 36.44
Model | 21.4253649 4 5.35634121 Prob > F = 0.0000
Residual | 105.386551 717 .146982637 R-squared = 0.1690
-------------+---------------------------------- Adj R-squared = 0.1643
Total | 126.811916 721 .175883378 Root MSE = .38338

------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0467522 .0104767 4.46 0.000 .0261835 .067321
ed_ped | .000775 .0002107 3.68 0.000 .0003612 .0011887
exper | .018871 .0039429 4.79 0.000 .0111299 .026612
tenure | .0102166 .0029938 3.41 0.001 .0043391 .0160942
_cons | 5.646519 .1295593 43.58 0.000 5.392158 5.90088
------------------------------------------------------------------------------

.
. * While this if command may be useful it is actually not required here
. * Using complete cases is what Stata does by default
. reg lwage educ ed_ped exper tenure

Source | SS df MS Number of obs = 722


-------------+---------------------------------- F(4, 717) = 36.44
Model | 21.4253649 4 5.35634121 Prob > F = 0.0000
Residual | 105.386551 717 .146982637 R-squared = 0.1690
-------------+---------------------------------- Adj R-squared = 0.1643
Total | 126.811916 721 .175883378 Root MSE = .38338

------------------------------------------------------------------------------
lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
educ | .0467522 .0104767 4.46 0.000 .0261835 .067321
ed_ped | .000775 .0002107 3.68 0.000 .0003612 .0011887
exper | .018871 .0039429 4.79 0.000 .0111299 .026612
tenure | .0102166 .0029938 3.41 0.001 .0043391 .0160942
_cons | 5.646519 .1295593 43.58 0.000 5.392158 5.90088
------------------------------------------------------------------------------
Q3. Wooldridge Ch6 QC8 (See Ch6_C8.do)
(i) The estimated equation (where price is in dollars) is
= 21,770.3 + 2.068 lotsize + 122.78 sqrft + 13,852.5 bdrms

(29,475.0) (0.642) (13.24) (9,010.1)
n = 88, R2 = .672, adjusted-R2 = .661, = 59,833.

. * Read data & run regression


. use hprice1.dta

. reg price lotsize sqrft bdrms

Source | SS df MS Number of obs = 88


-------------+---------------------------------- F(3, 84) = 57.46
Model | 617130.701 3 205710.234 Prob > F = 0.0000
Residual | 300723.805 84 3580.0453 R-squared = 0.6724
-------------+---------------------------------- Adj R-squared = 0.6607
Total | 917854.506 87 10550.0518 Root MSE = 59.833

------------------------------------------------------------------------------
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lotsize | .0020677 .0006421 3.22 0.002 .0007908 .0033446
sqrft | .1227782 .0132374 9.28 0.000 .0964541 .1491022
bdrms | 13.85252 9.010145 1.54 0.128 -4.065141 31.77018
_cons | -21.77031 29.47504 -0.74 0.462 -80.38466 36.84405
------------------------------------------------------------------------------
.
. * price is in $'000. What happens if price was in $?
. gen pricenew=1000*price

. reg pricenew lotsize sqrft bdrms

Source | SS df MS Number of obs = 88


-------------+---------------------------------- F(3, 84) = 57.46
Model | 6.1713e+11 3 2.0571e+11 Prob > F = 0.0000
Residual | 3.0072e+11 84 3.5800e+09 R-squared = 0.6724
-------------+---------------------------------- Adj R-squared = 0.6607
Total | 9.1785e+11 87 1.0550e+10 Root MSE = 59833

------------------------------------------------------------------------------
pricenew | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lotsize | 2.067707 .6421258 3.22 0.002 .790769 3.344644
sqrft | 122.7782 13.23741 9.28 0.000 96.45415 149.1022
bdrms | 13852.52 9010.145 1.54 0.128 -4065.14 31770.18
_cons | -21770.31 29475.04 -0.74 0.462 -80384.66 36844.04
------------------------------------------------------------------------------

There is simply a rescaling of the coefficients leaving t-statistics, p-values and R-squared values
the same. This is the same regression just with interpretation of effects now expressed in
dollars rather than thousands of dollars.

The predicted price at lotsize = 10,000, sqrft = 2,300, and bdrms = 4 can be simply obtained from
the regression of pricenew on (lotsize 10,000), (sqrft 2,300), and (bdrms 4) to yield about
$336,707.
. * For (i)-(iii) need transformed explanatory variables
. gen lotsize0=lotsize-10000
. gen sqrft0=sqrft-2300
. gen bdrms0=bdrms-4

. regress pricenew lotsize0 sqrft0 bdrms0

Source | SS df MS Number of obs = 88


-------------+---------------------------------- F(3, 84) = 57.46
Model | 6.1713e+11 3 2.0571e+11 Prob > F = 0.0000
Residual | 3.0072e+11 84 3.5800e+09 R-squared = 0.6724
-------------+---------------------------------- Adj R-squared = 0.6607
Total | 9.1785e+11 87 1.0550e+10 Root MSE = 59833

------------------------------------------------------------------------------
pricenew | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
lotsize0 | 2.067707 .6421258 3.22 0.002 .790769 3.344644
sqrft0 | 122.7782 13.23741 9.28 0.000 96.45415 149.1022
bdrms0 | 13852.52 9010.145 1.54 0.128 -4065.14 31770.18
_cons | 336706.7 7374.466 45.66 0.000 322041.7 351371.6
------------------------------------------------------------------------------

(ii) We want the intercept estimate and the associated 95% CI from this regression. The CI is
approximately 336,706.7 14,665, or about $322,042 to $351,372 when rounded to the nearest
dollar.

(iii) We must use equation (6.36) to obtain the standard error of and then use equation (6.37)
(assuming that price is normally distributed). But from the regression in part (ii), se( 0 )
7,374.5 and 59,833. Therefore, se( 0 ) [(7,374.5)2 + (59,833) 2]1/2 60,285.8. Using 1.99
as the approximate 97.5th percentile in the t84 distribution gives the 95% CI for price0, at the
given values of the explanatory variables, as 336,706.7 1.99(60,285.8) or, rounded to the
nearest dollar, $216,738 to $456,675. This is a fairly wide prediction interval. But we have not
used many factors to explain housing price. If we had more, we could, presumably, reduce the
error standard deviation, and therefore , to obtain a tighter prediction interval.

Q4. Project discussion


a) Complete the discussion started in the previous tutorial about the key research
problem, how the problem might be addressed and what data would be needed.
b) Produce a single tutorial Request for data document and by the end of the day of the
tutorial email it to [email protected]

You might also like