Assignment 1
Assignment 1
1. What is the role of the stochastic error term ui in regression analysis? What is
the difference between the stochastic error term and the residual ûi ?
2. Are the following models linear regression models? Why or why not? a. Yi =
eβ1 +β2 Xi +ui
1
b. Yi = 1+eβ1 +β2 Xi +ui
c. . lnYi = β1 + β2 ( X1i ) + ui
d. Yi = β1 + (0.75 − β1 )e−β2 (Xi −2) + ui
e. Yi = β1 + β23 Xi + ui
3. Consider the following nonstochastic models (i.e., models without the stochastic
error term). Are they linear regression models? If not, is it possible, by suitable
algebraic manipulations, to convert them into linear models?
1
a. Yi = β1 +β2 Xi
Xi
b. Yi = β1 +β2 Xi
1
c. Yi = 1+e−β1 −β2 Xi
4. Given the assumptions in column 1 of the table, show that the assumptions in
column 2 are equivalent to them.
Assumptions of the Classical Model
(1) (2)
E(ui |Xi ) = 0 E(Yi |Xi ) = β2 + β2 X
cov(ui , uj ) = 0 i ̸= j cov(Yi , Yj ) = 0 i ̸= j
2
var(ui |Xi ) = σ var(Yi |Xi ) = σ 2
1
5. Consider the following formulations of the two-variable PRF:
Model I: Yi = β1 + β2 Xi + ui
Model II: Yi = α1 + α2 (Xi − X) + ui
a. Find the estimators of β1 and α1 . Are they identical? Are their variances
identical?
b. Find the estimators of β2 and α2 . Are they identical? Are their variances
identical?
c. What is the advantage, if any, of model II over model I?
yi = β̂1 + β̂2 xi + ui
8. If X1 , X2 , and X3 are uncorrelated variables each having the same standard de-
viation, show that the coefficient of correlation between X1 + X2 and X2 + X3 is
1
equal to 2
. Why is the correlation coefficient not zero?
2
9. Regression without any regressor. Suppose you are given the model: Yi = β1 + ui .
Use OLS to find the estimator of β1 . What is its variance and the RSS? Does
the estimated β1 make intuitive sense? Now consider the two-variable model Yi =
β1 + β2 Xi + ui . Is it worth adding Xi to the model? If not, why bother with
regression analysis?
10. The relationship between nominal exchange rate and relative prices. From annual
observations from 1985 to 2005, the following regression results were obtained,
where Y = exchange rate of the Canadian dollar to the U.S. dollar ( CD
$
) and X =
ratio of the U.S. consumer price index to the Canadian consumer price index; that
is, X represents the relative prices in the two countries:
se = 0.096
a. Interpret this regression. How would you interpret r2 ?
b. Does the positive value of Xt make economic sense? What is the underlying
economic theory?
c. Suppose we were to redefine X as the ratio of the Canadian CPI to the U.S.
CPI. Would that change the sign of X? Why?
11. Following table gives data on nominal and real gross domestic product (GDP) for
the United States for the years 1959–2005.
a. Plot the GDP data in current and constant (i.e., 2000) dollars against time.
b. Letting Y denote GDP and X time (measured chronologically starting with 1
for 1959, 2 for 1960, through 47 for 2005), see if the following model fits the GDP
data:
Yt = β1 + β2 Xt + ut
3
c. How would you interpret β2 ?
d. If there is a difference between β2 estimated for current-dollar GDP and that
estimated for constant-dollar GDP, what explains the difference?
e. From your results what can you say about the nature of inflation in the United
States over the sample period?
4
(billions of dollars, except as noted; quarterly data at seasonally adjusted annual
rates; RGDP in billions of chained [2000] dollars)
Source: Economic Report of the President, 2007. Table B-1 and B-2.
12. Let kids denote the number of children ever born to a woman, and let educ denote
years of education for the woman. A simple model relating fertility to years of
education is
kids = β0 + β1 educ + u,
14. The data set BWGHT.RAW contains data on births to women in the United
States. Two variables of interest are the dependent variable, infant birth weight
in ounces (bwght), and an explanatory variable, average number of cigarettes the
mother smoked per day during pregnancy (cigs). The following simple regression
was estimated using data on n = 1,388 births:
ˆ = 119.77 − 0.514cigs
bwght
a. What is the predicted birth weight when cigs = 0? What about when cigs = 20
(one pack per day)? Comment on the difference.
b. Does this simple regression necessarily capture a causal relationship between
the child’s birth weight and the mother’s smoking habits? Explain.
5
c. To predict a birth weight of 125 ounces, what would cigs have to be? Comment.
d, The proportion of women in the sample who do not smoke while pregnant is
about 0.85. Does this help reconcile your finding from part c.
15. Using data from 1988 for houses sold in Andover, Massachusetts, from Kiel and
McClain (1995), the following equation relates housing price (price) to the distance
from a recently built garbage incinerator (dist):
ˆ
log(price) = 9.40 + 0.312log(dist)
n = 135 R2 = 0.162.
a. Interpret the coefficient on log(dist). Is the sign of this estimate what you
expect it to be?
b. Do you think simple regression provides an unbiased estimator of the ceteris
paribus elasticity of price with respect to dist? (Think about the city’s decision
on where to put the incinerator.)
c. What other factors about a house affect its price? Might these be correlated
with distance from the incinerator?
where e is a random variable with E(e) = 0and V ar(e) = σe2 . Assume that e is
independent of inc.
a. Show that E(u|inc) = 0, so that the key zero conditional mean assumption
(Assumption SLR.4) is satisfied. [Hint: If e is independent of inc, then E(e|inc) =
E(e).]
b. Show that V ar(u|inc) = σe2 inc, so that the homoskedasticity Assumption
SLR.5 is violated. In particular, the variance of sav increases with inc. [Hint:
V ar(e|inc) = V ar(e), if e and inc are independent.]
6
c. Provide a discussion that supports the assumption that the variance of savings
increases with family income.
17. Consider the standard simple regression model y = β0 + β1 x + u under the Gauss-
Markov Assumptions SLR.1 through SLR.5. The usual OLS estimators βˆ0 andβˆ1
are unbiased for their respective population parameters. Let β˜0 be the estimator
of β1 obtained by assuming the intercept is zero (see Section 2.6).
a. Find E(β˜1 ) in terms of the xi ,β0 , and β1 . Verify that β˜1 is unbiased for β1 when
the population intercept (β0 ) is zero. Are there other cases whereβ˜1 is unbiased?
b. Find the variance of β˜1 . (Hint: The variance does not depend on β0 .)
c. Show that V ar(β˜1 ) ≤ V ar(βˆ1 ). [Hint: For any sample of data, i=1,n x2i ≥
P
d. Comment on the trade off between bias and variance when choosing between
βˆ1 and β˜1 .