Econ301 Final
Econ301 Final
The Part I has 10 True/False & Fill in the Blanks questions. Each question is
worth 3 pts.
For Part II, use the space below to write your answers.
Lecture notes, cheat sheets, other course materials and a calculator are allowed.
2) If we were to exclude the variable X and run a regression of Y on a column of ones, we will end
up with the intercept.
4) Including an irrelevant variable in a model has no effect on the unbiasedness of the intercept
and other slope estimators.
Use this part to answer Q5-Q6. Consider a 2-variable regression model, Yi = βˆ0 + βˆ1 X1i + βˆ2 X2i + ûi .
2
The variance of β̂2 is given by σβ̂2 = P(X σ−X̄ )2 1−p
1
2 .
2 2i 2 12
6) The latter term in σβ̂2 is called variance inflation factor which is high when the absolute value
2
of correlation among regressors is low.
7) To jointly test the restrictions of a regression model, we calculate the t-test values using the R2
of restricted and unrestricted models.
8) Heteroscedasticity does not alter the unbiasedness and consistency properties of OLS estimators.
10) offers a plausible way eliminate the bias from simultaneous causal-
ity, error-in-variables, omitted unobserved variables.
Page 2
PART II - Questions (70 Points)
Question 1 (15 Points) Summarize the cartoon using the concepts, i.e., omitted variable bias, un-
derfitting and/or overfitting of a model and others, you learned in Econ 301. Answer the cartoonist
to explain the benefits of using statistics and econometrics.
Page 3
Question 2 (15 Points) A researcher would like to run a regression of Y on X; the observations
for {Y, X} pairs are plotted in the following figure. The researcher runs the regression and obtains
the regression line (solid line drawn on the figure). When the researcher conducts the hypothesis
testing, he/she notices that the intercept is statistically insignificant, while the slope is significant.
The insignificant intercept disappoints the researcher and he/she runs another regression without
an intercept and obtains the dashed regression line. He/she also notices that the slope is statistically
significant in both models.
a) Consider the regression model with intercept. In what interval would you expect that the
t-value of intercept to fall? (5 points)
b) We almost always include a constant term when estimating regressions. Why? (5 points)
Page 4
c) Because the null hypothesis of β0 = 0 is not rejected in the first regression, we can argue
that the intercept of the model is zero. Then, the models with and without intercept are equal.
True/False/Explain. (5 points)
Page 5
Question 3 (20 Points) The following figure and table provides details of two regression models
of student achievement (district average) on average district income.
Dependent variable:
T estScore
(1) (2)
Income 1.879∗∗∗ 3.851∗∗∗
(0.091) (0.304)
Income2 −0.042∗∗∗
(0.006)
Page 6
a) Match the regression functions (in figure) with the models in the table. (5 points)
b) Do you think both models suffers from heteroscedasticity problem? What is your solution
to this problem? (5 points)
c) Do you think absence of “Squared-Income” in the first model causes omitted variable bias?
(5 points)
d) Suppose that while coding the model 2 in R, I made a mistake and wrote “Income ∗ 2”
instead of “Incomeˆ2”. What is the implication of this mistake? (5 points)
Page 7
Question 4 (20 Points) The following table provides a simulated dataset of 20 observations.
Table 2: Observations
a) How do you measure the impact of gender on the earnings? Write down the regression model
(5 points)
b) Generate a dummy variable called M ale that it takes one for male individuals and zero
otherwise. List the observations for variable M ale. (5 points)
Page 8
c) To study the regional differences in earnings, we need to construct dummy variables. To
avoid dummy variable trap, what is the maximum number of dummy variables to be generated?
(5 points)
d) Suppose that a researcher runs the regression log(W age) = β0 + β1 log(Experience). Inter-
pret the β1 . (5 points)
Page 9