0% found this document useful (0 votes)
4 views

Tutorial Questions

The document contains a series of regression analysis questions related to earnings, labor force participation, and housing prices, involving various demographic and economic variables. It includes tasks such as interpreting coefficients, testing for significance, and addressing omitted variable bias. The questions require the application of statistical concepts and hypothesis testing to analyze the relationships between the variables presented.

Uploaded by

y67374383
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Tutorial Questions

The document contains a series of regression analysis questions related to earnings, labor force participation, and housing prices, involving various demographic and economic variables. It includes tasks such as interpreting coefficients, testing for significance, and addressing omitted variable bias. The questions require the application of statistical concepts and hypothesis testing to analyze the relationships between the variables presented.

Uploaded by

y67374383
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Questions for tutorial.

Q1.

A data set consists of information on 7178 full-time, full-year workers. The highest educational
achievement for each worker is either a high school diploma or a bachelor’s degree. The workers’
ages range from 25 to 34 years. The data set also contains information on the region of the country
where the person lives, their marital status and number of children. Let AHE be the average hourly
earnings; College a variable to indicate if the individual went to college; Female a variable to indicate
if the individual is a female; Age represent age in full years; and North, South, East and West,
represent the area of the country that the individual lives in.

Results of regression of average hourly earnings on gender, education, and other variables
Regressor (1) (2) (3)
College (x1) 10.47 10.44 10.42
(0.29) (0.29) (0.29)
Female (x2) -4.69 -4.56 -4.57
(0.29) (0.29) (0.29)
Age (x3) - 0.61 0.61
(0.05) (0.05)
North (x4) - - 0.74
(0.47)
West (x5) - - -1.54
(0.40)
South (x6) - - -0.44
(0.37)
Intercept 18.15 0.11 0.33
(0.19) (1.46) (1.47)
Summary Statistics
SSR 12.15 12.03 12.01
R2 0.165 0.182 0.185
N 7178 7178 7178

a. Is age an important determinant of earnings? Use an appropriate test to explain your answer.
(2)
b. Amna is a 29-year-old college graduate, Nasreen is a 34-year-old college graduate. Construct
a 95% interval for the expected difference between their earnings. [Hint: what is the
difference in their ages? How would you test if that difference is significant?]
(3)

c. Are regional differences in earnings important? Use an appropriate hypothesis test to explain
your answer. (3)

d. Ahmed is a 28-year-old engineer from the South; Bashir is a 28-year-old engineer from the
North. How would you test if the effect of the region they are from is the same? Specify each
step (you do not have to derive a test value, just explain the test process).
(4)
Q2. Consider the regression model Yi = β0 + β1X1i + β2X2i + β3X1i* X2i + β4X1i2 + Ui. Derive an expression
for:
a. The value of X1 or X2 when the changes in Y due to X1 and X2 are identical. (4)
b. Assume β4= 0. Show that the change in Y when X1 changes by ΔX1 and X2 changes by ΔX2 is (β1
+ β3X2) ΔX1 + (β2 + β3X1) ΔX2i + β3ΔX1iΔX2i (3)

Q3. The following equation measures labor force participation for 78 cities (standard errors in
parentheses):

Li= 94.2-0.24Ui+0.20Ei-0.691Ii-0.06Si+0.002Ci-0.80Di
(0.08) (0.06) (0.16) (0.18) (0.03) (0.53)

N = 78 R2 = 0.51
Where Li = percent labor force participation (male aged 25 – 54) in the ith
city.
Ui = percent unemployed rate in the ith city
Ei = average earnings (hundreds of dollars/year) in the ith city
Ii = average other income (hundreds of dollars/year) in the ith city.
Si = average schooling completed (years) in the ith city
Ci = percent of the labor force that is non-white in the ith city
Di = a dummy equal to 1 if the city is in the South and 0 otherwise.

a. Interpret the estimated coefficients of C and D. What do they mean? (5)


b. How likely is perfect multicollinearity in this equation? Explain what leads to this concern in
the current set up. (5)
c. Suppose that you were told that the data for this regression were from one decade and that
the estimates on the data from another decade yielded a much different coefficient of the
dummy variable. Would this imply that one of the estimates would be biased? If not, why
not? If so, how would you determine which year’s estimate is biased?
(2)
d. Comment on the following statement: “I know that these results are not BLUE because the
average participation rate of 94.2 percent is way too high.” Do you agree or disagree?
(3)

Q4. A researcher plans to study the causal effect of police on crime using data from a random sample
of U.S. counties. He plans to regress the county’s crime rate on the (per capita) size of the
county’s police force.

i. Explain why this regression is likely to suffer from omitted variable bias. Which variables
would you add to the regression to control for important omitted variables? Would the
number of males in the population affect the crime rate? (3)

ii. Use your answer to (i) and the expression for omitted variable bias to determine whether
the regression is likely to over or underestimate the effect of police on crime rate.
(3)
Q5. Consider the following regression of house price on number of bedrooms (BDR), number of
bathrooms (Bath), house size (Hsize), size of the Lot (Lsize), age of the seller (0.090) and
income status of the seller (where standard errors are shown below the coefficient in
parentheses):
̂ = 119.2 + 0.485𝐵𝐷𝑅 + 23.4 𝐵𝑎𝑡ℎ + 0.156𝐻𝑠𝑖𝑧𝑒 + 0.002𝐿𝑠𝑖𝑧𝑒
𝑃𝑟𝑖𝑐𝑒
(23.9) (2.61) (8.94) (0.011) (0.00048)

+ 0.090𝐴𝑔𝑒 − 48.8𝑃𝑜𝑜𝑟; 𝑅 − 𝑠𝑞 = 0.72, 𝑆𝑆𝑅 = 41.5, n = 450

(0.311) (10.5)

i. Is the coefficient on BDR, statistically different from 0. Typically, 5-bedroom houses sell for
more than 2-bedroom houses. Is this consistent with your finding? (3)

ii. A homeowner purchases an additional 2000 square feet from a neighboring lot. Construct
a 99% confidence interval for the change in the value of her house. (2)
iii. The R-sq from omitting BDR and Age from the regression is 0.71. Are BDR and Age jointly
statistically significant for explaining house prices? (4)
iv. How would you test if the effect of the number of bedrooms is the same as that of size of
the house? Explain what hypothesis you will test and how (Hint: You do not have to find a
test statistic, but describe all steps to explain how you would run this test).
(4)
-----------------------------------

Q6. Assume a simple linear regression where MLR.1 – 4 hold. Show that if the first four assumptions
are satisfied and given the slope coefficient is consistent, then OLS provides a consistent estimator of
the intercept term. (5)

You might also like