Assignment 1 With Answers PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

THE CHINESE UNIVERSITY OF HONG KONG, SHENZHEN

2022 - 2023 TERM 2

ECO 3121 Introductory Econometrics


ASSIGNMENT 1 ANSWERS

TOPIC: Simple linear regression model.

INSTRUCTIONS:
• Please label clearly each answer with the appropriate question number and letter. Securely
staple all answer sheets together, and make certain that your name(s) and student number(s)
are printed clearly at the top of each answer sheet.
• Please use STATA to do Question 1, and report your STATA commands and results
together with your answers to the questions.
• Hand-written answers must be legible. Illegible assignments will be returned unmarked.
• Please combine your answers with supporting documents into one Adobe PDF file and
submit.

DUE DATE: 5PM Friday February 24, 2023


Please submit your work on Blackboard. Late submissions will receive a 0 with no excuses.

MARKING: Marks for each question are indicated in parentheses. Total marks for the assignment
equal 90. Marks are given for both content and presentation.
Question 1 (25 marks)

Data file: 3121A1.dta (or 3121A1.csv)


Data Description: A random sample of 436 employees drawn from the 1976 U.S. population of
all employed paid workers.
Variable Definitions:
𝑤𝑎𝑔𝑒𝑖 = average hourly earnings of worker i in 1976, in dollars per hour.
𝑒𝑑𝑢𝑐𝑖 = years of formal education completed by worker i, in years.
𝑓𝑒𝑚𝑎𝑙𝑒𝑖 = an indicator variable equal to 1 if worker i is female, and 0 if worker i is male.

(5 marks)
1. Compile a table of descriptive summary statistics for the sample data. The table should include
for each of the variables in the dataset: the sample mean, the sample standard deviation, the
minimum sample value, and the maximum sample value. How many females and how many
males are there in the sample?

(1 mark) per column in table, except Obs.


. sum wage educ female

Variable Obs Mean Std. Dev. Min Max

wage 436 6.051216 3.795647 .53 25


educ 436 12.67202 2.660956 0 18
female 436 .4380734 .4967202 0 1

. tab1 female, missing

-> tabulation of female

female Freq. Percent Cum.

0 245 56.19 56.19


1 191 43.81 100.00

Total 436 100.00

Number of females in the sample = 191 (0.5 mark)


Number of males in the sample = 245 (0.5 mark)

(25 marks)
2. Compute and present OLS estimates of the following population regression equation for the
full sample of 436 paid workers:

𝑤𝑎𝑔𝑒𝑖 = 𝛽0 + 𝛽1 𝑒𝑑𝑢𝑐𝑖 + 𝑢𝑖 (1)

where 𝑢𝑖 is a random error term that is assumed to satisfy all the assumptions of the classical
linear regression model.
(5 marks)
a) Report the OLS coefficient estimates 𝛽̂0 and 𝛽̂1 computed by estimating population
regression equation (1).
. reg wage educ

Source SS df MS Number of obs = 436


F( 1, 434) = 88.48
Model 1061.27825 1 1061.27825 Prob > F = 0.0000
Residual 5205.739 434 11.9947903 R-squared = 0.1693
Adj R-squared = 0.1674
Total 6267.01726 435 14.4069362 Root MSE = 3.4633

wage Coef. Std. Err. t P>|t| [95% Conf. Interval]

educ .5869922 .0624042 9.41 0.000 .4643401 .7096443


_cons -1.38716 .807995 -1.72 0.087 -2.97523 .2009096

̂0 = −1.38716
𝛽 (2.5 mark)
̂1 = 0.5869922
𝛽 (2.5 mark)

(5 marks)
b) Interpret the value of the slope coefficient estimate 𝛽̂1 ; i.e., explain in words what the
numerical value of 𝛽̂1 means.

(Answer must not be just a generic description of the slope coefficient estimate; it must
explicitly account for the units in which wage and educ are measured.)

wage is measured in dollars per hour; educ is measured in years.

̂1 = 0.5870 means that a 1-year increase in education is


Therefore, the estimate 𝛽
associated with an increase in average hourly wages equal to 𝟎. 𝟓𝟖𝟕𝟎 dollars per
hour. (5 marks)

(5 marks)
c) Interpret the value of the intercept coefficient estimate 𝛽̂0 ; i.e., explain in words what the
numerical value of 𝛽̂0 means.

̂0 = −1.3872 means that the average (mean) hourly wage rate of workers
The estimate 𝛽
with zero years of education (educ = 0) equals −𝟏. 𝟑𝟖𝟕𝟐 dollars per hour. (5 marks)

(5 marks)
d) On a set of appropriately labeled coordinate axes, draw the estimated sample regression
function implied by OLS estimation of regression equation (1). That is, draw the graph of
̂ 𝑖 = 𝛽̂0 + 𝛽̂1 𝑒𝑑𝑢𝑐𝑖 , compute the coordinates of the two points on it that
the equation 𝑤𝑎𝑔𝑒
correspond to the values 12 and 16 of 𝑒𝑑𝑢𝑐𝑖 and label these two points on your graph as A
and B respectively. (Note: you do not need to use STATA, or any software program, to
draw and label this graph.)

The two points have the following coordinates:


Point A: For 𝑒𝑑𝑢𝑐𝑖 = 12 years, the estimated mean of average hourly earnings equals:

̂ 𝑖 = 𝛽̂0 + 𝛽̂1 𝑒𝑑𝑢𝑐𝑖 = −1.3872 + 0.5870(12) = 𝟓. 𝟔𝟓𝟔𝟖 𝐝𝐨𝐥𝐥𝐚𝐫𝐬 𝐩𝐞𝐫 𝐡𝐨𝐮𝐫


𝑤𝑎𝑔𝑒
= $ 𝟓. 𝟔𝟔 per hour (1 mark)

Point B: For 𝑒𝑑𝑢𝑐𝑖 = 16 years, the estimated mean of average hourly earnings equals:

̂ 𝑖 = 𝛽̂0 + 𝛽̂1 𝑒𝑑𝑢𝑐𝑖 = −1.3872 + 0.5870(16) = 𝟖. 𝟎𝟎𝟒𝟖 𝐝𝐨𝐥𝐥𝐚𝐫𝐬 𝐩𝐞𝐫 𝐡𝐨𝐮𝐫


𝑤𝑎𝑔𝑒
= $ 𝟖. 𝟎𝟎𝟒𝟖 per hour (1 mark)

̂ 𝑖 = 𝛽̂0 + 𝛽̂1 𝑒𝑑𝑢𝑐𝑖 = −1.3872 + 0.5870𝑒𝑑𝑢𝑐𝑖


Figure 1: Line graph of 𝑤𝑎𝑔𝑒
(3 marks) total: 2 marks for correct line graph; 1 mark for labeling points A and B
10

A
5
0

0 5 10 15 20
educ = year of education
Question 2 (35 marks)

A researcher is using data for a sample of 88 houses sold in an urban area during a recent year to
investigate the relationship between house prices 𝑦𝑖 (measured in thousands of dollars) and house
size 𝑥𝑖 (measured in square meters). Preliminary analysis of the sample data produces the
following sample information:

𝑛 = 88 ∑𝑛𝑖=1 𝑦𝑖 = 25,832.05 ∑𝑛𝑖=1 𝑥𝑖 = 16462.34 ∑𝑛𝑖=1 𝑦𝑖2 = 8,500,750.69

∑𝑛𝑖=1 𝑥𝑖2 = 3,329,789.6 ∑𝑛𝑖=1 𝑥𝑖 𝑦𝑖 = 5,209,990.7 ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )( 𝑦𝑖 − 𝑦̅) = 377,534.76

∑𝑛𝑖=1(𝑦𝑖 − 𝑦̅)2 = 917,854.51 ∑𝑛𝑖=1(𝑥𝑖 − 𝑥̅ )2 = 250,144.32 ∑𝑛𝑖=1 𝑢̂𝑖 2 = 348,053.43

Use the above sample information to answer all the following questions. Show explicitly all
formulas and calculations.

(12 marks)
(a) Use the above information to compute OLS estimates of the intercept coefficient 𝛽0 and the
slope coefficient 𝛽1
𝑛
̂1 = ∑𝑖=1(𝑥
𝛽 𝑖 −𝑥̅ )( 𝑦𝑖 −𝑦
̅)
=
377,534.76
= 1.509268 = 𝟏. 𝟓𝟎𝟗𝟑 (6 marks)
∑𝑛 (𝑥𝑖=1 )2
𝑖 −𝑥̅ 250,144.32

̂0 = 𝑦̅ − 𝛽
𝛽 ̂1 𝑥̅

∑𝑛
𝑖=1 𝑦𝑖 25,832.05 ∑𝑛
𝑖=1 𝑥𝑖 16,462.34
𝑦̅ = = = 293.546 and 𝑥̅ = = = 187.072
𝑛 88 𝑛 88

Therefore
̂0 = 𝑦̅ − 𝛽
𝛽 ̂1 𝑥̅ = 293.546 − 1.509268 ∗ 187.072 = 293.546 − 282.342 = 𝟏𝟏. 𝟐𝟎𝟒 (6 marks)

(5 marks)
(b) Interpret the slope coefficient estimate you calculated in part (a) -- i.e., explain what the
numeric value you calculated for 𝛽̂1 means.

̂1 = 𝟏. 𝟓𝟎𝟗𝟑. 𝑦𝑖 is measured in thousands of dollars, and 𝑥𝑖 is measured in square


Note: 𝛽
meters.

̂1 means that an increase (decrease) in house size of 1 square meter is


The estimate 1.5093 of 𝛽
associated on average with an increase (decrease) in house price of 1.5093 thousands of dollars,
or 1,509.3 dollars.
(6 marks)
(c) Calculate an estimate of 𝜎 2 , the error variance.

𝑅𝑅𝑆 ∑𝑛 ̂𝑖 2
𝑖=1 𝑢 348,053.43
𝜎̂ 2 = = = = 𝟒, 𝟎𝟒𝟕. 𝟏𝟑𝟑
𝑛−2 𝑛−2 88−2

(6 marks)
(d) Compute the value of 𝑅2 , the coefficient of determination for the estimated OLS sample
regression equation. Briefly explain what the calculated value of 𝑅2 means.

𝑆𝑆𝐸 = 𝑆𝑆𝑇 − 𝑆𝑆𝑅 = ∑𝑛𝑖=1(𝑦𝑖 − 𝑦̅)2 − ∑𝑛𝑖=1 𝑢̂𝑖 2 = 917,854.51 − 348,053.43 = 569,801.08

𝑆𝑆𝐸 569,801.08
𝑅2 = = = 𝟎. 𝟔𝟐𝟎𝟖 (4 marks)
𝑆𝑆𝑇 917,854.51

Interpretation of 𝑹𝟐 = 𝟎. 𝟔𝟐𝟎𝟖: The value of 0.6208 indicates that 62.08 percent of the total
sample variation in house prices is attributable to, or explained by, the model. (2 marks)

(6 marks)
(e) What are the values of ∑𝑛𝑖=1 𝑢̂𝑖 and ∑𝑛𝑖=1 𝑥𝑖 𝑢̂𝑖 for the sample regression equation you have
estimated? Explain briefly how you obtained your answer.

∑𝑛𝑖=1 𝑢̂𝑖 = 0 (2 marks)

∑𝑛𝑖=1 𝑥𝑖 𝑢̂𝑖 = 0 (2 marks)

These computational properties of the OLS sample regression equation follow from the first-order
conditions for the OLS coefficient estimators. (2 marks)
Question 3 (30 marks)

Derive the Ordinary Least Squares (OLS) estimate for the simple linear regression model, i.e., 𝛽̂0
and 𝛽̂1 . Be very specific.

Deriving the OLS estimates


The first-order conditions (FOCs) for a minimum of the RSS function by setting the partial
derivatives equal to zero:

we can get:

(1)

(2)

To solve the equations, pass the summation operator through the equation (1):

So

and plug this into the equation (2) (and drop the division by n):
simple algebra gives

If we can write

You might also like