Econ321 2017 Tutorial 1
Econ321 2017 Tutorial 1
Answer:
• Both estimators are unbiased:
E (Y ) =
1 n
E (Yi ) = [E (Y1 ) + E (Y2 ) + ... + E (Yn )]
1
∑
n i =1 n
= [µ + µ + ... + µ ] = nµ = µ
1 1
n n
()
2
E Y = ∑ E (Yi ) = [µ + µ ] = 2 µ = µ
~ 1 1 1
2 i =1 2 2
• Y is consistent by the Law of Large Numbers (recall the LLN).
~
Y does not change when we increase the sample size, it does not keep improving,
so it cannot be a consistent estimator.
Var (Y ) =
n
σ2
n
1 2 2 2
[1
= 2 σ + σ + ... + σ = 2 nσ =
n
2
n
]
() [ σ
]
2 2
Var Y = ∑Var (Yi ) = σ 2 + σ 2 = 2σ 2 =
~ 1 1 1
4 i =1 4 4 2
~
So the variance of Y is smaller than the variance of Y as long as n is larger than 2. In
this case the sample mean is more efficient.
σ2 σ2
< iff n > 2
n 2
2. Suppose Yi ~ NID (µ , σ ) where i = 1, 2, …, n. An estimator for σ
2 2
is given by
1 n
σˆ 2 = ∑ ((Yi − µ ) − (Y − µ ))2
n − 1 i =1
1 n
= ∑ ( yi − y )2
n − 1 i =1
1 n 2 n 2 n
σˆ 2 = ∑
n − 1 i =1
y i + ∑
i =1
y − 2 ∑
i =1
yi y
1 n 2
σˆ 2 = ∑
n − 1 i =1
yi + ny 2 − 2ny 2
1 n 2
= ∑
n − 1 i =1
yi − ny 2
n
since ∑y
i =1
i = ny .
Take the expectations of both sides and redefine things within the brackets:
n
( )
E σˆ 2 =
1
E ∑ yi2 − ny 2
n − 1 i =1
1 n
= ∑
n − 1 i =1
( ) ( ( ))
E yi2 − n E y 2
1 2 σ 2
= nσ − n
n −1 n
and:
( ) ( ) 1 n
E y 2 = E (Y − µ ) = Var (Y ) = Var ∑ Yi
2
n i =1
1 σ2
( )
n
1
= 2 Var ∑ Yi = 2 nσ 2 =
n i =1 n n
( )
E σˆ 2 =
1
n −1
[
nσ 2 − σ 2 ]
=
1
n −1
[
(n − 1)σ 2 ]
=σ 2
If we start with σ~ 2 = ∑ (Yi − Y ) we end up at the same place as above, but the term
n
1 2
n i =1
outside of the brackets is slightly different:
( ) 1 1
E σ~ 2 = nσ 2 − nσ 2
n n
[
= (n − 1)σ 2
1
n
]
=
(n − 1)σ 2
n
Thus, σ~ 2 is not unbiased. In general, it will be slightly smaller than the true variance
(e.g., if n = 100, the expectation of σ~ 2 is 99% of the true variance). However, it is
consistent, because as n goes to infinity the bias disappears.
COMPUTER LAB
The purpose of this lab is to review basic OLS using the econometric software package
STATA.
It is expected that you are already familiar with Stata.
You should see in the ‘Variables’ area four variables (date, obs, ur and lfpr). The first
is the quarter and year (ranging from Mar-86 to Dec-08). The second is the number
of observations (ranging from 1 to 92). The third is the official Unemployment Rate
(% of the labour force actively seeking and available for work based on the quarterly
Household Labour Force Survey (HLFS)). The fourth is the official Labour Force
Participation Rate (% of working age population (15+) who are either employed or
unemployed, also from the HLFS).
At any point you can ‘look’ at these data, by either examining all of the inputted data.
Type:
list
(Note: you’ll need to hit the ‘enter key’ several times to scroll through these data.)
summarize
date 0
obs 92 46.5 26.70206 1 92
ur 92 6.094565 2.0826 3.4 10.9
lfpr 92 65.81848 1.575983 63.3 69.3
The date variable is a string variable, and cannot be summarised in a numerical table.
The Unemployment Rate ranges from a low of 3.4% to a high of 10.9%, and has an
average slightly higher than 6.09% over the sample period. The average Labour
Force Participation Rate is nearly 65.82%.
• These are time series data for the New Zealand economy over this 92-quarter period
(23 years). There are some advantages in telling STATA that these are times series
data (you’ll see one in a minute). To do this we need to create an index for the
quarterly data (call it ‘qtr’) based on a baseline of qtr=1 indicating 1960q1 (this is set
arbitrarily in STATA). Type the following:
generate qtr=obs+103
tsset qtr, quarterly
The variable called ‘qtr’ now has a number and format (%tq – see the Variables area)
that recognises the date. Using the list command you can see that this last column
matches the first column (just a different date format).
• Here’s one advantage of this time series format. We can plot the values of the two
labour market variables over our entire sample period by typing:
The data points are the solid dots in the diagram, and the term ‘connected’ connects
the dots. (Note: replacing ‘connected’ with ‘line’ would remove the dots and
replacing it with ‘scatter’ would remove the line in this command.) The UR has
recently increased after a long expansion period, while the LFPR continues to rise.
• Time to do some simple regression analysis. Let’s say that we suspect that the
aggregate participation rates depend on the state of the labour market in general and
the unemployment rate in particular. The model we have in mind looks like this:
lfprt = β1 + β 2urt + ut
regress lfpr ur
• For two-variable regression models like this, it’s often helpful to show the scatter
diagram. This can be done by typing the following:
4 6 8 10 12
ur
• But we can also easily draw the regression line estimated above by typing:
graph twoway (scatter lfpr ur) (lfit lfpr ur) ... where ‘lfit’ stands for line fit
70
68
66
64
62
4 6 8 10 12
ur
• By the way, if you want to produce heteroscedasticity robust standard errors on your
coefficient estimates, you need to modify this regression command slightly.
Robust
lfpr Coef. Std. Err. t P>|t| [95% Conf. Interval]
summarize ur ur2
And you can see below that these regressors will be highly correlated (not
surprisingly), which raises the issue of multicollinearity. However, it’s easy to
confirm that the collinearity isn’t perfect:
corr ur ur2
(obs=92)
ur ur2
ur 1.0000
ur2 0.9877 1.0000
Robust
lfpr Coef. Std. Err. t P>|t| [95% Conf. Interval]
• Interpreting the estimated impact of the unemployment rate on the labour force
participation rate is a little tricky. It helps to use calculus. Take the derivative of the
dependent variable with respect to the explanatory variable
∂lfprt
= δ 2 + 2δ 3urt
∂urt
•
This ‘marginal effect’ of a percentage point change in the unemployment rate on
participation is a linear function of both δ 2 , δ 3 and the level of current
unemployment.
Replace the actual coefficients with their estimates in this sample, and ‘evaluate’ this
derivative at the sample mean for the unemployment rate:
∂lfprt
= d 2 + 2 d 3u r
∂urt urt = u r
≈ −1.899395 + 2(0.0911841)(6.094565)
≈ −0.78794
o
This says that the marginal effect is larger in absolute value at the mean
unemployment rate than our earlier estimate. More importantly, it says that this
marginal effect depends on the current unemployment rate.
For example, in a very tight labour market (e.g., unemployment at 3.4%), the effect of
a one percentage-point rise in the unemployment rate is substantial:
∂lfprt
≈ −1.899395 + 2(0.0911841)(3.4 ) ≈ −1.2793
∂urt urt = 3.4
In a very loose labour market (e.g., unemployment at 10.9%), the effect of a one
percentage-point rise in the unemployment rate is actually positive:
∂lfprt
≈ −1.899395 + 2(0.0911841)(10.9 ) ≈ 0.0884184
∂urt urt =10.9
We would determine the ‘breakeven point’ for the direction of this relationship. Set
the derivative equal to zero and solve for the unemployment rate:
∂lfprt
≈ −1.899395 + 2(0.0911841)urt = 0
∂urt urt
urt ≈ 10.4152
• You can also test whether or not these marginal effects are equal to zero at different
current unemployment rates. After you estimate the regression above, these Wald
tests can be called with the following command:
test ur+2*ur2*6.094565=0
The terms ‘ur’ and ‘ur2’ refer to coefficient estimates on these variables. The testing
procedure will do the algebra for you. You should get:
( 1) ur + 12.18913 ur2 = 0
F( 1, 89) = 457.61
Prob > F = 0.0000
The test on the marginal effect in the tight labour market is written:
test ur+2*ur2*3.4=0
( 1) ur + 6.8 ur2 = 0
F( 1, 89) = 175.48
Prob > F = 0.0000
The test on the marginal effect in the loose/depressed labour market is written:
test ur+2*ur2*10.9=0
F( 1, 89) = 0.94
Prob > F = 0.3343