0% found this document useful (0 votes)
257 views42 pages

ETC2410 Introductory Econometrics Unit Overview

This document provides an overview of the ETC2410 Introductory Econometrics course. It discusses key topics that will be covered, including Ordinary Least Squares regression techniques such as assumptions, properties, and inference. Mechanics of OLS including matrix notation and minimization of error are explained. Assumptions of OLS include linearity, randomness of samples, no perfect collinearity, and zero conditional mean. Properties of unbiasedness and best linear unbiased estimation are proven under the assumptions. Hypothesis testing using t-tests, F-tests, and reparameterization are outlined. The document also discusses functional form, model selection, dummy variables, heteroskedasticity, serial correlation

Uploaded by

chanlego123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
257 views42 pages

ETC2410 Introductory Econometrics Unit Overview

This document provides an overview of the ETC2410 Introductory Econometrics course. It discusses key topics that will be covered, including Ordinary Least Squares regression techniques such as assumptions, properties, and inference. Mechanics of OLS including matrix notation and minimization of error are explained. Assumptions of OLS include linearity, randomness of samples, no perfect collinearity, and zero conditional mean. Properties of unbiasedness and best linear unbiased estimation are proven under the assumptions. Hypothesis testing using t-tests, F-tests, and reparameterization are outlined. The document also discusses functional form, model selection, dummy variables, heteroskedasticity, serial correlation

Uploaded by

chanlego123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

ETC2410 Introductory Econometrics

Huize Zhang

27 May 2019
A little bit about me

I Honours | Econometrics
I Summer Research for Tennis Australia
I Worked in Predictive Analytics Group
What we do in ETC2410?

I We talk about an estimation technique called Ordinary Least


Square (OLS).
I There’s also a lot of other techniques i.e. MLE, GMM, other
techniques in statistical learning
What we learn about OLS?

I Mechanism
I Assumptions
I Properties
I Inference based on OLS
Mechanism

Regression in matrix notation

y = Xβ + u
Minimising the error:
X0 ^
u=0

β^ = (X0 X)−1 X0 y
Assumptions of OLS

I Linearity: linear in parameter


I Randomness: random sample
I No perfect collinearity
I Zero condition mean

E (u|X) = 0

I Homoskadesticity
Var (u|X) = σ 2 In
Properties (under assumptions)

I Unbiased Estimator: under assumption 1, 2, 3 and 4

^ =β
E (β)

I BLUE: Best Linear Unbiased Estimator: under all the 5


assumptions
^ = σ 2 (X0 X)−1
Var (β)
Properties: Proof

E (β̂) = β

E (β̂) = E [(X0 X)−1 X0 y]


= E [(X0 X)−1 X0 (Xβ + u)]
= E [(X0 X)−1 X0 Xβ + (X0 X)−1 X0 u)]
= E [β + (X0 X)−1 X0 u]
= β + E [(X0 X)−1 X0 u]
= β + (X 0 X )−1 X 0 E [u] condition on X
= β assumption 4: zero conditional mean
Inference based on OLS

Note: up until now, we haven’t make any distribution assumption


about OLS (you don’t need distribution assumption to have the
property of OLS)
I Normality
u ∼ N(0, σ 2 In )
I This gives
β̂∼N(β, σ 2 (X0 X)−1 )
Then we can make inference (hypothesis testing + confidence
interval) based on this distribution of β
Hypothesis testing: three scenarios

I single parameter & single restriction: T test


I multiple parameters & multiple restrictions: F test
I multiple parameters & single restriction: Reparameterisation
Test hypothesis about a single restriction: t-test

β̂j − βj
∼ tn−k−1
se(βj )

Five steps in hypothesis testing Put that in your cheat sheet


I step 1: H0 = ...; H1 = ...
I step 2: distribution under H0

β̂j − βj
T = ∼ Tn−k−1
se(βj )

I step 3: tcalc = ..; tcrit = ..


I step 4: reject H0 if tcalcl > tcrit
I step 5: Based on the calculated value and critical value, we
reject/ not reject the null and conclude that [put in context]
Test hypothesis about multiple restrictions: F-test
Formulae 1
(SSRr − SSRur )/q
F = ∼Fq,n−k−1
SSRur /(n − k − 1)

I unrestricted model: the origin model:

y = β0 + β1 x1 + β2 x2 + β3 x3 + β4 x4

I restricted model: the origin model after imposing the


restrictions in H0
I i.e. H0 : β1 = β2 = 0
I the restricted model is then y = β0 + β3 x3 + β4 x4

I q is the number of restriction i.e. the number of ” = ” in your


H0
Test hypothesis about multiple parameters: F-test

Formulae 2: overall significance


If the restriction contains all the β (except β0 ), then the formulae
shrinks to

R 2 /k
F = ∼Fk,n−k−1
(1 − R 2 )/(n − k − 1)

since SSR = 1 − R 2 and SSRr = 1


I choose the formulae based on the information given
A special case: single restriction with multiple parameters

y = β0 + β1 x1 + β2 x2 + β3 x3 + β4 x4 + u

H0 : β1 = β2
transform to
H0 : δ = 0 where δ = β1 − β2

since δ = β1 − β2 , β1 = δ + β2 ,
the origin y = β0 + β1 x1 + β2 x2 + β3 x3 + β4 x4 + u becomes

y = β0 + (δ + β2 )x1 + β2 x2 + β3 x3 + β4 x4 + u

y = β0 + δx1 + β2 (x2 + x1 ) + β3 x3 + β4 x4 + u

Regress y on x1 , x1 + x2 , x3 , x4 and use the estimated coefficient and


std.err of x1 to conduct t-test
Interval

I Confidence Interval for a parameter

β̂ ± tn−k−1 (α) ∗ se(β̂)

I Prediction Interval for y

ŷ ± tn−k−1 (α) ∗ se(ê)

where q
se(ê) = σ̂ 2 + [se(ŷ )]2
Prediction Interval

Understand why you can’t use se(ŷ )?


I the prediction of y include two sources of uncertainties: error of
the regression: σˆ2 and variation of the estimation: se(ŷ )
Functional Form

I Log transformation
Form change of x change of y
log-level(y-x) unit change percentage change
level-log(y-x) percentage change unit change
log-log percentage change percentage change
Always remeber: “controlling for all other variables”
I Quadratic term
I what’s the turning point? (find the maximum or minimum)
Model Selection Criteria

I R 2 is always increasing as the number of parameter increases -


not good!
Adjusted R 2 | AIC | BIC | HQ
I prefer the model with larger Adjusted Rˆ2 but smaller AIC, BIC
and HQ - the last three add a penalty to SSR and BIC adds
the largest penalty among all three
I the result may conflict, choose the best model considering all
the criteria
Dummy Variables

I Gender dummy
I Name the dummy as the category which has value of 1: female
is a good name; gender is a bad one
I Simple dummy interpretation: bring in the content!
I controlling for all other variables, females on average earn less
than their male counterpart by [whatever]% - don’t say a unit
increase in the female dummy!!!
Dummy Variables: interaction term

lwage = β0 + β1 female + β2 educ + β3 female ∗ educ

When female = 0
lwage = β0 + β2 educ
when female = 1

lwage = (β0 + β1 ) + (β2 + β3 )educ


Dummmy variable trick

I If you have multiple dummy variables: yr2010, yr2011, yr2012,


yr2013, yr2014, only include 4 out of 5 into the regression
I This is because, your observation will definitely fall into one of
the categories: yr2010 + yr2011 + yr2012 + yr2013 + yr2014
=1
I Interpret based on the base level (usually the smallest)
I [number] [more or less] than the base level
Heteroskedasticity

I Definition and Consequence


I Detection - BP test/ White test
I Correction
Definition and Consequence

Definition of HTSK
I The variance of the errors is not equal to each other
I In another word, the diagonal elements of the
variance-covariance matrix is not all the same
Consequence
I still unbiased
I no longer ‘BLUE’, thus no longer efficient - OLS SE is incorrect
I t/F tests are inaccurate
Detection: BP test or White test

Step 1: Null and alternative hypothesis

H0 : Var (u|x1 , x2 ...xn ) = σ 2

H1 (BP) : Var (u|x1 , x2 ...xn ) = δ0 + δ1 z1 + δ2 z2 + ... + δq zq


H1 (White) : Var (u|x1 , x2 ...xn ) a fucntion of x1 , x2 , ...xn

where z1 , z2 , ..., zn is a subset of x1 , x2 , ...xn


Detection: BP test or White test

Step 2: origin and auxiliary regression


1. Original regression
Regress y on x1 , x2 , ...xn to get the residual and note it as ûi
2. Auxiliary regression
(BP) Regress the residual ûi 2 on c, z1 , z2 , ..., zq and note as Rû22
(White) Regress the residual ûi 2 on c,x1 , x2 , ...xn , x12 , x22 , ...xn2 and
cross-product to get R 2 and note as Rû22

(White) Regress the residual ûi 2 on c,yˆi , yˆi2 and cross-product to


get R 2 and note as Rû22
Detection: BP test or White test

Step 3: distribution under the null


asy
n ∗ Rû22 ∼ χ2 (q)

Step 4: Rejection criteria reject H0 if the calculated value is larger


than the critical value and conclude there’s HTSK.
Correction

I Robust Standard Error


I Transformation - if the variance is of a particular form

Var (ui |xi1 , xi2 , ...xik ) = σ 2 hi

1
Var (ui |xi1 , xi2 , ...xik ) = σ 2
hi
ui
Var ( √ |xi1 , xi2 , ...xik ) = σ 2
hi

It shows that if ui becomes √ui , then the error is homo


hi
Correction

y = β0 + β1 xi1 + β2 xi2 + ... + βk xik + ui

Therefore, we regress √y on xi1 √


√ , xi2h , ..., xin
√ , then the model
hi hi i hi
will look like

y 1 1 1 1 1
√ = √ β0 + √ β1 xi1 + √ β2 xi2 + ... + √ βk xik + √ ui
hi hi hi hi hi hi
and your error will be homoskedastic :)
Serial Correlation

I Definition and Consequence


I Detection - BG test
I Correction
Definition and Consequence

Definition
I Errors in different periods are correlated with each other
Consequence
I Affect the variance-covariance matrix: off-diagonal elements
are not all zeros, which implies that

Var (u|X) 6= σ 2 In

I OLS estimation remains unbiased, but it is not BLUE (not best,


not efficient)
Detection - BG test

I Step 1: Set up the structure equation and auxiliary equation


(choose lag period)

yt = β0 + β1 xt1 + β2 xt2 + ... + βk xtk + ut

ut = ρ1 ut1 + ρ2 ut2 + ρ3 ut3 + et

I Step 2: Null and alternative hypothesis testing

H0 : ρ1 = ρ2 = ρ3 = 0

H1 : at least one of the ρ 6= 0


I Step 3: Estimate the origin equation by OLS and acquire OLS
residual uˆt
Detection - BG test

I Step 4: Estimate the auxiliary regression by OLS and acquire


the R 2 and calculate BGcalc = (n − q)Rû2
I Step 5: Distribution of BG stats under H0
asy
BG = (n − 3)Rû2 ∼ χ2 (3)

I Step 6: Rejection rule: reject the null if BGcalc > BGcrit and
conclude there’s serial correlation in the origin regression. If no
adjustment is made, the estimators will be unbiased but not
efficient.
Correction

I Solution 1: HAC standard error: Heteroskedasticity and


Autocorrelation Consistent estimator
I Solution 2: Estimate by GLS (FGLS)
I Solution 3: Dynamic Model
Dynamic Model

I Include lags of y in the RHS of the model


I Example: AR models
Assumptions under time series setting

Assumption 1: Weakly Stationary If a series is weakly stationary or


stationary it has properties that
I E (yt ) = µ for all t
I Var (yt ) = γ0 for all t
I Cov (yt , yt−j ) = γj for all t
Note: 3 means that the covariance between yt , yt−j only depends on
the time interval separating them rather than the time itself
Assumption 2: White Noise
I E (yt ) = 0 for all t
I Var (yt ) = σ 2 for all t
I Cov (yt , yt−j ) = 0 for all t
AR models

AR(p) model

yt = φ0 + φ1 yt−1 + φ1 yt−2 + · · · φp yt−p + ut

AR(1) model
yt = φ0 + φ1 yt−1 + ut
Stationary restriction

| φ1 |≤ 1

yt = φ0 + φ1 yt−1 + ut
= φ0 + φ1 (φ0 + φ1 yt−2 + ut ) + ut
= φ0 (1 + φ1 ) + φ21 yt−2 + ut (1 + φ1 )
= φ0 (1 + φ1 ) + φ21 (φ0 + φ1 yt−3 + ut ) + ut (1 + φ1 )
= · · · + φ31 yt−3 + · · ·
= · · · + φp1 yt−p + · · ·
Deduction of E (yt )
Given that yt is a stationary series and ut is white noise, deduct the
expression for E (yt ), Var (yt ) and Cov (yt , yt−j )

φ0
E (yt ) =
1 − φ1

because
E (yt ) = E (φ0 ) + E (φ1 yt−1 ) + E (ut )
E (yt ) = φ0 + φ1 E (yt−1 ) + E (ut )

Because of stationary assumption 1, E (yt ) = E (yt−1 )


Because of white noise assumption 1, E (ut ) = 0
Therefore,
E (yt ) = φ0 + φ1 E (yt )
φ0
E (yt ) =
1 − φ1
Deduction of Var (yt )
σ2
Var (yt ) =
1 − φ21

because

Var (yt ) = Var (φ0 ) + Var (φ1 yt−1 ) + Var (ut )

Var (yt ) = φ21 Var (yt−1 ) + Var (ut )

Because of stationary assumption 2, Var (yt ) = Var (yt−1 )


Because of white noise assumption 2, Var (ut ) = σ 2
Therefore,
Var (yt ) = φ21 Var (yt ) + σ 2

σ2
Var (yt ) =
1 − φ21
Unit root - what if our series is not stationary: random
walk

yt = yt−1 + ut
Assuming ut is still WN, then a random walk series has property that

E (yt ) = 0

Var (yt ) = tσ 2
Theory w.r.t OLS

1 Pn
I WLLN: As n goes to infinity, n i yi will converge to E (y ),
which is ȳ in OLS
I CLT √
n(Y¯n − µ) d
→ N(0, 1)
σ
This is an exact result: CLT is the theorem for infinity n
If n is large but finite number then we can move around and have
the result

σ 2
asy
Y¯n ∼ N(µ, )
n
Theory w.r.t OLS

I Consistency: plim(θ̂) = θ
I WLLN:
I Jensen’s Inequality
I Slutsky theorem (CMT)
I Asymptotic Normality: θ̂ is asymptotically distributed as
normal with mean of θ0 and a variance V
I Taylor’s theorem
I Consistency
I CLT
I CMT
I WLLN
I Efficiency or asymptotic efficiency
I If the variance in the (asy) normal distribution hits the (fisher
information per observation) CRLB, then it is (asy) efficient

You might also like