LN1 Introduction Ver2 Slides
LN1 Introduction Ver2 Slides
Introduction
Ping Yu
Tutorial: one tutorial class every other week; the first class starts from week two
(the week starting from Sep. 8). You can suggest some difficult points in my
lectures or some difficult exercises in the lecture notes to the tutor for covering in
the tutorial classes.
Examination: mimic the analytical exercises in the lecture notes; please refer to
past three years’ exams on Moodle for concrete examples (exams three years ago
seem out of date).
- The midterm is an in-class exam on October 24.
- The final will be conducted in the mid of December (TBA) and put more (not all)
weights on the materials that are not covered by the midterm.
- The exams are open-book and open-note (but only the materials on Moodle are
allowed, so a laptop is not allowed).
In Class: (i) turn off your cell phone and keep quiet; (ii) come to class and return
from the break on time; (iii) you can ask me freely in class, but if your question is
far out of the course or will take a long time to answer, I will answer you after class.
Policy on Plagiarism: If judged as “plagiarism”, you are in serious trouble. If a few
students are judged to copy each other, each gets zero mark. I will not judge who
copied whom. So DO NOT copy others and DO NOT be copied by others.
- This policy applies to HW, midterm and final.
Guest Account (cannot receive announcements):
- Website: http://hkuportal.hku.hk/moodle/guest
- Guest Username: econ6005_1a_2024_guest
- Password: ECON6005@ping
We will concentrate on linear models, i.e., linear regression and linear GMM, in
this course. Nonlinear models are discussed only briefly.
Sections, proofs, exercises, paragraphs or footnotes indexed by * are optional and
will not be covered in this course.
I may neglect or add materials beyond my notes (depending on your backgrounds
and time constraints). Just follow my slides and read the corresponding sections.
Suppose we observe fyi , xi gni=1 , where yi is the wage rate, xi includes education
and experience, and the target is to study the return to schooling or the
relationship between yi and xi .
The most general model is
y = m (x, u), (1)
)0
where x = (x1 , x2 with x1 being education and x2 being experience, u is a vector
of unobservable errors (e.g., the innate ability, skill, quality of education, work
ethic, interpersonal connection, preference, and family background), which may be
correlated with x (why?), and m ( ) can be any (nonlinear) function. To simplify our
discussion, suppose u is one-dimensional and represents the ability of individuals.
Notations: real numbers (or scalars) are written using lower case italics. Vectors
are defined as column vectors and represented using lowercase bold.
∂ m (x1 , x2 , u )
,
∂ x1
which depends on the levels of x1 and x2 and also u.
In other words, for different levels of education, the returns to schooling are
different; furthermore, for different levels of experience (which is observable) and
ability (which is unobservable), the returns to schooling are also different.
This model is called the NSNM since u is not additively separable.
ASNM:
y = m (x) + u.
In this model, the return to schooling is
∂ m (x1 , x2 )
,
∂ x1
which depends only on observables.
A special case of this model is the additive separable model (ASM) where
m (x ) = m1 (x1 ) + m2 (x2 ).
In this case, the return to schooling is ∂ m1 (x1 )/∂ x1 , which depends only on x1 .
There is also the case where the return to schooling depends on the unobservable
but not other covariates.
For example, suppose
y = α (u ) + m1 (x1 )β 1 (u ) + m2 (x2 )β 2 (u ),
∂ m1 (x1 )
β 1 (u ),
∂ x1
which does not depend on x2 but depend on x1 and u.
A special case is the RCM where m1 (x1 ) = x1 and m2 (x2 ) = x2 .
In this case, the return to schooling is β 1 (u ) which depends only on u.
y = α (x2 ) + x1 β 1 (x2 ) + u,
When the return to schooling does not depend on either (x1 , x2 ) or u, we get the
LRM,
y = α + x1 β 1 + x2 β 2 + u x0 β + u,
where x (1, x1 , x2 )0 , β (α, β 1 , β 2 )0 , and the return to schooling is β 1 which is
constant.
Summary:
x1 X X X X
x2 X X X X
u X X X X
Model NSNM ASNM ? ? ASM VCM RCM LRM
Table 1: Models Based on What The Return to Schooling Depends on
Other Dimensions
x and u are uncorrelated (or even independent) and x and u are correlated. In the
former case, x is called exogenous, and in the latter case, x is called endogenous.
Limited dependent variables (LDV): part of the information about y is missing.
Single equation vs. Multiple equations.
Different characteristics of the conditional distribution of y given x.
- Conditional mean or conditional expectation function (CEF)
Z Z
m (x) = E [y jx] = yf (y jx)dy = m (x, u )f (ujx)du,
where f (y jx) is the conditional probability density function (pdf) or the conditional
probability mass function (pmf) of y given x.
- Conditional quantile
- Conditional variance
h i
σ 2 (x) = Var (y jx) = E (y m (x))2 x ,
A Real Example
0.14
f (Wage|Female)
f (Wage|Male)
0.12 Mean(Wage|Female)
Median(Wage|Female)
Mean(Wage|Male)
0.1 Median(Wage|Male)
0.08
0.06
0.04
0.02
0
0 6.7 7.9 9.0 10.1 30
Figure: Wage Densities for Male and Female from the 1985 CPS
Econometrics, Microeconometrics
and Economic Theory
This course will concentrate on microeconometrics, i.e., the main data types
analyzed in this course are cross-sectional data and panel data.2
One main objective of microeconometrics is to explore causal relationships
between a response variable y and some covariates x.
- the effect of class sizes on test scores
- police expenditures on crime rates
- climate change on economic activity
- years of schooling on wages
- baby-bearing on the labor force participation of women
- institutional structure on growth
- the effectiveness of rewards on behavior
- the consequences of medical procedures on health outcomes
Caveat: causality is different from correlation.
- using umbrellas can predict raining but we cannot claim umbrellas cause raining.
- the rooster crow can predict sunrise but cannot cause sunrise.
- Correlation is used to "predict" y from x, while causality can be used to "explain"
y from x.
Noncausal relationships describe only associations, so are of less economic
interests.
2
Maybe only cross-sectional data will be discussed due to time constraint.
Ping Yu (HKU) Introduction 21 / 41
Econometrics, Microeconometrics and Economic Theory
Econometric Approaches
There are two econometric traditions: the frequentist approach and the Bayesian
approach.
- the former treats the parameter as fixed (i.e., there is only one true value) and
the samples as random.
- the latter treats the parameter as random and the samples as fixed.
This course will concentrate on the frequentist approach.
Two main methods in the frequentist approach are the likelihood method and the
method of moments (MoM).
We will concentrate on the MoM and only briefly discuss the likelihood method.
We will use the estimation problem to illustrate these two methods.
Bayes never published what would eventually become his most famous accomplishment; his
notes were edited and published after his death by Richard Price.
where X is a random vector, f (x ) is the true pdf or the true pmf, f (xjθ ) is the
specified parametrized pdf or pmf, Θ is the parameter space, and F (x ) is the true
cdf.
Ronald A. Fisher (1890-1962) is one iconic founder of modern statistical theory. The name of
F -distribution was coined by G.W. Snedecor, in honor of R.A. Fisher. The p-value is also
credited to him.
continue...
Harald Cramér (1893-1985), Stockholm C.R. Rao (1920-2023), ISI and PSU3
3
A student of R.A. Fisher, recipient of International Prize in Statistic – the Nobel Prize of statistics.
Ping Yu (HKU) Introduction 30 / 41
Econometric Approaches The Method of Moments Estimator
Karl Pearson (1857-1936) is also the inventor of the correlation coefficient, so the correlation
coefficient is also called the Pearson correlation coefficient. He is also the founder of
Biometrika.
The MoM estimator uses only the moment information in X , while the MLE uses
"all" information in X , so the MLE is more efficient than the MoM estimator.
However, the MoM estimator is more robust than the MLE since it does not rely on
the correctness of the full distribution but relies only on the correctness of the
moment functions.
Efficiency and robustness are a common trade-off among econometric methods.
Moment conditions often originate from the first order conditions (FOCs) in an
optimization problem.
Suppose the firms are maximizing their profits conditional on the information in
hand; then the problem for the firm i is
Z
max Eνjz [π (di , zi , ν i ; θ )] := max π (di , zi , ν i ; θ )f (vi jzi ) dvi . (4)
di di
π (di , zi , ν i ; θ ) = pi f (Li , ν i ; θ ) wi L i ,
continue...
max E [π (di , zi , ν i ; θ )] ,
di
4
The difference between zi and ν i is that zi can be observed ex post while ν i cannot. That zi is random
means that the decision is made before zi is revealed, or the decision is made ex ante.
Ping Yu (HKU) Introduction 35 / 41
Econometric Approaches The Method of Moments Estimator
∞
max ∑ ρ t E0 [u (ct )]
fct g∞
t =1 t = 1
s.t. ct +1 + kt +1 = kt Rt +1 , k0 is known,
Equations (3), (5) and (6) are the population version of moment conditions.
Although some econometricians treat "population" as a physical population (e.g.,
all individuals in the US census) in the real world, the term "population" is often
treated abstractly, and is potentially infinitely large.
Since the population distribution is unknown, we cannot solve the population
moment conditions to estimate the parameters.
In practice, we often have a set of finite data points from the population, so we
can substitute the population distribution in the moment conditions by the
empirical distribution of the data, which generates the sample version of the
moment conditions.
This is called the analog method.
which is equivalent to
1 n
n i∑
m (Xi jθ ) = 0. (7)
=1
The MoM estimator θb (X1 , , Xn ) is the solution to (7).
Ping Yu (HKU) Introduction 39 / 41
Econometric Approaches The Analog Method
(Parametric) Likelihood
0 1
semi-parametric: empirical likelihood
! @ semi-nonparametric: semi-nonparametric likelihood A
nonparametric: nonparametric likelihood
MoM ! GMM
We will cover only the GMM method in this course.
I will teach Econ6086 in the spring semester which will concentrate on machine
learning, and machine learning will focus on nonparametric methods.