0% found this document useful (0 votes)

38 views95 pages

04 LDV

The document discusses binary dependent variables and limitations of the linear probability model for modeling such variables. It introduces the logit model as a better alternative that uses a cumulative distribution function to ensure estimated probabilities remain between 0 and 1.

Uploaded by

Rinda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views95 pages

04 LDV

Uploaded by

Rinda

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 95

Department of Economics Universitas Padjadjaran | Microeconometrics

Microeconometrics:
Binary Dependent Variable

Department of Economics
Universitas Padjadjaran
2019
Department of Economics Universitas Padjadjaran | Microeconometrics
Department of Economics Universitas Padjadjaran | Microeconometrics
Department of Economics Universitas Padjadjaran | Microeconometrics
Department of Economics Universitas Padjadjaran | Microeconometrics

Additional References
• Dougherty, Introduction to Econometrics, 4th Ed, 2011
*best for basics*
• Golder, M., Advanced Quantitative Analysis: Maximum
Likelihood Estimation,
https://files.nyu.edu/mrg217/public/homepage.htm
Department of Economics Universitas Padjadjaran | Microeconometrics

Estimators we (will) know

• Ordinary Least Square (OLS) estimator
– If we have a SLR of 𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝑢𝑖 and 𝑥𝑖 is
𝑐𝑜𝑣(𝑥𝑖 ,𝑦𝑖 )
መ
exogenous, then we have 𝛽1 = 𝑂𝐿𝑆
𝑣𝑎𝑟 𝑥𝑖

• Instrumental Variable (IV) estimator

– If we have a SLR of 𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝑢𝑖 and 𝑥𝑖 is
መ 𝐼𝑉 𝑐𝑜𝑣(𝑧𝑖 ,𝑦𝑖 )
endogenous, then we have 𝛽1 = where
𝑐𝑜𝑣 𝑧𝑖 ,𝑥𝑖
𝐶𝑜𝑣(𝑧𝑖 , 𝑥𝑖 ) ≠ 0
• Maximum Likelihood (ML) estimator
Department of Economics Universitas Padjadjaran | Microeconometrics

Why uses binary dependent variable?

• Observed vs unobserved variables

• Suppose we want to analyse socioeconomic

factors underlying some people to:
– Corrupt
– Smoke
– Borrow money
– Get a scholarship
– Have boy/girl-friend(s)
– etc
Department of Economics Universitas Padjadjaran | Microeconometrics

Why uses binary dependent variable?

• Observed vs unobserved variables

• It would be best to know (observe)

– Utility derived from corruption, smoking, or
borrowing money, having a boy/girl-friend(s)…
– The actual (factual) cash flow of families
– A consistent way of measuring poverty

• And we could just apply OLS

Department of Economics Universitas Padjadjaran | Microeconometrics

Why uses binary dependent variable?

• Observed vs unobserved variables

• It would be best to know (observe)

– Utility derived from corruption, smoking, or
borrowing money, having a boy/girl-friend(s)…
– The actual (factual) cash flow of families
– A consistent way of measuring poverty

• But.. they are not observed

Department of Economics Universitas Padjadjaran | Microeconometrics

Why uses binary dependent variable?

• Observed vs unobserved variables

• What we observe is that

– Some people corrupt
– Some people smoke
– Some people borrow money
– Some people get scholarships
– Some people have boy/girl-friend(s)
Department of Economics Universitas Padjadjaran | Microeconometrics

The mechanism
Suppose:
𝑈𝑖𝑠 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝑢𝑖

But 𝑈𝑖𝑠 , the utility of smoking, is unobserved.

We, however, observe

𝑦𝑖 = 1 𝑖𝑓 𝑈𝑖𝑠 > 0 and

𝑦𝑖 = 0 𝑖𝑓 𝑈𝑖𝑠 ≤ 0
Department of Economics Universitas Padjadjaran | Microeconometrics

The mechanism
So we estimate
𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝑢𝑖

We know the value of 𝑦𝑖 , either 0 or 1

• Because of this, we may think as if 𝑦𝑖 is an event which
outcomes is 0 or 1
• Therefore, essentially what we want to know is 𝐸(𝑦𝑖 )
Department of Economics Universitas Padjadjaran | Microeconometrics

The Linear Probability Model

Using formula for expected value:

𝐸(𝑦𝑖 ) = σ 𝑦𝑖 × 𝑃𝑟𝑜𝑏(𝑦𝑖 )

= 1 × 𝑃𝑟𝑜𝑏 𝑦𝑖 = 1 + [0 × 𝑃𝑟𝑜𝑏 𝑦𝑖 = 0 ]

= 𝑃𝑟𝑜𝑏(𝑦𝑖 = 1)
Department of Economics Universitas Padjadjaran | Microeconometrics

The Linear Probability Model

If we estimate
𝑦𝑖 = 𝛽0 + 𝛽1 𝑥𝑖 + 𝑢𝑖

with 𝑦𝑖 is either 0 or 1

using OLS, we have a Linear Probability Model

𝑃𝑟𝑜𝑏(𝑦𝑖 = 1) = 𝛽0 + 𝛽1 𝑥𝑖 + 𝑢𝑖
Department of Economics Universitas Padjadjaran | Microeconometrics

The Linear Probability Model

We know from previous lectures about OLS that:

• We assume 𝐸(𝑢𝑖 𝑥𝑖 = 0

• we can write 𝑦𝑖 as 𝐸(𝑦𝑖 |𝑥𝑖 )

Therefore we can write our LPM

𝐸(𝑦𝑖 𝑥𝑖 = 𝑝𝑟𝑜𝑏(𝑦𝑖 = 1|𝑥𝑖 ) = 𝛽0 + 𝛽1 𝑥𝑖

Department of Economics Universitas Padjadjaran | Microeconometrics

LPM Interpretation
Suppose we have a more complete set of
independent variables:

𝐸(𝑦𝑖 𝑥𝑖 = 𝑝𝑟𝑜𝑏(𝑦𝑖 = 1|𝑥𝑖 ) = 𝛽0 + ෍ 𝛽𝑖 𝑥𝑖

• We cannot interpret our 𝛽𝑖 ’s as usual, because

𝑦𝑖 changes ONLY from 0 to 1 (vice versa)
Department of Economics Universitas Padjadjaran | Microeconometrics

LPM Interpretation
Suppose we have a more complete set of
independent variables:

𝐸(𝑦𝑖 𝑥 = 𝑝𝑟𝑜𝑏(𝑦𝑖 = 1|𝑥) = 𝛽0 + ෍ 𝛽𝑖 𝑥𝑖

• If 𝑥 is continuous:
– “If 𝑥 increases/decreases by 1 (unit), the
probability of 𝑦 increases/decreases by 𝛽𝑖
percentage points”
Department of Economics Universitas Padjadjaran | Microeconometrics

LPM Interpretation
Suppose we have a more complete set of
independent variables:

𝐸(𝑦𝑖 𝑥 = 𝑝𝑟𝑜𝑏(𝑦𝑖 = 1|𝑥) = 𝛽0 + ෍ 𝛽𝑖 𝑥𝑖

• If 𝑥 is dummy variable (e.g 1=male):

– “Suppose there are two individuals who are identical
in every respect but 1 individual is male the other one
is female; The probability of 𝑦 of male is 𝛽𝑖 percentage
points higher or lower (than female)”
Department of Economics Universitas Padjadjaran | Microeconometrics

LPM Interpretation
Department of Economics Universitas Padjadjaran | Microeconometrics

Limitations of LPM
• Distribution of the error term is not following
Normal Distribution, so test statistics are not
robust

• Suppose
𝑦𝑖 = 𝛽0 + 𝛽1 𝑥1 + 𝑢𝑖
𝑢𝑖 = 𝑦𝑖 − 𝛽0 − 𝛽1 𝑥1

When 𝑦𝑖 = 1, 𝑢𝑖 = 1 − 𝛽0 − 𝛽1 𝑥1
When 𝑦𝑖 = 0, 𝑢𝑖 = −𝛽0 − 𝛽1 𝑥1
Department of Economics Universitas Padjadjaran | Microeconometrics

Limitations of LPM
• Distribution of the error term is not following
Normal Distribution, so test statistics are not
robust

• Suppose:
– The probability when 𝑢𝑖 = 1 − 𝛽0 − 𝛽1 𝑥1 is 𝑃𝑖
– and when 𝑢𝑖 = −𝛽0 − 𝛽1 𝑥1 is (1 – 𝑃𝑖)

• The error term follows Bernoulli distribution

Department of Economics Universitas Padjadjaran | Microeconometrics

Limitations of LPM
• Heteroskedasiticity

Since the error term follows Bernoulli distribution, then

Variance of the error term:

𝜎 2 = 𝑃𝑖 1 − 𝑃𝑖 = 1 − 𝛽0 − 𝛽1 𝑥1 −𝛽0 − 𝛽1 𝑥1
Department of Economics Universitas Padjadjaran | Microeconometrics

Limitations of LPM
• Nonfulfillment of 0 ≤ 𝐸( 𝑦𝑖 | 𝑥𝑖) ≤ 1: Does it make sense
to assume this probability is a linear function of 𝑥𝑖 ?
Department of Economics Universitas Padjadjaran | Microeconometrics

Having said that…

• LPM is still widely used in empirical research,
as long as we are sure that all limitations are
tackled
Department of Economics Universitas Padjadjaran | Microeconometrics
Department of Economics Universitas Padjadjaran | Microeconometrics
Department of Economics Universitas Padjadjaran | Microeconometrics
Department of Economics Universitas Padjadjaran | Microeconometrics
Department of Economics Universitas Padjadjaran | Microeconometrics
Department of Economics Universitas Padjadjaran | Microeconometrics
Department of Economics Universitas Padjadjaran | Microeconometrics
Department of Economics Universitas Padjadjaran | Microeconometrics

What is a “better” model for estimating

E(yi)?
• Since probability of an event has to be
between 0 and 1, a good model would be a
nonlinear function of x that its result never
gets negative or larger than 1 !

• A class of function that we have already seen

in statistics and satisfy this requirement are
Cumulative Distribution Functions (CDFs)
Department of Economics Universitas Padjadjaran | Microeconometrics

What is a better model for estimating

E(yi)?

PDF CDF
Department of Economics Universitas Padjadjaran | Microeconometrics

What is a better model for E(yi)?

• We denote CDFs using the letter F

𝐸(𝑦𝑖 𝑥 = 𝑝𝑟𝑜𝑏(𝑦𝑖 = 1|𝑥) = 𝐹 𝛽0 + ෍ 𝛽𝑖 𝑥𝑖

Where F is a CDF

• Therefore to model a binary dependent variables we need to choose a

CDF and to have an estimation method appropriate for estimating 𝛽0 and
𝛽1
Department of Economics Universitas Padjadjaran | Microeconometrics

Solution
• We need a math function for 𝑦, or 𝐸(𝑦|𝑥), or
𝑃𝑟𝑜𝑏 (𝑦 = 1), that always results in values
between 0 and 1

• Whatever the values of independent variables are

(can be from −∞ to +∞), the values of
dependent variable will be between 0 and 1

• In general:
𝑃 𝑦 = 1 𝑥 = 𝐹(𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ⋯ + 𝛽𝑘 𝑥𝑘 )
Department of Economics Universitas Padjadjaran | Microeconometrics

Solution 1: Logit Model

𝑃 𝑦 = 1 𝑥 = 𝐹 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ⋯ + 𝛽𝑘 𝑥𝑘
= 𝐹 𝑧𝑖

F can be in the form of

1
𝑃𝑖 = 𝑃 𝑦 = 1 𝑥 =
1 + 𝑒 −𝑧𝑖
equivalently:
𝑃𝑖
= 𝑒 𝑧𝑖
1−𝑃𝑖
Department of Economics Universitas Padjadjaran | Microeconometrics

Solution 1: Logit Model

Logit Model:
1 𝑒 𝑧𝑖
Λ 𝑧𝑖 = 𝑃 𝑦 = 1 𝑥 = −𝑧
=
1+𝑒 𝑖 1 + 𝑒 𝑧𝑖
Department of Economics Universitas Padjadjaran | Microeconometrics

Solution 1: Logit Model

Taking the log of both sides
𝑃𝑖
𝑙𝑜𝑔 = 𝑧𝑖
1 − 𝑃𝑖
Hence
𝑃𝑖
𝐿𝑖 = 𝑙𝑜𝑔 = 𝑧𝑖
1 − 𝑃𝑖
• We call Li  Logit model
• We estimate logit model using Maximum
Likelihood method
Department of Economics Universitas Padjadjaran | Microeconometrics

Logit Model: Coefficients &

Marginal Effects
• Coefficients are not Marginal Effects (not directly
interprettable)
– Because of non-linearity setting in the model

• Therefore
1
𝐸 𝑦𝑖 𝑥 = 𝑃 𝑦𝑖 = 1 𝑥 =
1 + 𝑒 −𝑧𝑖

𝑒 𝑧𝑖
𝐸 𝑦𝑖 𝑥 = 𝑃 𝑦𝑖 = 1 𝑥 = 𝑧
= Λ(𝑧𝑖 )
1+𝑒 𝑖
Department of Economics Universitas Padjadjaran | Microeconometrics

Logit Model: Coefficients &

Marginal Effects
𝑒 𝑧𝑖
𝐸 𝑦𝑖 𝑥 = 𝑃 𝑦𝑖 = 1 𝑥 =
1 + 𝑒 𝑧𝑖

To get the marginal effect, we need to differentiate:

𝑧𝑖 2
𝑑𝐸 𝑦𝑖 𝑥 𝑑Λ 𝑧𝑖 𝑒
= = ∙ 𝛽𝑖
𝑑𝑥 𝑑𝑥 1 + 𝑒 𝑧𝑖
Department of Economics Universitas Padjadjaran | Microeconometrics

Solution 2: Probit Model

Suppose we have an equation:

𝑦 ∗ = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ⋯ 𝛽𝑘 𝑥𝑘 + 𝑢𝑖

But 𝑦 ∗ is unobservable

What we observed is actually 𝑦, which takes the

value of 1 if 𝑦 ∗ > 0 and 0 otherwise

We assume that 𝑢𝑖 ~𝑁(0, 𝜎 2 )

Department of Economics Universitas Padjadjaran | Microeconometrics

Solution 2: Probit Model

Hence

𝑃 𝑦 = 1 = 𝑃 𝑦∗ > 0
= 𝑃 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ⋯ 𝛽𝑘 𝑥𝑘 + 𝑢𝑖 > 0
= 𝑃 𝑢𝑖 > −𝛽0 − 𝛽1 𝑥1 − 𝛽2 𝑥2 − ⋯ 𝛽𝑘 𝑥𝑘
𝑢𝑖 −𝛽0 −𝛽1 𝑥1 −𝛽2 𝑥2 −⋯𝛽𝑘 𝑥𝑘
=𝑃 >
𝜎 𝜎

𝑢𝑖
The distribution of is standard normal
𝜎
Department of Economics Universitas Padjadjaran | Microeconometrics

Solution 2: Probit Model

Since the normal distribution is symmetric, we can write
𝑢𝑖 −𝛽0 − 𝛽1 𝑥1 − 𝛽2 𝑥2 − ⋯ 𝛽𝑘 𝑥𝑘
𝑃(𝑦 = 1) = 𝑃 >
𝜎 𝜎
𝑢𝑖 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ⋯ 𝛽𝑘 𝑥𝑘
=𝑃 <
𝜎 𝜎

𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ⋯ 𝛽𝑘 𝑥𝑘
=Φ
𝜎
And may be estimated using ML
Department of Economics Universitas Padjadjaran | Microeconometrics

Probit Model: Coefficients &

Marginal Effects
• Coefficients are not Marginal Effects (not directly
interpretable)
– Because of non-linearity setting in the model

• Therefore
𝐸 𝑦𝑖 𝑥 = 𝑃 𝑦𝑖 = 1 𝑥
𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ⋯ 𝛽𝑘 𝑥𝑘
=Φ
𝜎
Department of Economics Universitas Padjadjaran | Microeconometrics

Probit Model: Coefficients &

Marginal Effects
𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ⋯ 𝛽𝑘 𝑥𝑘
𝐸 𝑦𝑖 𝑥 = 𝑃 𝑦𝑖 = 1 𝑥 = Φ
𝜎

To get the marginal effect, we need to differentiate:

𝑑𝐸 𝑦𝑖 𝑥
= 𝐸 𝑦𝑖 𝑥 = 𝑃 𝑦𝑖 = 1 𝑥
𝑑𝑥
𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ⋯ 𝛽𝑘 𝑥𝑘
=Φ 𝛽𝑖
𝜎
Department of Economics Universitas Padjadjaran | Microeconometrics

Logit or Probit?
Department of Economics Universitas Padjadjaran | Microeconometrics
Department of Economics Universitas Padjadjaran | Microeconometrics
Department of Economics Universitas Padjadjaran | Microeconometrics
Department of Economics Universitas Padjadjaran | Microeconometrics
Department of Economics Universitas Padjadjaran | Microeconometrics

Gender Inequality and Poverty in Indonesia:

Evidence from Household Data
Kinanti Z. Patria
Department of Economics Universitas Padjadjaran | Microeconometrics
Department of Economics Universitas Padjadjaran | Microeconometrics

Jeffrey Aron Natan, 2019

Department of Economics Universitas Padjadjaran | Microeconometrics

Jeffrey Aron Natan, 2019

Department of Economics Universitas Padjadjaran | Microeconometrics

Estimation of Logit and Probit Models

• We do not use OLS, rather we use the Maximum
Likelihood Method

• MLE (Maximum Likelihood Estimator) of the

unknown parameters are the value of the
parameters that maximize the likelihood function

• Likelihood function: the joint probability of the

observed sample implied by the model
Department of Economics Universitas Padjadjaran | Microeconometrics

Extension

MAXIMUM LIKELIHOOD
ESTIMATOR
Department of Economics Universitas Padjadjaran | Microeconometrics

Maximum Likelihood Estimator

• Remember that our data is Random Variable
– Follows certain probability density function (pdf)
or probability distribution
• Suppose we have 5 observations of variable Y
𝑌 = {15,21,30,45,50}
– What is the odds that we will have these
observations from a normal distribution with 𝜇 =
100?
Department of Economics Universitas Padjadjaran | Microeconometrics

Maximum Likelihood Estimator

• Remember that our data is Random Variable
– Follows certain probability density function (pdf)
or probability distribution
• Suppose we have 5 observations of variable Y
𝑌 = {15,21,30,45,50}
– What is the odds that we will have these
observations from a normal distribution with 𝜇 =
100? 𝜇 = 35?
Department of Economics Universitas Padjadjaran | Microeconometrics

Maximum Likelihood Estimator

• “Maximum Likelihood is just a systematic way
of searching for the parameter values of our
chosen distribution that maximize the
probability of observing the data we observe”
(Matt Golder, 2013)
Copyright Christopher Dougherty 2012.

These slideshows may be downloaded by anyone, anywhere for personal use.

Subject to respect for copyright and, where appropriate, attribution, they may be used as a resource for
teaching an econometrics course. There is no need to refer to the author.

The content of this slideshow comes from Section R.2 of C. Dougherty, Introduction to Econometrics,
fourth edition 2011, Oxford University Press.
Additional (free) resources for both students and instructors may be downloaded from the OUP Online
Resource Centre
http://www.oup.com/uk/orc/bin/9780199567089/.

Individuals studying econometrics on their own who feel that they might benefit from participation in a
formal course should consider the London School of Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.
Department of Economics Universitas Padjadjaran | Microeconometrics

Method of ML
• The method of maximum likelihood is
intuitively appealing, because we attempt to
find the values of the true parameters that
would have most likely produced the data that
we in fact observed.
• For most cases of practical interest, the
performance of maximum likelihood
estimators is optimal for large enough data.
p
• This sequence introduces the
0.4 principle of maximum likelihood
0.3 estimation and illustrates it with
some simple examples.
0.2

0.1
• Suppose that you have a
normally-distributed random
0.0 variable X with unknown
0 1 2 3 4 5 6 7 8 m population mean m and
L standard deviation s, and that
0.06 you have a sample of two
observations, 4 and 6. For the
0.04 time being, we will assume that
s is equal to 1.
0.02

0.00
0 1 2 3 4 5 6 7 8 m
1
Normal Distribution

1 xm 2
1  ( )
f ( x)  e 2 s
s 2
This is a bell shaped curve
Note constants: with different centers and
=3.14159 spreads depending on m
e=2.71828 and s
p
0.4 m p(4) p(6)
0.3521
0.3
3.5 0.3521 0.0175

0.2

0.1
0.0175
0.0
0 1 2 3 4 5 6 7 8 m
L
0.06
Suppose initially you
0.04 consider the hypothesis m =
3.5. Under this hypothesis
0.02 the probability density at 4
would be 0.3521 and that at
6 would be 0.0175.
0.00
0 1 2 3 4 5 6 7 8 m
1
p
0.4 m p(4) p(6) L
0.3521
0.3
3.5 0.3521 0.0175 0.0062

0.2

0.1
0.0175
0.0
0 1 2 3 4 5 6 7 8 m
L
0.06
The joint probability
0.04 density, shown in the
bottom chart, is the product
0.02 of these, 0.0062.

0.00
0 1 2 3 4 5 6 7 8 m
1
p
0.4 0.3989 m p(4) p(6) L
3.5 0.3521 0.0175 0.0062
0.3
4.0 0.3989 0.0540 0.0215
0.2

0.1
0.0540
0.0
0 1 2 3 4 5 6 7 8 m
L
0.06 Next consider the
hypothesis m = 4.0. Under
0.04 this hypothesis the
probability densities
0.02
associated with the two
observations are 0.3989 and
0.0540, and the joint
0.00
probability density is
0 1 2 3 4 5 6 7 8 m
0.0215.
1
p
0.4 m p(4) p(6) L
0.3521
0.3 3.5 0.3521 0.0175 0.0062
4.0 0.3989 0.0540 0.0215
0.2
0.1295 4.5 0.3521 0.1295 0.0456
0.1

0.0
0 1 2 3 4 5 6 7 8 m
L
0.06 Next under the hypothesis
m = 4.5, the probability
0.04 densities are 0.3521 and
0.1295, and the joint
0.02
probability density is
0.0456.
0.00
0 1 2 3 4 5 6 7 8 m
1
p
0.4 m p(4) p(6) L

0.3 3.5 0.3521 0.0175 0.0062

0.2420 0.2420
4.0 0.3989 0.0540 0.0215
0.2
4.5 0.3521 0.1295 0.0456
0.1
5.0 0.2420 0.2420 0.0585
0.0
0 1 2 3 4 5 6 7 8 m
L
0.06 Under the hypothesis m =
5.0, the probability
0.04 densities are both 0.2420
and the joint probability
0.02
density is 0.0585.

0.00
0 1 2 3 4 5 6 7 8 m
1
p
0.4 m p(4) p(6) L
0.3521
0.3 3.5 0.3521 0.0175 0.0062
4.0 0.3989 0.0540 0.0215
0.2
0.1295 4.5 0.3521 0.1295 0.0456
0.1
5.0 0.2420 0.2420 0.0585
0.0
5.5 0.1295 0.3521 0.0456
0 1 2 3 4 5 6 7 8 m
L
0.06 Under the hypothesis m =
5.5, the probability
0.04 densities are 0.1295 and
0.3521 and the joint
0.02
probability density is
0.0456.
0.00
0 1 2 3 4 5 6 7 8 m
1
p
0.4 m p(4) p(6) L
0.3521
0.3 3.5 0.3521 0.0175 0.0062
4.0 0.3989 0.0540 0.0215
0.2
0.1295 4.5 0.3521 0.1295 0.0456
0.1
5.0 0.2420 0.2420 0.0585
0.0
5.5 0.1295 0.3521 0.0456
0 1 2 3 4 5 6 7 8 m
L
0.06 The complete joint density
function for all values of m
0.04 has now been plotted in the
lower diagram. We see that
0.02
it peaks at m = 5.

0.00
0 1 2 3 4 5 6 7 8 m
1
1 X  m  2
1   
2 s 
f (X )  e
s 2

Now we will look at the mathematics of the example. If X is normally distributed with mean
m and standard deviation s, its density function is as shown.
10
1 X  m  2
1   
2 s 
f (X )  e
s 2
1 2
1   X m 
f (X )  e 2
2

For the time being, we are assuming s is equal to 1, so the density function simplifies to the
second expression.

11
1 X  m  2
1   
2 s 
f (X )  e
s 2
1 2
1   X m 
f (X )  e 2
2

1 2 1 2
1   4 m  1   6 m 
f ( 4)  e 2
f ( 6)  e 2
2 2

Hence we obtain the probability densities for the observations where X = 4 and X = 6.

12
1 X  m  2
1   
2 s 
f (X )  e
s 2
1 2
1   X m 
f (X )  e 2
2

1 2 1 2
1   4 m  1   6 m 
f ( 4)  e 2
f ( 6)  e 2
2 2

 1  1  4  m  2  1  1  6  m  2 
joint density  e 2  e 2 
 2  2 
  
The joint probability density for the two observations in the sample is just the product of
their individual densities.

13
1 X  m  2
1   
2 s 
f (X )  e
s 2
1 2
1   X m 
f (X )  e 2
2

1 2 1 2
1   4 m  1   6 m 
f ( 4)  e 2
f ( 6)  e 2
2 2

 1  1  4  m  2  1  1  6  m  2 
joint density  e 2  e 2 
 2  2 
  
In maximum likelihood estimation we choose as our estimate of m ,the value that gives us
the greatest joint density for the observations in our sample. This value is associated with
the greatest probability, or maximum likelihood, of obtaining the observations in the sample.
14
Department of Economics Universitas Padjadjaran | Microeconometrics

MAXIMUM LIKELIHOOD ESTIMATION OF REGRESSION COEFFICIENTS

MLE AND REGRESSION ANALYSIS

b1 + b2Xi

Xi X

1
Y

b1 + b2Xi

Xi X

3
Y

b1 + b2Xi

Xi X

Potential values of Y close to b1 + b2Xi will have relatively large densities ...

6
Y

b1 + b2Xi

Xi X

... while potential values of Y relatively far from b1 + b2Xi will have small ones.

7
Y

b1 + b2Xi

Xi X

The mean value of the distribution of Yi is b1 + b2Xi. Its standard deviation is s, the standard deviation of
the disturbance term.

8
1 Y  b  b X 2

1   i 1 2 i 
s
f (Yi )  e 2 

Y s 2

b1 + b2Xi

Xi X

Hence the density function for the ex ante distribution of Yi is as shown.

9
1 Y  b  b X 2

1   i 1 2 i 
s
f (Yi )  e 2 
s 2

1  Y1  b1  b 2 X 1  2 1  Yn  b1  b 2 X n  2
1    1   
s s
f (Y1 )  ...  f (Yn )  e 2 
 ...  e 2 
s 2 s 2

The joint density function for the observations on Y is the product of their individual densities.

10
1 Y  b  b X 2

1   i 1 2 i 
s
f (Yi )  e 2 
s 2

1  Y1  b1  b 2 X 1  2 1  Yn  b1  b 2 X n  2
1    1   
s s
f (Y1 )  ...  f (Yn )  e 2 
 ...  e 2 
s 2 s 2

1 Y  b  b X  2 1 Y  b  b X  2
1   1 1 2 1 1   n 1 2 n
L b 1 , b 2 , s | Y1 ,...,Yn   e 2 s 
 ...  e 2 s 
s 2 s 2

Now, taking b1, b2 and s as our choice variables, and taking the data on Y and X as given, we can re-
interpret this function as the likelihood function for b1, b2, and s.
REMEMBER THIS
11
1 Y  b  b X 2

1   i 1 2 i 
s
f (Yi )  e 2 
s 2

1  Y1  b1  b 2 X 1  2 1  Yn  b1  b 2 X n  2
1    1   
s s
f (Y1 )  ...  f (Yn )  e 2 
 ...  e 2 
s 2 s 2

1 Y  b  b X  2 1 Y  b  b X  2
1   1 1 2 1 1   n 1 2 n
L b 1 , b 2 , s | Y1 ,...,Yn   e 2 s 
 ...  e 2 s 
s 2 s 2

 1 1  Y1  b1  b 2 X 1  2
 
1  Yn  b1  b 2 X n  2 
 
log L  log  e 2 s


 ... 
1
e 2 s  


 s 2 s 2 
 
We will choose b1, b2, and s so as to maximize the likelihood, given the data on Y and X. As usual, it is
easier to do this indirectly, maximizing the log-likelihood instead.

12
 1 Y  b  b X  2
  1 1 2 1
1 Y  b  b X  2
  n 1 2 n 
log L  log 
1 s 1 s
e 2 
 ...  e 2 
s 2 s 2 
 
 1  Y1  b1  b 2 X 1  2
    1 1  Yn  b1  b 2 X n  2 
 
 log   ...  log  
1  
2 s  2 s
e e
s 2   s 2 
   
 1  1  Y1  b 1  b 2 X 1  1  Yn  b 1  b 2 X n 
2 2

 n log      ...   
 s 2  2  s  2 s 
 1  s
2
 n log  Z
 s 2  2

As usual, the first step is to decompose the expression as the sum of the logarithms of the factors.

13
 1 Y  b  b X  2
  1 1 2 1
1 Y  b  b X  2
  n 1 2 n 
log L  log 
1 s 1 s
e 2 
 ...  e 2 
s 2 s 2 
 
 1  Y1  b1  b 2 X 1  2
    1 1  Yn  b1  b 2 X n  2 
 
 log   ...  log  
1  
2 s  2 s
e e
s 2   s 2 
   
 1  1  Y1  b 1  b 2 X 1  1  Yn  b 1  b 2 X n 
2 2

 n log      ...   
 s 2  2  s  2 s 
 1  s
2
 n log  Z
 s 2  2

Then we split the logarithm of each factor into two components. The first component is the same in each
case.

14
 1 Y  b  b X  2
  1 1 2 1
1 Y  b  b X  2
  n 1 2 n 
log L  log 
1 s 1 s
e 2 
 ...  e 2 
s 2 s 2 
 
 1  Y1  b1  b 2 X 1  2
    1 1  Yn  b1  b 2 X n  2 
 
 log   ...  log  
1  
2 s  2 s
e e
s 2   s 2 
   
 1  1  Y1  b 1  b 2 X 1  1  Yn  b 1  b 2 X n 
2 2

 n log      ...   
 s 2  2  s  2 s 
 1  s
2
 n log  Z
 s 2  2

where Z  (Y1  b 1  b 2 X 1 )2  ...  (Yn  b 1  b 2 X n )2 

Hence the log-likelihood simplifies as shown.

15
 1 Y  b  b X  2
  1 1 2 1
1 Y  b  b X  2
  n 1 2 n 
log L  log 
1 s 1 s
e 2 
 ...  e 2 
s 2 s 2 
 
 1  Y1  b1  b 2 X 1  2
    1 1  Yn  b1  b 2 X n  2 
 
 log   ...  log  
1  
2 s  2 s
e e
s 2   s 2 
   
 1  1  Y1  b 1  b 2 X 1  1  Yn  b 1  b 2 X n 
2 2

 n log      ...   
 s 2  2  s  2 s 
 1  s
2
 n log  Z
 s 2  2

where Z  (Y1  b 1  b 2 X 1 )2  ...  (Yn  b 1  b 2 X n )2 

To maximize the log-likelihood, we need to minimize Z. But choosing estimators of b1 and b2 to minimize
Z is exactly what we did when we derived the least squares regression coefficients.

16
 1 Y  b  b X  2
  1 1 2 1
1 Y  b  b X  2
  n 1 2 n 
log L  log 
1 s 1 s
e 2 
 ...  e 2 
s 2 s 2 
 
 1  Y1  b1  b 2 X 1  2
    1 1  Yn  b1  b 2 X n  2 
 
 log   ...  log  
1  
2 s  2 s
e e
s 2   s 2 
   
 1  1  Y1  b 1  b 2 X 1  1  Yn  b 1  b 2 X n 
2 2

 n log      ...   
 s 2  2  s  2 s 
 1  s
2
 n log  Z
 s 2  2

where Z  (Y1  b 1  b 2 X 1 )2  ...  (Yn  b 1  b 2 X n )2 

Thus, for this regression model, the maximum likelihood estimators of b1 and b2 are identical to the least
squares estimators.

17
 s
2
 1
log L  n log  Z
 s 2  2
where Z  (Y1  b 1  b 2 X 1 )2  ...  (Yn  b 1  b 2 X n )2 
  ei2 where ei  Yi  b1  b2 X i

As a consequence, Z will be the sum of the squares of the least squares residuals.

18
 1  s
2
log L  n log  Z
 s 2  2
 
1  1  s 2
 n log   n log  Z
s   2  2
 1  s
2
  n log s  n log  Z
 2  2

To obtain the maximum likelihood estimator of s, it is convenient to rearrange the log-likelihood function
as shown.

19
 1  s
2
log L  n log  Z
 s 2  2
 
1  1  s 2
 n log   n log  Z
s   2  2
 1  s
2
  n log s  n log  Z
 2  2
 log L
   s  3 Z  s  3 Z  ns 2 
n
s s

Differentiating it with respect to s, we obtain the expression shown.

20
 1  s
2
log L  n log  Z
 s 2  2
 
1  1  s 2
 n log   n log  Z
s   2  2
 1  s
2
  n log s  n log  Z
 2  2
 log L
   s  3 Z  s  3 Z  ns 2 
n
s s

ŝ 2    i
2
Z e
n n

The first order condition for a maximum requires this to be equal to zero. Hence the maximum likelihood
estimator of the variance is the sum of the squares of the residuals divided by n.

21
 1  s
2
log L  n log  Z
 s 2  2
 
1  1  s 2
 n log   n log  Z
s   2  2
 1  s
2
  n log s  n log  Z
 2  2
 log L
   s  3 Z  s  3 Z  ns 2 
n
s s

ŝ 2    i
2
Z e
n n

Note that this is biased for finite samples. To obtain an unbiased estimator, we should divide by n–k,
where k is the number of parameters, in this case 2. However, the bias disappears as the sample size
becomes large.
22

Inferensi Disekitar Mean Dan Pos Hoc-Zahro
No ratings yet
Inferensi Disekitar Mean Dan Pos Hoc-Zahro
11 pages
Introduction To Econometrics, 5 Edition: Chapter Heading
No ratings yet
Introduction To Econometrics, 5 Edition: Chapter Heading
69 pages
Self-Test Problems (Solutions in Appendix B) : LG5 LG2
0% (1)
Self-Test Problems (Solutions in Appendix B) : LG5 LG2
6 pages
3.handouts Binary Dependent Variables
No ratings yet
3.handouts Binary Dependent Variables
8 pages
MicroEconometrics Lecture10
No ratings yet
MicroEconometrics Lecture10
27 pages
Econometrics 2 Module 5 Video 1 Canvas
No ratings yet
Econometrics 2 Module 5 Video 1 Canvas
9 pages
Limited Dependent Variables
No ratings yet
Limited Dependent Variables
17 pages
Binary
No ratings yet
Binary
47 pages
Seminar Econometrie
No ratings yet
Seminar Econometrie
15 pages
Econometrics Eviews 6
No ratings yet
Econometrics Eviews 6
12 pages
Chapter - Five - Limited Dependent Variable Models
No ratings yet
Chapter - Five - Limited Dependent Variable Models
75 pages
Chapter3 Econometrics MultipleLinearRegressionModel
No ratings yet
Chapter3 Econometrics MultipleLinearRegressionModel
41 pages
Section 11 PDF
No ratings yet
Section 11 PDF
7 pages
Multiple Linear Regression Model: (Or Equivalently
No ratings yet
Multiple Linear Regression Model: (Or Equivalently
41 pages
Lecture 6 LPM
No ratings yet
Lecture 6 LPM
14 pages
Binary
No ratings yet
Binary
40 pages
Binaryresponsemf IMP
No ratings yet
Binaryresponsemf IMP
11 pages
Binary
No ratings yet
Binary
47 pages
Econometric S
No ratings yet
Econometric S
22 pages
Metrics WT 2023-24 Unit10 Ml+Discrete Choice
No ratings yet
Metrics WT 2023-24 Unit10 Ml+Discrete Choice
36 pages
Chapter 5
No ratings yet
Chapter 5
25 pages
Msfe Week9
No ratings yet
Msfe Week9
5 pages
Top2 Estimation Handout
No ratings yet
Top2 Estimation Handout
39 pages
Chapter3 Econometrics MultipleLinearRegressionModel
No ratings yet
Chapter3 Econometrics MultipleLinearRegressionModel
40 pages
Multiple Regression Edit - Removed
No ratings yet
Multiple Regression Edit - Removed
14 pages
Short - Notes - Econometric Methods
No ratings yet
Short - Notes - Econometric Methods
22 pages
Chapter 3 Econometrics
No ratings yet
Chapter 3 Econometrics
67 pages
Unitb - II - Linear Probability, Logit and Probit
No ratings yet
Unitb - II - Linear Probability, Logit and Probit
34 pages
Econometria Avanzada: Generalized Linear Models
No ratings yet
Econometria Avanzada: Generalized Linear Models
30 pages
Lecture 3
No ratings yet
Lecture 3
31 pages
Research What Is Research?
No ratings yet
Research What Is Research?
72 pages
Lecture 8 - Limited Dependent Var PDF
No ratings yet
Lecture 8 - Limited Dependent Var PDF
78 pages
2A.3 Lecture Slides20 LDV 1
No ratings yet
2A.3 Lecture Slides20 LDV 1
21 pages
Assignments Ashoka University
No ratings yet
Assignments Ashoka University
32 pages
Week 5 Notes
No ratings yet
Week 5 Notes
175 pages
Introduction To Bivariate Regression
No ratings yet
Introduction To Bivariate Regression
51 pages
Econometrics Module 2
No ratings yet
Econometrics Module 2
38 pages
Econometrics Chapter Two
No ratings yet
Econometrics Chapter Two
36 pages
Week 6 Notes
No ratings yet
Week 6 Notes
107 pages
Econ20222 MJAbackgr
No ratings yet
Econ20222 MJAbackgr
164 pages
2 Functionalformmf
No ratings yet
2 Functionalformmf
14 pages
Econometrics II Distance Module
No ratings yet
Econometrics II Distance Module
97 pages
Econometrics I 6
No ratings yet
Econometrics I 6
20 pages
Qualitative Response Regression Model - Probabilistic Models
No ratings yet
Qualitative Response Regression Model - Probabilistic Models
34 pages
Week1 Lecture2
No ratings yet
Week1 Lecture2
57 pages
Discrete Choice Model Soderbom
No ratings yet
Discrete Choice Model Soderbom
43 pages
Qualitative Response Models
No ratings yet
Qualitative Response Models
35 pages
Lecture 1a
No ratings yet
Lecture 1a
17 pages
Lec3 2019 PDF
No ratings yet
Lec3 2019 PDF
43 pages
Gary Chamberlain Econometric S
No ratings yet
Gary Chamberlain Econometric S
152 pages
Basic Eco No Metrics - Gujarati
50% (2)
Basic Eco No Metrics - Gujarati
48 pages
Unit - 1
No ratings yet
Unit - 1
8 pages
Limited Dependent Variables Models PDF
No ratings yet
Limited Dependent Variables Models PDF
47 pages
Econ Shu301 CH11
No ratings yet
Econ Shu301 CH11
53 pages
Basic Econometrics 1 (Password - BE)
No ratings yet
Basic Econometrics 1 (Password - BE)
30 pages
7 Binaryresponsemf
No ratings yet
7 Binaryresponsemf
11 pages
MOOC Econometrics: Philip Hans Franses
No ratings yet
MOOC Econometrics: Philip Hans Franses
4 pages
Pro Bit
No ratings yet
Pro Bit
5 pages
ECON3049 Lecture Notes 1
No ratings yet
ECON3049 Lecture Notes 1
32 pages
CMPSOmit PDF
No ratings yet
CMPSOmit PDF
12 pages
Extensions of The Two-Variable Linear Regression Model and Finding Outliers
No ratings yet
Extensions of The Two-Variable Linear Regression Model and Finding Outliers
17 pages
MultivariableRegression 1
No ratings yet
MultivariableRegression 1
30 pages
Geometry - Task Sheets Gr. 6-8
From Everand
Geometry - Task Sheets Gr. 6-8
Mary Rosenberg
No ratings yet
CH11 Wooldridge 7e PPT 2pp 20231206
No ratings yet
CH11 Wooldridge 7e PPT 2pp 20231206
4 pages
How To Conduct Propensity Score Matching - An Introduction
No ratings yet
How To Conduct Propensity Score Matching - An Introduction
10 pages
Which Project Should You Accept and Why?: Chart Title
No ratings yet
Which Project Should You Accept and Why?: Chart Title
7 pages
Behavioral Economics For Management MGMT 331 - Final Exam
No ratings yet
Behavioral Economics For Management MGMT 331 - Final Exam
10 pages
Anova
No ratings yet
Anova
2 pages
Reading 38 Backtesting and Simulation - Answers
No ratings yet
Reading 38 Backtesting and Simulation - Answers
8 pages
Artificial Intelligence: Adversarial Search
No ratings yet
Artificial Intelligence: Adversarial Search
62 pages
Pert/ CPM: Lecture Notes - 2
100% (1)
Pert/ CPM: Lecture Notes - 2
12 pages
3 - Applied Econometrics Syllabus
No ratings yet
3 - Applied Econometrics Syllabus
7 pages
Decision Theory Quiz
No ratings yet
Decision Theory Quiz
3 pages
Digital Assignmen T-3: Mat 2001 Statistics For Engineers
No ratings yet
Digital Assignmen T-3: Mat 2001 Statistics For Engineers
14 pages
Set 1
No ratings yet
Set 1
1 page
Midterm Review STA216: Generalized Linear Models: I I I I I I
No ratings yet
Midterm Review STA216: Generalized Linear Models: I I I I I I
26 pages
Optimization Technique
100% (6)
Optimization Technique
30 pages
Regresi Spasial Stata
No ratings yet
Regresi Spasial Stata
17 pages
Statistical Inference: Hypothesis Testing For Single Populations
No ratings yet
Statistical Inference: Hypothesis Testing For Single Populations
53 pages
Fractional Factorial Design
No ratings yet
Fractional Factorial Design
46 pages
Economics Assignment - Group B
No ratings yet
Economics Assignment - Group B
2 pages
PSM in Stata Using Teffects
No ratings yet
PSM in Stata Using Teffects
6 pages
AP Stats Formulas:Tables
No ratings yet
AP Stats Formulas:Tables
7 pages
RELAXING ASSUMPTIONS OF CLRMs
No ratings yet
RELAXING ASSUMPTIONS OF CLRMs
75 pages
Operations Research
50% (2)
Operations Research
9 pages
Managerial Economics in A Global Economy
No ratings yet
Managerial Economics in A Global Economy
26 pages
Evolutionary Game Dynamics
No ratings yet
Evolutionary Game Dynamics
185 pages
STATA 2 Class
No ratings yet
STATA 2 Class
3 pages
CSET240 Handout
No ratings yet
CSET240 Handout
4 pages
Regression in Machine Learning
No ratings yet
Regression in Machine Learning
18 pages

04 LDV

Uploaded by

04 LDV

Uploaded by

Department of Economics Universitas Padjadjaran | Microeconometrics

Estimators we (will) know

• Instrumental Variable (IV) estimator

Why uses binary dependent variable?

• Suppose we want to analyse socioeconomic

Why uses binary dependent variable?

• It would be best to know (observe)

• And we could just apply OLS

Why uses binary dependent variable?

• It would be best to know (observe)

• But.. they are not observed

Why uses binary dependent variable?

• What we observe is that

But 𝑈𝑖𝑠 , the utility of smoking, is unobserved.

We, however, observe

𝑦𝑖 = 1 𝑖𝑓 𝑈𝑖𝑠 > 0 and

We know the value of 𝑦𝑖 , either 0 or 1

The Linear Probability Model

The Linear Probability Model

using OLS, we have a Linear Probability Model

The Linear Probability Model

• we can write 𝑦𝑖 as 𝐸(𝑦𝑖 |𝑥𝑖 )

Therefore we can write our LPM

𝐸(𝑦𝑖 𝑥𝑖 = 𝑝𝑟𝑜𝑏(𝑦𝑖 = 1|𝑥𝑖 ) = 𝛽0 + 𝛽1 𝑥𝑖

𝐸(𝑦𝑖 𝑥𝑖 = 𝑝𝑟𝑜𝑏(𝑦𝑖 = 1|𝑥𝑖 ) = 𝛽0 + ෍ 𝛽𝑖 𝑥𝑖

• We cannot interpret our 𝛽𝑖 ’s as usual, because

𝐸(𝑦𝑖 𝑥 = 𝑝𝑟𝑜𝑏(𝑦𝑖 = 1|𝑥) = 𝛽0 + ෍ 𝛽𝑖 𝑥𝑖

𝐸(𝑦𝑖 𝑥 = 𝑝𝑟𝑜𝑏(𝑦𝑖 = 1|𝑥) = 𝛽0 + ෍ 𝛽𝑖 𝑥𝑖

• If 𝑥 is dummy variable (e.g 1=male):

• The error term follows Bernoulli distribution

Since the error term follows Bernoulli distribution, then

Variance of the error term:

Having said that…

What is a “better” model for estimating

• A class of function that we have already seen

What is a better model for estimating

What is a better model for E(yi)?

𝐸(𝑦𝑖 𝑥 = 𝑝𝑟𝑜𝑏(𝑦𝑖 = 1|𝑥) = 𝐹 𝛽0 + ෍ 𝛽𝑖 𝑥𝑖

• Therefore to model a binary dependent variables we need to choose a

• Whatever the values of independent variables are

Solution 1: Logit Model

F can be in the form of

Solution 1: Logit Model

Solution 1: Logit Model

Logit Model: Coefficients &

Logit Model: Coefficients &

To get the marginal effect, we need to differentiate:

Solution 2: Probit Model

What we observed is actually 𝑦, which takes the

We assume that 𝑢𝑖 ~𝑁(0, 𝜎 2 )

Solution 2: Probit Model

Solution 2: Probit Model

Probit Model: Coefficients &

Probit Model: Coefficients &

To get the marginal effect, we need to differentiate:

Gender Inequality and Poverty in Indonesia:

Jeffrey Aron Natan, 2019

Jeffrey Aron Natan, 2019

Estimation of Logit and Probit Models

• MLE (Maximum Likelihood Estimator) of the

• Likelihood function: the joint probability of the

Maximum Likelihood Estimator

Maximum Likelihood Estimator

Maximum Likelihood Estimator

These slideshows may be downloaded by anyone, anywhere for personal use.

0.3 3.5 0.3521 0.0175 0.0062

MAXIMUM LIKELIHOOD ESTIMATION OF REGRESSION COEFFICIENTS

MLE AND REGRESSION ANALYSIS

Hence the density function for the ex ante distribution of Yi is as shown.

where Z  (Y1  b 1  b 2 X 1 )2  ...  (Yn  b 1  b 2 X n )2 

Hence the log-likelihood simplifies as shown.

where Z  (Y1  b 1  b 2 X 1 )2  ...  (Yn  b 1  b 2 X n )2 

where Z  (Y1  b 1  b 2 X 1 )2  ...  (Yn  b 1  b 2 X n )2 

Differentiating it with respect to s, we obtain the expression shown.

You might also like