Lecture15 Binary Dependent Variables
Lecture15 Binary Dependent Variables
Lecture Outline
Introduction
What if Y is binary?
Y = get into college, or not; X = parental income.
Y = person smokes, or not; X = cigarette tax rate, income.
Y = mortgage application is accepted, or not; X = race, income,
Mortgage applications
Example:
Most individuals who want to buy a house apply for a mortgage at a
bank.
Not all mortgage applications are approved.
What determines whether or not a mortgage application is approved or
denied?
During this lecture we use a subset of the Boston HMDA data
(N = 2380)
a data set on mortgage applications collected by the Federal
Variable
Description
Mean
SD
deny
pi_ratio
black
0.120
0.331
0.142
0.325
0.107
0.350
Mortgage applications
Tuesday February 18 15:46:21 2014
Page 1
deny
pi_ratio
_cons
Number of obs =
F( 1, 2378)
Prob > F
R-squared
Root MSE
Coef.
.6035349
-.0799096
Robust
Std. Err.
.0984826
.0319666
t
6.13
-2.50
P>|t|
0.000
0.012
=
=
=
=
2380
37.56
0.0000
0.0397
.31828
.7966555
-.0172243
No perfect multicollinearity.
Yi = 0 + 1 X1i + + k Xki + ui
The variance of a Bernoulli random variable (CH 2 S&W):
Var (Y ) = Pr (Y = 1) (1 Pr (Y = 1))
We can use this to find the conditional variance of the error term
Var (ui |X1i , , Xki )
6=
u2
10
11
Z = 0 + 1 X1i + + k Xki
and
0 G (Z ) 1
Probit
Logit
G(Z ) = (Z )
G (Z ) =
1
1 + eZ
12
Probit
Probit regression models the probability that Y = 1
Using the cumulative standard normal distribution function (Z )
evaluated at Z = 0 + 1 X1i + + k Xki
since (z) = Pr (Z z) we have that the predicted probabilities of the
probit model are between 0 and 1
Example
Suppose we have only 1 regressor and Z = 2 + 3X1
We want to know the probability that Y = 1 when X1 = 0.4
z = 2 + 3 0.4 = 0.8
Pr (Y = 1) = Pr (Z 0.8) = (0.8)
13
Probit
Page 791 in the book:
Pr(z
-0.8)
= .2119
Pr (Y = 1) = Pr (Z
0.8)
= (0.8)
= 0.2119
11-15
14
Logit
Logit regression models the probability that Y = 1
Using the cumulative standard logistic distribution function
F (Z ) =
1
1 + eZ
15
Logit
.2
.25
.05
.1
.15
-5
-4
-3
-2
-1
x
Pr (Y = 1) = Pr (Z 0.8) =
1
1+e0.8
= 0.31
16
-4
-3
-2
-1
x
logistic
normal
17
18
19
Pr (Y1 = y1 ) . . . Pr (Yn = yn )
y
p 1 (1 p)1y1 . . . pyn (1 p)1yn
20
yi
(1 p)n
yi
i=1
21
yi
P
n ni=1 yi
1p
Pn
yi )
npp
Pn
yi
n
1X
yi = Y
n
i=1
i=1
p (n
Pn
i=1
np
i=1
22
=
=
Pr (Y1 = y1 ) . . . Pr (Yn = yn )
y1
p1 (1 p1 )1y1 . . . pnyn (1 pn )1yn
i
(0 + 1 X11 + + k Xk 1 )y1 (1 (0 + 1 X11 + + k Xk 1 ))1y1 . . .
h
i
(0 + 1 X1n + + k Xkn )yn (1 (0 + 1 X1n + + k Xkn ))1yn
23
Also with obtaining the MLE of the probit model it is easier to take the
logarithm of the likelihood function
Step 2: Maximize the log likelihood function
ln [fprobit (0 , . . . , k ; Y1 , . . . , Yn | X1i , . . . , Xki , i = 1, . . . , n)]
Pn
=
i=1 Yi ln [ (0 + 1 X1i + + k Xki )]
Pn
+ i=1 (1 Yi )ln [1 (0 + 1 X1i + + k Xki )]
w.r.t 0 , . . . , 1
There is no simple formula for the probit MLE, the maximization must be
done using numerical algorithm on a computer.
24
y
p11 (1 p1 )1y1 . . . pnyn (1 pn )1yn
very similar to the Probit model but with a different function for pi
h
i
pi = 1/ 1 + e(0 +1 X1i +...+k Xki )
Pn
i=1 (1
h
i
Yi )ln 1 1/ 1 + e(0 +1 X1i +...+k Xki )
There is no simple formula for the logit MLE, the maximization must be
done using numerical algorithm on a computer.
25
0:
1:
2:
3:
log
log
log
log
likelihood
likelihood
likelihood
likelihood
=
=
=
=
-872.0853
-832.02975
-831.79239
-831.79234
Probit regression
Log likelihood =
deny
pi_ratio
_cons
Number of obs
1)
LR chi2(
Prob > chi2
Pseudo R2
-831.79234
Coef.
2.967907
-2.194159
Std. Err.
.3591054
.12899
z
8.26
-17.01
P>|z|
0.000
0.000
=
=
=
=
2380
80.59
0.0000
0.0462
3.67174
-1.941343
26
27
The probit
model
satisfies
these
All predicted
probabilities
are between
0 andconditions:
1!
I.
0:
1:
2:
3:
4:
log
log
log
log
log
likelihood
likelihood
likelihood
likelihood
likelihood
=
=
=
=
=
-872.0853
-830.96071
-830.09497
-830.09403
-830.09403
Logistic regression
Log likelihood =
deny
pi_ratio
_cons
Number of obs
1)
LR chi2(
Prob > chi2
Pseudo R2
-830.09403
Coef.
5.884498
-4.028432
Std. Err.
.7336006
.2685763
z
8.02
-15.00
P>|z|
0.000
0.000
=
=
=
=
2380
83.98
0.0000
0.0482
7.322328
-3.502032
29
30
11-26
31
32
0:
1:
2:
3:
4:
log
log
log
log
log
likelihood
likelihood
likelihood
likelihood
likelihood
=
=
=
=
=
-872.0853
-800.88504
-797.1478
-797.13604
-797.13604
Probit regression
Log likelihood =
deny
black
pi_ratio
_cons
Number of obs
2)
LR chi2(
Prob > chi2
Pseudo R2
-797.13604
Coef.
.7081579
2.741637
-2.258738
Std. Err.
.0834327
.3595888
.129882
z
8.49
7.62
-17.39
P>|z|
0.000
0.000
0.000
=
=
=
=
2380
149.90
0.0000
0.0859
.8716831
3.446418
-2.004174
0:
1:
2:
3:
4:
log
log
log
log
log
33
likelihood
likelihood
likelihood
likelihood
likelihood
=
=
=
=
=
-872.0853
-806.3571
-795.72934
-795.69521
-795.69521
Log likelihood =
deny
black
pi_ratio
_cons
Number of obs
2)
LR chi2(
Prob > chi2
Pseudo R2
-795.69521
Coef.
1.272782
5.370362
-4.125558
Std. Err.
.1461983
.7283192
.2684161
z
8.71
7.37
-15.37
P>|z|
0.000
0.000
0.000
=
=
=
=
2380
152.78
0.0000
0.0876
1.559325
6.797841
-3.599472
34
LPM
Probit
Logit
black
0.177***
(0.025)
0.71***
(0.083)
1.27***
(0.15)
P/I ratio
0.559***
(0.089)
2.74***
(0.44)
5.37***
(0.96)
constant
-0.091***
(0.029)
-2.26***
(0.16)
-4.13***
(0.35)
17.7%
15.8%
14.8%
35
Internal validity
Is there omitted variable bias?
Is the functional form correct?
Probit model: is assumption of a Normal distribution correct?
Logit model: is assumption of a Logistic distribution correct?
Is there measurement error?
Is there sample selection bias?
is there a problem of simultaneous causality?
External validity
These data are from Boston in 1990-91.
Do you think the results also apply today, where you live?
Page 1
36
Number of obs =
F( 1, 3794)
Prob > F
R-squared
Root MSE
Robust
Sunday March 23 14:38:21 2014
Page 1
college
Coef.
Std. Err.
-.012471
.2910057
dist
_cons
.0031403
.0093045
t
-3.97
31.28
Probit regression
Log likelihood =
-2204.8977
-.0407873
-.5464198
-3.73
-19.38
.0109263
.028192
Logistic regression
Log likelihood =
college
dist
_cons
Coef.
-.0709896
-.8801555
P>|z|
Std. Err.
.0193593
.0476434
z
-3.67
-18.47
=
=
=
=
3796
14.48
0.0001
0.0033
___ ____-.0622025
____ ____-.0193721
____(R)
0.000
/__
/ -.6016752
____/
/ -.4911645
____/
0.000
___/
/
/___/
/
/___/
Statistics/Data Analysis
Number of obs
1)
LR chi2(
Prob > chi2
Pseudo R2
-2204.8006
3796
15.77
0.0001
0.0036
.44302
Sunday
March 23 14:38:55
Page 1
college
Coef. 2014
Std. Err.
dist
_cons
P>|t|
=
=
=
=
P>|z|
0.000
0.000
=
=
=
=
3796
14.68
0.0001
0.0033
-.033046
-.786776
37
.35
.3
.25
.2
.15
.1
.05
0
0
5
10
Distance to college ( x 10 miles)
Linear Probability
Probit
15
Logit
38
Summary
If Yi is binary, then E(Yi |Xi ) = Pr (Yi = 1|Xi )
Three models:
1