Logit
Logit
Logit
i=1
(x
i
)
y
i
(1 (x
i
))
1y
i
(1)
1
Step 1: Take the log of the likelihood function
lnL(, |y) =
n
i=1
ln[(x
i
)
y
i
(1 (x
i
))
1y
i
]
=
n
i=1
ln[(x
i
)
y
i
+ (1 (x
i
))
1y
i
]
=
n
i=1
[y
i
ln((x
i
)) + (1 y
i
)ln(1 (x
i
))]
Step 2: Distribute (1 y
i
) and rearrange
=
n
i=1
[y
i
ln((x
i
)) + ln(1 (x
i
)) y
i
ln(1 (x
i
))]
=
n
i=1
[y
i
ln((x
i
)) y
i
ln(1 (x
i
)) + ln(1 (x
i
))]
Step 3: Factor out y
i
=
n
i=1
[y
i
(ln((x
i
)) ln(1 (x
i
))) + ln(1 (x
i
))]
Step 4: Use Log Properties and Substitute
=
n
i=1
[y
i
ln
(x
i
)
1 (x
i
)
+ ln(1 (x
i
)]
Where, per Casella pgs. 591 and 592, respectively
ln
(x
i
)
1 (x
i
)
= + (x
i
)
(x
i
) =
e
+(x
i
)
1+e
+(x
i
)
2
Step 5: Multiply the 1 by
1+e
+(x
i
)
1+e
+(x
i
)
to pull over common denominator
=
n
i=1
[y
i
( + (x
i
) + ln
1 + e
+(x
i
)
1 + e
+(x
i
)
e
+(x
i
)
1 + e
+(x
i
]
=
n
i=1
[y
i
( + (x
i
) + ln
1
1 + e
+(x
i
)
]
Step 6: Use log properties
=
n
i=1
[y
i
( + (x
i
) + ln(1) ln(1 + e
+(x
i
)]
=
n
i=1
[y
i
( + (x
i
) ln(1 + e
+(x
i
)
)]
Where the above log likelihood function agrees to Gujarati page 590.
Step 7: Take rst derivatives with respect to and
u
lnu =
1
u
du
x
e
x
= e
x
dx
lnL
= y
i
(1) + 0
(1)e
+(x
i
)
1 + e
+(x
i
)
= y
i
e
+(x
i
)
1 + e
+(x
i
)
lnL
= 0 + y
i
(1)(x
i
)
(x
i
)e
+(x
i
)
1 + e
+(x
i
)
= y
i
(x
i
)
(x
i
)e
+(x
i
)
1 + e
+(x
i
)
Step 8: Take second derivatives with respect to and . For the second
derivative with respect to , use the quotient rule. For the second derivative
3
with respect to , use both the product and quotient rule.
u = e
+(x
i
)
v = 1 + e
+(x
i
)
u
= (1)e
+(x
i
)
= e
+(x
i
)
v
= (1)e
+(x
i
)
= e
+(x
i
)
2
lnL
2
= 0
vu
uv
v
2
=
(1 + e
+(x
i
)
)(e
+(x
i
)
) (e
+(x
i
)
)(e
+(x
i
)
)
(1 + e
+(x
i
)
)
2
=
e
+(x
i
)
(1 + e
+(x
i
)
)
2
For the second derivative with respect to , rst use the product rule for the
derivative of the numerator term, (x
i
)e
+(x
i
)
, with respect to .
u = x
i
v = e
+(x
i
)
u
= 0
v
= (1)x
i
e
+(x
i
)
uv
+ vu
= x
i
x
i
e
+(x
i
)
+ e
+(x
i
)
(0)
= x
2
i
e
+(x
i
)
4
The above will be used as u for application of the quotient rule on
(x
i
)e
+(x
i
)
1+e
+(x
i
)
u = x
i
e
+(x
i
)
v = 1 + e
+(x
i
)
u
= x
2
i
e
+(x
i
)
v
= 0 + x
i
e
+(x
i
)
2
lnL
2
= 0
vu
uv
v
2
=
(1 + e
+(x
i
)
)(x
2
i
e
+(x
i
)
) (x
i
e
+(x
i
)
)(x
i
e
+(x
i
)
)
(1 + e
+(x
i
)
)
2
=
x
2
i
e
+(x
i
)
(1 + e
+(x
i
)
)
2
Step 9: Calculate the last term needed for the Hessian using the quotient
rule.
2
lnL
[y
i
e
+(x
i
)
1 + e
+(x
i
)
]
=
x
i
e
+(x
i
)
(1 + e
+(x
i
)
)
2
Step 10: Collect the second derivatives for the Hessian.
2
f() =
e
+(x
i
)
(
1+e
+(x
i
)
)
2
x
i
e
+(x
i
)
(
1+e
+(x
i
)
)
2
x
i
e
+(x
i
)
(
1+e
+(x
i
)
)
2
x
2
i
e
+(x
i
)
(
1+e
+(x
i
)
)
2
m+1
=
m
[
2
f()|
m
]
1
f()|
m
At this point, an Excel spreadsheet is necessary that has the following in
each of the below columns:
5
(A) values for the dependent variable from Casella and Bergers Challenger
Data in Example 12.3.1
(B) values for the independent variable from Casella and Bergers Chal-
lenger Data in Example 12.3.1
(C) the guess for
(D) the guess for
(E) column intentionally left blank
(F) element 1,1 from the Hessian
(G) element 1,2 from the Hessian
(H) column intentionally left blank, but note that the Hessian is symmetric
(I) element 2,2 from the Hessian
(J) column intentionally left blank
(K) the rst derivative with respect to evaluated at the guess for each
observation
(L) column intentionally left blank
(M) the rst derivative with respect to evaluated at the guess for each
observation
Using the rst observation as an example, the formulas in each relevant
column of the spreadsheet are:
Col F = (EXP(C2 + (D2 B2))/((1 + EXP(C2 + D2 B2)))
2
)
Col G = B2 (EXP(C2 + (D2 B2))/((1 + EXP(C2 + D2 B2)))
2
)
Col H = Symmetric
Col I = (B2
2
) (EXP(C2 + (D2 B2))/((1 + EXP(C2 + D2 B2)))
2
)
Col K = A2 (EXP(C2 + (D2 B2))/((1 + EXP(C2 + D2 B2))))
Col M = (A2 B2) (B2 (EXP(C2 + (D2 B2))/((1 + EXP(C2 + D2 B2)))))
6
The results from columns F through I, K, and M are summed up over the
23 observations and the sums are used in the Stata code below to manually
implement Newton-Raphson. The initial guesses used are = 20 and =
.3
Here is the manual implementation of the Newton-Raphson Method using
Stata:
clear
******* Casella Data *************
/* first guess */
matrix casella_c= (-3.13, -214.77\ -214.77, -14786.82)
matrix invcasella_c= inv(casella_c)
matrix list invcasella_c
matrix casella_d= (-1.030, -65.869)
matrix transcasella_d= casella_d
matrix list casella_d
matrix list transcasella_d
matrix casella_guess= (20, -.30)
matrix transcasella_guess= casella_guess
matrix logit_coeff= transcasella_guess-(invcasella_c*transcasella_d)
matrix list logit_coeff
matrix translogit_coeff= logit_coeff
matrix list translogit_coeff
/* Second guess */
matrix casella_a_second= (-3.428, -232.656\ -232.656, -15893.986)
matrix invcasella_a_second= inv(casella_a_second)
matrix list invcasella_a_second
matrix casella_d_two= (.155047958, 7.852477079)
matrix transcasella_d_two= casella_d_two
matrix list casella_d_two
7
matrix list transcasella_d_two
matrix casella_guess_two= (13.07929, -.2039352)
matrix transcasella_guess_two= casella_guess_two
matrix logit_coeff_two= transcasella_guess_two- ///
(invcasella_a_second*transcasella_d_two)
matrix list logit_coeff_two
/* Third guess */
matrix casella_1= (-3.27827, -222.83347\ -222.83347, -15223.56817)
matrix invcasella_1= inv(casella_1)
matrix casella_2= (.007626928, .303416161)
matrix transcasella_2= casella_2
matrix transcasella_3= casella_2
matrix guess_3= (14.87091, -.2296669)
matrix transguess_3= guess_3
matrix logit_coeff_3= transguess_3- (invcasella_1*transcasella_2)
matrix list logit_coeff_3
/* Fourth guess */
matrix casella_4= (-3.26061, -221.65395\ -221.65395, -15153.07005)
matrix invcasella_4= inv(casella_4)
matrix casella_5= (.000058, .029397859)
matrix transcasella_5= casella_5
matrix guess_6= (15.0632, -.2324616)
matrix transguess_6= guess_6
matrix logit_coeff_4= transguess_6 -(invcasella_4*transcasella_5)
matrix list logit_coeff_4
import excel using C:\econometrics\logit\casella_data.xlsx, firstrow
logit y x
8