Marginal Effects For Binary Response Models Nonlinear
Marginal Effects For Binary Response Models Nonlinear
Abstract
Marginal Effects for binary response variables are nonlinear functions of the
parameter estimates. Many of the applications of these models also contain nonlinear
errors for these models. We also use this approach to present an empirical example in
which marginal effects and the corresponding confidence interval are graphically
intervals.
*
Corresponding author, [email protected]
1. Introduction
estimates of marginal effects. For such models, Anderson and Newell (2003) show
variables, such as quadratic, higher order polynomial and interaction terms (see for
example, Li and Zahniser 2002, Auld and Sihu 2005, Azfar and Danninger 2001,
the same results of Anderson and Newell (2003) can be applied to these models as
well.
In addition, it has been suggested (Brambor et al. 2006) that for such models
the marginal effects and corresponding confidence intervals are graphically illustrated
an empirical example, using the methods in this paper to calculate the marginal effects
K
P(y =1/x1 " xK )=F β0 + ∑ β j x j = F ( z ) (1)
j =1
where F is a function such that 0 < F ( z ) < 1 and x1 " xK are explanatory variables.
Following Anderson and Newell (2003) the explanatory variables are then normalized
so that they equal 0 at a desired reference point. In this paper we normalize them at
the value at which the marginal effect is to be calculated so that (1) is rewritten as:
2
K
P(y =1/x1 " xK )=F γ 0 + ∑ γ j ( x j − a j ) (2)
j =1
where a1 " aK are the values as which the explanatory variables are normalized.
Usually, this will imply that aj will be the mean value of x j or a particular observation
value of the reference group for which the marginal effect is to be calculated.
β is defined by
γˆ = Aβˆ (3)
where
1 a1 " aK
0 1 " 0
A= (4)
# # # #
0 0 " 1
K
= γˆ i f γˆ 0 + ∑ γˆ j ( x j − a j )
ni = ∂P ( y =1/ x1" xK )
ME ∂xi (6)
j =1
where
∂F ( z )
f ( z) = (7)
∂z
∂P ( y =1/ x1" xK )
ni =
ME ∂xi = γˆ i f ( γˆ 0 ) (8)
3
where γˆ 0 , γˆ i are calculated from the original parameter estimates using the linear
K
γˆ 0 = βˆ 0 +∑ a j βˆ j and γˆ i = βˆ i
j =1
where the aj’s are set to the values at which the marginal effect is to be calculated.
2 2
n ni ni ni
n i = ∂ ME i ∂ ME ∂ ME ∂ ME
var ME( )
∂γˆ σˆ 00 + 2 σˆ 0i + σˆ ii (9)
0 ∂γˆ 0 ∂γˆ 1 ∂γˆ i
between γ̂ 0 and γˆ j . These are obtained from the original estimated standard errors
using (5).
For the Logit model,
n i = γˆ f ( γˆ ) = γˆ Λ ( γˆ ) (1 − Λ ( γˆ ) )
ME (11)
i 0 i 0 0
n i2 (1 − 2Λ ( γˆ ) ) (1 − 2Λ ( γˆ ) ) σˆ + 2 σˆ + 1 σˆ
( )
n i = ME
var ME
0
0 00
γˆ i
0i
γˆ i
2 ii
(12)
F ( z) = Φ ( z ) (13)
where the function Φ (.) is the standard normal distribution. The marginal effect for a
n i = γˆ f ( γˆ ) = γˆ φ ( γˆ )
ME (14)
i 0 i 0
4
where φ (.) is the standard normal distribution and the corresponding estimated
variance is:
n i2 γˆ 2 σˆ − 2 γˆ 0 σˆ + 1 σˆ
(
n i = ME
var ME )
0 00
γˆ 1
0i ii
γˆ 12
(15)
Table 1 in Anderson and Newell (2003, pg 323) can be used to calculate these
expressions for the Logit and Probit models by substituting γˆ 0 = c. Note that as
γ̂ 0 becomes large then the marginal effects for the Logit and Probit models, that is
(11) and (14) tend to 0. As γ̂ 0 approaches 0 then the marginal effect for the Logit
model, that is (11) tends to 0.250 × γˆ i and for the Probit model, that is (14) tends to
0.399 × γˆ i .
If xi is a dummy variable then set ai=0 and define the partial effect from
changing xi from 0 to 1, holding all other variables fixed as
i −1 K
= F γ0 + γi + ∑ ( x j − a j ) + ∑ ( x j − a j )
m = ∆P ( y =1/ x1" xK )
PE i ∆xi
j =1 j =i +1
(16)
i −1 K
- F γ0 + ∑ ( x j − a j ) + ∑ ( x j − a j )
j =1 j = i +1
∆P ( y =1/ x1" xK )
m =
PE i ∆xi = F ( γˆ 0 + γˆ i ) − F ( γˆ 0 ) (17)
and the asymptotic variances are calculated using the delta method as follows:
var PE ( )
m = { f ( γˆ + γˆ )}2 ( σˆ + 2σˆ + σˆ )
i 0 i 00 0i ii
(18)
− 2 f ( γˆ 0 ) f ( γˆ 0 + γˆ i )( σˆ 00 + σˆ 0i ) + f ( γˆ 0 ) ( σˆ 00 )
2
For both the Logit and Probit models (17) and (18) are easily calculated using the
appropriate expressions for F and f. As above, Table 1 in Anderson and Newell
(2003, pg 323) can be used to calculate these expressions for the Logit and Probit
models by substituting γˆ 0 + γˆ 1 = c, to find the value of F ( γˆ 0 + γˆ i ) and γˆ 0 = c to find
the value of F ( γˆ 0 ) . Note that as γ̂ 0 becomes large then the partial effects for the
5
Logit and Probit models, that is (17) tends to 0. As γ̂ 0 approaches 0 then the partial
effects for the Logit and Probit model, that is (17) tends to F ( γˆ i ) − 1 2 .
The expressions obtained in this section have wide applicability as marginal and
partial effects can be generated for any point by simply specifying the values of aj in
the A matrix (4) since γˆ 0 and γˆ 1 and corresponding standard errors can be calculated
from the relationships in (3) and (5) with the original parameter estimates and
standard errors. In the next section we show how they can also be applied to models
variables, such as quadratic, higher order polynomial and interaction terms (see for
example, Li and Zahniser 2002, Auld and Sihu 2005, Azfar and Danninger 2001,
Powers 2005). In this section, we show how the results of Section 3 can be applied to
models containing these variables by using a re definition of the A matrix in the linear
If any of the explanatory variables are expressed in log form, say xi for
example, then define the corresponding ai in (5) to also be in log form and all the
K
P(y =1/x1 " xK ) = F β0 + β1 x1 + β2 x12 + ∑ β j x j (19)
j =3
6
1 a1 a12 a3 " aK
0 1 2a1 0 " 0
A = 0 0 1 0 " 0 (20)
# # # # # #
0 0 0 0 0 1
assume the explanatory variable x1 appears with many powers. That is, assume,
K
P(y =1/x)=F β0 + ∑ β j x j (21)
j =1
1 a a2 a3 a4 " aK
2 3 K-1
0 1 2a 3a 4a " Ka
0
0 1 3a 6a 2
" ( 2 ) a
K(K-1) K-2
Then, as above, to apply the results of Section 3 set i = 1 in (8) and (9).
K
P(y =1/x1 " xK )=F β0 + β1 x1 + β2 x2 + β3 { x1 × x2 } + ∑ β j x j (23)
j =4
7
1 a1 a2 a1 × a2 a4 " aK
0 1 0 a2 0 " 0
0 0 1 a1 0 " 0
A = 0 0 0 1 0 " 0 (24)
0 0 0 0 1 " 0
# # # # # # #
0 0 0 0 0 " 1
If x1 and x2 are both continuous variables, then to find the marginal effect for x1 for
variable, then to find the partial effect of a change in x1 for taking the value 0 to 1 set
5. Empirical Example
We use data from Fair (1978) who extracted a sample of 601 observations on
men and women from a 1969 survey from a magazine Psychology Today. The sample
consisted of men and women currently married for the first time and analysed their
responses to a question about extramarital affairs. In the example we use data on the
variables: affair =1 if had at least one affair; yrsmarr =years married (sample mean
8.18), kids =1 if have kids, an interaction between yrsmarr and kids, and age in years
(with a sample mean 32.5)†. Estimation results are presented in Table 1 which in
particular indicates that the interaction term between yrsmarr and kids is highly
significant. In Figure 1 the partial effect of a change in kids from no kids to kids on
yrsmarr and the sample mean value for age. From this figure we see that it is only
significantly different from 0 for those who have not been married very long (less
than 5 years) or those married for many years (over 18). In the sample 41% of
observations correspond to individuals who have been married less than 5 years.
†
A dummy variable for gender was not included as it was not significant.
8
However, there are no observations in the sample corresponding to individuals who
have been married more than 18 years. Similar results were obtained when different
0.2
0.1
0
0 5 10 15 20 25
-0.1 me
-0.2 low
-0.3 up
-0.4
-0.5
-0.6
-0.7
9
References
10