0% found this document useful (0 votes)
28 views10 pages

Marginal Effects For Binary Response Models Nonlinear

Uploaded by

alinadenham
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views10 pages

Marginal Effects For Binary Response Models Nonlinear

Uploaded by

alinadenham
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Marginal Effects for Binary Response Models with Nonlinear Regressors

Joseph G. Hirschberga and Jenny N. Lyea,*


a
Department of Economics, University of Melbourne, Victoria 3010, Australia.

Abstract

Marginal Effects for binary response variables are nonlinear functions of the

parameter estimates. Many of the applications of these models also contain nonlinear

regressors. We adopt a simple linear transformation of the parameter estimates that

dramatically simplifies the calculation of marginal effects and corresponding standard

errors for these models. We also use this approach to present an empirical example in

which marginal effects and the corresponding confidence interval are graphically

illustrated and interpreted.

Keywords: Binary response models; nonlinear regressors, marginal effects, confidence

intervals.

*
Corresponding author, [email protected]
1. Introduction

Parameter estimates in binary response models must be transformed to yield

estimates of marginal effects. For such models, Anderson and Newell (2003) show

that the use of a simple normalisation of explanatory variables dramatically simplifies

the calculation of marginal effects and corresponding standard errors.

However, many of the applications of these models contain nonlinear

variables, such as quadratic, higher order polynomial and interaction terms (see for

example, Li and Zahniser 2002, Auld and Sihu 2005, Azfar and Danninger 2001,

Powers 2005). We show by adopting a simple linear transformation of the parameters

the same results of Anderson and Newell (2003) can be applied to these models as

well.

In addition, it has been suggested (Brambor et al. 2006) that for such models

the marginal effects and corresponding confidence intervals are graphically illustrated

across a substantially meaningful range of the variables. We illustrate this approach in

an empirical example, using the methods in this paper to calculate the marginal effects

and corresponding standard errors.

2. A Linear Transformation of the Parameters

Consider a class of binary response models of the form

 K 
P(y =1/x1 " xK )=F  β0 + ∑ β j x j  = F ( z ) (1)
 j =1 

where F is a function such that 0 < F ( z ) < 1 and x1 " xK are explanatory variables.

Following Anderson and Newell (2003) the explanatory variables are then normalized

so that they equal 0 at a desired reference point. In this paper we normalize them at

the value at which the marginal effect is to be calculated so that (1) is rewritten as:

2
 K

P(y =1/x1 " xK )=F  γ 0 + ∑ γ j ( x j − a j )  (2)
 j =1 

where a1 " aK are the values as which the explanatory variables are normalized.

Usually, this will imply that aj will be the mean value of x j or a particular observation

of x j . If xj is a dummy variable then the corresponding value of aj is set equal to the

value of the reference group for which the marginal effect is to be calculated.

Let γ be the Kx1 vector ( γ0 γ1 " γ K ) and β be the Kx1 vector

( β0 β1 " βK ) , the relationship between γ̂ , the estimated γ and β̂ , the estimated

β is defined by

γˆ = Aβˆ (3)

where

1 a1 " aK 
0 1 " 0 
A=  (4)
# # # # 
 
0 0 " 1 

The variance-covariance matrix of γ̂ is defined by

var ( γˆ ) = A var(βˆ ) A′ (5)

3. Calculation of Marginal and Partial Effects

When xi is a continuous variable, the marginal effect of xi is

 K

= γˆ i f  γˆ 0 + ∑ γˆ j ( x j − a j ) 
ni = ∂P ( y =1/ x1" xK )
ME ∂xi (6)
 j =1 

where

∂F ( z )
f ( z) = (7)
∂z

and if (6) is calculated at x j = a j ∀ j =1" K then

∂P ( y =1/ x1" xK )
ni =
ME ∂xi = γˆ i f ( γˆ 0 ) (8)

3
where γˆ 0 , γˆ i are calculated from the original parameter estimates using the linear

transformation in (3). In this example we have

K
γˆ 0 = βˆ 0 +∑ a j βˆ j and γˆ i = βˆ i
j =1

where the aj’s are set to the values at which the marginal effect is to be calculated.

n i are calculated using the delta method as follows:


The asymptotic variances of ME

2 2
n ni ni ni 
n i =  ∂ ME i   ∂ ME   ∂ ME   ∂ ME
var ME( )
 ∂γˆ  σˆ 00 + 2     σˆ 0i +   σˆ ii (9)
 0   ∂γˆ 0   ∂γˆ 1   ∂γˆ i 

where σˆ jj is the estimated variance of γˆ j and σˆ 0 j is the estimated covariance

between γ̂ 0 and γˆ j . These are obtained from the original estimated standard errors

using (5).
For the Logit model,

F ( z ) = exp( z ) [1+ exp( z )] = Λ ( z ) (10)

and the marginal effect for a continuous variable is equal to:

n i = γˆ f ( γˆ ) = γˆ  Λ ( γˆ ) (1 − Λ ( γˆ ) ) 
ME (11)
i 0 i  0 0 

with corresponding estimated variance

n i2 (1 − 2Λ ( γˆ ) ) (1 − 2Λ ( γˆ ) ) σˆ + 2 σˆ  + 1 σˆ 
( )
n i = ME
var ME 

0 

0 00
γˆ i
0i 
 γˆ i
2 ii 

(12)

For the Probit model

F ( z) = Φ ( z ) (13)

where the function Φ (.) is the standard normal distribution. The marginal effect for a

continuous variable is equal to:

n i = γˆ f ( γˆ ) = γˆ φ ( γˆ )
ME (14)
i 0 i 0

4
where φ (.) is the standard normal distribution and the corresponding estimated

variance is:

n i2  γˆ 2 σˆ − 2 γˆ 0 σˆ + 1 σˆ 
(
n i = ME
var ME )
 0 00
 γˆ 1
0i ii 
γˆ 12 
(15)

Table 1 in Anderson and Newell (2003, pg 323) can be used to calculate these
expressions for the Logit and Probit models by substituting γˆ 0 = c. Note that as

γ̂ 0 becomes large then the marginal effects for the Logit and Probit models, that is

(11) and (14) tend to 0. As γ̂ 0 approaches 0 then the marginal effect for the Logit

model, that is (11) tends to 0.250 × γˆ i and for the Probit model, that is (14) tends to

0.399 × γˆ i .
If xi is a dummy variable then set ai=0 and define the partial effect from
changing xi from 0 to 1, holding all other variables fixed as

 i −1 K 
= F  γ0 + γi + ∑ ( x j − a j ) + ∑ ( x j − a j ) 
m = ∆P ( y =1/ x1" xK )
PE i ∆xi
 j =1 j =i +1 
(16)
 i −1 K 
- F  γ0 + ∑ ( x j − a j ) + ∑ ( x j − a j ) 
 j =1 j = i +1 

If (16) is calculated at x j = a j ∀ j =1" K then

∆P ( y =1/ x1" xK )
m =
PE i ∆xi = F ( γˆ 0 + γˆ i ) − F ( γˆ 0 ) (17)

and the asymptotic variances are calculated using the delta method as follows:

var PE ( )
m = { f ( γˆ + γˆ )}2 ( σˆ + 2σˆ + σˆ )
i 0 i 00 0i ii
(18)
− 2 f ( γˆ 0 ) f ( γˆ 0 + γˆ i )( σˆ 00 + σˆ 0i ) + f ( γˆ 0 ) ( σˆ 00 )
2

For both the Logit and Probit models (17) and (18) are easily calculated using the
appropriate expressions for F and f. As above, Table 1 in Anderson and Newell
(2003, pg 323) can be used to calculate these expressions for the Logit and Probit
models by substituting γˆ 0 + γˆ 1 = c, to find the value of F ( γˆ 0 + γˆ i ) and γˆ 0 = c to find

the value of F ( γˆ 0 ) . Note that as γ̂ 0 becomes large then the partial effects for the

5
Logit and Probit models, that is (17) tends to 0. As γ̂ 0 approaches 0 then the partial

effects for the Logit and Probit model, that is (17) tends to F ( γˆ i ) − 1 2 .

The expressions obtained in this section have wide applicability as marginal and

partial effects can be generated for any point by simply specifying the values of aj in

the A matrix (4) since γˆ 0 and γˆ 1 and corresponding standard errors can be calculated

from the relationships in (3) and (5) with the original parameter estimates and

standard errors. In the next section we show how they can also be applied to models

that contain nonlinear regressors.

4. Extensions to Nonlinear Explanatory Variables

Many of the applications of discrete choice models contain nonlinear

variables, such as quadratic, higher order polynomial and interaction terms (see for

example, Li and Zahniser 2002, Auld and Sihu 2005, Azfar and Danninger 2001,

Powers 2005). In this section, we show how the results of Section 3 can be applied to

models containing these variables by using a re definition of the A matrix in the linear

transformation of the parameters.

4.1 Models with Log Terms

If any of the explanatory variables are expressed in log form, say xi for

example, then define the corresponding ai in (5) to also be in log form and all the

results of Section 3 apply.

4.2 Models with Quadratics

Suppose the model contains both x1 , x12 as explanatory variables so that

 K 
P(y =1/x1 " xK ) = F  β0 + β1 x1 + β2 x12 + ∑ β j x j  (19)
 j =3 

Now define the A matrix in (3) and (5) as:

6
1 a1 a12 a3 " aK 
 
0 1 2a1 0 " 0
A = 0 0 1 0 " 0 (20)
 
# # # # # # 
0 0 0 0 0 1 

and to apply the results of Section 3 set i = 1 in (8) and (9).

4.3 Higher Order Polynomials

An extension of the above model which contains a quadratic term, is to

assume the explanatory variable x1 appears with many powers. That is, assume,

 K 
P(y =1/x)=F  β0 + ∑ β j x j  (21)
 j =1 

In this case define the matrix A in (3) and (5) as:

1 a a2 a3 a4 " aK 
 2 3 K-1 
0 1 2a 3a 4a " Ka 
0

0 1 3a 6a 2
" ( 2 ) a 
K(K-1) K-2

A = 0 0 0 1 4a " ( K(K-1)(K-2) 2×3 ) a K-3  (22)


# # # # # # # 
 
0 0 0 0 1 " Ka 
0 0 0 0 0 " 1 
 

Then, as above, to apply the results of Section 3 set i = 1 in (8) and (9).

4.4 Models with Interaction Terms

Regression specifications frequently employ explanatory variables that are

defined as the product of two other explanatory variables to form an interaction.

Suppose the explanatory variables include x1 , x2 , and { x1 × x2 } then

 K 
P(y =1/x1 " xK )=F  β0 + β1 x1 + β2 x2 + β3 { x1 × x2 } + ∑ β j x j  (23)
 j =4 

In this case, define the A matrix as in (3) and (5) as:

7
1 a1 a2 a1 × a2 a4 " aK 
0 1 0 a2 0 " 0 

0 0 1 a1 0 " 0
 
A = 0 0 0 1 0 " 0 (24)
0 0 0 0 1 " 0
 
# # # # # # # 
0 0 0 0 0 " 1 

If x1 and x2 are both continuous variables, then to find the marginal effect for x1 for

example, set i = 1 in (8) and (9) in Section 3 above. If however, x1 is a dummy

variable, then to find the partial effect of a change in x1 for taking the value 0 to 1 set

i = 1 in (17) and (18).

5. Empirical Example

We use data from Fair (1978) who extracted a sample of 601 observations on

men and women from a 1969 survey from a magazine Psychology Today. The sample

consisted of men and women currently married for the first time and analysed their

responses to a question about extramarital affairs. In the example we use data on the

variables: affair =1 if had at least one affair; yrsmarr =years married (sample mean

8.18), kids =1 if have kids, an interaction between yrsmarr and kids, and age in years

(with a sample mean 32.5)†. Estimation results are presented in Table 1 which in

particular indicates that the interaction term between yrsmarr and kids is highly

significant. In Figure 1 the partial effect of a change in kids from no kids to kids on

the probability of having an affair is illustrated assuming a range of values for

yrsmarr and the sample mean value for age. From this figure we see that it is only

significantly different from 0 for those who have not been married very long (less

than 5 years) or those married for many years (over 18). In the sample 41% of

observations correspond to individuals who have been married less than 5 years.


A dummy variable for gender was not included as it was not significant.

8
However, there are no observations in the sample corresponding to individuals who

have been married more than 18 years. Similar results were obtained when different

values of age were assumed.

Table 1: Estimation Results for Fair(1978) Data


Dependent Variable: AFFAIR

Variable Coefficient Std. Error z-Statistic Prob.

C -1.558 0.470 -3.312 0.001

YRSMARR 0.197 0.055 3.571 0.000

KIDS 1.141 0.398 2.870 0.004

YRSMARR*KIDS -0.137 0.055 -2.505 0.012

AGE -0.032 0.017 -1.921 0.0545

Log likelihood -325.234 Restr. Log likelihood -337.689

McFadden R-squared 0.037 LR statistic (4 df) 24.909

Obs with Dep=0 451 Obs with Dep=1 150

Figure 1: Partial Effect and corresponding Confidence Interval


0.3

0.2

0.1

0
0 5 10 15 20 25
-0.1 me
-0.2 low

-0.3 up

-0.4

-0.5

-0.6

-0.7

9
References

Anderson, S. and R. Newell (2003), “Simplified Marginal Effects in Discrete Choice


Models”, Economics Letters, 81, 321-326.
Auld, M.C. and N. Sidhu (2005), “Schooling, cognitive ability and Health”, Health
Economics, 14, 1019-1034.
Azfar, O. and S. Danninger (2001), “Profit-Sharing, Employment Stability and Wage
Growth”, Industrial and Labor Relations Review, 54, 619-630.
Brambor, T., W. Clark and M. Golder (2006), “Understanding Interaction Models:
Improving Empirical Analyses”, Political Analysis, 14, 63-82.
Fair, R. (1978) , “A Theory of Extramarital Affairs”, Journal of Political Economy,
86, 45-61.
Li, H. and S. Zahniser (2002), “The Determinants of Temporary Rural-to-Urban
Migration in China”, Urban Studies, 39, 2219-2235.
Powers E. (2005), “Interpreting logit regressions with interaction terms: an
application to the management turnover literature”, Journal of Corporate
Finance, 11, 504-522.

10

You might also like