0% found this document useful (0 votes)
124 views47 pages

Limited Dependent Variables Models PDF

The document discusses limited dependent variable models and how to calculate marginal effects and predictions from these models. Specifically: 1) Marginal effects in limited dependent variable models like probit and logit models are nonlinear and vary depending on the values of the independent variables. 2) The margins command in Stata can be used to calculate marginal effects at particular points or on average after estimating a probit or logit model. 3) The predict command can also be used to generate predictions like predicted probabilities or the index function from an estimated limited dependent variable model.

Uploaded by

kodratul iman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
124 views47 pages

Limited Dependent Variables Models PDF

The document discusses limited dependent variable models and how to calculate marginal effects and predictions from these models. Specifically: 1) Marginal effects in limited dependent variable models like probit and logit models are nonlinear and vary depending on the values of the independent variables. 2) The margins command in Stata can be used to calculate marginal effects at particular points or on average after estimating a probit or logit model. 3) The predict command can also be used to generate predictions like predicted probabilities or the index function from an estimated limited dependent variable model.

Uploaded by

kodratul iman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

Limited dependent variables Marginal effects and predictions

One of the major challenges in working with limited dependent variable


models is the complexity of explanatory factors’ marginal effects on the
result of interest. That complexity arises from the nonlinearity of the
relationship. In Equation (4), the latent measure is translated by (yi∗ )
to a probability that yi = 1. While Equation (2) is a linear relationship in
the β parameters, Equation (4) is not. Therefore, although Xj has a
linear effect on yi∗ , it will not have a linear effect on the resulting
probability that y = 1:

∂Pr [y = 1jX ] ∂Pr [y = 1jX ] ∂X β


= =
∂Xj ∂X β ∂Xj
0
(X β) βj = ψ(X β) βj .

The probability that yi = 1 is not constant over the data. Via the chain
rule, we see that the effect of an increase in Xj on the probability is the
product of two factors: the effect of Xj on the latent variable and the
derivative of the CDF evaluated at yi∗ . The latter term, ψ( ), is the
probability density function (PDF ) of the distribution.
Christopher F Baum (BC / DIW) Limited Dependent Variables BBS 2013 12 / 47
Limited dependent variables Marginal effects and predictions

In a binary choice model, the marginal effect of an increase in factor Xj


cannot have a constant effect on the conditional probability that
(y = 1jX ) since ( ) varies through the range of X values. In a linear
regression model, the coefficient βj and its estimate bj measures the
marginal effect ∂y /∂Xj , and that effect is constant for all values of X .
In a binary choice model, where the probability that yi = 1 is bounded
by the {0,1} interval, the marginal effect must vary.

For instance, the marginal effect of a one dollar increase in disposable


income on the conditional probability that (y = 1jX ) must approach
zero as Xj increases. Therefore, the marginal effect in such a model
varies continuously throughout the range of Xj , and must approach
zero for both very low and very high levels of Xj .

Christopher F Baum (BC / DIW) Limited Dependent Variables BBS 2013 13 / 47


Limited dependent variables Marginal effects and predictions

When using Stata’s probit (or logit) command, the reported


coefficients (computed via maximum likelihood) are b, corresponding
to β. You can use margins to compute the marginal effects. If a
probit estimation is followed by the command margins,
dydx(_all), the dF/dx values will be calculated.

The margins command’s at() option can be used to compute the


effects at a particular point in the sample space. The margins
command may also be used to calculate elasticities and
semi-elasticities.

Christopher F Baum (BC / DIW) Limited Dependent Variables BBS 2013 14 / 47


Limited dependent variables Marginal effects and predictions

After estimating a probit model, the predict command may be used,


with a default option p, the predicted probability of a positive outcome.
The xb option may be used to calculate the index function for each
observation: that is, the predicted value of yi∗ from Equation (4), which
is in z-units (those of a standard Normal variable). For instance, an
index function value of 1.69 will be associated with a predicted
probability of 0.95 in a large sample.

Christopher F Baum (BC / DIW) Limited Dependent Variables BBS 2013 15 / 47


Limited dependent variables Marginal effects and predictions

We use a modified version of the womenwk Reference Manual dataset,


which contains information on 2,000 women, 657 of which are not
recorded as wage earners. The indicator variable work is set to zero
for the non-working and to one for those reporting positive wages.
. summarize work age married children education
Variable Obs Mean Std. Dev. Min Max

work 2000 .6715 .4697852 0 1


age 2000 36.208 8.28656 20 59
married 2000 .6705 .4701492 0 1
children 2000 1.6445 1.398963 0 5
education 2000 13.084 3.045912 10 20

Christopher F Baum (BC / DIW) Limited Dependent Variables BBS 2013 16 / 47


Limited dependent variables Marginal effects and predictions

We estimate a probit model of the decision to work depending on the


woman’s age, marital status, number of children and level of education.
. probit work age married children education, nolog
Probit regression Number of obs = 2000
LR chi2(4) = 478.32
Prob > chi2 = 0.0000
Log likelihood = -1027.0616 Pseudo R2 = 0.1889

work Coef. Std. Err. z P>|z| [95% Conf. Interval]

age .0347211 .0042293 8.21 0.000 .0264318 .0430105


married .4308575 .074208 5.81 0.000 .2854125 .5763025
children .4473249 .0287417 15.56 0.000 .3909922 .5036576
education .0583645 .0109742 5.32 0.000 .0368555 .0798735
_cons -2.467365 .1925635 -12.81 0.000 -2.844782 -2.089948

Surprisingly, the effect of additional children in the household


increases the likelihood that the woman will work.

Christopher F Baum (BC / DIW) Limited Dependent Variables BBS 2013 17 / 47


Limited dependent variables Marginal effects and predictions

Average marginal effects (AMEs) are computed via margins.

. margins, dydx(_all)
Average marginal effects Number of obs = 2000
Model VCE : OIM
Expression : Pr(work), predict()
dy/dx w.r.t. : age married children education

Delta-method
dy/dx Std. Err. z P>|z| [95% Conf. Interval]

age .0100768 .0011647 8.65 0.000 .0077941 .0123595


married .1250441 .0210541 5.94 0.000 .0837788 .1663094
children .1298233 .0068418 18.98 0.000 .1164137 .1432329
education .0169386 .0031183 5.43 0.000 .0108269 .0230504

The marginal effects imply that married women have a 12.5% higher
probability of labor force participation, while the addition of a child is
associated with an 13% increase in participation.

Christopher F Baum (BC / DIW) Limited Dependent Variables BBS 2013 18 / 47


Limited dependent variables Estimation with proportions data

When the Logistic CDF is employed, the probability (πi ) of y = 1,


conditioned on X , is exp(X β)/(1 + exp(X β). Unlike the CDF of the
Normal distribution, which lacks an inverse in closed form, this function
may be inverted to yield
 
πi
log = Xi β. (7)
1 πi

This expression is termed the logit of πi , with that term being a


contraction of the log of the odds ratio. The odds ratio reexpresses the
probability in terms of the odds of y = 1.

Christopher F Baum (BC / DIW) Limited Dependent Variables BBS 2013 19 / 47


Limited dependent variables Estimation with proportions data

As the logit of πi = Xi β, it follows that the odds ratio for a one-unit


change in the j th X , holding other X constant, is merely exp(βj ). When
we estimate a logit model, the or option specifies that odds ratios
are to be displayed rather than coefficients.
If the odds ratio exceeds unity, an increase in that X increases the
likelihood that y = 1, and vice versa. Estimated standard errors for the
odds ratios are calculated via the delta method.

Christopher F Baum (BC / DIW) Limited Dependent Variables BBS 2013 20 / 47


Limited dependent variables Estimation with proportions data

We can define the logit, or log of the odds ratio, in terms of grouped
data (averages of microdata). For instance, in the 2004 U.S.
presidential election, the ex post probability of a Massachusetts
resident voting for John Kerry was 0.62, with a logit of
log (0.62/(1 0.62)) = 0.4895. The probability of that person voting
for George Bush was 0.37, with a logit of 0.5322. Say that we had
such data for all 50 states. It would be inappropriate to use linear
regression on the probabilities voteKerry and voteBush, just as it would
be inappropriate to run a regression on individual voter’s voteKerry and
voteBush indicator variables.

Christopher F Baum (BC / DIW) Limited Dependent Variables BBS 2013 21 / 47


Limited dependent variables Estimation with proportions data

In this case, Stata’s glogit (grouped logit) command may be used to


produce weighted least squares estimates for the model on state-level
data. Alternatively, the blogit command may be used to produce
maximum-likelihood estimates of that model on grouped (or “blocked”)
data.
The equivalent commands gprobit and bprobit may be used to fit
a probit model to grouped data.

Christopher F Baum (BC / DIW) Limited Dependent Variables BBS 2013 22 / 47


Limited dependent variables Truncated regression

Truncation

We turn now to a context where the response variable is not binary nor
necessarily integer, but subject to truncation. This is a bit trickier, since
a truncated or censored response variable may not be obviously so.
We must fully understand the context in which the data were
generated. Nevertheless, it is quite important that we identify situations
of truncated or censored response variables. Utilizing these variables
as the dependent variable in a regression equation without
consideration of these qualities will be misleading.

Christopher F Baum (BC / DIW) Limited Dependent Variables BBS 2013 23 / 47


Limited dependent variables Censoring

An increase in an explanatory variable with a positive coefficient will


imply that a left-censored individual is less likely to be censored. Their
predicted probability of a nonzero value will increase. For a
non-censored individual, an increase in Xj will imply that E[y jy > 0]
will increase. So, for instance, a decrease in the mortgage interest rate
will allow more people to be homebuyers (since many borrowers’
income will qualify them for a mortgage at lower interest rates), and
allow prequalified homebuyers to purchase a more expensive home.
The marginal effect captures the combination of those effects. Since
the newly-qualified homebuyers will be purchasing the cheapest
homes, the effect of the lower interest rate on the average price at
which homes are sold will incorporate both effects. We expect that it
will increase the average transactions price, but due to attenuation, by
a smaller amount than the regression function component of the model
would indicate.

Christopher F Baum (BC / DIW) Limited Dependent Variables BBS 2013 41 / 47


Limited dependent variables Censoring

We return to the womenwk data set used to illustrate binomial probit.


We generate the log of the wage (lw) for working women and set lwf
equal to lw for working women and zero for non-working women. This
could be problematic if recorded wages below $1.00 were present in
the data, but in these data the minimum wage recorded is $5.88. We
first estimate the model with OLS ignoring the censored nature of the
response variable.

Christopher F Baum (BC / DIW) Limited Dependent Variables BBS 2013 42 / 47


Limited dependent variables Censoring

. use womenwk,clear
. regress lwf age married children education
Source SS df MS Number of obs = 2000
F( 4, 1995) = 134.21
Model 937.873188 4 234.468297 Prob > F = 0.0000
Residual 3485.34135 1995 1.74703827 R-squared = 0.2120
Adj R-squared = 0.2105
Total 4423.21454 1999 2.21271363 Root MSE = 1.3218

lwf Coef. Std. Err. t P>|t| [95% Conf. Interval]

age .0363624 .003862 9.42 0.000 .0287885 .0439362


married .3188214 .0690834 4.62 0.000 .1833381 .4543046
children .3305009 .0213143 15.51 0.000 .2887004 .3723015
education .0843345 .0102295 8.24 0.000 .0642729 .1043961
_cons -1.077738 .1703218 -6.33 0.000 -1.411765 -.7437105

Christopher F Baum (BC / DIW) Limited Dependent Variables BBS 2013 43 / 47


Limited dependent variables Censoring

Reestimating the model as a tobit and indicating that lwf is


left-censored at zero with the ll option yields:
. tobit lwf age married children education, ll(0)
Tobit regression Number of obs = 2000
LR chi2(4) = 461.85
Prob > chi2 = 0.0000
Log likelihood = -3349.9685 Pseudo R2 = 0.0645

lwf Coef. Std. Err. t P>|t| [95% Conf. Interval]

age .052157 .0057457 9.08 0.000 .0408888 .0634252


married .4841801 .1035188 4.68 0.000 .2811639 .6871964
children .4860021 .0317054 15.33 0.000 .4238229 .5481812
education .1149492 .0150913 7.62 0.000 .0853529 .1445454
_cons -2.807696 .2632565 -10.67 0.000 -3.323982 -2.291409

/sigma 1.872811 .040014 1.794337 1.951285

Obs. summary: 657 left-censored observations at lwf<=0


1343 uncensored observations
0 right-censored observations

Christopher F Baum (BC / DIW) Limited Dependent Variables BBS 2013 44 / 47


Limited dependent variables Censoring

The tobit estimates of lwf show positive, significant effects for age,
marital status, the number of children and the number of years of
education. Each of these factors is expected to both increase the
probability that a woman will work as well as increase her wage
conditional on employed status.

Christopher F Baum (BC / DIW) Limited Dependent Variables BBS 2013 45 / 47


Limited dependent variables Censoring

Following tobit estimation, we first generate the marginal effects of


each explanatory variable on the probability that an individual will have
a positive log(wage): the pr(a,b) option of predict.
. margins, predict(pr(0,.)) dydx(_all)
Average marginal effects Number of obs = 2000
Model VCE : OIM
Expression : Pr(lwf>0), predict(pr(0,.))
dy/dx w.r.t. : age married children education

Delta-method
dy/dx Std. Err. z P>|z| [95% Conf. Interval]

age .0071483 .0007873 9.08 0.000 .0056052 .0086914


married .0663585 .0142009 4.67 0.000 .0385254 .0941917
children .0666082 .0044677 14.91 0.000 .0578516 .0753649
education .0157542 .0020695 7.61 0.000 .0116981 .0198103

Christopher F Baum (BC / DIW) Limited Dependent Variables BBS 2013 46 / 47


Limited dependent variables Censoring

We then calculate the marginal effect of each explanatory variable on


the expected log wage, given that the individual has not been censored
(i.e., was working). These effects, unlike the estimated coefficients
from regress, properly take into account the censored nature of the
response variable.
. margins, predict(e(0,.)) dydx(_all)
Average marginal effects Number of obs = 2000
Model VCE : OIM
Expression : E(lwf|lwf>0), predict(e(0,.))
dy/dx w.r.t. : age married children education

Delta-method
dy/dx Std. Err. z P>|z| [95% Conf. Interval]

age .0315183 .00347 9.08 0.000 .0247172 .0383194


married .2925884 .0625056 4.68 0.000 .1700797 .4150971
children .2936894 .0189659 15.49 0.000 .2565169 .3308619
education .0694634 .0091252 7.61 0.000 .0515784 .0873484

Note, for instance, the much smaller marginal effects associated with
number of children and level of education in tobit vs. regress.

Christopher F Baum (BC / DIW) Limited Dependent Variables BBS 2013 47 / 47

You might also like