0% found this document useful (0 votes)
72 views26 pages

Tobit Regression 1

The document provides an overview of the tobit model, which is used to account for mass points in an otherwise continuous dependent variable. It discusses how the tobit model can be given a latent variable interpretation and how maximum likelihood estimation is performed using a reparameterization technique. This involves deriving the log-likelihood function and showing that the Hessian is negative semidefinite, ensuring global concavity. Marginal effects are also briefly discussed.

Uploaded by

Stata demos
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views26 pages

Tobit Regression 1

The document provides an overview of the tobit model, which is used to account for mass points in an otherwise continuous dependent variable. It discusses how the tobit model can be given a latent variable interpretation and how maximum likelihood estimation is performed using a reparameterization technique. This involves deriving the log-likelihood function and showing that the Hessian is negative semidefinite, ensuring global concavity. Marginal effects are also briefly discussed.

Uploaded by

Stata demos
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

The Tobit Model

Econ 674

Purdue University

Justin L. Tobias (Purdue) The Tobit 1/1


Estimation

In this lecture, we address estimation and application of the tobit


model.

The tobit model is a useful specification to account for mass points in


a dependent variable that is otherwise continuous.

For example, our outcome may be characterized by lots of zeros, and


we want our model to speak to this incidence of zeros.
1

Justin L. Tobias (Purdue) The Tobit 2/1


The tobit

Like the probit and ordered probit, the tobit model can be given a latent
variable interpretation. We write this as follows:

We observe data on (xi , yi ) but not on zi . Note that zi is partially


observed.
Note that, unlike the probit and ordered probit, the scale parameter is not
fixed at unity (why)?
In some cases, application of the tobit is, perhaps, not ideal while in
others, the tobit can be applied more credibly. Two examples illustrate.

Justin L. Tobias (Purdue) The Tobit 3/1


The tobit

Case #1:
Suppose we seek to model expenditures on automobiles during the
calendar year. We apply a tobit to model this data. How would you
interpret your model in terms of this specific application?

Many would give zi an interpretation like desired expenditure. If this is


positive, then the person buys a car and spends the desired amount. If this
is negative or zero, then we simply see that the person did not buy the car.

Are there any problems here?


1

Justin L. Tobias (Purdue) The Tobit 4/1


The tobit

Case #2:
Suppose that you seek to model expenditures on tobacco products during
the calendar year. The observed variable yi represents the fraction of
income spent on such products during the calendar year. The data is likely
characterized by lots of zeros.

In this case,
1 It is quite likely to see yi values very close to zero, given its
construction.
2 Perhaps negative values of zi make more sense in the context of this
application. Specifically, people may contribute to anti-smoking
campaigns, which we might interpret as a type of negative
expenditure.

Justin L. Tobias (Purdue) The Tobit 5/1


The tobit

An important and often overlooked point is that, although it might seem


natural to assert that the “censoring” point is at zero, it may, in fact, be
something different from zero. [Zuehlke (2003)].

That is, there may be some minimum level of expenditure that is possible.

For this reason, we might consider a variant of the tobit with an unknown
censoring point:

for some constant c that is to be estimated from the data.

Justin L. Tobias (Purdue) The Tobit 6/1


Estimation in the (Standard) Tobit
iid
zi = xi β + ui , ui |xi ∼ N (0, σ 2 )

yi = max{0, zi }.
To derive the log likelihood in the tobit, (though it is not necessary to do
so), we first consider the c.d.f. :
Pr(Yi ≤ c|X ).
It is convenient to express this probability in the following way:

where Di can be any binary variable, yet it is convenient to define it here


as

Justin L. Tobias (Purdue) The Tobit 7/1


Estimation in the (Standard) Tobit

Pr(Yi ≤ c|xi ) = Pr(Yi ≤ c|xi , Di = 1)Pr(Di = 1|xi )


+ Pr(Yi ≤ c|xi , Di = 0)Pr(Di = 0|xi ),

With respect to the components of the above, some of these are


straightforward:

and hence, Pr(Di = 0|xi ) = 1 − Φ(xi β/σ). What about


Pr(Yi ≤ c|xi , Di = 0)? Intuitively,

Justin L. Tobias (Purdue) The Tobit 8/1


Estimation in the (Standard) Tobit

As for the remaining conditional density, note for c > 0:

Justin L. Tobias (Purdue) The Tobit 9/1


Estimation in the (Standard) Tobit

Thus, we obtain the following “density” function for Yi :


1
f (yi |xi ) = φ([yi − xi β]/σ)I (yi > 0) + I (yi = 0) [1 − Φ(xi β/σ)] .
σ
From here, it is not hard to get to the log likelihood:

Pn
In the above n1 = i=1 Di , or the number of uncensored observations.

Justin L. Tobias (Purdue) The Tobit 10 / 1


Estimation in the (Standard) Tobit

From here, a standard tobit analysis can be carried out.

That is, the score vector can be obtained, as can the Hessian matrix.

However, these are quite messy, particularly the Hessian.

Moreover, it turns out that a reparameterization of the problem


simplifies these expressions considerably and, furthermore, that we
can prove global concavity for the reparameterized model.

Justin L. Tobias (Purdue) The Tobit 11 / 1


Estimation in the (Standard) Tobit
We employ the reparameterization suggested by Olson (1978). Specifically,
we let

Then we obtain
n1 1 X
L(δ, θ; y ) = − log(2π) + n1 log δ − (δyi − xi θ)2
2 2
i:yi >0
X
+ log[1 − Φ(xi θ)].
i:yi =0

From this, we obtain the score:

Justin L. Tobias (Purdue) The Tobit 12 / 1


Estimation in the (Standard) Tobit

With a bit of work, the components of the Hessian matrix can also be
obtained:
 
X φ(xi θ) φ(xi θ) X
Lθθ0 = xi θ − xi0 xi − xi0 xi .
1 − Φ(xi θ) 1 − Φ(xi θ)
i:yi =0 i:yi >0

Justin L. Tobias (Purdue) The Tobit 13 / 1


Estimation in the (Standard) Tobit

Let  
0 0 X0
γ = [θ δ] , X = , y = [y0 y1 ]0 ,
X1
where X0 consists of the X observations with yi = 0 (and similarly for X1 ,
etc.). That is, we first arrange the data with the yi = 0 outcomes
appearing first, followed by those with yi = 1.
With this notation in hand, one can show that the Hessian can be written
as:

Justin L. Tobias (Purdue) The Tobit 14 / 1


Estimation in the (Standard) Tobit

In the last slide, Di is an n0 × n0 matrix with diagonal element


 
φ(xi θ) φ(xi θ)
− xi θ − .
1 − Φ(xi θ) 1 − Φ(xi θ)

Furthermore, one can show that the Hessian is always negative


semidefinite (and thus the log likelihood is globally concave) provided the
elements of D are positive. (why?)
Given the form of these elements above, this is true iff
φ(xi θ)
xi θ − < 0.
1 − Φ(xi θ)

This is indeed true, but in order to prove it, we must digress a little bit.

Justin L. Tobias (Purdue) The Tobit 15 / 1


Mean of Truncated Normal
To obtain the density function for any truncated random variable w , we
apply the formula:

That is, we keep the shape of the marginal density, chop off the tail, and
scale it up to make sure it integrates to unity. Thus,

For the case of a standard normal random variable w , with c = xi θ, we


get:

Justin L. Tobias (Purdue) The Tobit 16 / 1


Mean of Truncated Normal

Now, clearly, it must be the case that

In the case of a standard normal random variable w , then, we have

or

Note that this is exactly the term we needed to prove was negative in
order to verify that the Hessian is negative semidefinite.

Justin L. Tobias (Purdue) The Tobit 17 / 1


Mean of Truncated Normal

This result motivates use of the reparameterization in practice.

An iterative maximization routine should converge quickly to the


maximum given the uniqueness of this maximum.

Invariance can be applied to estimate β and σ. Specifically,

σ̂ = δ̂ −1 , β̂ = θ̂/δ̂.

The Delta method can be used to obtain large sample standard errors.

Justin L. Tobias (Purdue) The Tobit 18 / 1


A note on discarding the zeros

It is somewhat common, though unfortunate, practice in the applied


literatures to simply discard the zero responses when estimating the tobit.
Of course, this is not a valid procedure since:

E (yi |xi , yi > 0) = xi β + E (ui |ui > −xi β, xi )


φ(xi β)
= xi β + σ .
Φ(xi β)

Thus, the conditional mean function, given that positive values occur, is
not simply the population conditional mean xi β. As such, OLS results will
be biased and inconsistent.

Justin L. Tobias (Purdue) The Tobit 19 / 1


Marginal Effects

We now describe a method for calculating marginal effects in the tobit.


Though several of these have been discussed, we focus our attention on
effects with respect to the mean of the observed y outcome: First, note
(similar to our previous discussion):

E (y |x) = E (y |x, z > 0)Pr(z > 0|x) + E (y |x, z ≤ 0)Pr(z ≤ 0|x)


= E (y |x, z > 0)Pr(z > 0|x)

(why?) Hence, we have:

∂E (y |x) ∂E (y |x, z > 0) ∂Pr(z > 0|x)


= Pr(z > 0|x) + E (y |x, z > 0) .
∂xj ∂xj ∂xj

Justin L. Tobias (Purdue) The Tobit 20 / 1


Marginal Effects
To make things a bit simpler notationally, let φ ≡ φ(xi β/σ) and define Φ
analogously.
To put together all of the pieces of the marginal effect expression, we first
note:

E (y |x, z > 0) = E (z|x, z > 0)


= xβ + E (u|x, z > 0)
= xβ + E (u|u > −xβ, x).

The last term, again, is the mean of a truncated normal random variable,
though in this case the variance of u is σ 2 rather than unity. It follows by
similar reasoning that

Justin L. Tobias (Purdue) The Tobit 21 / 1


Marginal Effects

In order to completely characterize the marginal effect, we must


differentiate the normal density function. That is, we seek:

∂ (2π)−1/2 exp −[1/2](xβ/σ)2


 
∂φ
=
∂xj ∂xj
−1/2
exp −[1/2](xβ/σ)2 (−xβ/σ)(βj /σ)

= (2π)
= φ[−xβ/σ](βj /σ)

Therefore,

Justin L. Tobias (Purdue) The Tobit 22 / 1


Marginal Effects
Putting this together with the other pieces comprising our marginal effect,
we obtain:
−φΦ[xβ/σ](βj /σ) − φ2 (βj /σ)
   
∂E (y |x) φ
= βj + σ Φ+ xβ + σ φ[βj /σ].
∂xj Φ2 Φ

Rather conveniently, terms cancel to produce:

Any intuition here?

As Φ → 1, the probability associated with the mass point at zero


approaches zero. In this limiting case, we are essentially back into
linear regression framework, whence the marginal effect reduces to βj .

Justin L. Tobias (Purdue) The Tobit 23 / 1


Tobit: Application

Using the female labor supply data on the course website, we fit a
tobit model to account for the censoring at zero weeks of work.

We work in the (δ, θ) parameterization.

The fsolve command is used in MATLAB, so the score vector is


programmed into the maximization routine.
The following slide gives results from this exercise.

Justin L. Tobias (Purdue) The Tobit 24 / 1


Tobit: Application

MATLAB STATA
Variable Pt. Est Std. Err Marg Eff Pt. Est Stderr
Constant 31.93 3.83 —- 31.93 3.66
Ability .061 .0221 .056 .061 .022
SpouseInc. -.123 .0254 -.114 -.123 .025
Kids -13.52 1.22 -12.55 -13.52 1.16
Education .932 .292 .865 .932 .291
σ 23.24 .383 —- 23.24 .383

Justin L. Tobias (Purdue) The Tobit 25 / 1


References

Tobin, J. (1958). “Estimation of Relationships for Limited Dependent


Variables” Econometrica.

Olsen, R.J. (1978). “ANote on the Uniqueness of the Maximum


Likelihood Estimator for the tobit model” Econometrica.

Zuehlke, T. (2003). ”Estimation of a Tobit Model with Unknown


Censoring Threshold.” Applied Economics.

Justin L. Tobias (Purdue) The Tobit 26 / 1

You might also like