0% found this document useful (0 votes)
7 views7 pages

PS6 Sol

Uploaded by

ongaribelia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views7 pages

PS6 Sol

Uploaded by

ongaribelia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Econometrics

by Jin-Young Choi

1. The dependent variable is a dummy variable for whether an individual i is working or not.
Let educ and exper be schooling years and working experience at current (or previous) work
place. Suppose that, using data on 12,166 individuals, the following results are obtained:

(1) LPM Probit Logit


SE SE SE
const: 0:048 0:0231 1:651 0:0778 2:783 0:1338
educ 0:026 0:0018 0:081 0:0060 0:138 0:0103
exper 0:279 0:0053 0:800 0:0185 1:337 0:0324
exper2 0:027 0:0009 0:074 0:0031 0:121 0:0055

(a) For each estimator, compute the estimated working probability for someone who has
5 years of working experience and 15 years of education. Do you think the estimated
probability is reliable?

For LPM, the estimated working probability is

P (Y = 1jeduc = 15; exper = 5) = 0:048 + 0:026 15 + 0:279 5

0:027 52 = 1:062:

For Probit, the same probability is

( 1:651 + 0:081 15 + 0:800 5 0:074 52 )

= (1:714) = P (Z < 1:714) 0:9564

For Logit, the probability is

1
1 + expf ( 2:783 + 0:138 15 + 1:337 5 0:121 52 )g
1
= = 0:950:
1 + expf 2:947g

1
The estimated working probability using LPM is not reliable, because probability
larger than one is nonsense.

(b) Using Probit, compute the e¤ect of additional year of experience on the working prob-
ability for someone who has 2 years of experience and 11 years of education. Compute
the same e¤ect using Logit, and compare these two results.

The e¤ ect of additional year of experience can be calculated by comparing two …tted
values: for the Probit

P (Y = 1jeduc = 11; exper = 3) P (Y = 1jeduc = 11; exper = 2)

= ( 1:651 + 0:081 11 + 0:800 3 0:074 32 )

( 1:651 + 0:081 11 + 0:800 2 0:074 22 )

= (0:974) (0:544) 0:834 0:705 = 0:129:

Thus the e¤ ect of additional year of experience for someone who has 2 years of ex-
perience and 11 years of education is 12.9%. Additional year of experience increases
12.9% of working probability. For Logit, the probability is

P (Y = 1jeduc = 11; exper = 3) P (Y = 1jeduc = 11; exper = 2)


1
=
1 + expf ( 2:783 + 0:138 11 + 1:337 3 0:121 32 )g
1
1 + expf ( 2:783 + 0:138 11 + 1:337 2 0:121 22 )g
= 0:840 0:716 = 0:124

Thus the e¤ ect of additional year of experience for someone who has 2 years of
experience and 11 years of education is 12.4%. The e¤ ect from Logit is close to that
from Probit .

(c) Now, we set up a di¤erent model including only two binary variables, hischool and black,
where hischool is an indicator for whether an individual i has a highschool degree or

2
not, and black is an indicator for blacks. Suppose the following results are obtained:

(2) LPM Probit


SE SE
const: 0:5115 0:0089 0:030 0:0229
hischool 0:108 0:0095 0:274 0:0244
black 0:089 0:0093 0:225 0:0238

Using Probit in Table (2), compute the estimated di¤erence in the working probability for

blacks and non-blacks where both have a highschool degree. Is this di¤erence signi…cantly
di¤erent from zero?

The estimated working probability for blacks is

P (Y = 1jhischool = 1; black = 1) = (0:030 + 0:274 0:225)

= (0:079) = P (Z < 0:079) 0:5319:

And the same probability for non-blacks is

P (Y = 1jhischool = 1; black = 1) = (0:030 + 0:274)

= (0:304) = P (Z < 0:304) 0:6179:

Thus, the estimated di¤ erence in the working probability between blacks and non-
blacks is 0:5319 0:6179 = 0:086 which implies that the working probability is lower
for blacks by 8.6%. And this di¤ erence is signi…cantly di¤ erent from zero, because
the di¤ erence in the working probability is coming from the coe¢ cient of black, and
0:225 0
this coe¢ cient is signi…cantly di¤ erent from zero with t-value 0:0238 9:45.

(d) Using LPM and Probit in above table (2), compute the estimated working probability
for blacks with a highschool degree. Are these two results di¤erent? Explain why these
are di¤erent or why they are not.

3
For LPM, the estimated working probability is

P (Y = 1jhischool = 1; black = 1) = 0:5115 + 0:108 0:089

= 0:5305:

For Probit, the same probability is

(0:030 + 0:274 0:225) 0:5319:

These two results are similar because the model includes only binary variables. If
a regressor X is binary, P (Y = 1jX) takes only two values, P (Y = 1jX = 1) and
P (Y = 1j = 0). Since it is hard to think of non-linear relation between X and
P (Y = 1jX), Probit has no advantage in terms of allowing non-linear relation and
gives similar results to LPM.

2. Suppose we have data on rental prices and other variables for 120 (college) towns for the
years 2005 and 2010 (i = 1; :::; 120 and t = 2005 and 2010). Using these data, we want to see
whether a stronger presence of students a¤ects rental rates in a town. Suppose that the true
equation with the …xed e¤ects is as follows

log(rent)it = 0 + 1 log(pop)it + 2 log(avginc)it + 3 pctstuit + t + i + Uit (4)

where pop is city population, avginc is average income, and pctstu is student population as
a percentage of city population (during the school year). The unobserved city …xed e¤ect is
denoted by i, and the unobserved time …xed e¤ect is denoted by t. How would you estimate
the coe¢ cient 3 consistently? Explain brie‡y the way to estimate 3.

Since all regressors would be related to the two di¤ erent types of unobserved …xed e¤ ects,
pooled OLS to this equation would be inconsistent. The easiest way to estimate 3 is
to use FD (…rst di¤ erence) in order to control or eliminate the …xed e¤ ects from the
equation. Since there are only two time periods, 2005 and 2010, by taking di¤ erence

4
between two time periods we obtain

log(rent)i10 log(rent)i05

= 1 flog(pop)i10 log(pop)i05 g + 2 flog(avginc)i10 log(avginc)i05 g

+ 3 (pctstui10 pctstui05 ) + 10 05 + Ui10 Ui05

= 0 + 1 log(pop)i + 2 log(avginc)i + 3 pctstui + Ui ;

where Xi Xit Xit 1 and 0 10 05 . Then, 1; 2; 3; and the time e¤ ect


of 2010 relative to 2005 ( 0) are estimated consistently by applying OLS to the …nal
equation.

3. In 1985, neither Florida nor Georgia had laws banning open alcohol containers in vehicle
passenger compartments. By 1990, Florida had passed such a law, but Georgia had not.

(a) Suppose you can collect random samples of the driving-age population in both states, for
1985 and 1990. Let arrest be a binary variable equal to unity if a person was arrested for
drunk driving during the year. Without controlling for any other factors, write down a
linear probability model that allows you to test whether the open container law reduced
the probability of being arrested for drunk driving. Which coe¢ cient in your model
measures the e¤ect of the law?

Let FL be a binary variable equal to one if a person lives in Florida, and zero
otherwise. Let y90 be a year dummy variable for 1990. Then,

E(arrestjF L; y90) = 0 + 0 y90 + 1F L + 1 y90 FL

from the above equation, we have the linear probability model. The e¤ ect of the law
is measured by 1, which is the change in the probability of drunk driving arrest due
to the new law in Florida. Including y90 allows for aggregate trends in drunk driving
arrests that would a¤ ect both states; including F L allows for systematic di¤ erences
between Florida and Georgia in either drunk driving behavior or law enforcement.

5
(b) Why might you want to control for other factors in the model? What might some of
these factors be?

It could be that the populations of drivers in the two states change in di¤ erent ways
over time. For example, age, race, or gender distributions may have changed. The
levels of education across the two states may have changed. As these factors might
a¤ ect whether someone is arrested for drunk driving, it could be important to control
for them. At a minimum, there is the possibility of obtaining a more precise estima-
tor of 1 by reducing the error variance. Essentially, any explanatory variable that
a¤ ects arrest can be used for this purpose.

(c) Now, suppose that you can only collect data for 1985 and 1990 at the county level
for the two states. The dependent variable would be the fraction of licensed drivers
arrested for drunk driving during the year. How does this data structure di¤er from the
individual-level data described in part (a)? What econometric method would you use?

In (a), each person is randomly selected for both years, so the probability for a person
to be observed multiple times would be close to zero. Thus, each unit is observed
only once and the data are repeated cross-section, not panel. Now, the dependent
variable is the fraction of licensed drivers arrested for drunk driving, and the data
unit is county observed twice, so we can consider a linear panel model, such as

F arrestit = i + t + 1 F Li + 1 y90 F Lit + x Xit + Uit

F arresti85 = i + 85 + 1 F Li + x Xit + Uit

F arresti90 = i + 90 + 1 F Li + 1 F Lit + x Xit + Uit

i denotes each county, t denotes the year, and Xit denotes any county-level time-
varying characteristics. Give this model, we can use ED or FD controlling the two
types of …xed e¤ ects, the county-level …xed e¤ ect i and the time …xed e¤ ect t. If
we use FD, then the …nal regression model is

4F arresti90 = 4 90 + 1 F Li90 + x 4Xi90 + 4Ui90 :

6
For ED, the …nal regression model is

F^
arrestit = 0 + 0 y90 + ^
1 y90
F Lit + e + Uit :
x Xit

Under Assumption ED1-4 (or FD1-4), we can estimate 1 consistently.

You might also like