0% found this document useful (0 votes)
266 views9 pages

Mroz Replication

This document summarizes a partial replication of a 1987 paper by T. Mroz on modeling female labor supply. The replication examines some, but not all, of Mroz's analyses due to limitations of the publicly available data and complexity of the original paper. It describes Mroz's model and tests of exogeneity assumptions. The replication compares results to Mroz's and finds some differences attributed to using a different data set missing some variables. Tables comparing baseline model specifications between the original and replication are presented.

Uploaded by

Mateo Rivera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
266 views9 pages

Mroz Replication

This document summarizes a partial replication of a 1987 paper by T. Mroz on modeling female labor supply. The replication examines some, but not all, of Mroz's analyses due to limitations of the publicly available data and complexity of the original paper. It describes Mroz's model and tests of exogeneity assumptions. The replication compares results to Mroz's and finds some differences attributed to using a different data set missing some variables. Tables comparing baseline model specifications between the original and replication are presented.

Uploaded by

Mateo Rivera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

A Partial Replication of T.

Mroz paper(1987)
on the Sensitivity of an Empirical Model
Seong H. Moon

December 18, 2003

1 Introduction
This paper is a partial replication of the analysis in T. Mroz well-known
article(1987, hereafter Mroz paper) on the sensitivity of an empirical model
of female labor supply.1 This paper, however, is not a complete replication
in the sense that it does not provide complete replications of all results in
Mroz paper. There are two main reasons:

(1) Actually, Mroz paper is very long and provides a plenty of


outcomes using various estimation methods. This is so because
the goal of the paper was to undertake a systematic analysis
of several theoretic and statistical assumptions used in many
empirical models. He attempted to address several issues at a
time using a single data set, and thus to provide a useful baseline
for future studies in studying female labor supply. So, I take
only a part of them since it will be an extremely hard job to
replicate all of them, and further, some of analyses in it are
actually beyond my ability and knowledge.

(2) The data set can be obtained very easily but, at most, in-
completely.2 Mroz used a sub-sample of Univ. of Michigan Panel

In Ph.D. program, Dept. of Economics, Univ. of Chicago. E-mail: moon@uchicago.
edu.
1
Thomas A. Mroz, The Sensitivity of an Empirical Model of Married Womens Hours
of Work to Economic and Statistical Assumptions, Econometrica, Vol.55, No.4 (Jul.,
1987), pp.765-799.
2
The Mroz data can be downloaded from http://www.stata.com or http://www.dataset
.org. PSID data set is available in http://psidonline.isr.umich.edu.

1
Study of Income Dynamics for the year 1975. The data set con-
sists of 753 married white women between the ages of 30 and
60 in 1975, with 428 working at some time during the year. It
contains measures of hours of work, hourly wage rate, family in-
come, and some information about their own and their husbands
demographic characteristics, numbers of children, etc. A main
dierence between the data set that Mroz actually used and the
publicly released one that I employ is that the original version
includes number of years of schooling of husbands mothers and
fathers, but the publicly released one does not. This is very crit-
ical because he used these two variables as instruments in some
parts of his paper. Without them, I cannot replicate some anal-
yses in his paper. In this situation, obviously, the most plausible
solution would be to replicate his sampling rule with the original
PSID data set, but actually it did not work because character-
istics of PSID sample for the year 1975 did not correspond with
the counterpart presented in Mroz paper. Thus, I cannot do
anything but guess that he must have used another version of
PSID that dier from the publicly released one.

For these two reasons, I present replications of some results in Mroz paper.
This is why I call this paper a partial replication.

This paper contains three more sections. Section 2 briefly describes the
main idea, the methodology, and results of Mroz paper. Section 3 presents
replications of Mroz analysis and some hypothesis tests. In section 4, I will
give a summary and some suggestions for improvement.

2 A Brief Description of Mroz paper


In Section 1 of his paper, Mroz analyzed a sensitivity of empirical model
of female labor supply by focusing on a simple model. In this model, the
husbands behavior is assumed to be exogenous, and a womans labor supply
is given by

hi = 0 + 1 ln(wi ) + 2 Yi + 30 Xi + i (1)

where hi is the it h womans hours of work during a given year, wi is a


measure of her wage rate, Yi is a measure of other income received by the
household, Xi is a set of control variables, i is a stochastic disturbance,

2
and 0 , 1 , 2 , 2 and 3 are the parameters of the labor supply function.
The vector Xi includes the wifes age, her years of schooling, the number
of children less than six years old in the household, the number of children
between the ages of six and nineteen, and the marginal tax rate, etc.
His main issue is whether this model is sensitive to some statistical as-
sumptions; exogeneity assumptions, statistical control for self-selection into
labor force, and the impact of controlling for taxes. He actually addressed
all of these three categories, but in this paper I will touch only the first
one, that is, the exogeneity. For the exogeneity, he tested for the exogene-
ity of wage rates, the wifes labor market experiences, non-wife income, and
children in the household without controlling self-selection bias at this stage.

Exogeneity of Wage Rates:


In (1), the OLS estimator for coecients will be consistent and ecient
if all regressors are exogenous and the disturbances are i.i.d. normally dis-
tributed. To test the null hypothesis of exogenous wage rate, we can utilize
the IV estimation where the wage rate is instrumented. The dierence be-
tween OLS estimator and IV estimator will converge to zero under the null
hypothesis (whose asymptotic distribution is a Chi-squared) while under the
alternative hypothesis (i.e. endogenous wage rate), it will not. Here, it arises
as an important issue which variables must be considered as instruments.
Since we are considering labor supply in the year of 1975, the wage rate
in 1976 or 1974 might be considered as ideal candidates. He actually used
wage rates in 1976 as an instrument. Obviously, however, this measure is
available only for women who were at work in 1976. Without considering
self-selection, he compared the OLS and IV estimates for a sub-sample of
women who were at work both in 1975 and in 1976.

Exogeneity of Instruments:
If we want to take instruments from the data in 1975, we can consider
various candidates. But, there arise another issue, that is, Are they valid
instruments? To test exogeneity of a particular instrument, say, zk , we need
to compare two IV estimators: the one from an estimation with instruments
including zk , the other excluding it. Under the null hypothesis of exogenous
zk , both estimators are consistent while only the second one is, otherwise
(assuming that all other instruments are exogenous). Thus, we can test the
exogeneity of zk by checking whether the dierence between two estimators
does or does not converge to zero based on the following test statistic:

(d d d d 1 d d
IV IV ) [var(IV IV )] (IV IV ) =) (k )
0 2 (2)

3
where only d
IV includes zk

Here, 2 issues arise. We need to know (i) the variance-covariance matrix


in this formulae, and especially the covariance matrix, and (ii) the degree
of freedom of this distribution, k . For the first issue, we cannot utilize
here Hausmans (1978) formulae for the covariance matrix of the dierences
between the two estimators because neither is necessarily ecient. Instead,
we can derive it as following:
var(d d d d d d
IV IV ) = var(IV ) + var(IV ) 2cov(IV , IV )
(3)

cov(d d
IV , IV ) = [X Z(Z Z) Z X] X Z(Z Z) Z
0 0 1 0 1 0 0 1 0

E(0 )Z (Z0 Z )1 Z0 X[X 0 Z (Z0 Z )1 Z0 X]1 (4)


For the second issue, it is tempting to invoke the results for the full
rank quadratic form in a normal vector and conclude the degree of freedom
for this chi-squared statistic is K, i.e. the number of regressors. But that
method will usually be incorrect, and worse yet, unless X and Z have no
variables in common, the rank of the matrix in this statistic is less than
K, and the ordinary inverse might not even exist. In this case, since we
are assuming that all other instruments are exogenous and testing only one
instruments exogeneity, the degree of freedom is actually just 1.3

Considering these results, Mroz, first, specified the model, and then
tested the exogeneity assumptions for 4 variables above. From now on, I
will replicate and re-evaluate his procedures.

3 Replications and Hypothesis Tests


Basic Model Specifications:
In the beginning of his paper, Mroz suggested 10 baseline specifications.
[Table I] compares his and my own calculation. These specifications all are
based on a sub-sample of women who are at work in 1975 and the number
of observation is 428. [Table I] corresponds to [Table IV] in Mroz paper.4
As shown in [Table I], the figures are slightly dierent.5 There are two
reasons: the one comes from dierence in data sets and the other from
3
For details, see W. Greene(2003), pp.81-82.
4
Mroz(1987), p.770.
5
Actually, standard errors seem very dierent, which I dont present here. I guess
that this is so because Mroz, actually, corrected all standard errors for arbitrary forms of
heteroscedasticity, but I didnt. He told that he did, but not how. So, I couldnt replicate
it.

4
various sources. First, for Spec. (5) and (9), Mroz and my own estimation
look very dierent. This is resulted mainly from dierence in data sets used:
Mroz included numbers of year of schooling of husbands mother and father,
but I couldnt because the publicly released data set didnt contain them
as stated before. Second, other minor dierences seem to come just from
calculating process except for Spec. (2) where I believe that it must be a
typo.
It is worth noting that, from this table, Mroz derives 2 interpretations:
(i) the wide range of estimated wage eects found suggests that assumptions
concerning the sets of instruments used to estimate the model can have con-
siderable impact upon the estimated structural parameters, and (ii) since
estimates using the set of instrumental variables with the wifes market ex-
perience (row 2-6) yield larger wage responses than the rows without this
set of instruments (row 7-10), we can guess a possible specification error.6
Following him, I interpret that (i) and (ii) imply endogeneity of wage rates
and the wifes market experience, respectively.

Endogeneity of Wage Rates:


To test endogeneity of wage rate, as stated in Sec. II, we need two dierent
specifications: OLS and IV estimation. Comparing Spec. (1) - i.e. OLS
estimation - and other specifications in [Table I] above, we observe that they
look very dierent, which implies that wage rate might be endogenous. But,
since we also suspect that instruments themselves might be endogenous,
this guess would be misleading at this point. We need a more trustable
instrument and a robust test based on it. For this reason, Mroz took wage
rates in 1976 and run an IV estimation for another sub-sample of women
who were at work both in 1975 and in 1976. Results from this estimation
are presented in [Table II] which corresponds to [Table VI] in Mroz paper.7
As shown, two specifications are very dierent and this implies that wage
rate is endogenous in labor supply equation at least for working women. Of
course, to state this more formally, we need to calculate a test statistic which
has a Chi-squared distribution in a large sample and to utilize Hausmans
formulae. Actually, in my own calculation, the Chi-squared test statistic has
a value of 6,747,008 which is certainly larger than the corresponding critical
value at any level of significance. Thus, the null hypothesis of exogenous
wage rate is rejected.

6
Mroz(1987), pp.772-773.
7
Mroz(1987), p.775.

5
Endogeneity of Instruments:
The [Table III], which corresponds again to [Table VI] in Mroz paper,
shows results from IV estimations using various samples and sets of instru-
ments. In the third row, the dierence between two IV estimates - one using
ln(wage1976) as an instrument, the other excluding it - seems very large,
and actually the null hypothesis of exogenous ln(wage1976) is rejected. In
my own calculation, the Chi-squared test statistic is 2,344,001 and larger
than the corresponding critical value at any level of significance. Thus, we
conclude that the 1976 wage rate is not exogenous, that is, it is correlated
with unobservables in the labor supply equation of working women.
The sixth row also can be interpreted in a similar way. That is, wifes
experience is not exogenous in working womens labor supply equation. I
obtained a Chi-squared statistic of 9,737,228 and the null hypothesis of
exogenous labor market experience was rejected again at any level of signif-
icance.
How about other instruments? Mroz tested endogeneity for non-wife
income and number of children, and then failed to reject the exogeneity
assumptions for both of them. [Table IV] presents results about this, which
corresponds to [Table VII]-[Table VIII] in Mroz paper.8 Since we already
know that the wifes labor market experience is not exogenous, I excluded
it from the set of instruments. Here, notice that my own results are very
dierent in most of cells. This is caused by dierence in data sets used. In
testing exogeneity of non-wife income and children, Mroz included numbers
of year of schooling of husbands mother and father, but I couldnt do in the
absence of relevant data and thus obtained dierent estimates. The main
conclusion is, however, the same as will be shown.
In [Table IV], the 5th and 6th rows are about exogeneity of numbers of
children while the 7th and 8th rows are about one of non-wife income. Test
statistics indicate that both for children and for non-wife income, the null
hypothesis of exogeneity is not rejected. That is, both can be considered
exogenous. Test statistics are between 2. 655 (for the 6th row) and 4. 981
(for the 7th row), and they all are less than the critical value of 5.02 at a
significance level of 5% for a Chi-squared distribution with degree of freedom
of 1. Thus, we reach to a conclusion that the null hypotheses of exogeneity
are not rejected, which is consistent with Mroz outcome.
8
Mroz(1987), p.777.

6
4 Summary and Some Suggestions
From the above, I obtained results consistent with Mroz paper: (i) the wage
rate is not exogenous in female labor supply equation, (ii) the wifes labor
market experience also is not exogenous, which implies that women who
have worked many years in the past tend to have higher wages and work
more in the present and this reflects a systematic dierence in the unobserv-
ables influencing their labor supplies (e.g. tastes for work), and (iii) non-
wife income of a household and number of children seem to be exogenous.
However, it is worth noting that these are just preliminary since I didnt
control self-selection bias which has actually been addressed in Mroz(1987)
and many other studies. I skipped it simply because it is beyond the scope
of this paper, but it must be taken into account.

In replicating Mroz paper, I figured out some mysterious and weird as-
pects of his analyses, and some of them look like either simple mistakes or
serious failures:
First, he reported slightly dierent figures from table to table for the
exactly same estimation, and sometimes this divergence seems too huge to
be ignored. For example, in his paper, the row 9 in [Table IV], the row 2 in
[Table VII], and the row 1 in [Table VIII] all are from a single estimation, but
the numbers that he reported are dierent. For the coecient of number of
children less than 6 years, he reported -338 in [Table IV], -334 in [Table VII],
and -344 in [Table VIII]. In my opinion, it is very obvious that there is no
reason for that they have dierent values. I could find the same divergences
for other variables.
Second, as stated at a footnote in Sec.II, he told that he corrected stan-
dard errors for arbitrary forms of heteroscedasticity, but didnt explain at all
how he actually did it, and this is why I couldnt produce the precisely same
standard errors as his. I think he should explain how to do and consider
whether such a correction do or do not cause significant impacts upon test
statistics.
Third, he didnt address why it can be thought reasonable that the ex-
ogeneity of non-wife income and children is not rejected. But, it must be
explained. Similarly, the validity of the assumption of exogeneity of hus-
bands behavior also must be re-evaluated. Non-wife income, number of
children, and husbands behavior all are concerned with a womans past
decision, and obviously they might be correlated with unobservables influ-
encing her current labor supply.

7
References

Greene, W. (2003): Econometric Analysis, 5th edition, New York: Pren-


tice Hall.

Hausman, J. (1978): Specification Tests in Econometrics, Economet-


rica, Vol.46, 1251-1272.

Mroz, T. (1987): The Sensitivity of an Empirical Model of Married


Womens Hours of Work to Economic and Statistical Assumptions, Econo-
metrica, Vol.55, 765-799.

8
Appendix: More Tables To Be Attached

[Table I] Basic Specifications: Women at work in 1975 (obs=428)

Mroz(1987) Moon(2003)
ln(wage) Non- Kids Kids ln(wage) Non- Kids Kids
wife (less (6 to wife (less (6 to
income than 6) 18) income than 6) 18)
1 -17 -4.2 -342 -115 -17.4 -4.25 -342.5 -115.0
2 1282 -8.3 -235 -60 1261.6 -8.34 -234.7 -59.8
3 831 -7.0 -271 -78 831.3 -6.96 -271.0 -78.4
4 672 -6.4 -283 -85 672.3 -6.45 -284.4 -85.2
5 482 -5.8 -300 -93 495.4 -5.89 -299.3 -92.9
6 638 -6.3 -287 -87 638.6 -6.35 -287.2 -86.7
7 -182 -3.7 -356 -122 -182.5 -3.72 -356.4 -122.1
8 46 -4.4 -337 -112 45.7 -4.45 -337.2 -112.3
9 -30 -4.2 -338 -113 -18.0 -4.20 -343.0 -115.0
10 129 -4.7 -330 -108 129.3 -4.72 -330.1 -108.7

1. Spec. (1) comes from an OLS estimation where regressors are a


constant, wifes age and education, and the above 4 variables.

2. Spec.(2)-(10) come from 2SLS estimations which employ dierent


sets of instruments for ln(wage). B, C, and I are common
instruments, and additional instruments are (2) E (3) E, F2 (4) E,
F3 (5) E, F3, H3 (6) E, F4 (7) F2 (8) F3 (9) F3, H3 (10) F4.

3. B: county unemployment rate, numbers of years of schooling of wifes


mother and father, C: number of children less than 6 and between 6
and 18, I : Non-wife income, E: wifes experience and its square, F2:
quadratic terms in wifes age and education, F3: cubic terms in wifes
age and education, F4: quartic terms in wifes age and education.

4. In my own specifications, H2-H4 are constructed in the same way as


F2-F4 for husbands age and education. But, in Mroz specifications,
they include numbers of years of schooling of husbands mother and
father.

You might also like