0% found this document useful (0 votes)

45 views31 pages

Is Neglected Heterogeneity Really An Issue in Binary and Fractional Regression Models? A Simulation Exercise For Logit, Probit and Loglog Models

Neglected heterogeneity in binary and fractional regression models can affect parameter estimates, partial effects calculations, outcome predictions, and test statistics. Through simulations, the authors examine these issues for logit, probit, and loglog models with binary and fractional dependent variables. They find that unobserved heterogeneity generally causes attenuation bias in parameter estimates but has varying effects on partial effects, prediction, and test statistics depending on the model. The paper aims to provide a more comprehensive analysis of these issues than previous research.

Uploaded by

Neema

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views31 pages

Is Neglected Heterogeneity Really An Issue in Binary and Fractional Regression Models? A Simulation Exercise For Logit, Probit and Loglog Models

Uploaded by

Neema

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

CEFAGE-UE Working Paper

2009/10

Is neglected heterogeneity really an issue in binary and

fractional regression models?
A simulation exercise for logit, probit and loglog models

Esmeralda A. Ramalho 1 and Joaquim J. S. Ramalho2

1
Departamento de Economia, Universidade de Évora and CEFAGE – UE
2
Departamento de Economia, Universidade de Évora and CEFAGE – UE

CEFAGE-UE, Universidade de Évora, Largo dos Colegiais 2, 7000-803 Évora - Portugal

Tel.: (+351) 266 740 869, E-mail: [email protected], Web page: http://www.cefage.uevora.pt
Is neglected heterogeneity really an issue in binary and
fractional regression models? A simulation exercise for
logit, probit and loglog models∗

Esmeralda A. Ramalho and Joaquim J.S. Ramalho

Department of Economics, Universidade de Évora and CEFAGE-UE

This draft: May 2009

Abstract

In this paper we examine theoretically and by simulation whether or not unobserved

heterogeneity independent of the included regressors is really an issue in logit, probit
and loglog models with both binary and fractional data. We found that unobserved
heterogeneity: (i) produces an attenuation bias in the estimation of regression coeffi-
cients; (ii) is innocuous for logit estimation of average sample partial effects, while in
the probit and loglog cases there may be important biases in the estimation of those
quantities; (iii) has much more destructive effects over the estimation of population
partial effects; (iv) only for logit models does not affect substantially the prediction of
outcomes; and (v) is innocuous for the size and consistency of Wald tests for the signif-
icance of observed regressors but, in small samples, reduces their power substantially.
Keywords: binary models, fractional models, neglected heterogeneity, partial effects,
prediction, Wald tests.
JEL Classification: C12, C13, C15, C25.

∗
The authors thank João M.C. Santos Silva for very helpful comments. Financial support from Fundação
para a Ciência e a Tecnologia is also gratefully acknowledged (grant PTDC/ECO/64693/2006). Address
for correspondence: Joaquim J.S. Ramalho, Department of Economics, Universidade de Évora, Largo dos
Colegiais, 7000-803 ÉVORA, Portugal (e-mail: [email protected]).

1
1 Introduction
In economics, researchers are often interested in explaining a limited dependent variable,  ,
as a function of a set of explanatory variables, . Due to the bounded nature of the variable
of interest, linear specifications often provide an inadequate description of the conditional
mean of  ,  ( |), since no restriction is imposed on the range of values taken by the
predicted outcome. Moreover, when interest lies in the conditional probability of  , Pr ( |),
nonlinear models are typically used. While the omission of relevant explanatory variables that
are independent of the included regressors is relatively innocuous in linear models, it generally
causes inconsistency in the estimation of the parameters of interest in nonlinear models (see
inter alia Gourieroux 2000, pp. 32-33). In this paper we examine the consequences of the
presence of that type of unobserved heterogeneity in logit, probit and loglog models for binary
and fractional or proportionate data.
To the best of our knowledge, there are very few studies examining the consequences
of unobserved heterogeneity in binary and fractional regression models. Moreover, the few
studies undertaken have assumed very restrictive conditions or were only concerned with
the effects of neglected heterogeneity on particular aspects of those models. For example,
Lee (1982) derived conditions under which omission of an orthogonal explanatory variable
would not cause bias in the estimation of the remaining parameters of a binary logit model.
However, those conditions are too stringent to be of practical use. Yatchew and Griliches
(1985) showed that for a binary probit model with a normally distributed omitted variable,
the estimators for the parameters of the included variables suffer from attenuation bias.
Wooldridge (2002, 2005), under similar assumptions, demonstrated that that bias does not
affect the consistent estimation of the partial effect of the observed regressors on the outcome.
Finally, Cramer (2003, 2007) considered the binary logit model and proved formally that
the same bias attenuation would occur in this context if the distribution of the omitted
variables is such that their relegation to the disturbance term of the latent regression equation
that originates the logit model does not change its logistic distribution, which is also a
very strong assumption. However, this last author presents also a small simulation study
which reveals that a particular partial effect, the average sample effect, is quite insensitive
to the inconsistency of the parameters of interest, even in cases where the logit shape of the

2
conditional distribution is severely affected.1
Given that calculation of partial effects is often the main aim of empirical work and that
in nonlinear models the analysis of the magnitude of regression coefficients is not relevant
per se, both Wooldridge (2002) and Cramer (2007) suggest that, similarly to what happens
in linear models, unobserved heterogeneity is not an important issue in, respectively, binary
probit and logit models. However, it is not clear whether the robustness of the binary logit
model revealed by the simulation study of Cramer (2007) extends to the binary probit model
(or, in fact, to any other binary or fractional model) since no similar analysis has been carried
out for the latter model. Moreover, there are other quantities of interest in empirical work
that have not been considered by those authors. One example is outcome prediction, which
is relevant not only for the analysis of binary and fractional data but also in the estimation
of multi-part models which require binary outcome prediction in the first stage. Testing the
significance of the observed covariates is clearly another relevant issue for practitioners.
In order to examine these questions, we consider the theoretical framework of Wooldridge
(2002) and Cramer (2007) and extend their results for other quantities of interest and models.
However, given that a more general theoretical approach does not seem to be feasible, in this
paper we conduct also an extensive Monte Carlo study that extends the findings of the cited
papers in several directions. On the one hand, in addition to the binary logit and probit
models, we consider also an alternative asymmetric specification, the loglog model, and, in
each case, both binary outcomes, where interest lies in modelling Pr ( |), and fractional
responses, where the main purpose is modelling  ( |).2 On the other hand, we examine
the consequences of neglected heterogeneity over the performance of standard estimators
for those models at various levels: (i) the magnitude and direction of the parameters of
interest; (ii) the two common forms of calculating partial effects considered separately by
Wooldridge (2002) and Cramer (2007); (iii) the prediction of outcomes; and (iv) the size and
power of Wald tests for the significance of the included regressors. In all cases, we consider

1
See also the work by Neuhaus and Jewell (1993) in the area of generalized linear models, which include
the models analyzed in this paper as particular cases. However, their analysis was restricted to the case of a
single observed covariate.
2
See Papke and Wooldridge (1996) for a seminal paper on the so-called fractional regression model and
Ramalho, Ramalho and Murteira (2009) for a comprehensive survey on this subject.

3
several patterns of neglected heterogeneity by assuming various alternative distributions for
the omitted variables and assigning diﬀerent weights to their relative importance.
The paper is organized as follows. In Section 2 we establish the framework of the paper,
discussing analytically the consequences of neglected heterogeneity in binary regression mod-
els. The Monte Carlo simulation study to assess the performance of naive estimators in both
binary and fractional regression models is carried out in Section 3. Section 4 concludes.

2 Framework
Consider a random sample of  = 1   individuals and let  be the binary or fractional
variable of interest, defined, respectively, as  = {0 1} and  ∈ [0 1], and 1 and 2 be,
respectively, 1 - and 2 -vectors of explanatory variables. Denote by 1 and 2 the 1 - and
2 -vectors of parameters associated with 1 and 2 , respectively, and assume that there are
no relevant explanatory variables other than those included in 1 and 2 . Assume also that
1 contains an intercept term, 2 is not observed and 1 and 2 are independent. Finally,
assume that
 ( |1 = 1  2 = 2 ) =  (1 1 + 2 2 ) , (1)
¡ ¢ −
where  () is defined as   1 +  , Φ (), and  for, respectively, logit, probit, and
loglog models. Note that in the binary case  (·) also equals Pr ( = 1|1 = 1  2 = 2 ).

2.1 Eﬀects of neglected heterogeneity on parameter estimation

By a simple application of the law of iterated expectations, it follows that,

Z
 ( |1 ) = 2 [ (1 1 + 2 2 )] =  (1 1 + 2 2 ) 2 (2 ) 2 , (2)
X2

where X2 and 2 (2 ) denote, respectively, the sample space and the marginal distribution of
2 . As, in general,  ( |1 ) 6=  (1 1 ), naive estimation based on  (1 1 ) will not produce
consistent estimators for 1 . In fact, it seems that omission of 2 will bias 1 towards zero, as
shown by Yatchew and Griliches (1985) and Wooldridge (2002) for a particular binary probit
model, by Cramer (2007) for a peculiar binary logit model, and by Neuhaus and Jewell
(1993) for any generalized linear model based on a log concave density function (which is

4
the case of the binary and fractional logit, probit and loglog models) with a single observed
covariate. Howewer, as we show next, retracing the arguments of Yatchew and Griliches
(1985), Wooldridge (2002) and Cramer (2007), it is not possible to prove formally that this
attenuation eﬀect will be the consequence of neglected heterogeneity under any circumstances.
For simplicity, consider the following latent regression equation:

 ∗ = 1  1 + 2  2 + , (3)

where  ∗ is not observed, 1 includes a unit variable, 2 contains a single explanatory variable
that is uncorrelated with 1 and  is a random disturbance that is uncorrelated with the
regressors. Instead of  ∗ , we observe the binary variable , which takes the value 1 if  ∗  0
and the value 0 otherwise. Assume that  has mean zero and variance  2 and denote its
standardized distribution by . When 2 is observed, it follows that:

 ( |1  2 ) = Pr ( = 1|1  2 )

= Pr (  −1  1 − 2  2 |1  2 )

= 1 − Pr ( ≤ −1  1 − 2  2 |1  2 )
µ ¶
1 2
= 1 −  −1 − 2
 
µ ¶ 
 
=  1 1 + 2 2 , (4)
 
where  (·) is the complementary function of  (·). When  has a symmetric distribution,
 (·) ≡  (·). As it is well known, the parameters  1 and  2 are not separately identified
from   . Let 1 =  1   .
Assume now that 2 is not observed and has mean zero and variance  2 . Then, the
composite error ∗ = 2  2 +  is independent of 1 and has variance  2∗ =  22  2 + 2 .
Denote the standardized distribution of ∗ by  ∗ . In this setting, it follows that:

 ( |1 ) = Pr ( = 1|1 )

= Pr (∗  −1  1 |1 )

= 1 − Pr (∗ ≤ −1  1 |1 )

µ ¶
∗ 1
= 1 −  −1
 ∗
µ ¶ 

= ∗ 1 1 . (5)
 ∗

5
Let ∗1 =  1  ∗ . Clearly, we cannot evaluate the eﬀects of omitting 2 over parameter
estimation unless we assume that  =  ∗ , i.e. the distribution of 2 must be such that its
inclusion in the error term does not change the distribution of the disturbance. If we make
this assumption, then  = ∗ and, comparing (4) and (5), we find that


∗1 = 1 . (6)
 ∗

As ∗    (unless  2 = 0 or  2 = 0), in general |∗1 |  |1 |, which implies that, under the
assumptions made, omission of an explanatory variable produces an attenuation bias in the
estimation of the observed covariates.
In this proof, the crucial assumption is that  =  ∗ . Actually, most of the papers cited
above made this assumption.3 Indeed, both Yatchew and Griliches (1985) and Wooldridge
(2002, 2005) assumed that both  and 2 are normally distributed, which implies that ∗ has
also a normal distribution. On the other hand, in his proof of the existence of an attenuation
bias in the logit model, Cramer (2007) did not specify the distribution of 2 but assumed
that both  and ∗ had a logistic distribution. However, in practice, it is extremely unlikely
that  =  ∗ . Moreover, for fractional regression models, which cannot be written in latent
form, no similar proof seems to be feasible. Therefore, in the Monte Carlo simulation study
carried out in the next section, we investigate whether equation (6), which applies only to
very specific binary regression models, also holds approximately for cases where  6=  ∗ and
for fractional regression models.

2.2 Eﬀects of neglected heterogeneity on partial eﬀects

For empirical analysis based on nonlinear models, the focus is not so much the analysis
of the magnitude of the regression coefficients, but consistent estimation of partial effects.
The two most usual forms of measuring partial effects in nonlinear models in applied work
are the average sample effect (), which is the mean of the partial effects calculated
independently for each individual in the sample, and the population partial effect (  ),
which is calculated for specific values of the covariates. As discussed in detail by Wooldridge

3
The exception is Neuhaus and Jewell (1993). However, their geometric approach applies only to models
with a single observed covariate.

6
(2002), in presence of neglected heterogeneity we are usually interested in calculating partial
eﬀects averaged across the population distribution of the omitted variables.
Consider again the model described by (1) and assume that 2 is not observed. In this
setting, for the covariate 1 , those partial eﬀects are defined by

1 X  ( |1 ) 1 X 2 [ (1 1 + 2 2 )]

 
 = = (7)
 =1 1  =1 1

and, considering evaluation at a given point 1 = ̄1 (e.g. the mean of the observed regres-
sors), by:
 ( |1 = ̄1 ) 2 [ (̄1 1 + 2 2 )]
 = = . (8)
1 1
As both eﬀects depend on 2 , the naive estimators
³ 
´
X   ̂
\ = 1
1 1
 (9)
 =1 1

and ³ ´
 ̄1 ̂1
\
  = , (10)
1
 
where ̂1 denotes the naive estimator of 1 , should be inconsistent, since ̂1 is inconsistent
\ and \
and  (·) is in general misspecified. However, when  =  ∗ both     provide
consistent estimates for  and   , respectively. Indeed, consider again the example
discussed in the previous section. Using (2) and (5), we know that for binary regression
models: µ ¶
1
∗
 ( |1 ) = 2 [ (1 1 + 2 2 )] =  1 . (11)
 ∗
Hence, ³ ´
2 [ (̄1 1 + 2 2 )] ∗ ̄1 1∗
 = = . (12)
1 1

Therefore, as when  =  ∗ ,  = ∗ and ̂1 converges to ∗1 =  1  ∗ , it follows that under
this assumption \
   is a consistent estimator for   . A similar proof may be performed
for .
Wooldridge (2002), using similar arguments, was the first to demonstrate that in the
binary probit model with a normally distributed omitted variable the bias in the estimation
of 1 does not carry over to the estimation of the   . Cramer (2007) showed that the same

7
conclusion holds for logit models in the particular case where the logit shape of  ( |1  2 )
of (1) is preserved in  ( |1 ) of (2).4 This last author also shows by simulation that, for
logit models, even in cases where  ( |1 ) deviates significantly from the logit functional
form assumed for  ( |1  2 ), the  is relatively robust to neglected heterogeneity. In
section 3 we investigate whether this robustness of naive partial eﬀects may be extended to
other models and more general settings.

2.3 Eﬀects of neglected heterogeneity on predicted outcomes

In this paper we examine also whether naive predictions of  ( |1 ) or Pr ( |1 ), based

on the misspecified functional form  (1 1 ) evaluated at the inconsistent estimator ̂1 , are
reliable. So far, the literature has been silent about this issue. However, outcome prediction,
besides being a relevant matter per se, is also the basis for the estimation of partial effects
in multi-part models where the first stage usually requires the estimation of a binary model.
Because 2 is not observed, the main interest is outcome prediction averaged across the
population distribution of the omitted variables, just like discussed above for partial effects.
From (11), it is clear that the same assumptions required above for consistent estimation of
³ ´
partial effects are still needed: only if  =  ∗ does  ̄1 ̂1 consistently predicts  ( |1 ).
Therefore, in a probit model with normal distributed heterogeneity or in the very special logit
model considered by Cramer (2007) neglected heterogeneity is not a problem also for outcome
prediction. In our Monte Carlo study we focus on cases where  6=  ∗ .

2.4 Eﬀects of neglected heterogeneity on Wald tests

Finally, as testing the significance of the impact of a particular covariate on the outcome
variable is one of the main aims of any empirical study, we next evaluate the eﬀects of
neglected heterogeneity on significance tests. In particular, we examine the application of
the widely used Wald test to assess the individual significance of the parameters associated
to the observed regressors in presence of unobserved heterogeneity.
When there are no omitted variables, the Wald statistic for assessing 0 : 1 = 0 is

4
These findings are supported by a former work by Stoker (1986), who showed that misspecification of
the functional form in single index models does not aﬀect the estimation of average behavioral derivatives.

8
Ár ³ ´ ³ ´
given by  = ̂1 ̂ ̂1 , where ̂ ̂1 denotes an estimate of the variance of ̂1 ,
and converges to a standard normal distribution. For binary data, considering again model
(1), it follows that
⎡ ³ ´2 ⎤−1
³ ´ X
  1 ̂1 + 2 ̂2 
⎢1 ³ ´h ³ ´i ⎥
̂ ̂1 = ⎣ ⎦ (13)
 =1  1 ̂1 + 2 ̂2 1 −  1 ̂1 + 2 ̂2

where  () =  ()  and  is the relevant element of 0  . Hence,
v
u ³ ´2
u 2
u1 X  ̂1  1 ̂1 + 2 ̂2 
 =u t ³ ´h ³ ´i . (14)
 =1  1 ̂1 + 2 ̂2 1 −  1 ̂1 + 2 ̂2

When 2 is omitted, the naive significance test of no eﬀect of 1 is given by

v
u ³  ´2 ³ ´
u  2
u1 X  ̂1  1 ̂1 
 = ut ³ ´h ³ ´i , (15)
 =1  1 ̂ 1 −   ̂
1 1

1

since we are assuming that 1 and 2 are independent.

Under the assumptions made previously, i.e. the distribution of the neglected heterogene-
ity is such that  =  ∗ , there is a case, 1 = 0, where neglected heterogeneity does not
originate any bias. Indeed, in such a case the existence of an attenuation bias implies that

both ̂1 and ̂1 are consistent estimators of 1 and, therefore, the size of any significance
test should remain unaﬀected by unobserved heterogeneity; see also Lagakos and Schoenfeld
(1984), who discuss this issue in the context of score tests in proportional-hazards regression
models where the included variable for which the significance is tested is binary. Later on,
we will examine by simulation the consequences of neglected heterogeneity over the size of
Wald tests when  6=  ∗ .
Lagakos and Schoenfeld (1984) showed also that the power of a score significance test for
a binary included variable may be substantially reduced in the presence of omitted covariates.
In our framework, we may also suspect that neglected heterogeneity may cause some power
loss in the application of the Wald test. In fact, although no general power comparison
between  and   seems to be feasible, there is a special case, the logit model, where such
comparison is straightforward, provided that we assume again that  =  ∗ . Indeed, for this

9
model it is well known that  () =  () [1 −  ()], which implies that statistics (14) and
(15) may be simplified to
v v ³ ´
u u
u1 X 2 ³ ´ q u   ̂ +  ̂

u1 X 
1 1 2 2
 =t ̂1  1 ̂1 + 2 ̂2  = ̂1 t  (16)
 =1  =1 1

and v
v u ³ ´
u ³ ´ ³ ´ q u 
u1 X  2

  u 1 X
  1 ̂1
 = t ̂1  1 ̂1  = ̂1 t  , (17)
 =1  =1 1
³ ´.
respectively, since  () 1 = 1  (). Thus, as both  1 ̂1 + 2 ̂2 1 and
³ 
´.
  1 ̂1 1 converge to the same quantity, 2 [ (1 1 + 2 2 )]/ 1 , see (12),

and ̂1 and ̂1 converge to 1 and ∗1 , respectively, it follows from (6), (16) and (17) that
v
u  r
  u ̂1 
= t → . (18)
 ̂1  ∗

Hence, assuming  =  ∗ , in a logit model the naive Wald test   is depressed relative to 

by the square root of the attenuation factor that relates ̂1 to ̂1 .5 This implies that, in fact,
in small samples unobserved heterogeneity may reduce the power of Wald tests. However,
from (18) it is also evident that under neglected heterogeneity the Wald test retains its
consistency.
In the Monte Carlo study that follows we investigate the size and power properties of
naive Wald statistics under general patterns of heterogeneity.

3 A Monte Carlo simulation study

In this section we present an extensive Monte Carlo simulation study for binary and fractional
logit, probit, and loglog models. All experiments bear on a simple two-variable equation,

 ( |1  2 ) =  (0 + 1 1 + 2 2 ) , (19)

where 0 = 0, 2 ranges from 0 to 4 in steps of 0.25 and 1 takes different values across the
different experiments. Our aim is to analyze the effects of omitting 2 on the estimation
5 
Note that this implies that the same relationship holds for the ratio of the standard errors of ̂1 and ̂1 .

10
of 1 and related statistics. Note that 2 = 0 corresponds to the case where there is no
neglected heterogeneity and that larger values of 2 imply a larger amount of heterogeneity.
In all experiments, 1 is generated from a mixtures of normal distributions, where the
variate is  (−1 1) with probability 07 and  (2333 1) with probability 03, and 2 is
generated from the N (0 1), 5 ,  (1) and 2(1) distributions. Both variables are
scaled to have mean zero and variance one. The choice of an asymmetric distribution for 1
was made to avoid the reflection property about the origin that would aﬀect the sampling
distribution of the estimators of 1 ; see Chesher and Peters (1994) and Chesher (1995) for a
discussion on the design of Monte Carlo simulation studies.
We generate  as a Bernoulli (binary case) or a beta (fractional case) variate with mean
given by the logit, probit, or loglog functional form and the shape parameter of the beta
distribution fixed at 1.6 In the former case, the parameters of interest are estimated by
maximum likelihood (ML), while in the latter we use the quasi-maximum likelihood (QML)
method, which are the standard ways of dealing with each type of data. In both cases, we
estimate full and curtailed versions of the models, i.e. models with and without 2 . The full
version of the model yields consistent estimators for all the quantities of interest and, hence,
it will be used as a reference to evaluate the consequences of neglected heterogeneity.
All experiments were repeated 5000 times using the statistical package  and, given the
substantial amount of results produced in each experiment, we summarized them in figures.
In most cases (the only exceptions are the experiments regarding the Wald tests), given the
similarity of the results obtained, only those relative to binary models are reported.7 Apart
from the last experiment, where several samples sizes were considered, in all the remaining
cases the sample size is  = 200.

3.1 Attenuation bias in the parameter estimates

Under some special conditions, we proved above that an attenuation bias is imposed by
neglected heterogeneity over naive estimation of the parameters of the observed regressors.
As in this Monte Carlo study we consider only one observed covariate, according to the

6
See inter alia Ramalho, Ramalho and Murteira (2009) for the mean-dispersion parametrization of the
beta distribution used in the generation of data.
7
Full results are available from the authors upon request.

11
findings of Neuhaus and Jewell (1993), we know for sure that an attenuation bias will be
present in all the models simulated. However, this bias may diﬀer substantially from that
predicted by (6), since the assumptions made in its derivation are not met in 11 out of the
12 models simulated. Therefore, the main aim of our first set of experiments is to examine
whether equation (6) measures appropriately the extent of the bias caused by neglected
heterogeneity when  6=  ∗ . Figure 1 displays the values of the ratio ̂1 1 for two diﬀerent
values of 1 (-1 and 1) for each one of the 17 values of 2 simulated. In this figure we display
also (solid line) the value of the ratio ∗1 1 , obtained from (6).

Figure 1 about here

Clearly, in all cases, ̂1 is depressed towards zero, its absolute bias increasing as 2 (i.e.
the extent of heterogeneity) increases. Equation (6) gives often a very good approximation
to the attenuation bias (e.g. loglog and, obviously, probit models with normal-distributed
heterogeneity and logit model with 5 -distributed heterogeneity) but in some cases there
are some important deviations. For example, when 2 has an exponential or chi-square
distribution ̂1 is not, in general, as biased as predicted by (6) in the logit and probit
models, while for the loglog model the attenuation eﬀect is amplified relative to (6). Note
also that in some cases the actual bias depends on the value of 1 , while (6) is not a function
of that parameter. Therefore, as the extent of that bias is not perfectly approximated by
(6) in many cases, next we investigate the consequences of this fact over the calculation of
marginal eﬀects and prediction of outcomes when  6=  ∗ .

3.2 Partial eﬀects

Using the same setup of the previous section, in Figure 2 we display the mean across the
replications of the  estimated for the case 1 = 1. For the curtailed model we estimate
the  as in (9), while for the full model we use (7), where the expectation  ( |1 ) is
calculated by integration as in (2) with 2 (2 ) replaced by the density used to generate 2 .
This figure shows clearly that in the logit case ML estimation based on the full (MLf) or the
curtailed (MLc) equations leads to very similar results (the largest bias is 3.6% for 2 = 275
in the chi-square case). Thus, as already noted by Cramer (2007), logit analysis of the 
is very robust to neglected heterogeneity.

12
Figure 2 about here

In the probit model, considering a symmetrical-distributed omitted variable, the s

estimated for each equation are also almost identical, while for asymmetric 2 the deviations
between them are no longer insignificant, achieving a maximum of 7.7% (2 = 225, chi-
square case). With regard to the loglog model, the consequences of neglected heterogeneity
are somewhat similar to those found for the probit model: while for symmetric 2 the bias is
minimal (always inferior to 3%), for asymmetric unobserved heterogeneity the  is often
somewhat overestimated (maximum bias: 8.3% for 2 = 175 in the chi-square case).
Finally, note that the bias increases with the level of unobserved heterogeneity but only
until a certain point, which may be explained by the little importance of 1 in the variation
of  ( |1 ) when 2 is very large (the marginal eﬀect of 1 tends to zero as 2 increases).
For example, when 2 = 4 the weight of the variance of the term 2 2 in the total variance
of the index (0 + 1 1 + 2 2 ) is 94%.
In which concerns the   s, we calculated them as in (10), for the curtailed equation,
and (8), for the full model. In both cases, the    was evaluated at the mean and the
{0 002 004  098 1} quantiles of 1 . Figure 3 shows the results obtained for 1 = 1 and
2 = 05, 1, 2 and 4 when 2 is generated according to a normal and a chi-square distribution.
The dotted line indicates the mean of 1 . For cases where 2 is normal-distributed, both the
logit and the probit estimators are clearly unaﬀected by neglected heterogeneity. However,
in the chi-square case, while for small amounts of heterogeneity (2 = 05) the bias in the
estimation of   s is not that relevant (maximum bias of 2.0% for the logit and 6.5% for the
probit), for large amounts of heterogeneity (2 = 4) the bias may achieve a maximum value
of 28.9% (logit model) or 50.0% (probit model), even when the analysis is restricted to the
0.05-0.95 quantile range. For the loglog model, the bias is in general substantial, achieving a
maximum of 17.4% for normal-distributed heterogeneity and 82.6% for the chi-square case,
in both cases for 2 = 2 and again restricting the analysis to the 0.05-0.95 quantile range.

Figure 3 about here

When we consider the evaluation of the    at the mean of 1 , the bias of the various
estimators is much smaller. For example, for the symmetric 2 case, the maximum bias in

13
the loglog model is now 4.2%. Nevertheless, for the chi-square case the bias may still be
substantial: the maximum bias for the logit, probit and loglog models is, respectively, 9.8%,
21.4% and 25.1%.
Overall, the results obtained in this section allows us to achieve three main conclusions.
First, the logit model produces more robust estimates of partial effects than probit or loglog
models. Second, when our interest is the calculation of average partial effects, which is usually
the case in empirical work (in most cases, practitioners report only average partial effects), it
is preferable to compute s instead of   s evaluated at the mean of the regressors, since
the former appears to be clearly much more robust to neglected heterogeneity.8 Finally, under
neglected heterogeneity, computation of   s for an individual with specific characteristics
may be very unreliable.

3.3 Predicted outcomes

Figure 4 illustrates the eﬀects of the omission of 2 in the prediction of  ( |1 ) through
a simulation design similar to that used for the   s. For the full model the prediction is
based on (19), while for the curtailed equation we used the naive estimator  (̂0 + ̂1 1 ).
Clearly, unobserved heterogeneity is relatively harmless in logit models: the maximum
bias in the 0.05-0.95 quantile range is 5.0% (2 = 2). The probit model is also robust
to the omission of variables when the distribution of 2 is symmetric, but displays more
important distortions in cases where 2 is asymmetric (maximum bias: 15.8% for 2 = 2).
Finally, the loglog model is relatively robust to unobserved heterogeneity when 2 has a
normal distribution but displays some bias in the other case, achieving a maximum bias of
23.7% (2 = 2). Hence, for outcome prediction, unobserved heterogeneity resulting from the
omission of independent explanatory variables does not seem to be a relevant issue only in
logit models. Nevertheless, note that our results suggest that when  6=  ∗ , the consequences
of using a misspecified model ∗ are much more serious for calculation of   s (which require
the computation of derivatives of ∗ ) than for outcome prediction.

8
A similar finding was reported by Ramalho, Ramalho and Murteira (2009), who found that computation
of s is relatively robust to functional form misspecification in the framework of fractional regression
models, while estimation of   s evaluated at the mean of the covariates may be severely biased.

14
Figure 4 about here

3.4 Size and power of Wald tests for the significance of observed
regressors

In our final set of experiments we investigate the size and power of naive (Q)ML-based Wald
tests for assessing the statistical significance of observed regressors, i.e. we examine their
ability for testing the null hypothesis 0 : 1 = 0 both when it is true and false. Figures
5-6 display the percentage of rejections of 0 for a nominal level of 5% when this hypothesis
is indeed true (the horizontal lines represent the limits of a 95% confidence interval for the
nominal size). This percentage is very similar for the curtailed and full models in the binary
case, being always very near to the nominal level of 5%. For fractional data, where we use
robust estimation of standard errors since we are performing QML estimation, the empirical
size of the Wald test based on the naive estimator is even closer to the nominal size than
that based on the full equation. Therefore, these results show clearly that the size properties
of the Wald test for 1 = 0 are very robust to the presence of neglected heterogeneity.

Figure 5 about here

Figure 6 about here

With regard to the power properties of the Wald test, Figures 7-8 illustrate a very diﬀerent
scenario. In this case, we observe an important decay on the percentage of rejections of the
false 0 as the level of heterogeneity increases. This decay seems to be more substantial, in
relative terms, in the probit and loglog models, in cases where 1 is larger, and with fractional
data.

Figure 7 about here

Figure 8 about here

In order to check whether equation (18), which was derived for binary logit models under
the assumption  =  ∗ , provides also a good approximation for other models, in Figures
9-10 we represent three    ratios: that given by (18) (solid line) and two others that are
given by the mean across replications of that ratio for the two values of 1 simulated.

15
Figure 9 about here
Figure 10 about here

For binary models, see Figure 9, equation (18) seems to be a reasonable approximation.
In fact, comparing Figures 1 and 9, a very similar pattern was obtained. In contrast, for
fractional regression models the attenuation bias in the estimation of the Wald statistic is
much larger, which explains why the loss of power detected in Figure 8 is more substantial for
these models. Clearly, equation (18) is not a good approximation when robust sandwich-type
variance estimators are used.
A further investigation on the power of naive Wald tests was conducted. Only for the
chi-square distribution and for the value of 1 = 015, which led to the poorest power
performance of all the cases illustrated in the previous figures, we run experiments for  =
{200 500 1000 2500 5000}. Figure 11 shows that in all cases the power of the test increases
substantially as  increases, which confirms that the Wald test is still consistent in presence
of omitted variables, as discussed in Section 2. Given these results, it seems that we can trust
the outcome of a naive Wald test that reveals that a given explanatory variable is significant.
The opposite conclusion may be simply the consequence of the omission of relevant variables,
unless the sample size is large and/or the amount of heterogeneity is small.

Figure 11 about here

4 Conclusion
It is well known that the omission of orthogonal relevant variables in nonlinear models causes
inconsistency in the estimation of the parameters of interest associated with the included
regressors. However, some recent work on the probit and logit models by Wooldridge (2002,
2005) and Cramer (2003, 2007), respectively, shows that, in some cases, the bias does not
carry over to the marginal eﬀect of those regressors on the outcome and that, hence, neglected
heterogeneity may not be really an issue in, at least, binary logit and probit models. In this
paper, we demonstrated analytically that, under similar assumptions to those imposed by
those authors, their results can be extended to any other model for binary data. Moreover,
we showed that, while other features like outcome prediction are also robust to neglected

16
heterogeneity, Wald tests for the individual significance of an included covariate are biased
towards the non-rejection of the null hypothesis of non-significance.
Given that the theoretical analysis undertaken in this paper requires strong assumptions,
we performed also an extensive Monte Carlo simulation study considering more general forms
of heterogeneity. We found that, in general, unobserved heterogeneity independent of the
included covariates: (i) produces an attenuation bias in the estimation of regression coeffi-
cients; (ii) is relatively innocuous for logit estimation of the , while in the probit and
loglog cases there may be important biases in its estimation; (iii) has much more destructive
effects over the estimation of   s than s; (iv) only for logit models does not affect
substantially the prediction of outcomes; and (v) is innocuous for the size and the consistency
of Wald tests for the significance of the observed regressors but, in small samples, reduces
their power substantially.
Overall, our results imply that unobserved heterogeneity is not a relevant problem in
any of the nonlinear models considered in this paper if the aim of the analysis is simply
obtaining the direction of the partial effects of the covariates. In addition, in the logit
case, neglected heterogeneity is also relatively innocuous for outcome prediction and the
calculation of s.9 These are, we think, very comforting and useful results for practitioners
since the usual ways of dealing with unobserved heterogeneity are not entirely satisfactory,
requiring strong distributional assumptions for the unobservables which often give rise to
a model that does not describe properly the data, or are too complex to be widely used
by applied economists, often requiring the utilization of nonparametric techniques which
frequently cannot be computed without substantial programming experience.
Another important implication of our results is that it is extremely important to test
the general specification of the functional form adopted for the model.10 Indeed, if the test
indicates that the functional form of our binary regression model is correctly specified (which
means that  =  ∗ ), then we know that calculation of partial effects and outcome prediction
9
Note that this unique property of robustness of the logit model is not totally unexpected. In fact, this
model is also robust to other problems, like endogenous stratification and nonignorable missing data, that,
in general, cause the inconsistency of the estimators based on other models; see, respectively, Hsieh, Manski
and McFadden (1985) and Ramalho and Smith (2003).
10
For a comparison of various functional form tests for binary and fractional regression models see Ramalho,
Ramalho and Murteira (2009) and Ramalho and Ramalho (2009).

17
is not aﬀected by the presence of neglected heterogeneity. In such a case, the only relevant
problem that remains is the poor power of the Wald test in small samples. However, if all
variables are statistically significant or the sample is very large, then even that is not really
a problem.

References
Chesher, A. (1995), “A mirror image invariance for m-estimators”, Econometrica, 63(1),
207-211.

Chesher, A. and Peters, S. (1994), “Symmetry, regression design, and sampling distribu-
tions”, Econometric Theory, 10, 116-129.

Cramer, J.S. (2003), Logit Models from Economics and Other Fields, Cambridge, Cambridge
University Press.

Cramer, J.S. (2007), “Robustness of logit analysis: unobserved heterogeneity and mis-
specified disturbances”, Oxford Bulletin of Economics and Statistics, 69(4), 545-555.

Gourieroux, C. (2000), Econometrics of Qualitative Dependent Variables, Cambridge, Cam-

bridge University Press.

Hsieh, D.A., Manski, C.F. and McFadden, D. (1985), “Estimation of response probabili-
ties from augmented retrospective observations”, Journal of the American Statistical
Association, 80, 651-662.

Lagakos, S.W. and Schoenfeld, D.A. (1984), “Properties of proportional-hazards score tests
under misspecified regression models”, Biometrics, 40, 1037-1048.

Lee, L.F. (1982), “Specification error in multinomial logit models”, Journal of Econometrics,
20, 197-209.

Neuhaus, J.M. and Jewell, N.P. (1993), “A geometric approach to assess bias due to omitted
covariates in generalized linear models”, Biometrika, 80, 807-815.

18
Papke, L.E. and Wooldridge, J.M. (1996), “Econometric methods for fractional response
variables with an application to 401(k) plan participation rates”, Journal of Applied
Econometrics, 11(6), 619-632.

Ramalho, E.A., and Ramalho, J.J.S. (2009),“Alternative versions of the RESET test for
binary response index models: a comparative study”, mimeo.

Ramalho, E.A., Ramalho, J.J.S. and Murteira, J. (2009),“Alternative estimating and testing
empirical strategies for fractional regression models”, Journal of Economic Surveys,
forthcoming.

Ramalho, E.A., and Smith, R.J. (2003), “Discrete Choice Nonresponse”, Centre for Micro-
data Methods and Practice, I.F.S. and U.C.L.. http://cemmap.ifs.org.uk/wps/cwp0307.pdf

Stoker, T.M. (1986), “Consistent estimation of scaled coeﬃcients”, Econometrica, 54(6),

1461-1481.

Wooldridge, J.M. (2002), Econometric Analysis of Cross Section and Panel Data, Cam-
bridge, MIT Press.

Wooldridge, J.M. (2005), “Unobserved heterogeneity and estimation of average partial ef-
fects”, in D.W.K. Andrews and J.H. Stock (eds.) Identification and Inference for Econo-
metric Models, Cambridge, Cambridge University Press, 27-55.

Yatchew, A. and Griliches, Z. (1985), “Specification error in probit models”, Review of

Economics and Statistics, 67(1), 134-139.

19
Figure 1: Attenuation bias of parameter estimates in binary regression models

Logit model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

Theoretical Theoretical Theoretical Theoretical

1.0

1.0
α1 = − 1 α1 = − 1 α1 = − 1 α1 = − 1
α1 = 1 α1 = 1 α1 = 1 α1 = 1
0.8

0.8

0.8
^ α1

^ α1

^ α1
0.6

0.6

0.6
n
1

n
1

n
1
α

α
0.4

0.4

0.4
0.2

0.2

0.2
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2

Probit model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

Theoretical Theoretical Theoretical Theoretical

1.0

1.0
α1 = − 1 α1 = − 1 α1 = − 1 α1 = − 1
α1 = 1 α1 = 1 α1 = 1 α1 = 1
0.8

0.8

0.8
^ α1

^ α1

^ α1
0.6

0.6

0.6
n
1

n
1

n
1
α

α
0.4

0.4

0.4
0.2

0.2

0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2

Loglog model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

Theoretical Theoretical Theoretical Theoretical

1.0

α1 = − 1 α1 = − 1 α1 = − 1 α1 = − 1
α1 = 1 α1 = 1 α1 = 1 α1 = 1
0.8

0.8

0.8
^ α1

^ α1

^ α1
0.6

0.6

0.6
n
1

n
1

n
1
α

α
0.4

0.4

0.4
0.2

0.2

0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2
α1 = 1)
Figure 2: Average sample effects for binary regression models (α

Normal−distributed heterogeneity t(5)−distributed heterogeneity

0.30

0.30
MLc MLc
probit MLf probit MLf
0.25

0.25
loglog loglog
0.20

0.20
logit logit
ASE

ASE
0.15

0.15
0.10

0.10
0.05

0.05

0 1 2 3 4 0 1 2 3 4
α2 α2

Exponential−distributed heterogeneity Chi−square−distributed heterogeneity

0.30

MLc MLc
probit MLf probit MLf
0.25

0.25

loglog loglog
0.20

0.20

logit logit
ASE

ASE
0.15

0.15
0.10

0.10
0.05

0.05

0 1 2 3 4 0 1 2 3 4
α2 α2
α1 = 1)
Figure 3: Population partial effects for binary regression models (α

Normal−distributed heterogeneity
Logit Probit Loglog

0.4

0.4
MLc MLc MLc
MLf MLf α2 = 0.5 MLf
α2 = 0.5
α2 = 1
0.3

0.3

0.3
α2 = 1
α2 = 0.5
PPE

PPE

PPE
0.2

0.2

0.2
α2 = 1 α2 = 2 α2 = 2

α2 = 2
α2 = 4
0.1

0.1

0.1
α2 = 4 α2 = 4
0.0

0.0

0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
X1 quantiles X1 quantiles X1 quantiles

Chi−square−distributed heterogeneity

Logit Probit Loglog

0.4

0.4
MLc MLc MLc
MLf MLf α2 = 0.5 MLf
α2 = 0.5
α2 = 1
0.3

0.3

0.3
α2 = 1
α2 = 0.5

α2 = 1 α2 = 2
PPE

PPE

PPE
0.2

0.2

0.2
α2 = 2
α2 = 2

α2 = 4 α2 = 4
0.1

0.1

0.1
α2 = 4
0.0

0.0

0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
X1 quantiles X1 quantiles X1 quantiles
α1 = 1)
Figure 4: Predicted outcomes for binary regression models (α

Normal−distributed heterogeneity
Logit Probit Loglog

1.0

1.0
MLc MLc MLc
MLf α2 = 0.5 MLf α2 = 0.5 MLf
0.8

0.8

0.8
α2 = 0.5
α2 = 2 α2 = 2 α2 = 2
Predicted outcomes

Predicted outcomes

Predicted outcomes
α2 = 4 α2 = 4
0.6

0.6

0.6
α2 = 4
0.4

0.4

0.4
0.2

0.2

0.2
0.0

0.0

0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
X1 quantiles X1 quantiles X1 quantiles

Chi−square−distributed heterogeneity

Logit Probit Loglog

1.0

1.0
MLc MLc MLc
MLf α2 = 0.5 MLf α2 = 0.5 MLf
0.8

0.8

0.8
α2 = 0.5
α2 = 2
Predicted outcomes

Predicted outcomes

Predicted outcomes
α2 = 2
0.6

0.6

0.6
α2 = 2
α2 = 4 α2 = 4

α2 = 4
0.4

0.4

0.4
0.2

0.2

0.2
0.0

0.0

0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
X1 quantiles X1 quantiles X1 quantiles
Figure 5: Empirical size for binary regression models (N = 200)

Logit model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

0.08

0.08
MLc MLc MLc MLc
MLf MLf MLf MLf
empirical size

empirical size

empirical size
0.06

0.06

0.06
0.04

0.04

0.04
0.02

0.02

0.02
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2

Probit model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

0.08

0.08
MLc MLc MLc MLc
MLf MLf MLf MLf
empirical size

empirical size

empirical size
0.06

0.06

0.06
0.04

0.04

0.04
0.02

0.02

0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2

Loglog model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

0.08

MLc MLc MLc MLc

MLf MLf MLf MLf
empirical size

empirical size

empirical size
0.06

0.06

0.06
0.04

0.04

0.04
0.02

0.02

0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2
Figure 6: Empirical size for fractional regression models (N = 200)

Logit model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

0.08

0.08
QMLc QMLc QMLc QMLc
QMLf QMLf QMLf QMLf
empirical size

empirical size

empirical size
0.06

0.06

0.06
0.04

0.04

0.04
0.02

0.02

0.02
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2

Probit model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

0.08

0.08
QMLc QMLc QMLc QMLc
QMLf QMLf QMLf QMLf
empirical size

empirical size

empirical size
0.06

0.06

0.06
0.04

0.04

0.04
0.02

0.02

0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2

Loglog model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

0.08

QMLc QMLc QMLc QMLc

QMLf QMLf QMLf QMLf
empirical size

empirical size

empirical size
0.06

0.06

0.06
0.04

0.04

0.04
0.02

0.02

0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2
Figure 7: Empirical power for binary regression models (N = 200)

Logit model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

1.0

1.0
MLc MLc MLc MLc
MLf MLf MLf MLf
0.4 0.6 0.8

0.4 0.6 0.8

empirical power

empirical power
α1 = 0.3 α1 = 0.3 α1 = 0.3 α1 = 0.3

α1 = 0.15 α1 = 0.15 α1 = 0.15 α1 = 0.15

0.2

0.2
0.0

0.0

0.0
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2

Probit model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

1.0

1.0
α1 = 0.3 MLc α1 = 0.3 MLc α1 = 0.3 MLc α1 = 0.3 MLc
MLf MLf MLf MLf
0.4 0.6 0.8

0.4 0.6 0.8

empirical power

empirical power
α1 = 0.15 α1 = 0.15 α1 = 0.15 α1 = 0.15
0.2

0.2

0.2
0.0

0.0

0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2

Loglog model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

1.0

α1 = 0.3 MLc α1 = 0.3 MLc α1 = 0.3 MLc α1 = 0.3 MLc

MLf MLf MLf MLf
0.4 0.6 0.8

0.4 0.6 0.8

empirical power

α1 = 0.15 α1 = 0.15 α1 = 0.15 α1 = 0.15

0.2

0.2
0.0

0.0

0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2
Figure 8: Empirical power for fractional regression models (N = 200)

Logit model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

1.0

1.0
α1 = 0.3 QMLc α1 = 0.3 QMLc α1 = 0.3 QMLc α1 = 0.3 QMLc
QMLf QMLf QMLf QMLf
0.4 0.6 0.8

0.4 0.6 0.8

empirical power

empirical power
α1 = 0.15 α1 = 0.15 α1 = 0.15 α1 = 0.15
0.2

0.2

0.2
0.0

0.0

0.0
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2

Probit model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

1.0

1.0
α1 = 0.3 QMLc α1 = 0.3 QMLc α1 = 0.3 QMLc α1 = 0.3 QMLc
QMLf QMLf QMLf QMLf
0.4 0.6 0.8

0.4 0.6 0.8

α1 = 0.15 α1 = 0.15 α1 = 0.15 α1 = 0.15
empirical power

empirical power

empirical power
0.2

0.2

0.2
0.0

0.0

0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2

Loglog model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

1.0

α1 = 0.3 QMLc α1 = 0.3 QMLc α1 = 0.3 QMLc α1 = 0.3 QMLc

QMLf QMLf QMLf QMLf
0.4 0.6 0.8

0.4 0.6 0.8

α1 = 0.15 α1 = 0.15 α1 = 0.15 α1 = 0.15

empirical power

empirical power
0.2

0.2

0.2
0.0

0.0

0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2
Figure 9: Attenuation bias of Wald statistics in binary regression models

Logit model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

Theoretical Theoretical Theoretical Theoretical

1.0

1.0
α1 = 0.15 α1 = 0.15 α1 = 0.15 α1 = 0.15
α1 = 0.3 α1 = 0.3 α1 = 0.3 α1 = 0.3
0.8

0.8

0.8
Wn W

Wn W

Wn W
0.6

0.6

0.6
0.4

0.4

0.4
0.2

0.2

0.2
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2

Probit model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

Theoretical Theoretical Theoretical Theoretical

1.0

1.0
α1 = 0.15 α1 = 0.15 α1 = 0.15 α1 = 0.15
α1 = 0.3 α1 = 0.3 α1 = 0.3 α1 = 0.3
0.8

0.8

0.8
Wn W

Wn W

Wn W
0.6

0.6

0.6
0.4

0.4

0.4
0.2

0.2

0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2

Loglog model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

Theoretical Theoretical Theoretical Theoretical

1.0

α1 = 0.15 α1 = 0.15 α1 = 0.15 α1 = 0.15

α1 = 0.3 α1 = 0.3 α1 = 0.3 α1 = 0.3
0.8

0.8

0.8
Wn W

Wn W

Wn W
0.6

0.6

0.6
0.4

0.4

0.4
0.2

0.2

0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2
Figure 10: Attenuation bias of Wald statistics in fractional regression models

Logit model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

Theoretical Theoretical Theoretical Theoretical

1.0

1.0
α1 = 0.15 α1 = 0.15 α1 = 0.15 α1 = 0.15
α1 = 0.3 α1 = 0.3 α1 = 0.3 α1 = 0.3
0.8

0.8

0.8
Wn W

Wn W

Wn W
0.6

0.6

0.6
0.4

0.4

0.4
0.2

0.2

0.2
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2

Probit model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

Theoretical Theoretical Theoretical Theoretical

1.0

1.0
α1 = 0.15 α1 = 0.15 α1 = 0.15 α1 = 0.15
α1 = 0.3 α1 = 0.3 α1 = 0.3 α1 = 0.3
0.8

0.8

0.8
Wn W

Wn W

Wn W
0.6

0.6

0.6
0.4

0.4

0.4
0.2

0.2

0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2

Loglog model

Normal distribution t(5) distribution Exponential distribution Chi−square distribution

Theoretical Theoretical Theoretical Theoretical

1.0

α1 = 0.15 α1 = 0.15 α1 = 0.15 α1 = 0.15

α1 = 0.3 α1 = 0.3 α1 = 0.3 α1 = 0.3
0.8

0.8

0.8
Wn W

Wn W

Wn W
0.6

0.6

0.6
0.4

0.4

0.4
0.2

0.2

0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2 α2
Figure 11: Empirical power − different sample sizes (chi−square−distributed heterogeneity; α1 = 0.15)

Binary regression models

Logit Probit Loglog

1.0

1.0
N = 5000 N = 5000
0.8

0.8

0.8
N = 500
N = 500
N = 1000 N = 5000
empirical power

empirical power

empirical power
0.6

0.6

0.6
N = 2500 N = 2500
0.4

0.4

0.4
N = 500 N = 200 N = 2500
N = 200

N = 1000
N = 1000
0.2

0.2

0.2
N = 200
0.0

0.0

0.0
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2

Fractional regression models

Logit Probit Loglog

1.0

1.0
N = 5000 N = 5000
0.8

0.8

0.8
N = 500
N = 500 N = 5000
N = 500 N = 2500 N = 200
empirical power

empirical power

empirical power
N = 2500
0.6

0.6

0.6
N = 200

N = 2500
0.4

0.4

0.4
N = 200 N = 1000 N = 1000
N = 1000
0.2

0.2

0.2
0.0

0.0

0.0
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
α2 α2 α2

Eur Sociol Rev 2010 Mood 67 82
No ratings yet
Eur Sociol Rev 2010 Mood 67 82
16 pages
Dummy Dependent Variable
100% (1)
Dummy Dependent Variable
58 pages
Endogeneity
No ratings yet
Endogeneity
10 pages
Lecture 3 Simple Linear Regression
No ratings yet
Lecture 3 Simple Linear Regression
46 pages
Fixed Effects Estimation of Structural Parameters and Marginal Effects in Panel Probit Models
No ratings yet
Fixed Effects Estimation of Structural Parameters and Marginal Effects in Panel Probit Models
44 pages
Limited Dependent Variables Models-1
No ratings yet
Limited Dependent Variables Models-1
23 pages
Chapter 4: Transformations of Variables: Box-Cox Tests of Functional Specification
No ratings yet
Chapter 4: Transformations of Variables: Box-Cox Tests of Functional Specification
16 pages
Module 2 Transcripts - v3
No ratings yet
Module 2 Transcripts - v3
103 pages
049 Stat 326 Regression Final Paper
No ratings yet
049 Stat 326 Regression Final Paper
17 pages
Econometrics Revision Work
100% (6)
Econometrics Revision Work
6 pages
What Is A Math/Stats Model?: 1. Often Describe Relationship Between Variables 2. Types
No ratings yet
What Is A Math/Stats Model?: 1. Often Describe Relationship Between Variables 2. Types
64 pages
Tedo New Se
No ratings yet
Tedo New Se
29 pages
CUHK STAT5102 Ch7
No ratings yet
CUHK STAT5102 Ch7
33 pages
Differences Between Statistical Software Packages
No ratings yet
Differences Between Statistical Software Packages
24 pages
Unitb - II - Linear Probability, Logit and Probit
No ratings yet
Unitb - II - Linear Probability, Logit and Probit
34 pages
04 16 Simple Regression
No ratings yet
04 16 Simple Regression
47 pages
Binary
No ratings yet
Binary
47 pages
Binary
No ratings yet
Binary
47 pages
Binary
No ratings yet
Binary
40 pages
Ecntr Assmm
No ratings yet
Ecntr Assmm
23 pages
Simple Linear Regression Analysis..
No ratings yet
Simple Linear Regression Analysis..
51 pages
Chapter 2
No ratings yet
Chapter 2
18 pages
Q.1 Explain The Underlying Ideas Behind The Log It Model. Explain On What Grounds Log It Model Is An Improvement Over Linear Probability Model. Ans
No ratings yet
Q.1 Explain The Underlying Ideas Behind The Log It Model. Explain On What Grounds Log It Model Is An Improvement Over Linear Probability Model. Ans
17 pages
Ch2 Two Variable Analysis
No ratings yet
Ch2 Two Variable Analysis
13 pages
Binary Data Advanced
No ratings yet
Binary Data Advanced
42 pages
Presentation Last
No ratings yet
Presentation Last
20 pages
Chapter One Part 1
No ratings yet
Chapter One Part 1
20 pages
Models Assignment
No ratings yet
Models Assignment
43 pages
C4-LP1 - Computing The Point Estimate of A Population Mean
No ratings yet
C4-LP1 - Computing The Point Estimate of A Population Mean
5 pages
Probit Logit Ohio PDF
No ratings yet
Probit Logit Ohio PDF
16 pages
Lecture 7 - Binary
No ratings yet
Lecture 7 - Binary
45 pages
Robustness of Logit Analysis: Unobserved Heterogeneity and Misspecified Disturbances
No ratings yet
Robustness of Logit Analysis: Unobserved Heterogeneity and Misspecified Disturbances
14 pages
Chapter 5
No ratings yet
Chapter 5
25 pages
Lecture Notes 5
No ratings yet
Lecture Notes 5
19 pages
PD2004 9
No ratings yet
PD2004 9
26 pages
3.handouts Binary Dependent Variables
No ratings yet
3.handouts Binary Dependent Variables
8 pages
Simple Regression Model: Erbil Technology Institute
No ratings yet
Simple Regression Model: Erbil Technology Institute
9 pages
Key9 - Interpreting and Understanding
No ratings yet
Key9 - Interpreting and Understanding
16 pages
Chapter 5-LDVM-2024
No ratings yet
Chapter 5-LDVM-2024
27 pages
Econometric Analysis of Panel Data
No ratings yet
Econometric Analysis of Panel Data
14 pages
Metrikaq
No ratings yet
Metrikaq
11 pages
Tema 0 Econometrics
No ratings yet
Tema 0 Econometrics
6 pages
Moment-Based Estimation of Nonlinear Regression Models Under Unobserved Heterogeneity, With Applications To Non-Negative and Fractional Responses
No ratings yet
Moment-Based Estimation of Nonlinear Regression Models Under Unobserved Heterogeneity, With Applications To Non-Negative and Fractional Responses
29 pages
Sta 3010 Quizes
No ratings yet
Sta 3010 Quizes
10 pages
Regression With A Binary Dependent Variable
No ratings yet
Regression With A Binary Dependent Variable
63 pages
Limited Dependent Variables
No ratings yet
Limited Dependent Variables
17 pages
Econometrics Eviews 6
No ratings yet
Econometrics Eviews 6
12 pages
17.874 Lecture Notes Part 6: Panel Models
No ratings yet
17.874 Lecture Notes Part 6: Panel Models
13 pages
CH 5. Discrete Choice Model
No ratings yet
CH 5. Discrete Choice Model
38 pages
09-Limited Dependent Variable Models
No ratings yet
09-Limited Dependent Variable Models
71 pages
Probit Model
No ratings yet
Probit Model
5 pages
Qualitative Response Regression Models
No ratings yet
Qualitative Response Regression Models
6 pages
Limited Dependent Variables - Binary Dependent Variables
No ratings yet
Limited Dependent Variables - Binary Dependent Variables
24 pages
Newsletter 23 - Logit, Probit, Tobit (2P)
No ratings yet
Newsletter 23 - Logit, Probit, Tobit (2P)
2 pages
Generalized Linear Model
No ratings yet
Generalized Linear Model
9 pages
Ghysels, Eric - Marcellino, Massimiliano - Applied Economic Forecasting Using Time Series Methods-Oxford University Press (2018)
No ratings yet
Ghysels, Eric - Marcellino, Massimiliano - Applied Economic Forecasting Using Time Series Methods-Oxford University Press (2018)
617 pages
Binaryresponsemf IMP
No ratings yet
Binaryresponsemf IMP
11 pages
Linear Regression Chap01
100% (1)
Linear Regression Chap01
7 pages
Estimating Econometric Models With Fixed Effects
No ratings yet
Estimating Econometric Models With Fixed Effects
14 pages
Waqar Ansari's RISE QM Ch#13
No ratings yet
Waqar Ansari's RISE QM Ch#13
12 pages
Henderson 1984 PDF
No ratings yet
Henderson 1984 PDF
384 pages
TPJC JC 2 H2 Maths 2011 Mid Year Exam Solutions
No ratings yet
TPJC JC 2 H2 Maths 2011 Mid Year Exam Solutions
13 pages
Panel Data Analysis - Advantages and Challenges: Wise Working Paper Series WISEWP0602
No ratings yet
Panel Data Analysis - Advantages and Challenges: Wise Working Paper Series WISEWP0602
35 pages
BTMMeeting25Nov2020 StatisticalLearning
No ratings yet
BTMMeeting25Nov2020 StatisticalLearning
49 pages
DBA-5102 Statistics - For - Management Assignment
No ratings yet
DBA-5102 Statistics - For - Management Assignment
12 pages
Basic Principles of Research Design
No ratings yet
Basic Principles of Research Design
33 pages
CH 9 - Forecasting Exchange Rates
No ratings yet
CH 9 - Forecasting Exchange Rates
34 pages
IAI CS1 Syllabus 2024
No ratings yet
IAI CS1 Syllabus 2024
6 pages
Probabilistic Programming Julia
No ratings yet
Probabilistic Programming Julia
91 pages
2020-2021 EDA 101 Lectures (Sampling Distribution and Point Estimates - Test of Hypothesis For Single Population)
No ratings yet
2020-2021 EDA 101 Lectures (Sampling Distribution and Point Estimates - Test of Hypothesis For Single Population)
55 pages
Week9 Estimation (New)
No ratings yet
Week9 Estimation (New)
166 pages
Lecture 1
No ratings yet
Lecture 1
23 pages
Randomized Complete Block Design
No ratings yet
Randomized Complete Block Design
9 pages
Machine Learning: Linear Models For Regression
No ratings yet
Machine Learning: Linear Models For Regression
54 pages
Lecture 6 - Estimate
No ratings yet
Lecture 6 - Estimate
20 pages
Two Stage Sampling 1
No ratings yet
Two Stage Sampling 1
33 pages
SP Quiz 3 - Problem Solving Involving Mean, Variance, and SD
No ratings yet
SP Quiz 3 - Problem Solving Involving Mean, Variance, and SD
4 pages
Lesson 4.2 Computing The Point Estimate of A Population Mean
No ratings yet
Lesson 4.2 Computing The Point Estimate of A Population Mean
24 pages
The Classical Two-Variable Regression Model
No ratings yet
The Classical Two-Variable Regression Model
29 pages
Module No. 4: Estimation of Parameters: Math 2 - Statistics and Probability 2nd Semester - AY 2020-2021
No ratings yet
Module No. 4: Estimation of Parameters: Math 2 - Statistics and Probability 2nd Semester - AY 2020-2021
15 pages
Understanding and Misunderstanding Randomized Controlled Trials - PMC
No ratings yet
Understanding and Misunderstanding Randomized Controlled Trials - PMC
46 pages
Zellner - 1962 - An Efficient Method of Estimating Seemingly Unreleted Regressions and Test
No ratings yet
Zellner - 1962 - An Efficient Method of Estimating Seemingly Unreleted Regressions and Test
22 pages
Estimating Neutrosophic Finite Median Employing Robust Measures of The Auxiliary Variable
No ratings yet
Estimating Neutrosophic Finite Median Employing Robust Measures of The Auxiliary Variable
18 pages
I. True, False and Uncertain
No ratings yet
I. True, False and Uncertain
4 pages
5 Taxation
No ratings yet
5 Taxation
85 pages
On The Optimal Weighting Matrix For The GMM System Estimator in Dynamic Panel Data Models
No ratings yet
On The Optimal Weighting Matrix For The GMM System Estimator in Dynamic Panel Data Models
28 pages
Updated Assignment#3 MAS2001
No ratings yet
Updated Assignment#3 MAS2001
3 pages
Christopher A. Sims and Vector Autoregressions: Lawrence J. Christiano
No ratings yet
Christopher A. Sims and Vector Autoregressions: Lawrence J. Christiano
23 pages
Chapter 3 Introduction To Numerical Methods: C BX Ax
No ratings yet
Chapter 3 Introduction To Numerical Methods: C BX Ax
19 pages
Applied Financial Econometrics Using Stata 3. Linear Factor Models
No ratings yet
Applied Financial Econometrics Using Stata 3. Linear Factor Models
42 pages
The Determinants of Bank Mergers: A Revealed Preference Analysis
No ratings yet
The Determinants of Bank Mergers: A Revealed Preference Analysis
46 pages
Problem Set #1 Solutions: Due: Wednesday, February 16, 2005 (In Class)
No ratings yet
Problem Set #1 Solutions: Due: Wednesday, February 16, 2005 (In Class)
9 pages
Problem Set 1: ECON 4330
No ratings yet
Problem Set 1: ECON 4330
14 pages
Mohring Crop Insurance and Pesticide Use
No ratings yet
Mohring Crop Insurance and Pesticide Use
21 pages
Addendum To Ordered Logit and Probit Models by Afees Salisu
No ratings yet
Addendum To Ordered Logit and Probit Models by Afees Salisu
16 pages
The Pricing of Illiquidity Risk On Emerging Stock Exchange Markets: A Portfolio Panel Data Analysis
No ratings yet
The Pricing of Illiquidity Risk On Emerging Stock Exchange Markets: A Portfolio Panel Data Analysis
15 pages
Public Economics Problem Set 1: 1 Tax Reform in Hungary
No ratings yet
Public Economics Problem Set 1: 1 Tax Reform in Hungary
10 pages
Final 2000 Ans-2
No ratings yet
Final 2000 Ans-2
8 pages
Insurance, Moral Hazard, and Chemical Use in Agriculture: John K. Horowitz and Erik Lichtenberg
No ratings yet
Insurance, Moral Hazard, and Chemical Use in Agriculture: John K. Horowitz and Erik Lichtenberg
10 pages
First Hour Exam Answers: Part I. True/False. (30 Points)
No ratings yet
First Hour Exam Answers: Part I. True/False. (30 Points)
7 pages
Problem Set #3 Suggested Solutions
No ratings yet
Problem Set #3 Suggested Solutions
13 pages
14.472 Problem Set 2 Suggested Solutions: DZ C DZ A X
No ratings yet
14.472 Problem Set 2 Suggested Solutions: DZ C DZ A X
6 pages
The Error Correction Model
No ratings yet
The Error Correction Model
6 pages
GRS
No ratings yet
GRS
9 pages
Pset 3
No ratings yet
Pset 3
2 pages
General Concepts of Point Estimation
No ratings yet
General Concepts of Point Estimation
2 pages
機率大抄
No ratings yet
機率大抄
2 pages
Introductory Guide to Partial Differential Equations
From Everand
Introductory Guide to Partial Differential Equations
Sameer Kulkarni
No ratings yet
Log-Linear Modeling: Concepts, Interpretation, and Application
From Everand
Log-Linear Modeling: Concepts, Interpretation, and Application
Alexander von Eye
No ratings yet
Applied Partial Differential Equations
From Everand
Applied Partial Differential Equations
Paul DuChateau
5/5 (1)