dp0241 PDF
dp0241 PDF
02-41
ZEW
Zentrum fr Europische
Wirtschaftsforschung GmbH
Centre for European
Economic Research
Discussion Paper No. 02-41
ftp://ftp.zew.de/pub/zew-docs/dp/dp0241.pdf
Discussion Papers are intended to make results of ZEW research promptly available to other
economists in order to encourage discussion and suggestions for revisions. The authors are solely
responsible for the contents which do not necessarily represent the opinion of the ZEW.
Nontechnical Summary
Many new inventions in the field of engineering sciences are based on the
knowledge of structures in nature. These highly efficient structures are
the results of an optimization process called evolution. Evolution is the
strategy used by nature to optimize the adaptation of life according to the
environment.
July 2002
Abstract
This paper discusses a tool for optimization of econometric mod-
els based on genetic algorithms. First, we briefly describe the con-
cept of this optimization technique. Then, we explain the design of a
specifically developed algorithm and apply it to a difficult economet-
ric problem, the semiparametric estimation of a censored regression
model. We carry out some Monte Carlo simulations and compare the
genetic algorithm with another technique, the iterative linear program-
ming algorithm, to run the censored least absolute deviation estimator.
It turns out that both algorithms lead to similar results in this case,
but that the proposed method is computationally more stable than its
competitor.
1
Helpful comments by Francois Laisney and James L. Powell are gratefully acknowl-
edged. Moreover, we would like to thank the participants of the Econometrics Lunch of
the University of California at Berkeley, where an earlier version of this paper has been
presented.
1 Introduction
Many new inventions in the field of engineering sciences are based on the
knowledge of structures in nature. These highly efficient structures are the
results of an optimization process called evolution. Evolution is the strategy
used by nature to optimize the adaptation of life according to the environ-
ment. The basic principles are crossover, mutation and selection. Evolution
theory explains these principles on the basis of whole populations. Genetic
science takes a much closer look at the individual aspects of evolution: the
genes. It explains the meaning of crossover and mutation at a molecular
level. The knowledge of both worlds is combined in genetic algorithms to use
the problem solving capabilities of evolution for a large number of scientific
and engineering problems or models. Genetic algorithms (GAs) simulate
evolution for a population of candidate solutions in an artificial environment
representing a specific problem. Examples are the optimization of movement
patterns for artificial lifeforms, the determination of weights for neural
networks or, not surprisingly, the emergence of markets in the economy.
These by no means exhaustive examples demonstrate the versatility of GAs.
1
insolvency risk of 3,840 industrial Italian companies. Varetto compares
linear discriminant analysis as a traditional statistical methodology for
bankruptcy classification and prediction and a genetic algorithm. He con-
cludes that genetic algorithms are a very effective instrument for insolvency
diagnosis, although the results obtained with linear discriminant analysis
were superior to those obtained with the GA. Varetto notes that the results
of the GA were obtained in less time and with more limited contributions
from the financial analyst than the linear dicriminant analysis.
The next section explains the concept of GA and introduces basic termi-
nology used in evolutionary programming. Section 3 describes the design
of a genetic algorithm which we have developed for the estimation of
econometric models. In section 4, we apply this GA to a specific econo-
metric model and carry out simulations to study the performance of the GA.
2 Genetic algorithms
The concept of genetic algorithm is very appealingly described by Cooper
(2000). He calls it a concept of partial imitation and refers to an approach
which is familiar to every economist: [...] an effective method for creating
innovative new models is to combine the successful features of two or
more existing models (Cooper, 2000: 403). That is exactly what Nature
does and what is known as evolution. Learning from Nature and finding
2
improved elements of a complex space is incorporated into a formal method
of optimization called genetic algorithm.
2.1 Terminology
The terminology of GAs is mainly borrowed from biology and evolution
theory to underline the analogies. Each term represents the artificial
implementation of biological or evolutionary concepts, though on a much
simpler level. Because there are always different views on the same item,
a multitude of more biological or more evolutionary inspired realizations
exists.
3
Reproduction is another basic principle of evolution. In GAs the fertility
of a candidate solution is determined by the relative fitness. A fitter
solution will reproduce more often than a less fit entity.
4
Mutation is a very important evolutionary aspect for GAs. While crossover
can produce many new variants of existing solutions, mutation has the power
to produce completely new solutions. It is randomly applied after crossover
to mutate one or more genes in an offspring. GAs using binary encoding just
have to apply the logical not operator at randomly chosen bit positions in
the string. The mutation of randomly selected real value encoded traits is
resolved by the multiplication with a random factor within a specific interval
termed for further reference radiation level.
2. Main loop
The main loop runs the artificial evolution. It repeats steps 3 to 5 until
a maximum number of generations T is reached or the GA stagnates.
Stagnation occurs when the current generation equals the previous
5
generation over a given number of subsequent generations (with
T) .
where 0 and 0 are the initial values and and are the halflife
durations. Since mutation is a probability, the initial value must be
in the interval [0, 1]. The absolute value of the radiation level and its
negative counterpart define the interval limits for the random mutation
factor.
max(f1 , . . . , fs ) min(f1 , . . . , fs )
offset = (4)
s
hi
i = s . (6)
X
hi
i=1
6
crossovers is fully determined by the fitness of the candidate solutions.
If W is set to values smaller than 1, the importance of the individual
fitness decreases. If W = 0, the selection probability is independent
of the fitness, so that the chance of being chosen for crossover would
be equal for every candidate solution. If W is specified, the selection
probability is calculated as a the convex combination
1
i = (1 W ) + W i . (7)
s
5. Evolution
7
GA can be applied to every numerical criterion function. However, as we
encountered problems using the CLAD estimator in empirical studies, we
thought it might be useful to think about the GA as another option for
practical use or at least as a supplementary method.
yi = 0 xi + i , (8)
A special case is the Tobit model which is a fully parametric model and
can be estimated with the common maximum likelihood (ML) techniques.
It is derived from the additional assumption that yi N (, 2 ). If the as-
sumptions of homoscedasticity or normality are violated, the ML estimates
may be inconsistent. In case of heteroscedasticity, researchers can attempt
to model heteroscedasticity as a function of some observable variables.
However, the true functional form is usually unknown and the choice
of variables determining the heteroscedasticity function is arbitrary. To-
bit estimates are sensitive to different choices of the heteroscedasticity term.
8
1. Estimate a median regression (least absolute deviation: LAD) with
the entire sample and generate the estimated yi for this initial step.
9
out MonteCarlo simulations. Consider the true model
yi = 1 x1i + 2 x2i + 3 x3i , i = 1, . . . , N, (11)
with the constant term x3i = 1 i. The explanatory variable x1 is stan-
dard normally distributed and x2 is uniformly distributed on the interval
[0,1]. For simplicity, we set 1 = 2 = 3 = 1. The observed dependent
variable is
(
yi + i if yi + i > 0
yi = (12)
0 if yi + i 0
where i is a normally distributed heteroscedastic error term. The het-
eroscedasticity is modeled as
i = exp(0.2x4 ), with x4 N (0, 1). (13)
Table 1 displays the required tuning parameters for running the GA to
estimate 1 , 2 and 3 . The tuning parameters have to be set by the user.
The right column shows their values for the following Monte Carlo study.
The initialization interval for all three parameters was set to [2; 2].
Table 1:
Set of tuning parameters for running the genetic algorithm
Parameter Description Parameter value
s Population size 30
o Number of offsprings 60
0 Mutation probability 0.5
0 Radiation level 1
Halflife duration of mutation 15
Halflife duration of radiation 15
T Maximum number of generations 250
Number of subsequent stagnations 10
W Weight for the selection probability 1.0
Initially, we start with a sample of 400 observations and carry out 200
replications of each regression (Buchinsky and GA) to obtain distributions
of the estimated coefficients. Afterwards, we repeat the simulation with
samples of 800, 1,600 and 3,200 observations to investigate the convergence
behavior of the algorithms.
The results suggest that both the Buchinsky algorithm and the genetic
algorithm produce consistent estimates, although the findings are not
exactly numerically identical. Table 2 displays the mean squared errors of
both methods for all simulations.
10
Table 2:
Mean squared errors of estimated coefficients (multiplied with 100)
Regressor number of observations in sample
400 800 1,600 3,200
x1 GA 0.4244 0.2885 0.1468 0.0686
Buchinsky 0.4207 0.2870 0.1468 0.0689
x2 GA 5.8185 2.7720 1.2521 0.6561
Buchinsky 5.9725 2.2818 1.3054 0.6497
x3 GA 1.7721 1.0356 0.4634 0.2209
Buchinsky 1.8326 0.9096 0.4833 0.2158
2
Test results are not presented here, but are available upon request.
11
Figure 1: Gauss kernel density estimates of 1
12
Figure 2: Gauss kernel density estimates of 2
We have dropped one data point from the results of the genetic
algorithm. For the sample with 800 observations one estimated
value was -.0027. As this would influence the graphical presen-
tation of all other results, we have decided to exclude it in this
figure.
13
Figure 3: Gauss kernel density estimates of 3
14
Figure 4: Box plots of estimated coefficients distributions
Coefficient of x1
Coefficient of x2
Coefficient of x3
15
As the figures above show, both algorithms produce very similar distribu-
tions of results. With growing sample size, the estimates seem to converge
to the true parameter values. From econometric theory, it is known that the
1
CLAD model should exhibit a rate of convergence of N 2 (Powell, 1984).
However, for the genetic algorithm this behavior is unclear a priori as no
theoretical econometric foundation exists. To investigate the convergence
behavior in more detail, we calculate the estimates variances N,k and
relate them to the sample sizes. Searching for k such that
N,k
= constant k = 1, . . . , 3, N = 400, 800, 1600, 3200,
N k
we can estimate the rate of convergence with the following regression model
ln(
N,k ) = k ln(N ) + constant + k .
For all three k the rate of convergence the estimate should be about
0.5. To analyze the rate of convergence of the model empirically, we com-
pare the findings of both methods. The regressions yield the following coef-
ficients:
Table 3:
Empirical rate of convergence of both methods
convergence k for Genetic algorithm Buchinsky
1 .4466 (.0442) .4425 (.0454)
2 .5286 (.0115) .5210 (.0331)
3 .5068 (.0257) .5065 (.0147)
Standard errors in parentheses.
For all six cases, tests show that the hypothesis that the slope coefficient is
0.5 does not have to be rejected.3 Hence, we conclude that both algorithms
exhibit the same behavior of convergence.
16
of 100 candidate solutions and chose a smaller number of offsprings,
e.g. 50. This makes the process of evolution sluggish as only a few
crossovers take place, which means that it may take several generations
until the fitness of the best candidate solution improves significantly.
We recommend using a smaller population size, e.g. 30, and to create
a larger number of crossovers, e.g. 60, as done in the presented simula-
tion. This may require more generations until the evolution stagnates,
but it is rapidly computed and the best fitness is found quickly due to
a large number of trials (offsprings) to find improvements.
17
5 Conclusions and Perspectives
In this paper, we present the concept of a genetic algorithm. This method
of optimization is based on Nature and simulates the evolution of artificial
lifeforms. We propose to use genetic algorithms as an alternative tool for
optimizing criterion functions, especially those which are not continuously
differentiable because many alternative techniques run into numerical
problems in this case.
18
References
Arifovic, J. (1994), Genetic Algorithm learning and the cobweb model,
Journal of Economic Dynamics and Control 18, 328.
Caruana R.A. and J.D. Schaffer (1988), Representation and hidden bias:
gray versus binary coding for genetic algorithms, Proceedings of the
fourth international conference on machine learning.
Dorsey, R.E. and W.J. Mayer (1995), Genetic algorithms for estimations
problems with multiple optima, nondifferentiability, and other irregu-
lar features, Journal of Business & Economic Statistics 13(1), 5366.
19
Manski, C.F. and T.S. Thompson (1986), Operational characteristics of
maximum score estimation, Journal of Econometrics 32, 85108.
Powell, J.L. (1984), Least absolute deviations for the censored regression
model, Journal of Econometrics 25, 303325.
20