Evolutionary Computation: 131: Ajith Abraham
Evolutionary Computation: 131: Ajith Abraham
Evolutionary Computation: 131: Ajith Abraham
Evolutionary Computation
Ajith Abraham
Oklahoma State University, Stillwater, OK, USA
1 Introduction and Biological Motivation 2 Genetic Algorithms 3 Schema Theorem 4 Selection and Reproduction 5 GA Demonstrations 6 Evolution Strategies 7 Evolutionary Programming 8 Genetic Programming 9 Genetic Programming Basics 10 Summary References Further Reading
920 921 922 922 924 925 928 928 929 931 931 931
This is the recombination operation, which is often referred to as crossover because of the way strands of chromosomes cross over during the exchange. The diversity in the population is achieved by mutation operation. Evolutionary algorithms are ubiquitous nowadays, having been successfully applied to numerous problems from different domains, including optimization, automatic programming, signal processing, bioinformatics, social systems, and so on. In many cases, the mathematical function, which describes the problem, is not known, and the values at certain parameters are obtained from simulations. In contrast to many other optimization techniques, an important advantage of evolutionary algorithms is they can cope with multimodal functions. Usually found grouped under the term evolutionary computation or evolutionary algorithms (B ack, 1996), are the domains of genetic algorithms (GA) (Holland, 1975), evolution strategies (Rechenberg, 1973; Schwefel, 1977), evolutionary programming (Fogel, Owens and Walsh, 1966), and genetic programming (Koza, 1992). These all share a common conceptual base of simulating the evolution of individual structures via processes of selection, recombination, and mutation reproduction, thereby producing better solutions. The processes depend on the perceived performance of the individual structures as dened by the problem. A population of candidate solutions (for the optimization task to be solved) is initialized. New solutions are created by applying reproduction operators (crossover and/or mutation). The tness (how good the solutions are) of the resulting solutions is evaluated and suitable selection strategy is then applied to determine which solutions will be maintained into the next generation. The procedure is then iterated, as illustrated in Figure 1.
Handbook of Measuring System Design, edited by Peter H. Sydenham and Richard Thorn. 2005 John Wiley & Sons, Ltd. ISBN: 0-470-02143-8.
Selection
Population
Reproduction
Replacement Offspring
4.
Figure 1. Flow chart of an evolutionary algorithm.
A primary advantage of evolutionary computation is that it is conceptually simple. The procedure may be written as the difference equation: x(t + 1) = s {v [x(t)]} (1)
5.
where x(t) is the population at time t under a representation x , v is a random variation (reproduction) operator, and s is the selection operator (Fogel, 1999).
6.
Evolutionary algorithms can also be combined with more traditional optimization techniques. This may be as simple as the use of a gradient minimization after primary search with an evolutionary algorithm (e.g. ne tuning of weights of an evolutionary neural network) or it may involve simultaneous application of other algorithms (e.g. hybridizing with simulated annealing or Tabu search to improve the efciency of basic evolutionary search). The evaluation of each solution can be handled in parallel and only selection (which requires at least pair-wise competition) requires some serial processing. Implicit parallelism is not possible in many global optimization algorithms like simulated annealing and Tabu search. Traditional methods of optimization are not robust to the dynamic changes in the problem of the environment and often require a complete restart in order to provide a solution (e.g. dynamic programming). In contrast, evolutionary algorithms can be used to adapt solutions to changing circumstance. Perhaps, the greatest advantage of evolutionary algorithms comes from the ability to address problems for which there are no human experts. Although human expertise should be used when it is available, it often proves less than adequate for automating problemsolving routines.
2 GENETIC ALGORITHMS
A typical owchart of a genetic algorithm is depicted in Figure 2. One iteration of the algorithm is referred to as a generation . The basic GA is very generic, and there are many aspects that can be implemented differently according to the problem (e.g. representation of solution (chromosomes), type of encoding, selection strategy, type of crossover and mutation operators, etc.). In practice, GAs are implemented by having arrays of bits or characters to represent the chromosomes. The individuals in the population
2.
Solution found? No
Yes End
Reproduction (Crossover/mutation)
Selection
922
Elements: B Signal Conditioning The chromosome represents the order in which the salesman will visit the cities. Special care is taken to ensure that the strings represent real sequences after crossover and mutation. Floating-point representation is very useful for numeric optimization (e.g. for encoding the weights of a neural network). It should be noted that in many recent applications, more sophisticated genotypes are appearing (e.g. chromosome can be a tree of symbols or a combination of a string and a tree, some parts of the chromosome are not allowed to evolve, etc.).
then go through a process of simulated evolution. Simple bit manipulation operations allow the implementation of crossover, mutation, and other operations. The number of bits for every gene (parameter) and the decimal range in which they decode are usually the same, but nothing precludes the utilization of a different number of bits or range for every gene. When compared to other evolutionary algorithms, one of the most important GA feature is its focus on xed-length character strings, although variable-length strings and other structures have been used.
3 SCHEMA THEOREM
Theoretical foundations of evolutionary algorithms can be partially explained by the schema theorem (Holland, 1975), which relies on the concept of schemata. Schemata are templates that partially specify a solution (more strictly, a solution in the genotype space). If genotypes are strings built using symbols from an alphabet A, schemata are strings whose symbols belong to A ( ). This extra-symbol ( ) must be interpreted as a wildcard, being loci occupied by it, called undened. A chromosome is said to match a schema if they agree in the dened positions. For example, the string 10011010 matches the schemata 1 and 011 among others but does not match 1 11 because they differ in the second gene (the rst dened gene in the schema). A schema can be viewed as a hyperplane in a k dimensional space, representing a set of solutions with common properties. Obviously, the numbers of solutions that match a schema H depend on the number of dened positions in it. Another related concept is the deninglength of a schema, dened as the distance between the rst and the last dened positions in it. The GA works by allocating strings to best schemata exponentially through successive generations, this being the selection mechanism mainly responsible for this behavior. On the other hand, the crossover operator is responsible for exploring new combinations of the present schemata in order to get the ttest individuals. Finally, the purpose of the mutation operator is to introduce fresh genotypic material in the population.
The parameters of the search are identied as x1 and x2 , which are called the phenotypes in evolutionary algorithms. In genetic algorithms, the phenotypes (parameters) are usually converted to genotypes by using a coding procedure. Knowing the ranges of x1 and x2 , each variable is to be represented using a suitable binary string. This representation using binary coding makes the parametric space independent of the type of variables used. The genotype (chromosome) should in some way contain information about solution, which is also known as encoding. GAs use a binary string encoding, as shown below. Chromosome A: 110110111110100110110 Chromosome B: 110111101010100011110 Each bit in the chromosome strings can represent some characteristic of the solution. There are several types of encoding (e.g. direct integer or real numbers encoding), which directly depends on the problem. Permutation encoding can be used in ordering problems, such as the traveling salesman problem (TSP) or taskordering problem. In permutation encoding, every chromosome is a string of numbers, which represents numbers in a sequence. A chromosome using permutation encoding for a 9-city TSP problem will appear as follows: Chromosome A: 4 5 3 2 6 1 7 8 9 Chromosome B: 8 5 6 7 2 3 1 4 9
Evolutionary Computation 923 depending on its own tness value and the tness value of all other individuals in the selection pool. This tness is used for the actual selection step afterwards. Some of the popular selection schemes are discussed below. size takes values ranging from two to the total number of individuals in the population.
4.4 Elitism
When creating a new population by crossover and mutation, there is a big chance that we will lose the best chromosome. Elitism is the name of the method that rst copies the best chromosome (or a few best chromosomes) to the new population. The rest is done in the classical way. Elitism can very rapidly increase performance of GA because it prevents losing the best-found solution.
4.6 Crossover
Crossover selects genes from parent chromosomes and creates a new offspring. The simplest way to do this is to choose randomly some crossover point and everything before this point is copied from the rst parent and then, everything after a crossover point is copied from the second parent. A single point crossover is illustrated as follows (| is the crossover point): Chromosome A: 11011|00100110110 Chromosome B: 11011|11000011110 Offspring A: 11011|11000011110 Offspring B: 11011|00100110110
Chromosome 4
Chromosome 3
Chromosome 1
Chromosome 2
924
Elements: B Signal Conditioning For many optimization problems, there may be multiple, equal, or unequal optimal solutions. Sometimes, a simple GA cannot maintain stable populations at different optima of such functions. In the case of unequal optimal solutions, the population invariably converges to the global optimum. Niching helps to maintain subpopulations near global and local optima. A niche is viewed as an organisms environment and a species as a collection of organisms with similar features. Niching helps to maintain subpopulations near global and local optima by introducing a controlled competition among different solutions near every local optimal region. Niching is achieved by a sharing function, which creates subdivisions of the environment by degrading an organisms tness proportional to the number of other members in its neighborhood. The amount of sharing contributed by individuals to their neighbor is determined by their proximity in the decoded parameter space (phenotypic sharing) based on a distance measure (Goldberg, 1989).
Parent 1
Parent 2
Offspring 1
Offspring 2
Parent 1
Parent 2
Offspring 1
Offspring 2
Parent 1
Parent 2
5 GA DEMONSTRATIONS
Offspring 1 Offspring 2
Uniform crossover
As illustrated in Figure 4, there are several crossover techniques. In a uniform crossover, bits are randomly copied from the rst or the second parent. Specic crossover made for a specic problem can improve the GA performance.
F (x) = 10n +
i =1
4.7 Mutation
After crossover operation, mutation takes place. Mutation randomly changes the new offspring. For binary encoding, mutation is performed by changing a few randomly chosen bits from 1 to 0 or from 0 to 1. Mutation depends on the encoding as well as the crossover. For example, when we are encoding permutations, mutation could be exchanging two genes. A simple mutation operation is illustrated as follows: Chromosome A: 1101111000011110 Chromosome B: 1101100100110110 Offspring A: 1100111000011110
(3) The function has just one global minimum, which occurs at the origin where the value of the function is 0. At any local minimum other than [0, 0], the value of Rastrigins function is greater than 0. The farther the local minimum is from the origin, the larger the value of the function is at that point. Figure 5 illustrates the surface of the function for two input variables. A real-value representation was used to encode the two input variables. The following parameters were used for the GA experiments. Mutation: 0.05, crossover: 0.90 Population size: 20, number of iterations: 50, selection method: Roulette-wheel selection. Figure 6 illustrates how the best tness values were evolved during the 50 generations. As evident after 30 generations, the GA algorithm has succeeded in nding the best optimal solution.
Offspring B: 1101101100110110
600 590 Objective value 580 570 560 550 540 530 520 5 0 Var iab l 2 3 4 5
e2
ia Var
1 ble
0 1 6 11 16 21 26 31 No. of generations 36 41 46
x 5 x3 y5 (4)
the given range of x and y values. Using a population size of 30, the genetic algorithm was run for 25 iterations. Each input variable was represented using 8 bit. Crossover and mutation rates were set as 0.9 and 0.1 respectively. Figure 8(a), (b), and (c), illustrate the convergence of the solutions on a contour plot of the surface. After 10 iterations, almost all the solutions were near the optimal point.
exp((x + 1)2 y 2 ) 3y 3
for 3 x 3 and
6 EVOLUTION STRATEGIES
Evolution strategy (ES) was developed by Rechenberg (1973) and Schwefel (1977). ES tends to be used for
The Peak function surface is plotted in Figure 7, and the task is to nd the optimum value (maximum) for
926
8 6 4 2 0 2 4 6 3 1 2 0 1 0 1 2 3 3 1 3 2
empirical experiments that are difcult to model mathematically. The system to be optimized is actually constructed and ES is used to nd the optimal parameter settings. Evolution strategies merely concentrate on translating the fundamental mechanisms of biological evolution for technical optimization problems. The parameters to be optimized are often represented by a vector of real numbers (object parameters op ). Another vector of real numbers denes the strategy parameters (sp ), which controls the mutation of the objective parameters. Both object and strategic parameters form the data structure for a single individual. A population P of n individuals could be described as follows: P = (c1 , c2 , . . . , cn1 , cn ) (5)
where N0 (si ) is the Gaussian distribution of mean value 0 and standard deviation si . Usually, the strategy parameters mutation step size is done by adapting the standard deviation si . This may be done (for example) as follows: sp(mut) = (s1 A1 , s2 A2 , . . . , sn1 An1 , sn An ) (8)
where the i th chromosome ci is dened as: ci = (op , sp ) op = (o1 , o2 , . . . , on1 , on ) sp = (s1 , s2 , . . . , sn1 , sn ) and (6)
where Ai is randomly chosen from or 1/ , depending on the value of equally distributed random variable E of [0,1] Ai = Ai = 1 if if E < 0.5 E 0.5 (9)
0 1 2 3 3
0 1 2 3 3
(a)
0 x 3
3 (b)
0 x
0 1 2 3 3
(c)
0 x
Figure 8. Convergence of solutions (a) generation 0; (b) after 5 generations; (c) after 20 generations (solution points are marked with *).
928
6.3.2 P + C strategy
The P parents produce C children, using mutation. Fitness values are calculated for each of the C children and the best P individuals of both parents and children become next-generation parents. Children and parents are sorted by their tness value and the rst P individuals are selected to be next-generation parents.
2.
2. 3.
7 EVOLUTIONARY PROGRAMMING
The book Articial Intelligence Through Simulated Evolution by Fogel, Owens and Walsh (1966) is the landmark publication for evolutionary programming (EP). In this book, nite state automata are evolved to predict symbol strings generated from Markov processes and nonstationary time series (AIFAQ-Genetic). The basic evolutionary programming method involves the following steps: 1. Choose an initial population (possible solutions at random). The number of solutions in a population is highly relevant to the speed of optimization, but no denite answers are available as to how many solutions are appropriate (other than > 1). New offspring are created by mutation. Each offspring solution is assessed by computing its tness. Typically, a stochastic tournament is held to determine the N solutions to be retained for the population of solutions. It should be noted that, typically, evolutionary programming method does not use any crossover as a genetic operator.
8 GENETIC PROGRAMMING
The genetic programming (GP) technique provides a framework for automatically creating a working computer program from a high-level statement of the problem (Koza, 1992). Genetic programming achieves this goal of automatic programming by genetically breeding a population of computer programs, using the principles of Darwinian natural selection and biologically inspired operations. The operations include most of the techniques discussed in the previous sections. The main difference between GP and GA is the representation of the solution. GP creates computer programs in the LISP or scheme computer languages as the solution. LISP is an acronym for LISt Processor and was developed in the late 1950s (History of LISP, 2004). Unlike most languages, LISP is usually used as an interpreted language. This means that, unlike compiled languages, an interpreter can process and respond directly to programs written in LISP.
2.
Evolutionary Computation 929 The main reason for choosing LISP to implement GP is because of the advantage that the programs and data have the same structure, which could provide easy means for manipulation and evaluation. In GP, the individual population members are not xedlength character strings that encode possible solutions to the problem at hand, they are programs that, when executed, are the candidate solutions to the problem. These programs are expressed in genetic programming as parse trees rather than as lines of code. For example, the simple program a + b c would be represented as shown in Figure 9. The terminal and function sets are also important components of genetic programming. The terminal and function sets are the alphabets of the programs to be made. The terminal set consists of the variables and constants of the programs (e.g. A, B , and C in Figure 9). The most common way of writing down a function with two arguments is the inx notation. That is, the two arguments are connected with the operator symbol between them as follows. A+B A different method is the prex notation. Here, the operator symbol is written down rst, followed by its required arguments. +AB While this may be a bit more difcult or just unusual for human eyes, it opens some advantages for computational uses. The computer language LISP uses symbolic expressions (or S-expressions) composed in prex notation. Then, a simple S-expression could be (operator, argument ), where operator is the name of a function and argument can be either a constant or a variable or another symbolic expression, as shown below. (operator, argument (operator, argument ) (operator, argument ))
+ Subtree
(+5(31))
930
/ + 4 a Parent 1 c a b a c 6 a + a
/ + 4 a a / c
Parent 2
/ + + 4 a a / c Offspring 1 c a b a c 6 a + a
/ 4 a
b Offspring 2
/ + + a 9 c Mutation of subtree c 6 a
implemented by taking randomly selected subtrees in the individuals and exchanging them. Mutation is another important feature of genetic programming. Two types of mutations are commonly used. The simplest type is to replace a function or a terminal by another function or a terminal respectively. In the
second kind, an entire subtree can replace another subtree. Figure 12 explains the concepts of mutation. GP requires data structures that are easy to handle and evaluate and are robust to structural manipulations. These are among the reasons why the class of S-expressions was chosen to implement GP. The set of functions and terminals
Evolutionary Computation 931 that will be used in a specic problem has to be chosen carefully. If the set of functions is not powerful enough, a solution may be very complex or may not be found at all. Like in any evolutionary computation technique, the generation of the rst population of individuals is important for successful implementation of GP. Some of the other factors that inuence the performance of the algorithm are the size of the population, percentage of individuals that participate in the crossover/mutation, maximum depth for the initial individuals and the maximum allowed depth for the generated offspring, and so on. Some specic advantages of genetic programming are that no analytical knowledge is needed and still accurate results could be obtained. GP approach does scale with the problem size. GP does impose restrictions on how the structure of solutions should be formulated.
B ack, T. (1996) Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic algorithms, Oxford University Press, New York. Fogel, D.B. (1999) Evolutionary Computation: Toward a New Philosophy of Machine Intelligence, 2nd edn, IEEE Press, Piscataway, NJ. Fogel, L.J., Owens, A.J. and Walsh, M.J. (1966) Articial Intelligence Through Simulated Evolution, John Wiley & Sons, New York. Goldberg, D.E. (1989) Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley Publishing Corporation, Inc, Reading, MA. History of LISP. (2004) http://www-formal.stanford.edu/jmc/ history/lisp/lisp.html. Holland, J. (1975) Adaptation in Natural and Articial Systems, University of Michican Press, Ann Harbor, MI. Jang, J.S.R., Sun, C.T. and Mizutani, E. (1997) Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence, Prentice Hall, USA. Koza, J.R. (1992) Genetic Programming, MIT Press, Cambridge, MA. Rechenberg, I. (1973) Evolutionsstrategie: Optimierung Technischer Systeme nach Prinzipien der biologischen Evolution, Fromman-Holzboog, Stuttgart. Schwefel, H.P. (1977) Numerische Optimierung von Computermodellen Mittels der Evolutionsstrategie, Birkhaeuser, Basel. T orn, A. and Zilinskas, A. (1989) Global Optimization, Lecture Notes in Computer Science, Vol. 350, Springer-Verlag, Berlin.
10 SUMMARY
This article presents the biological motivation and fundamental aspects of evolutionary algorithms and its constituents, namely, genetic algorithm, evolution strategies, evolutionary programming, and genetic programming. Performance of genetic algorithms is demonstrated using two function optimization problems. Important advantages of evolutionary computation as compared to classical optimization techniques are also discussed.
REFERENCES
AI FAQ-Genetic. http://www.faqs.org/faqs/ai-faq/genetic/, accessed on September 10, 2004.
FURTHER READING
Michalewicz, Z. (1992) Genetic Algorithms + Data Structures = Evolution Programs, Springer-Verlag, Berlin.