Holland GeneticAlgorithms 1992
Holland GeneticAlgorithms 1992
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms
Scientific American, a division of Nature America, Inc. is collaborating with JSTOR to digitize,
preserve and extend access to Scientific American
D
hurdles in software design: specifying This mixing allows creatures to evolve uring this time, I had been in
in advance all the features of a problem much more rapidly than they would if vestigating mathematical analy
and the actions a program should take each offspring simply contained a copy ses of adaptation and had be
to deal with them. By harnessing the of the genes of a single parent, modified come convinced that the recombination
mechanisms of evolution, researchers occasionally by mutation. (Although uni of groups of genes by means of mating
may be able to "breed" programs that cellular organisms do not engage in was a critical part of evolution. By the
solve problems even when no person mating as humans like to think of it, mid-1960s I had developed a program
can fully understand their structure. In they do exchange genetic material, and ming technique, the genetic algorithm,
deed, these so-called genetic algorithms their evolution can be described in anal that is well suited to evolution by both
have already demonstrated the ability ogous terms.) mating and mutation. During the next
to make breakthroughs in the design of Selection is simple: if an organism decade, I worked to extend the scope of
such complex systems as jet engines. fails some test of fitness, such as recog genetic algorithms by creating a genet
Genetic algorithms make it possible nizing a predator and fleeing, it dies. ic code that could represent the struc
to explore a far greater range of poten Similarly, computer scientists have little ture of any computer program.
tial solutions to a problem than do con trouble weeding out poorly performing The result was the classifier sys
ventional programs. Furthermore, as re algorithms. If a program is supposed to tem, consisting of a set of rules, each
searchers probe the natural selection of sort numbers in ascending order, for ex of which performs particular actions
programs under controlled and well-un- ample, one need merely check whether every time its conditions are satisfied
each entry of the program's output is by some piece of information. The con
larger than the preceding one. ditions and actions are represented by
JOHN H. HOLLAND has been investi People have employed a combination strings of bits corresponding to the
gating the theory and practice of algo of crossbreeding and selection for mil presence or absence of specific charac
rithmic evolution for nearly 40 years. He lennia to breed better crops, racehorses teristics in the rules' input and output.
is a professor of psychology and of elec
or ornamental roses. It is not as easy, For each characteristic that was pres
trical engineering and computer science
however, to translate these procedures ent, the string would contain a 1 in the
at the University of Michigan. Holland
received a B.S. in physics from the Mas for use on computer programs. The appropriate position, and for each that
sachusetts Institute of Technology in chief problem is the construction of a was absent, it would contain a O. For ex
1950 and served on the Logical Planning "genetic code" that can represent the ample, a classifier rule that recognized
Group for IBM's first programmed elec structure of different programs, just as dogs might be encoded as a string con
tronic computer (the 701) from 1950 un DNA represents the structure of a per taining l's for the bits corresponding to
til 1952. He received an M.A. in math·
son or a mouse. Mating or mutating the "hairy," "slobbers," "barks," "loyal" and
ematics and a Ph.D. in communication
text of a FORTRAN program, for exam "chases sticks" and O's for the bits cor
SCiences from the University of Michi
gan. Holland has been a member of the
ple, would in most cases not produce a responding to "metallic," "speaks Urdu"
Steering Committee of the Santa Fe in better or worse FORTRAN program but and "possesses credit cards." More real
stitute since its inception in 1987 and is rather no program at all. istically, the programmer should choose
an external professor there. The first attempts to mesh computer the Simplest, most primitive character
science and evolution, in the late 1950s istics so that they can be combined-as
R
be rewritten as a classifier system. ecast in the language of genetic ing: start at some random point, and if
To evolve classifier rules that solve a algorithms, the search for a good a slight modification improves the qual
particular problem, one simply starts solution to a problem is a search ity of your solution, continue in that di
with a population of random strings of for particular binary strings. The uni rection; otherwise, go in the opposite
l's and O's and rates each string ac verse of all possible strings can be con direction. Complex problems, howev
cording to the quality of its result. De sidered as an imaginary landscape; val er, make landscapes with many high
pending on the problem, the measure leys mark the location of strings that points. As the number of dimensions of
of fitness could be business profitabili encode poor solutions, and the land the problem space increases, the coun
ty, game payoff, error rate or any num scape's highest point corresponds to tryside may contain tunnels, bridges
ber of other criteria. High-quality strings the best possible string. and even more convoluted topological
mate; low-quality ones perish. As gen Regions in the solution space can also features. Finding the right hill or even
erations pass, strings associated with be defined by looking at strings that determining which way is up becomes
improved solutions will predominate. have l's or O's in speCified places-a increasingly difficult. In addition, such
BEE ORCHID demonstrates the specificity with which natural to natural selection, the author says, can produce computer
genetic selection can match an organism to a particular niche. programs (so-called genetic algorithms) capable of solving
The flower, which resembles a female bumblebee, is fertilized such complex problems as the design of jet turbines or com
by male bees that attempt to mate with it. Mechanisms similar munications networks.
search spaces are usually enormous. If 10,000 symbols flips from 0 to 1, or vice implicit parallelism. The purpose of
each move in a chess game, for exam versa. Mutation alone does not general crossing strings in the genetic algorithm
ple, has an average of 10 alternatives, ly advance the search for a solUtion, is to test new parts of target regions
and a typical game lasts for 30 moves but it does provide insurance against rather than testing the same string over
on each side, then there are about 1060 the development of a uniform popula and over again in successive genera
strategies for playing chess (most of tion incapable of further evolution. tions. But the process can also "move"
them bad). an offspring out of one region into an
T
Genetic algorithms cast a net over this he genetic algorithm exploits the other, causing the sampling rate of dif
landscape. The multitude of strings in higher-payoff, or "target," regions ferent regions to depart from a strict
an evolving population samples it in of the solution space, because proportionality to average fitness. That
many regions simultaneously. Notably, successive generations of reproduction departure will slow the rate of evolution.
the rate at which the genetic algorithm and crossover produce increasing num The probability that the offspring of
samples different regions corresponds bers of strings in those regions. The al two strings will leave its parents' re
directly to the regions' average "eleva gorithm favors the fittest strings as gion depends on the distance between
tionn-that is, the probability of finding parents, and so above-average strings the 1's and O's that define the region.
a good solution in that vicinity. (which fall in target regions) will have The offspring of a string that samples
This remarkable ability of genetic more offspring in the next generation. 10****, for example, can be outside
algorithms to focus their attention on Indeed, the number of strings in a giv that region only if crossover begins at
the most promising parts of a solution en region increases at a rate proportion the second position in the string-one
space is a direct outcome of their ability al to the statistical estimate of that re chance in five for a string containing
to combine strings containing partial so gion's fitness. A statistician would need six genes. (The same building block
lutions. First, each string in the popula to evaluate dozens of samples from would run a risk of only one in 999 if
tion is evaluated to determine the per thousands or millions of regions to es contained in a 1,000-gene string.) The
formance of the strategy that it encodes. timate the average fitness of each re offspring of a six-gene string that sam
Second, the higher-ranking strings mate. gion. The genetic algorithm manages to ples region 1 ***,' 1 runs the risk of leav
Two strings line up, a point along the achieve the same result with far fewer ing its parents' region no matter where
strings is selected at random and the strings and virtually no computation. crossover occurs.
portions to the left of that point are The key to this rather surprising be Closely adjacent l's or O's that define
exchanged to produce two offspring: havior is the fact that a single string a region are called compact building
one containing the symbols of the first belongs to all the regions in which any blocks. They are most likely to survive
string up to the crossover point and of its bits appear. For example, the crossover intact and so be propagated
those of the second beyond it, and the string 11011001 is a member of regions into future generations at a rate propor
other containing the complementary 11 ****,,* (where the * indicates that a tional to the average fitness of strings
cross [see illustration above]. Biological bit's value is unspecified), 1***,,*',1, that carry them. Although a reproduc
chromosomes cross over one another **0**00* and so forth. The largest re tion mechanism that includes crossover
when two gametes meet to form a zy gions-those containing many unspeci does not manage to sample all regions
gote, and so the process of crossover in fied bits-will typically be sampled by at a rate proportional to their fitness, it
genetic algorithms does in fact close a large fraction of all the strings in a does succeed in doing so for all regions
ly mimic its biological model. The off population. Thus, a genetic algorithm defined by compact building blocks.
spring do not replace the parent strings; that manipulates a population of a few The number of compactly defined build
instead they replace low-fitness strings, thousand strings actually samples a ing blocks in a population of strings still
which are discarded at each generation vastly larger number of regions. This vastly exceeds the number of strings,
so that the total population remains the implicit parallelism gives the genetic al and so the genetic algorithm still ex
same size. gorithm its central advantage over oth hibits implicit parallelism.
Third, mutations modify a small frac er problem-solving processes. Curiously, an operation in natural ge
tion of the strings: roughly one in every Crossover complicates the effects of netics called inversion occasionally rear-
0.60 �.J!..:;.;"-"-_____----'
0.45 L.:..:.�
.: -L..._____----'
1 110(§'bt�1 10001111100111011111 1 X
1 000101100100111100101110101111 1- __
__
__
__
__
_
__
__
_
GENE POOL of algorithms consists of strings of l's and O's. Each string is evaluated
1
0.20 001111100001111011001101110010 1 X
for fitness, and the best strings mate (second column) and produce offspring by
means of crossover (indicated by a vertical black line). Strings of intermediate
0.20 1 1000100000000100101�001101qooo l X
fitness simply survive to the next generation, and the least fit perish. If particular
patterns of bits (shown here by colored areas) improve the fitness of strings that
carry them, repeated cycles of evaluation and mating (succeeding columns) will
1
0.10 110101100000101101010010100110 1 X cause the proportion of these high-quality "building blocks" to increase. The pattern
corresponding to each building block appears in the rightmost column; asterisks
0.10 1 010010� 101o o,110111110101111111 1 X represent bits whose values are unspecified.
ample of the difficulties of cooperation. whether the preferred response to its best individual.) Similarly, the genetic
Game theory predicts that each player corresponding history was cooperation algorithm can be used, with modifica
should minimize the maximum dam or defection. For example, the 54-bit tions, to govern the evolution not mere
age the other player can inflict: that is, string consisting of all O's would desig ly of individual rules or strategies but
both players should defect. Yet when nate the strategy that defects in all cas of classifier-system "organisms" com
two people play the game together re es. Even for such a simple game, there posed of many rules. Instead of select
peatedly, they typically learn to cooper are 264 (approximately 15 quadrillion) ing the fittest rules in isolation, com
ate with each other to raise their joint different strategies. petitive pressures can lead to the evo
payoff. One of the most effective known Axelrod and Forrest supplied the ge lution of larger systems whose abilities
strategies for the Prisoner's Dilemma is netic algorithm with a small random are encoded in the strings that make
"tit for tat," which begins by cooperat collection of strings representing strate them up.
ing but thereafter mimics the last play gies. The fitness of each string was sim Re-creating evolution at this higher
of the other player. That is, it "punish ply the average of the payoffs its strate level requires several modifications to
es" a defection by defecting the next gy received under repeated play. All the original genetic algorithm. Strings
time, and it rewards cooperation by co these strings had low fitnesses because still represent condition-action rules,
operating the next time. most strategies for playing the Prison and each rule whose conditions are met
Robert Axelrod of the University of er's Dilemma are not very good. Quickly generates an action as before. Rating
Michigan, working with Stephanie For the genetic algorithm discovered and ex each rule by the number of correct ac
rest, now at the University of New Mex ploited tit for tat, but further evolution tions it generates, however, will favor
ico, decided to find out if the genetic introduced an additional improvement. the evolution of individual "superrules"
algorithm could discover the tit-for-tat The new strategy, discovered while the instead of finding clusters of rules that
strategy. Applying the genetic algorithm genetic algorithm was already playing at interact usefully. To redirect the search
first requires translating possible strate a high level, exploited players that could toward interacting rules, the procedure
gies into strings. One simple way is to be "bluffed"-lured into cooperating re is modified by forcing rules to compete
base the next response on the outcome peatedly in the face of defection. It re for control of the system's actions. Each
of the last three plays. Each iteration verted to tit for tat, however, when the rule whose conditions are met com
has four possible outcomes, and so a history indicated the player could not petes with all other rules whose condi
sequence of three plays yields 54 pos be bluffed. tions are met, and the strongest rules
sibilities. A 54-bit string contains one determine what the system will do in
B
gene (or bit position) for each. The first iological evolution operates, of that given situation. If the system's ac
gene, for instance, would be allocated course, not to produce a single tions lead to a successful outcome, all
to the case of three consecutive mutual superindividual but rather to pro the winning rules are strengthened; oth
cooperations and the last to three mu duce interacting species well adapted erwise they are weakened.
tual defections. The value of each gene to one another. (Indeed, in the biologi Another way of looking at this meth
would be either 1 or 0 depending on cal realm there is no such thing as a od is to consider each rule string as a
l ,0
0.60 Oll!1 1110 11001101001010001110 1
1 J
0.65 011 1111 Oll00ll0l0 l 00 l 0l l 00ll I 1........ ...... ......··· 0 1 1 1· 1
• •
1 �1
0.70 011 1110011001101010010110011 1
� 0.70 L-'--. :..:.--'..:.
.. 11000100001010001110
....""'--
_---'-'- ....:. �.:: 0.60 I 011�1111doll00ll0l00l0l000lllO l 1.......... ..1 0 0 1..1 0 ..· 1
••••••
� 0.60 I U
0.70 011 1111 O,Ol1000100001010001110 1 1......1 1.1 0 0 .. ............·· 1
•
L-________-L__��
1
�
� 0.60 101010010100011101001010001110 0.60 101010010100010100010010110011 1
1
0.55 101010010100011101001010001110 1 1 �
0.65 011 11 1 10011000100010010110011 1
� 0.55 I 1010100101000101000W010ll00ll I )( 1
0.70 101010010100010100001010001110 1
1 g1 10001111101110101111 1 )(
�0.50 110
�0.50 1 000101100100111100101110101111 1 )( 1
0.60 101010010100011101001010001110 1
hypothesis about the classifier's world. strong enough to dictate behavior in that increase pressure in a particular
A rule enters the competition only when particular cases. branch of the pipeline and valves to
it "claims" to be relevant to the current Eventually the system develops a hi regulate the flow of gas to and from
situation. Its ability to compete depends erarchy: layers of exception rules at the storage tanks. Because of the tremen
on how much of a contribution it has lower levels handle most cases, but the dous lag between manipulating valves
made to solving similar problems. As default rules at the top level of the hi or compressors and the actual pressure
the genetic algorithm proceeds, strong erarchy come into play when none of changes in the lines, there is no analyt
rules mate and form offspring rules the detailed rules has enough informa ic solution to the problem, and human
that combine their parents' building tion to satisfy its conditions. Such de controllers, like Goldberg's algorithm,
blocks. These offspring, which replace fault hierarchies bring relevant experi must learn by apprenticeship.
the weakest rules, amount to plausible ence to bear on novel situations while Goldberg's system not only met gas
but untried hypotheses. preventing the system from becoming demand at costs comparable to those
Competition among rules provides bogged down in overly detailed options. achieved in practice, but it also devel
the system with a graceful way of han The same characteristics that make oped a hierarchy of default rules ca
dling perpetual novelty. When a system evolving classifier systems adept at han pable of responding properly to holes
has strong rules that respond to a par dling perpetual novelty also do a good punched in the pipeline (as happens
ticular situation, that is the equivalent job of handling situations where the all too often in reality at the blade of
of saying that it has certain well-validat payoff for a given action may come an errant bulldozer). Lawrence Davis of
ed hypotheses. Offspring rules, which only long after the action is taken. The Tica Associates in Cambridge, Mass.,
begin life weaker than do their parents, earliest moves of a chess game, for ex has used similar techniques to design
can win the competition and influence ample, may set the stage for later victo communications networks; his soft
the system's behavior only when there ry or defeat. ware's goal is to carry the maximum
are no strong rules whose conditions To train a classifier system for such possible amount of data with the mini
are satisfied-in other words, when the long-term goals, a programmer gives mum number of transmission lines and
system does not know what to do. If the system a payoff each time it com switches interconnecting them.
their actions help, they survive; if not, pletes a task. The credit for success (or A group of researchers at General
they are soon replaced. Thus, the off the blame for failure) can propagate Electric and Rensselaer Polytechnic In-
spring do not interfere with the sys through the hierarchy to strengthen (or
tem's action in well-practiced situations weaken) individual rules even if their
but wait gracefully in the wings as hy actions had only a distant effect on the
potheses about what to do under novel outcome. Over the course of many gen The Prisoner's Dilemma
circumstances. erations the system develops rules that
(B) (B)
Adding competition in this way act ever earlier to set the stage for later PLAYER
COOPERATE DEFECT
strongly affects the evolution of a clas payoffs. It therefore becomes increas
sifier system. Shortly after the system ingly able to anticipate the consequenc
(A)
starts running, it evolves rules with sim es of its actions. 3/3 5/0
COOPERATE
ple conditions-treating a broad range
G
of situations as if they were identical. enetic algorithms have now been
(A)
The system exploits such rules as de tested in a wide variety of con 0/5 0/0
DEFECT
faults that specify something to be done texts. David E. Goldberg of the
in the absence of more detailed infor University of Illinois, for example, has
mation. Because the default rules make developed algorithms that learn to con
IN PRISONER'S DILEMMA each play
only coarse discriminations, however, trol a gas pipeline system modeled on er can either cooperate or defect and
they are often wrong and so do not the one that carries natural gas from the receives a payoff based on the other's
grow in strength. As the system gains Southwest to the Northeast. The pipe choice. If both cooperate, for example,
experience, reproduction and crossover line complex consists of many branches, both receive three points. Mutual defec
lead to the development of more com all carrying various amounts of gas; the tion is the safest strategy, but repeated
plex, speCific rules that rapidly become only controls available are compressors play often leads to cooperation instead.
�
neer working alone took about eight more standard methods.
ANNEAuNG. Edited by Lawrence Davis.
weeks to reach a satisfactory design.
Morgan Kaufmann, 1987.
So-called expert systems, which use in thOUgh genetic algorithms mimic
GENETIC ALGORITHMS IN SEARCH, OP
ference rules based on experience to the effects of natural selection, TIMIZATION, AND MACHINE LEARNING.
predict the effects of a change of one until now they have operated D. E. Goldberg. Addison-Wesley, 1989.
or two variables, can help direct the de on a much smaller scale than does bio GENETIC ALGORlTHMS: PROCEEDINGS OF
signer in seeking out useful changes. logical evolution. My colleagues and I THE FOURTH INTERNATIONAL CONFER
An engineer using such an expert sys have run classifier systems containing ENCE. Edited by Richard Belew and
Lashon Booker. Morgan Kaufmann, 1991.
tem took less than a day to design an as many as 8,000 rules, but this size is
ADAPTATION IN NATURAL AND ARTIFI
engine with twice the improvements of at the low end of viability for natural
CIAL SYSTEMS. J. H. Holland. MIT Press,
the eight-week manual design. populations. Large animals that are not 1992.
Such expert systems, however, soon endangered may number in the mil COMPLEX ADAPTIVE SYSTEMS. J. H. Hol
get stuck at points where further lions, insect populations in the trillions land in Dazda/us, Vol. 121, No.1, pages
improvements can be made only by and bacteria in the quintillions or more. 17-30; Winter 1992.
changing many variables simultaneous- These large numbers greatly enhance
Hot-link multiple
data files directly to
Stanford Graphics'
proprietary 4-0
70-trillion
cell spreadsheet.
Instantly identify
& modify your data
points with the
Intelligent Data Cursor.�
Simply drag a bar or
line to change a cell in
your spreadsheet!
Does your presentation software give in Windows 3.0 and 3.1 technology, with
you all beauty and no brains? Where is sophisticated DDE, OLE, and TrueType
the substance behind those pretty pic font support.
tures?What if your presentation program Stanford Graphics for Windows:
had enough data-handling muscle to Presentation and Analysis.
meet the needs of business and statistical Finally, form and substance.
and scientific users?
With on-screen curve fitting. formula 3IlL 3-� UISII±INS
solVing. statistics, and data manipulation,
Stanford Graphics advances both the art
,rCall Toll Free for Details
and science of technical presentations.
STANFORD GRAPHICS"
1-800-729-4723
All this and Windows too. Stanford 2780 Skypark Drive, Torrance, CA 90505
Graphics takes full advantage of the latest P R E 5 E N T A T I O N .. A N A L Y 5 I 5 Tel: (310) 325-1339 FAX: (310) 325-1505