A Field Guide To Genetic Programming PDF
A Field Guide To Genetic Programming PDF
Genetic Programming
Riccardo Poli
Department of Computing and Electronic Systems
University of Essex UK
[email protected]
William B. Langdon
Departments of Biological and Mathematical Sciences
University of Essex UK
[email protected]
Nicholas F. McPhee
Division of Science and Mathematics
University of Minnesota, Morris USA
[email protected]
with contributions by
John R. Koza
Stanford University USA
[email protected]
March 2008
c
Riccardo
Poli, William B. Langdon, and Nicholas F. McPhee, 2008
This work is licensed under the Creative Commons AttributionNoncommercial-No Derivative Works 2.0 UK: England & Wales License
(see http://creativecommons.org/licenses/by-nc-nd/2.0/uk/). That
is:
You are free:
to copy, distribute, display, and perform the work
Under the following conditions:
Attribution. You must give the original authors credit.
Non-Commercial. You may not use this work for commercial
purposes.
No Derivative Works. You may not alter, transform, or build
upon this work.
For any reuse or distribution, you must make clear to others the licence
terms of this work. Any of these conditions can be waived if you get
permission from the copyright holders. Nothing in this license impairs
or restricts the authors rights.
Non-commercial uses are thus permitted without any further authorisation
from the copyright owners. The book may be freely downloaded in electronic
form at http://www.gp-field-guide.org.uk. Printed copies can also
be purchased inexpensively from http://lulu.com. For more information
about Creative Commons licenses, go to http://creativecommons.org
or send a letter to Creative Commons, 171 Second Street, Suite 300, San
Francisco, California, 94105, USA.
To cite this book, please see the entry for (Poli, Langdon, and McPhee,
2008) in the bibliography.
Preface
Genetic programming (GP) is a collection of evolutionary computation techniques that allow computers to solve problems automatically. Since its inception twenty years ago, GP has been used to solve a wide range of practical problems, producing a number of human-competitive results and even
patentable new inventions. Like many other areas of computer science, GP
is evolving rapidly, with new ideas, techniques and applications being constantly proposed. While this shows how wonderfully prolific GP is, it also
makes it difficult for newcomers to become acquainted with the main ideas
in the field, and form a mental map of its different branches. Even for people
who have been interested in GP for a while, it is difficult to keep up with
the pace of new developments.
Many books have been written which describe aspects of GP. Some
provide general introductions to the field as a whole. However, no new
introductory book on GP has been produced in the last decade, and anyone
wanting to learn about GP is forced to map the terrain painfully on their
own. This book attempts to fill that gap, by providing a modern field guide
to GP for both newcomers and old-timers.
It would have been straightforward to find a traditional publisher for such
a book. However, we want our book to be as accessible as possible to everyone interested in learning about GP. Therefore, we have chosen to make it
freely available on-line, while also allowing printed copies to be ordered inexpensively from http://lulu.com. Visit http://www.gp-field-guide.
org.uk for the details.
The book has undergone numerous iterations and revisions. It began as
a book-chapter overview of GP (more on this below), which quickly grew
to almost 100 pages. A technical report version of it was circulated on the
GP mailing list. People responded very positively, and some encouraged us
to continue and expand that survey into a book. We took their advice and
this field guide is the result.
Acknowledgements
We would like to thank the University of Essex and the University of Minnesota, Morris, for their support.
Many thanks to Tyler Hutchison for the use of his cool drawing on the
cover (and elsewhere!), and for finding those scary pinks and greens.
We had the invaluable assistance of many people, and we are very grateful
for their individual and collective efforts, often on very short timelines. Rick
Riolo, Matthew Walker, Christian Gagne, Bob McKay, Giovanni Pazienza,
and Lee Spector all provided useful suggestions based on an early technical report version. Yossi Borenstein, Caterina Cinel, Ellery Crane, Cecilia
Di Chio, Stephen Dignum, Edgar Galv
an-L
opez, Keisha Harriott, David
Hunter, Lonny Johnson, Ahmed Kattan, Robert Keller, Andy Korth, Yevgeniya Kovalchuk, Simon Lucas, Wayne Manselle, Alberto Moraglio, Oliver
Oechsle, Francisco Sepulveda, Elias Tawil, Edward Tsang, William Tozier
and Christian Wagner all contributed to the final proofreading festival.
Their sharp eyes and hard work did much to make the book better; any
remaining errors or omissions are obviously the sole responsibility of the
authors.
We would also like to thank Prof. Xin Yao and the School of Computer
Science of The University of Birmingham and Prof. Bernard Buxton of University College, London, for continuing support, particularly of the genetic
programming bibliography. We also thank Schloss Dagstuhl, where some of
the integration of this book took place.
Most of the tools used in the construction of this book are open source,1
and we are very grateful to all the developers whose efforts have gone into
building those tools over the years.
As mentioned above, this book started life as a chapter. This was
for a forthcoming handbook on computational intelligence2 edited by John
Fulcher and Lakhmi C. Jain. We are grateful to John Fulcher for his useful
comments and edits on that book chapter. We would also like to thank most
warmly John Koza, who co-authored the aforementioned chapter with us,
and for allowing us to reuse some of his original material in this book.
This book is a summary of nearly two decades of intensive research in
the field of genetic programming, and we obviously owe a great debt to all
the researchers whose hard work, ideas, and interactions ultimately made
this book possible. Their work runs through every page, from an idea made
somewhat clearer by a conversation at a conference, to a specific concept
or diagram. It has been a pleasure to be part of the GP community over
the years, and we greatly appreciate having so much interesting work to
summarise!
March 2008
1 See
Riccardo Poli
William B. Langdon
Nicholas Freitag McPhee
To
Caterina, Ludovico, Rachele and Leonardo
Susan and Thomas
R.P.
N.F.M.
Contents
Contents
xi
1 Introduction
1.1 Genetic Programming in a Nutshell .
1.2 Getting Started . . . . . . . . . . . .
1.3 Prerequisites . . . . . . . . . . . . .
1.4 Overview of this Field Guide . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Basics
1
2
2
3
4
Operators in Tree-based
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
9
9
11
14
15
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
19
19
20
21
23
23
24
26
27
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
4.2.2
4.2.3
4.2.4
II
CONTENTS
Fitness Evaluation . . . . . . . . . . . . . . . . . . . .
Selection, Crossover and Mutation . . . . . . . . . . .
Termination and Solution Designation . . . . . . . . .
32
32
35
37
Tree-based GP
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
39
39
40
40
41
42
42
42
44
46
47
47
48
50
51
52
52
53
55
57
59
61
61
61
62
64
65
65
67
67
68
xii
.
.
.
.
.
.
.
.
.
CONTENTS
CONTENTS
75
75
76
77
78
80
80
81
.
.
.
.
.
.
.
.
.
83
83
86
88
89
89
90
92
93
93
97
. 98
. 99
. 101
. 101
. 104
III
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
12 Applications
12.1 Where GP has Done Well . . . . . . . . . . . . . . . . . .
12.2 Curve Fitting, Data Modelling and Symbolic Regression .
12.3 Human Competitive Results the Humies . . . . . . . . .
12.4 Image and Signal Processing . . . . . . . . . . . . . . . . .
12.5 Financial Trading, Time Series, and Economic Modelling
12.6 Industrial Process Control . . . . . . . . . . . . . . . . . .
12.7 Medicine, Biology and Bioinformatics . . . . . . . . . . .
12.8 GP to Create Searchers and Solvers Hyper-heuristics . .
xiii
109
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
111
111
113
117
121
123
124
125
126
CONTENTS
CONTENTS
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
14 Conclusions
IV
131
131
132
132
133
133
134
136
137
139
139
139
140
141
143
A Resources
A.1 Key Books . . . . . . . . . .
A.2 Key Journals . . . . . . . .
A.3 Key International Meetings
A.4 GP Implementations . . . .
A.5 On-Line Resources . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
145
146
147
147
147
148
B TinyGP
B.1 Overview of TinyGP . . . . . . .
B.2 Input Data Files for TinyGP . .
B.3 Source Code . . . . . . . . . . . .
B.4 Compiling and Running TinyGP
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
151
151
153
154
162
.
.
.
.
.
.
.
.
.
.
Bibliography
167
Index
225
xiv
Chapter 1
Introduction
The goal of having computers automatically solve problems is central to
artificial intelligence, machine learning, and the broad area encompassed by
what Turing called machine intelligence (Turing, 1948). Machine learning
pioneer Arthur Samuel, in his 1983 talk entitled AI: Where It Has Been
and Where It Is Going (Samuel, 1983), stated that the main goal of the
fields of machine learning and artificial intelligence is:
to get machines to exhibit behaviour, which if done by humans,
would be assumed to involve the use of intelligence.
Genetic programming (GP) is an evolutionary computation (EC)1 technique that automatically solves problems without requiring the user to know
or specify the form or structure of the solution in advance. At the most
abstract level GP is a systematic, domain-independent method for getting
computers to solve problems automatically starting from a high-level statement of what needs to be done.
Since its inception, GP has attracted the interest of myriads of people
around the globe. This book gives an overview of the basics of GP, summarised important work that gave direction and impetus to the field and
discusses some interesting new directions and applications. Things continue
to change rapidly in genetic programming as investigators and practitioners
discover new methods and applications. This makes it impossible to cover
all aspects of GP, and this book should be seen as a snapshot of a particular
moment in the history of the field.
1 These
1 Introduction
Generate Population
of Random Programs
Solution
(* (SIN (- y x))
(IF (> x 15.43)
(+ 2.3787 x)
(* (SQRT y)
(/ x 7.54))))
Figure 1.1: The basic control flow for genetic programming, where survival
of the fittest is used to find solutions.
1.1
1.2
Getting Started
1.3 Prerequisites
1:
2:
3:
4:
5:
6:
7:
The best way to begin is obviously by reading this book, so youre off to
a good start. We included a wide variety of references to help guide people
through at least some of the literature. No single work, however, could claim
to be completely comprehensive. Thus Appendix A reviews a whole host of
books, videos, journals, conferences, and on-line sources (including several
freely available GP systems) that should be of assistance.
We strongly encourage doing GP as well as reading about it; the dynamics of evolutionary algorithms are complex, and the experience of tracing through runs is invaluable. In Appendix B we provide the full Java
implementation of Riccardos TinyGP system.
1.3
Prerequisites
Although this book has been written with beginners in mind, unavoidably
we had to make some assumptions about the typical background of our
readers. The book assumes some working knowledge of computer science
and computer programming; this is probably an essential prerequisite to get
the most from the book.
We dont expect that readers will have been exposed to other flavours of
evolutionary algorithms before, although a little background might be useful.
The interested novice can easily find additional information on evolutionary
computation thanks to the plethora of tutorials available on the Internet.
Articles from Wikipedia and the genetic algorithm tutorial produced by
Whitley (1994) should suffice.
1.4
1 Introduction
As we indicated in the section entitled Whats in this book (page v), the
book is divided up into four parts. In this section, we will have a closer look
at their content.
Part I is mainly for the benefit of beginners, so notions are introduced
at a relaxed pace. In the next chapter we provide a description of the key
elements in GP. These include how programs are stored (Section 2.1), the
initialisation of the population (Section 2.2), the selection of individuals
(Section 2.3) and the genetic operations of crossover and mutation (Section 2.4). A discussion of the decisions that are needed before running GP
is given in Chapter 3. These preparatory steps include the specification of
the set of instructions that GP can use to construct programs (Sections 3.1
and 3.2), the definition of a fitness measure that can guide GP towards
good solutions (Section 3.3), setting GP parameters (Section 3.4) and, finally, the rule used to decide when to stop a GP run (Section 3.5). To help
the reader understand these, Chapter 4 presents a step-by-step application
of the preparatory steps (Section 4.1) and a detailed explanation of a sample
GP run (Section 4.2).
After these introductory chapters, we go up a gear in Part II where
we describe a variety of more advanced GP techniques. Chapter 5 considers additional initialisation strategies and genetic operators for the main GP
representationsyntax trees. In Chapter 6 we look at techniques for the evolution of structured and grammatically-constrained programs. In particular,
we consider: modular and hierarchical structures including automatically defined functions and architecture-altering operations (Section 6.1), systems
that constrain the syntax of evolved programs using grammars or type systems (Section 6.2), and developmental GP (Section 6.3). In Chapter 7 we
discuss alternative program representations, namely linear GP (Section 7.1)
and graph-based GP (Section 7.2).
In Chapter 8 we review systems where, instead of using mutation and
recombination to create new programs, they are simply generated randomly
according to a probability distribution which itself evolves. These are known
as estimation of distribution algorithms, cf. Sections 8.1 and 8.2. Section 8.3
reviews hybrids between GP and probabilistic grammars, where probability
distributions are associated with the elements of a grammar.
Many, if not most, real-world problems are multi-objective, in the sense
that their solutions are required to satisfy more than one criterion at the
same time. In Chapter 9, we review different techniques that allow GP to
solve multi-objective problems. These include the aggregation of multiple
objectives into a scalar fitness measure (Section 9.1), the use of the notion of
Pareto dominance (Section 9.2), the definition of dynamic or staged fitness
functions (Section 9.3), and the reliance on special biases on the genetic
operators to aid the optimisation of multiple objectives (Section 9.4).
A variety of methods to speed up, parallelise and distribute genetic programming runs are described in Chapter 10. We start by looking at ways
to reduce the number of fitness evaluations or increase their effectiveness
(Section 10.1) and ways to speed up their execution (Section 10.2). We
then point out (Section 10.3) that faster evaluation is not the only reason
for running GP in parallel, as geographic distribution has advantages in
its own right. In Section 10.4, we consider the first approach and describe
master-slave parallel architectures (Section 10.4.1), running GP on graphics
hardware (Section 10.4.2) and FPGAs (Section 10.4.3), and a fast method to
exploit the parallelism available on every computer (Section 10.4.4). Finally,
Section 10.5 looks at the second approach discussing the geographically distributed evolution of programs. We then give an overview of some of the
considerable work that has been done on GPs theory and its practical uses
(Chapter 11).
After this review of techniques, Part III provides information for people interested in using GP in practical applications. We survey the enormous variety of applications of GP in Chapter 12. We start with a discussion of the general kinds of problems where GP has proved successful
(Section 12.1) and then describe a variety of GP applications, including:
curve fitting, data modelling and symbolic regression (Section 12.2); human
competitive results (Section 12.3); image analysis and signal processing (Section 12.4); financial trading, time series prediction and economic modelling
(Section 12.5); industrial process control (Section 12.6); medicine, biology
and bioinformatics (Section 12.7); the evolution of search algorithms and
optimisers (Section 12.8); computer games and entertainment applications
(Section 12.9); artistic applications (12.10); and GP-based data compression
(Section 12.11). This is followed by a chapter providing a collection of troubleshooting techniques used by experienced GP practitioners (Chapter 13)
and by our conclusions (Chapter 14).
In Part IV, we provide a resources appendix that reviews the many
sources of further information on GP, on its applications, and on related
problem solving systems (Appendix A). This is followed by a description
and the source code for a simple GP system in Java (Appendix B). The
results of a sample run with the system are also described in the appendix
and further illustrated via a Flip-O-Rama animation2 (see Section B.4).
The book ends with a large bibliography containing around 650 references. Of these, around 420 contain pointers to on-line versions of the corresponding papers. While this is very useful on its own, the users of the PDF
version of this book will be able to do more if they use a PDF viewer that
supports hyperlinks: they will be able to click on the URLs and retrieve the
cited articles. Around 550 of the papers in the bibliography are included in
2 This
is in the footer of the odd-numbered pages in the bibliography and in the index.
1 Introduction
3 Available
at http://www.cs.bham.ac.uk/~wbl/biblio/
Part I
Basics
Chapter 2
Representation,
Initialisation and
Operators in Tree-based
GP
This chapter introduces the basic tools and terminology used in genetic
programming. In particular, it looks at how trial solutions are represented in
most GP systems (Section 2.1), how one might construct the initial random
population (Section 2.2), and how selection (Section 2.3) as well as crossover
and mutation (Section 2.4) are used to construct new programs.
2.1
Representation
In GP, programs are usually expressed as syntax trees rather than as lines of
code. For example Figure 2.1 shows the tree representation of the program
max(x+x,x+3*y). The variables and constants in the program (x, y and 3)
are leaves of the tree. In GP they are called terminals, whilst the arithmetic
operations (+, * and max) are internal nodes called functions. The sets of
allowed functions and terminals together form the primitive set of a GP
system.
In more advanced forms of GP, programs can be composed of multiple
components (e.g., subroutines). In this case the representation used in GP
is a set of trees (one for each component) grouped together under a special
root node that acts as glue, as illustrated in Figure 2.2. We will call these
(sub)trees branches. The number and type of the branches in a program,
9
10
2 Tree-based GP
together with certain other features of their structure, form the architecture
of the program. This is discussed in more detail in Section 6.1.
It is common in the GP literature to represent expressions in a prefix notation similar to that used in Lisp or Scheme. For example, max(x+x,x+3*y)
becomes (max (+ x x) (+ x (* 3 y))). This notation often makes it easier to see the relationship between (sub)expressions and their corresponding
(sub)trees. Therefore, in the following, we will use trees and their corresponding prefix-notation expressions interchangeably.
How one implements GP trees will obviously depend a great deal on
the programming languages and libraries being used. Languages that provide automatic garbage collection and dynamic lists as fundamental data
types make it easier to implement expression trees and the necessary GP
operations. Most traditional languages used in AI research (e.g., Lisp and
Prolog), many recent languages (e.g., Ruby and Python), and the languages
associated with several scientific programming tools (e.g., MATLAB1 and
Mathematica2 ) have these facilities. In other languages, one may have to
implement lists/trees or use libraries that provide such data structures.
In high performance environments, the tree-based representation of programs may be too inefficient since it requires the storage and management
of numerous pointers. In some cases, it may be desirable to use GP primitives which accept a variable number of arguments (a quantity we will call
arity). An example is the sequencing instruction progn, which accepts any
number of arguments, executes them one at a time and then returns the
max
+
x
+
x
x
3
2 Mathematica
11
ROOT
Branches
...
Component
1
Component
2
Component
N
2.2
12
2 Tree-based GP
t=1
t=2
+
t=3
t=4
x
t=5
t=6
t=7
y 1
Figure 2.3: Creation of a full tree having maximum depth 2 using the full
initialisation method (t = time).
will describe two of the simplest (and earliest) methods (the full and grow
methods), and a widely used combination of the two known as Ramped halfand-half.
In both the full and grow methods, the initial individuals are generated
so that they do not exceed a user specified maximum depth. The depth of
a node is the number of edges that need to be traversed to reach the node
starting from the trees root node (which is assumed to be at depth 0). The
depth of a tree is the depth of its deepest leaf (e.g., the tree in Figure 2.1 has
a depth of 3). In the full method (so named because it generates full trees,
i.e. all leaves are at the same depth) nodes are taken at random from the
function set until the maximum tree depth is reached. (Beyond that depth,
only terminals can be chosen.) Figure 2.3 shows a series of snapshots of the
construction of a full tree of depth 2. The children of the * and / nodes
must be leaves or otherwise the tree would be too deep. Thus, at both steps
t = 3, t = 4, t = 6 and t = 7 a terminal must be chosen (x, y, 1 and 0,
respectively).
Although, the full method generates trees where all the leaves are at
the same depth, this does not necessarily mean that all initial trees will
have an identical number of nodes (often referred to as the size of a tree)
or the same shape. This only happens, in fact, when all the functions in
the primitive set have an equal arity. Nonetheless, even when mixed-arity
primitive sets are used, the range of program sizes and shapes produced by
the full method may be rather limited. The grow method, on the contrary,
allows for the creation of trees of more varied sizes and shapes. Nodes are
selected from the whole primitive set (i.e., functions and terminals) until
the depth limit is reached. Once the depth limit is reached only terminals
13
t=1
t=2
+
t=3
+
x
t=4
t=5
x
2
x
2
Figure 2.4: Creation of a five node tree using the grow initialisation method
with a maximum depth of 2 (t = time). A terminal is chosen at t = 2,
causing the left branch of the root to be closed at that point even though
the maximum depth had not been reached.
may be chosen (just as in the full method). Figure 2.4 illustrates this
process for the construction of a tree with depth limit 2. Here the first
argument of the + root node happens to be a terminal. This closes off that
branch preventing it from growing any more before it reached the depth
limit. The other argument is a function (-), but its arguments are forced
to be terminals to ensure that the resulting tree does not exceed the depth
limit. Pseudocode for a recursive implementation of both the full and grow
methods is given in Algorithm 2.1.
Because neither the grow or full method provide a very wide array of
sizes or shapes on their own, Koza (1992) proposed a combination called
ramped half-and-half. Half the initial population is constructed using full
and half is constructed using grow. This is done using a range of depth limits
(hence the term ramped) to help ensure that we generate trees having a
variety of sizes and shapes.
While these methods are easy to implement and use, they often make it
difficult to control the statistical distributions of important properties such
as the sizes and shapes of the generated trees. For example, the sizes and
shapes of the trees generated via the grow method are highly sensitive to the
sizes of the function and terminal sets. If, for example, one has significantly
more terminals than functions, the grow method will almost always generate
very short trees regardless of the depth limit. Similarly, if the number of
functions is considerably greater than the number of terminals, then the
grow method will behave quite similarly to the full method. The arities
of the functions in the primitive set also influence the size and shape of the
14
2 Tree-based GP
2.3
Selection
15
from the population. These are compared with each other and the best of
them is chosen to be the parent. When doing crossover, two parents are
needed and, so, two selection tournaments are made. Note that tournament selection only looks at which program is better than another. It does
not need to know how much better. This effectively automatically rescales
fitness, so that the selection pressure4 on the population remains constant.
Thus, a single extraordinarily good program cannot immediately swamp the
next generation with its children; if it did, this would lead to a rapid loss
of diversity with potentially disastrous consequences for a run. Conversely,
tournament selection amplifies small differences in fitness to prefer the better program even if it is only marginally superior to the other individuals in
a tournament.
An element of noise is inherent in tournament selection due to the random selection of candidates for tournaments. So, while preferring the best,
tournament selection does ensure that even average-quality programs have
some chance of having children. Since tournament selection is easy to implement and provides automatic fitness rescaling, it is commonly used in GP.
Considering that selection has been described many times in the evolutionary algorithms literature, we will not provide details of the numerous
other mechanisms that have been proposed. (Goldberg, 1989), for example,
describes fitness-proportionate selection, stochastic universal sampling and
several others.
2.4
GP departs significantly from other evolutionary algorithms in the implementation of the operators of crossover and mutation. The most commonly
used form of crossover is subtree crossover. Given two parents, subtree
crossover randomly (and independently) selects a crossover point (a node)
in each parent tree. Then, it creates the offspring by replacing the subtree
rooted at the crossover point in a copy of the first parent with a copy of
the subtree rooted at the crossover point in the second parent, as illustrated
in Figure 2.5. Copies are used to avoid disrupting the original individuals.
This way, if selected multiple times, they can take part in the creation of
multiple offspring programs. Note that it is also possible to define a version
of crossover that returns two offspring, but this is not commonly used.
Often crossover points are not selected with uniform probability. Typical
GP primitive sets lead to trees with an average branching factor (the number of children of each node) of at least two, so the majority of the nodes
will be leaves. Consequently the uniform selection of crossover points leads
4 A key property of any selection mechanism is selection pressure. A system with a
strong selection pressure very highly favours the more fit individuals, while a system with
a weak selection pressure isnt so discriminating.
16
2 Tree-based GP
Parents
Crossover
Point
(x+y)+3
Crossover
Point
+
y
Offspring
(x/2)+3
3
(y+1)* (x/2)
/
1x
GARBAGE
Figure 2.5: Example of subtree crossover. Note that the trees on the left
are actually copies of the parents. So, their genetic material can freely be
used without altering the original individuals.
17
Parents
Mutation
Point
+
+
x
Offspring
3
y
Mutation
Point
Randomly Generated
Sub-tree
y
x
y
x
Chapter 3
3.1
While it is common to describe GP as evolving programs, GP is not typically used to evolve programs in the familiar Turing-complete languages
humans normally use for software development. It is instead more common to evolve programs (or expressions or formulae) in a more constrained
and often domain-specific language. The first two preparatory steps, the
definition of the terminal and function sets, specify such a language. That
is, together they define the ingredients that are available to GP to create
computer programs.
19
20
3.2
The function set used in GP is typically driven by the nature of the problem
domain. In a simple numeric problem, for example, the function set may
consist of merely the arithmetic functions (+, -, *, /). However, all sorts
of other functions and constructs typically encountered in computer programs can be used. Table 3.1 shows a sample of some of the functions one
sees in the GP literature. Sometimes the primitive set includes specialised
functions and terminals which are designed to solve problems in a specific
problem domain. For example, if the goal is to program a robot to mop the
floor, then the function set might include such actions as move, turn, and
swish-the-mop.
21
3.2.1
Closure
For GP to work effectively, most function sets are required to have an important property known as closure (Koza, 1992), which can in turn be broken
down into the properties of type consistency and evaluation safety.
Type consistency is required because subtree crossover (as described in
Section 2.4) can mix and join nodes arbitrarily. As a result it is necessary
that any subtree can be used in any of the argument positions for every function in the function set, because it is always possible that subtree crossover
will generate that combination. It is thus common to require that all the
functions be type consistent, i.e., they all return values of the same type,
and that each of their arguments also have this type. For example +, -, *,
and / can can be defined so that they each take two integer arguments and
return an integer. Sometimes type consistency can be weakened somewhat
by providing an automatic conversion mechanism between types. We can,
for example, convert numbers to Booleans by treating all negative values as
false, and non-negative values as true. However, conversion mechanisms can
introduce unexpected biases into the search process, so they should be used
with care.
The type consistency requirement can seem quite limiting but often simple restructuring of the functions can resolve apparent problems. For example, an if function is often defined as taking three arguments: the test, the
value to return if the test evaluates to true and the value to return if the
test evaluates to false. The first of these three arguments is clearly Boolean,
which would suggest that if cant be used with numeric functions like +.
22
1 The decision to return the value 1 provides the GP system with a simple way to
generate the constant 1, via an expression of the form (% x x). This combined with a
similar mechanism for generating 0 via (- x x) ensures that GP can easily construct
these two important constants.
3.2.2
23
Sufficiency
There is one more property that primitives sets should have: sufficiency.
Sufficiency means it is possible to express a solution to the problem at hand
using the elements of the primitive set.2 Unfortunately, sufficiency can be
guaranteed only for those problems where theory, or experience with other
methods, tells us that a solution can be obtained by combining the elements
of the primitive set.
As an example of a sufficient primitive set consider {AND, OR, NOT, x1, x2,
..., xN}. It is always sufficient for Boolean induction problems, since it can
produce all Boolean functions of the variables x1, x2, ..., xN. An example
of insufficient set is {+, -, *, /, x, 0, 1, 2}, which is unable to represent
transcendental functions. The function exp(x), for example, is transcendental and therefore cannot be expressed as a rational function (basically, a
ratio of polynomials), and so cannot be represented exactly by any combination of {+, -, *, /, x, 0, 1, 2}. When a primitive set is insufficient, GP
can only develop programs that approximate the desired one. However, in
many cases such an approximation can be very close and good enough for
the users purpose. Adding a few unnecessary primitives in an attempt to
ensure sufficiency does not tend to slow down GP overmuch, although there
are cases where it can bias the system in unexpected ways.
3.2.3
There are many problems where solutions cannot be directly cast as computer programs. For example, in many design problems the solution is an
artifact of some type: a bridge, a circuit, an antenna, a lens, etc. GP has
been applied to problems of this kind by using a trick: the primitive set is set
up so that the evolved programs construct solutions to the problem. This is
analogous to the process by which an egg grows into a chicken. For example,
if the goal is the automatic creation of an electronic controller for a plant,
the function set might include common components such as integrator,
differentiator, lead, lag, and gain, and the terminal set might contain
reference, signal, and plant output. Each of these primitives, when
executed, inserts the corresponding device into the controller being built.
If, on the other hand, the goal is to synthesise analogue electrical circuits,
the function set might include components such as transistors, capacitors,
resistors, etc. See Section 6.3 for more information on developmental GP
systems.
2 More formally, the primitive set is sufficient if the set of all the possible recursive
compositions of primitives includes at least one solution.
24
3.3
The first two preparatory steps define the primitive set for GP, and therefore
indirectly define the search space GP will explore. This includes all the
programs that can be constructed by composing the primitives in all possible
ways. However, at this stage, we still do not know which elements or regions
of this search space are good. I.e., which regions of the search space include
programs that solve, or approximately solve, the problem. This is the task
of the fitness measure, which is our primary (and often sole) mechanism
for giving a high-level statement of the problems requirements to the GP
system. For example, suppose the goal is to get GP to synthesise an amplifier
automatically. Then the fitness function is the mechanism which tells GP
to synthesise a circuit that amplifies an incoming signal. (As opposed to
evolving a circuit that suppresses the low frequencies of an incoming signal,
or computes its square root, etc. etc.)
Fitness can be measured in many ways. For example, in terms of: the
amount of error between its output and the desired output; the amount
of time (fuel, money, etc.) required to bring a system to a desired target
state; the accuracy of the program in recognising patterns or classifying
objects; the payoff that a game-playing program produces; the compliance
of a structure with user-specified design criteria.
There is something unusual about the fitness functions used in GP that
differentiates them from those used in most other evolutionary algorithms.
Because the structures being evolved in GP are computer programs, fitness
evaluation normally requires executing all the programs in the population,
typically multiple times. While one can compile the GP programs that make
up the population, the overhead of building a compiler is usually substantial,
so it is much more common to use an interpreter to evaluate the evolved
programs.
Interpreting a program tree means executing the nodes in the tree in
an order that guarantees that nodes are not executed before the value of
their arguments (if any) is known. This is usually done by traversing the
tree recursively starting from the root node, and postponing the evaluation
of each node until the values of its children (arguments) are known. Other
orders, such as going from the leaves to the root, are possible. If none
of the primitives have side effects, the two orders are equivalent.3 This
depth-first recursive process is illustrated in Figure 3.1. Algorithm 3.1 gives
a pseudocode implementation of the interpretation procedure. The code
assumes that programs are represented as prefix-notation expressions and
that such expressions can be treated as lists of components.
3 Functional operations like addition dont depend on the order in which their arguments are evaluated. The order of side-effecting operations such as moving or turning a
robot, however, is obviously crucial.
25
x=-1
+
0 x
-2
-1
-3
0 x
26
3.4
Step 4: GP Parameters
The fourth preparatory step specifies the control parameters for the run.
The most important control parameter is the population size. Other control
parameters include the probabilities of performing the genetic operations, the
maximum size for programs and other details of the run.
It is impossible to make general recommendations for setting optimal
parameter values, as these depend too much on the details of the application.
However, genetic programming is in practice robust, and it is likely that
many different parameter values will work. As a consequence, one need not
typically spend a long time tuning GP for it to work adequately.
It is common to create the initial population randomly using ramped
half-and-half (Section 2.2) with a depth range of 26. The initial tree sizes
will depend upon the number of the functions, the number of terminals
and the arities of the functions. However, evolution will quickly move the
population away from its initial distribution.
Traditionally, 90% of children are created by subtree crossover. However, the use of a 50-50 mixture of crossover and a variety of mutations (cf.
Chapter 5) also appears to work well.
In many cases, the main limitation on the population size is the time
taken to evaluate the fitnesses, not the space required to store the individuals. As a rule one prefers to have the largest population size that your
system can handle gracefully; normally, the population size should be at
least 500, and people often use much larger populations.4 Often, to a first
4 There
are, however, GP systems that frequently use much smaller populations. These
27
3.5
typically rely more on mutation than crossover for their primary search mechanism.
5 Training data refers to the test cases used to evaluate the fitness of the evolved
individuals.
Chapter 4
Example
Genetic Programming Run
This chapter provides an illustrative run of GP in which the goal is to
automatically create a program with a target input/output behaviour. In
particular, we want to evolve an expression whose values match those of
the quadratic polynomial x2 + x + 1 in the range [1, +1]. The process of
mechanically creating a computer program that fit certain numerical data
is sometimes called system identification or symbolic regression (see Section 12.2 for more).
We begin with the five preparatory steps from the previous chapter and
then describe in detail the events in one run.
4.1
Preparatory Steps
The purpose of the first two preparatory steps is to specify the ingredients
the evolutionary process can use to construct potential solutions. Because
the problem is to find a mathematical function of one independent variable,
x, the terminal set (the inputs of the to-be-evolved programs) must include
this variable. The terminal set also includes ephemeral random constants
drawn from some reasonable range,1 say from 5.0 to +5.0, as described in
29
30
31
the crossover operation will be used twice (each time generating one individual), which corresponds to a crossover rate of 50%, while the mutation
and reproduction operations will each be used to generate one individual.
These are therefore applied with a rate of 25% each. For simplicity, the
architecture-altering operations are not used for this problem.
In the fifth and final step we need to specify a termination condition. A
reasonable termination criterion for this problem is that the run will continue
from generation to generation until the fitness (or error) of some individual
is less than 0.1. In this contrived example, our example run will (atypically)
yield an algebraically perfect solution with a fitness of zero after just one
generation.
4.2
Now that we have performed the five preparatory steps, the run of GP can
be launched. The GP setup is summarised in Table 4.1.
4.2.1
Initialisation
32
(b)
1
x+1
+
1
(d)
(c)
x
2
x +1
-1
2
-2
4.2.2
Fitness Evaluation
4.2.3
33
-1
-1
(a)
(b)
-2
-2
-1
-1
(c)
-2
(d)
-2
Figure 4.2: Graphs of the evolved functions from generation 0. The solid
line in each plot is the target function x2 + x + 1, with the dashed line
being the evolved functions from the first generation (see Figure 4.1). The
fitness of each of the four randomly created individuals of generation 0 is
approximately proportional to the area between two curves, with the actual
fitness values being 7.7, 11.0, 17.98 and 28.7 for individuals (a) through (d),
respectively.
34
(b)
(c)
(d)
x
x+1
x + x+1
Figure 4.3: Population of generation 1 (after one reproduction, one mutation, and two one-offspring crossover operations).
4.2.4
35
Because the fitness of the individual in Figure 4.3d is below 0.1, the termination criterion for the run is satisfied and the run is automatically terminated.
This best-so-far individual (Figure 4.3d) is then designated as the result of
the run.
Note that the best-of-run individual (Figure 4.3d) incorporates a good
trait (the quadratic term x 2 ) from the first parent (Figure 4.1b) with two
other good traits (the linear term x and constant term of 1) from the second
parent (Figure 4.1a). The crossover operation thus produced a solution to
this problem by recombining good traits from these two relatively fit parents
into a superior (indeed, perfect) offspring.
This is, obviously, a highly simplified example, and the dynamics of a
real GP run are typically far more complex than what is presented here.
Also, in general, there is no guarantee that an exact solution like this will
be found by GP.
Part II
Advanced Genetic
Programming
37
Chapter 5
Alternative Initialisations
and Operators in
Tree-based GP
The genetic programming system described in the preceding chapters is just
the beginning; in many ways it is the simplest thing that could possibly
work. Most of the techniques described in Part I date back to the late
1980s and early 1990s, a wide array of alternatives and extensions have
been explored since. A full catalogue of these would be far beyond the
scope of this book. The chapters in Part II survey a number of the more
prominent or historically important extensions to GP, particularly (but not
exclusively) in relation to the tree-based representation for programs.
We start, in this chapter, by reviewing a variety of initialisation strategies
(Section 5.1) and genetic operators (Sections 5.2 and 5.3) for tree-based GP
not covered in Part I. We also briefly look at some hybridisations of GP
with other techniques (Section 5.4).
5.1
Kozas ramped half-and-half method is the most common way of creating the
initial GP population (cf. Section 2.2, page 11). However, there are several
other ways of constructing a collection of random trees. In Section 5.1.2
we will briefly consider an unexpected impact of population initialisation.
There has also been some work with non-random or informed starting points
(cf. Section 5.1.3).
39
40
5.1.1
Uniform Initialisation
The shape of the initial trees can be lost within a few generations (more on
this below). However, a good start given by the initial population can still be
crucial to the success of a GP run. In general, there are an infinite number of
possible computer programs. This means that it is impossible to search them
uniformly. Therefore, any method used to create the initial population will
have a bias. For example, ramped half-and-half tends to create bushy trees.
Such trees have a higher proportion of solutions to symmetric problems,
such as parity. Conversely, the smallest solution to the Sante Fe ant trailfollowing problem is more randomly shaped (Langdon and Poli, 1998a). This
is partly why ramped half-and-half is very poor at finding programs which
can navigate the Sante Fe trail. Another reason is that many of the programs
generated by ramped half-and-half (with standard parameters) are simply
too small. Chellapilla (1997a) claims good results when the size of the initial
trees was more tightly controlled.
Iba (1996a) and Bohm and Geyer-Schulz (1996) report methods to precisely sample trees uniformly based on Alonsos bijective algorithm (Alonso
and Schott, 1995). Although this algorithm has been criticised (Luke,
2000) for being computationally expensive, it can be readily used in practice. Langdon (2000) introduced the ramped uniform initialisation which
extends Alonsos bijective algorithm by allowing the user to specify a range
of initial tree sizes. It then generates equal numbers of random trees
for each length in the chosen range. (C++ code can be obtained from
ftp://cs.ucl.ac.uk/genetic/gp-code/rand tree.cc.)
With these more uniform initialisations, most trees are asymmetric
with some leaves very close to the root of the tree. This is quite different
from the trees generated by ramped half-and-half which are on average some
distance from the root. Uniform sampling may be better in problems where
the desired solutions are asymmetric with some leaves being much more
important than others. For example, in data mining it is common to look
for solutions with a few dominant variables (which may be close to the root
node) whilst other variables are of little or no interest and may be some
distance from the root (or indeed not present in the tree). On the other
hand, problems like multiplexer or parity require all the inputs to be used
and are of similar importance. Bushier trees may be better at solving such
problems.
5.1.2
Crossover has a strong preference for creating a very non-uniform distributions of tree sizes (Poli, Langdon, and Dignum, 2007). Crossover generates very short programs much more often than longer ones. Selection can
only partially combat this tendency. Typically, crossover will totally rear-
41
range the size and shape of the initial trees within a few generations. As
discussed in Section 11.3.1 (page 101), the excessive sampling of short programs appears to be an important cause of bloat (the uncontrolled growth
of programs during GP runs, which will be described in more detail in Section 11.3, page 101 onwards). It has been shown (Dignum and Poli, 2007)
that when the initial population is created with the size distribution preferred by crossover (see Section 11.3.1), bloat is more marked. The distribution has a known mathematical formula (it is a Lagrange distribution of
the second kind), but in practice it can be created by simply performing
multiple rounds of crossover on a population created in the traditional way
before the GP run starts. This is known as Lagrange initialisation. These
findings suggest that initialisation methods which tend to produce many
short programs may in fact induce bloat sooner than methods that produce
distributions more skewed towards larger programs.
5.1.3
Seeding
42
5.2
5.2.1
GP Mutation
Is Mutation Necessary?
5.2.2
Mutation Cookbook
With linear bit string GAs, mutation usually consists of random changes in
bit values. In contrast, in GP there are many mutation operators in use.
Often multiple types of mutation are beneficially used simultaneously (e.g.,
see (Kraft, Petry, Buckles, and Sadasivan, 1994) and (Angeline, 1996)). We
describe a selection of mutation operators below:
Subtree mutation replaces a randomly selected subtree with another randomly created subtree (Koza, 1992, page 106). Kinnear (1993) defined
a similar mutation operator, but with a restriction that prevents the
offspring from being more than 15% deeper than its parent.
Size-fair subtree mutation was proposed in two forms by Langdon
(1998). In both cases, the new random subtree is, on average, the
same size as the code it replaces. The size of the random code is given
either by the size of another random subtree in the program or chosen
at random in the range [l/2, 3l/2] (where l is the size of the subtree
being replaced). The first of these methods samples uniformly in the
space of possible programs, whereas the second samples uniformly in
5.2 GP Mutation
43
44
5.3
GP Crossover
5.3 GP Crossover
45
Parent 1
Parent 2
*
+
sin
x
*
x y
*
y
Alignment
sin
x
*
y
*
y
Parent 2
x*
Swap
y xy
Selection of
Common
Crossover Point
Common
Region
+*
Offspring 2
sin*
y x
*
y
Offspring 1
+*
Parent 1
Crossover Point
+*
Parent 1 sin*
y x
+ * Parent 2
y xy
x*
y
46
As we shall see in Chapter 7, specific crossover operators exist for linear GP (Section 7.1) and graph based GP systems (Section 7.2), such as
PDGP (page 65), PADO (page 67) and Cartesian GP (page 67).
5.4
Other Techniques
GP can be hybridised with other techniques. For example, Iba, de Garis, and
Sato (1994), Nikolaev and Iba (2006), and Zhang and M
uhlenbein (1995)
have incorporated information theoretic and minimum description length
ideas into GP fitness functions to provide a degree of regularisation and
so avoid over-fitting (and bloat, see Section 11.3). As mentioned in Section 6.2.3, computer language grammars can be incorporated into GP.
Whereas genetic programming typically uses an evolutionary algorithm
to search the space of computer programs, various other heuristic search
methods can also be applied to program search, including: enumeration
(Olsson, 1995), hill climbing (OReilly and Oppacher, 1994a), and simulated annealing (OReilly, 1996; Tsoulos and Lagaris, 2006). As discussed
in Chapter 8, it is also possible to extend Estimation of Distribution Algorithms (EDAs) to the variable size representations used in GP.
Another alternative is to use co-evolution with multiple populations,
where the fitness of individuals in one population depends on the behaviour
of individuals in other populations. There have been many successful applications of co-evolution in GP, including (Azaria and Sipper, 2005a; Brameier,
Haan, Krings, and MacCallum, 2006; Buason, Bergfeldt, and Ziemke, 2005;
Channon, 2006; Dolinsky, Jenkinson, and Colquhoun, 2007; Funes, Sklar,
Juille, and Pollack, 1998a; Gagne and Parizeau, 2007; Hillis, 1992; Hornby
and Pollack, 2001; Mendes, de B. Voznika, Nievola, and Freitas, 2001; Piaseczny, Suzuki, and Sawai, 2004; Schmidt and Lipson, 2006; Sharabi and
Sipper, 2006; Soule, 2003; Soule and Komireddy, 2006; Spector, 2002; Spector and Klein, 2006; Spector, Klein, Perry, and Feinstein, 2005b; Wilson and
Heywood, 2007; Zhang and Cho, 1999).
Finally, it is worth mentioning that program trees can be manipulated
with editing operations (Koza, 1992). For example, if the root node of
a subtree is but one of its arguments is always guaranteed to evaluate
to 0, then we can replace the subtree rooted there with the terminal 0.
If the root node of a subtree is + and one argument evaluates to 0, we
can replace the subtree with the other argument of the +. Editing can
reduce the complexity of evolved solutions and can make them easier to
understand. However, it may also lead to GP getting stuck in local optima,
so editing operations should probably be used sparingly at run time. Other
reorganisation operations of various types are also possible. For example,
after trees are generated by GP, (Garcia-Almanza and Tsang, 2006, 2007)
prune branches and combine branches from different trees.
Chapter 6
Modular, Grammatical
and Developmental
Tree-based GP
This chapter discusses advanced techniques that are primarily focused on
two important issues in genetic programming: modularity and constraint.
In Section 6.1 we explore the evolution of modular, hierarchical structures,
and in Section 6.2 we looks at ways of constraining the evolutionary process,
typically based on some sort of domain knowledge. We also look at using
GP to evolve programs which themselves develop solutions (Section 6.3) or
even construct other programs (Section 6.4).
6.1
48
taken from parts of fit GP trees. Special mutation operations allowed the
GP population to share code by referring to the same code within the library. Subsequently, Angeline suggested that the schemes advantages lay
in allowing GP individuals to access far more code than they actually held
within themselves, rather than principally in developing more modular code.
Rosca and Ballard (1996a) used a similar scheme, but were able to use much
more information from the fitness function to guide the selection of the code
to be inserted into the library and its subsequent use by members of the GP
population. Olsson (1999, 1995) later developed an abstraction operator for
use in his ADATE system, where sub-functions (anonymous lambda expressions) were automatically extracted. Unlike Angelines library approach,
Olssons modules remained attached to the individual they were extracted
from.
Kozas automatically defined functions (ADFs) (Koza, 1994) remain the
most widely used method of evolving reusable components and have been
used successfully in a variety of settings. Basic ADFs (covered in Section 6.1.1) use a fixed architecture specified in advance by the user. Koza
later extended this using architecture altering operations (Section 6.1.2),
which allow the architecture to evolve along with the programs.
6.1.1
Human programmers organise sequences of repeated steps into reusable components such as subroutines, functions and classes. They then repeatedly
invoke these components, typically with different inputs. Reuse eliminates
the need to reinvent the wheel every time a particular sequence of steps
is needed. Reuse also makes it possible to exploit a problems modularities,
symmetries and regularities (thereby potentially accelerate the problemsolving process). This can be taken further, as programmers typically organise these components into hierarchies in which top level components call
lower level ones, which call still lower levels, etc. Kozas ADFs provide a
mechanism by which the evolutionary process can evolve these kinds of potentially reusable components. We will review the basic concepts here, but
ADFs are discussed in great detail in (Koza, 1994).
When ADFs are used, a program consists of multiple components. These
typically consist of one or more function-defining branches (i.e., ADFs), as
well as one or more main result-producing branches (the RPB ), as illustrated
in the example in Figure 6.1. The RPB is the main program that is
executed when the individual is evaluated. It can, however, call the ADFs,
which can in turn potentially call each other. A single ADF may be called
multiple times by the same RPB, or by a combination of the RPB and other
ADFs, allowing the logic that evolution has assembled in that ADF to be
re-used in different contexts.
49
ROOT
ADF1
ADF2
RPB
(6.1)
(6.2)
The ADF (Equation 6.2) is simply the squaring function, but by combining
this multiple times in the RPB (Equation 6.1) this individual computes x8
in a highly compact fashion.
It is important to not be fooled by a tidy example like this. ADFs
evolved in real applications are typically complex and can be very difficult
to understand. Further, simply including ADFs provides no guarantee of
modular re-use. As is discussed in Chapter 13, there are no silver bullets.
It may be that the RPB never calls an ADF or only calls it once. It is also
common for an ADF to not actually encapsulate any significant logic. For
example, an ADF might be as simple as a single terminal, in which case it
is essentially just providing a new name for that terminal.
In Kozas approach, each ADF is attached (as a branch) to a specific individual in the population. This is in contrast to both Angelines and Roscas
systems mentioned above, both of which have general pools of modules or
components which are shared across the population. Sometimes recursion
is allowed in ADFs, but this frequently leads to infinite computations. Typically, recursion is prevented by imposing an order on the ADFs within an
individual and by restricting calls so that ADFi can only call ADFj if i < j.
In the presence of ADFs, recombination operators are typically constrained to respect the larger structure. That is, during crossover, a subtree
50
from ADFi can only be swapped with a subtree from another individuals
ADFi .
The programs result-producing branch and its ADFs typically have different function and terminal sets. For example, the terminal set for ADFs
usually include arguments, such as arg0, arg1. Typically the user must
decide in advance the primitive sets, the number of ADFs and any call restrictions to prevent recursion. However, these choices can be evolved using
the architecture-altering operations described in Section 6.1.2.
Koza also proposed other types of automatically evolved program components (Koza, Andre, Bennet, and Keane, 1999). Automatically defined
iterations (ADIs), automatically defined loops (ADLs) and automatically
defined recursions (ADRs) provide means to reuse code. Automatically defined stores (ADSs) provide means to reuse the result of executing code.
6.1.2
51
Koza and his colleagues have used these architecture altering operations
quite widely in their genetic design work, where they evolve GP trees that
encode a collection of developmental operations that, when interpreted, generate a complex structure like a circuit or an optical system (see, for example,
Section 12.3, page 118).
The idea of architecture altering operations was extended to the extremely general Genetic Programming Problem Solver (GPPS), which is
described in detail in (Koza et al., 1999, part 4). This is an open ended
system which combines a small set of basic vector-based primitives with the
architecture altering operations in a way that can, in theory, solve a wide
range of problems with almost no input required from the user other than
the fitness function. The problem is that this open-ended system needs a
very carefully constructed fitness function to guide it to a viable solution, an
enormous amount of computational effort, or both. As a result it is currently
an idea of more conceptual than practical value.
6.2
Constraining Structures
52
a type system, since these often generate results that are more comprehensible (Haynes, Wainwright, Sen, and Schoenefeld, 1995), (Langdon, 1998,
page 126). Similarly, if there is domain knowledge that strongly suggests a
particular syntactic constraint on the solution, then ignoring that constraint
may make it much harder to find a solution.
We will focus here on three different approaches to constraining the syntax of the evolved expression trees in GP: simple structure enforcement
(Section 6.2.1), strongly typed GP (Section 6.2.2) and grammar-based constraints (Section 6.2.3). Finally, we consider the advantages and disadvantages of syntactic and type constraints and their biases (Section 6.2.4).
6.2.1
6.2.2
Strongly Typed GP
53
and all the genetic operators are implemented so as to ensure that they do
not violate the type systems constraints.
Returning to the if example from Section 3.2.1 (page 21), we might have
an application with both numeric and Boolean terminals (e.g., get speed
and is food ahead). We might then have an if function that takes three
arguments: a test (Boolean), the value to return if the test is true, and
the value to return if the test is false. Assuming that the second and third
values are numbers, then the output of the if is also going to be numeric.
If we choose the test argument as a crossover point in the first parent, then
the subtree (excised from the second parent) to insert must have a Boolean
output. That is, we must find either a function which returns a Boolean or
a Boolean terminal in the other parent tree to be the root of the subtree
which we will insert into the new child. Conversely if we choose either the
second or third argument as a crossover point in the first parent, then the
inserted subtree must be numeric. In all three cases, given that both parents
are type correct, restricting the second crossover point in this way ensures
the child will also be type correct.
This basic approach to types can be extended to more complex type systems including simple generics (Montana, 1995), multi-level type systems
(Haynes, Schoenefeld, and Wainwright, 1996), fully polymorphic types (Olsson, 1994), and polymorphic higher-order type systems (Yu, 2001).
6.2.3
Grammar-based Constraints
Another natural way to express constraints is via grammars, and these have
been used in GP in a variety of ways (Gruau, 1996; Hoai, McKay, and
Abbass, 2003; ONeill and Ryan, 2003; Whigham, 1996; Wong and Leung,
1996). Many of these simply use a grammar as a means of expressing the
kinds of constraints discussed above in Section 6.2.1. For example, one could
enforce the structure for the period function using a grammar such as the
following:
tree
E
op
var
::=
::=
::=
::=
E sin(E t)
var | (E op E)
+ | | |
x | y | z
(6.3)
54
tree
var
sin
op
var
var
55
(6.4)
then we start with 39 and the first syntax rule, tree. However tree has no
alternatives, so we move to 7 and rule E. Now E has two alternatives and
7 is used (via modulus) to chose between them. More of the translation
process is given in Figure 6.3.
In this example we did not need to use all the numbers in the sequence
to generate a complete program. Indeed the last integer, 94, was not used.
In general, extra genetic material is simply ignored. More problematic is
when a sequence is too short in the sense that the end of the sequence
is reached before the translation process is complete. There are a variety
of options in this case, including failure (assigning this individual the worst
possible fitness) and wrapping (continuing the translation process, moving
back to the front of the numeric sequence). Grammatical evolution has been
very successful and is widely used.
6.2.4
56
tree
h 39 mod 1 = 0, i.e., there is only one option i
E sin(E t)
h 7 mod 2 = 1, i.e., choose second option i
(E op E) sin(E t)
h 2 mod 2 = 0, i.e., take the first option i
(var op E) sin(E t)
h 83 mod 3 = 2, pick the third variable, i
(z op E) sin(E t)
h 66 mod 4 = 2, take the third operator i
(z E) sin(E t)
...
(z x) sin(z t)
Figure 6.3: Sample grammatical evolution derivation using the grammar in
Equation (6.3) and the integer sequence in Equation (6.4). The non-terminal
to be rewritten is underlined in each case.
57
6.3
58
59
ing from developmental processes often have some regularity, which other
methods obtain through the use of ADFs, constraints, types, etc. A disadvantage is that, with cellular encoding, individuals require an additional
genotype-to-phenotype decoding step. However, when the fitness function
involves complex calculations with many fitness cases, the relative cost of the
decoding step is often small compared with the rest of the fitness function.
6.4
While types are often used to constrain evolution, Spectors PushGP (Klein
and Spector, 2007; Robinson and Spector, 2002; Spector, 2001; Spector,
Klein, and Keijzer, 2005a) is a move away from constraining evolution.
Essentially PushGP uses genetic programming to automatically create
programs written in the Push programming language. Push is a strongly
typed tree based language which does not enforce syntactic constraints.
Each of Pushs types has its own stack. In addition to stacks for integers, floats, Booleans and so on, there is a stack for objects of type program.
Using this code stack, Push naturally supports both recursion and program
modules (see Section 6.1.1) without human pre-specification. The code stack
allows an evolved program to push itself or fragments of itself onto the stack
for subsequent manipulation.
PushGP can use the code stack and other operations to allow programs to
construct their own crossover and other genetic operations and create their
own offspring. Programs are prevented from simply duplicating themselves
to deflect catastrophic loss of population diversity.
Chapter 7
7.1
7.1.1
Motivations
There are two different reasons for trying linear GP. Firstly, almost all
computer architectures represent computer programs in a linear fashion with
....
Instruction 1 Instruction 2
Instruction N
62
Output
R0..R7
Arg 1
R0..R7
Opcode
+*/
Arg 2
0...127
or
R0..R7
7.1.2
Linear GP Representations
63
interpreted linear GP, where each instruction is executable by some higherlevel virtual machine (typically written in an efficient language such as C
or C++). When the instructions are actual machine code, then the order
of the elements of the representation shown in Figure 7.2 is determined by
the particular computer architecture used, and the corresponding data must
be packed into bit fields of appropriate sizes. The overhead of packing and
unpacking data can be avoided, however, when one is using virtual machine
instructions since then the designer of a GP system has complete freedom
as to how the virtual machine will interpret its instructions.
If the goal is execution speed, then the evolved code should be machine
code for a real computer rather than some higher level language or virtualmachine code. This is why Nordin (1994) started by evolving machine code
for SUN computers and Crepeau (1995) targeted the Z80. The linear GP
of Leung, Lee, and Cheang (2002) was designed for novel hardware, but much
of the GP development had to be run in simulation whilst the hardware itself
was under development.
The Sun SPARC has a simple 32-bit RISC architecture which eases
designing genetic operations which manipulate its machine code. Nordin
(1997) wrapped each machine code GP individual (which was a sequence of
machine instructions) inside a C function. Each of the GP programs inputs
was copied from one of the C functions arguments into one of the machine
registers. As well as the registers used for inputs,2 a small number (e.g.,
24) of other registers are used for scratch memory to store partial results of
intermediate calculations. Finally, the GP simply leaves its answer in one of
the registers. The external framework uses this as the C functions return
value.
Since Unix was ported onto the x86, Intels complex instruction set,
which was already standard with Windows-based PCs, has had almost complete dominance. Seeing this, Nordin ported his Sun RISC linear GP system
onto Intels CISC. Various changes were made to the genetic operations
which ensure that the initial random programs are made only of legal Intel machine code and that mutation operations, which act inside the x86s
32-bit word, respect the x86s complex sub-fields. Since the x86 has instructions of different lengths, special care has to be taken when altering them.
Typically, several short instructions are packed into each 4-byte word. If
there are any bytes left over, they are filled with no-operation codes. In
this way, best use is made of the available space, without instructions crossing 32-bit boundaries. Nordins work led to Discipulus (Foster, 2001),
which has been used in applications ranging from bioinformatics (Vukusic,
Grellscheid, and Wiehe, 2007) to robotics (Langdon and Nordin, 2001) and
2 Anyone using a register-based GP (linear or tree-based) should consider writeprotecting the input registers to prevent the inputs from being overwritten. Otherwise
evolved programs (especially in the early generations) are prone to writing over their
inputs before theyve had a chance to use them in any constructive way.
64
7.1.3
Linear GP Operators
The typical crossover and mutation operators for linear GP ignore the details
of the machine code of the computer being used. For example, crossover may
choose randomly two crossover points in each parent and swaps the code
between them. Since the crossed over fragments are typically of different
lengths, such a crossover may change the programs lengths, cf. Figure 7.3.
Since computer machine code is organised into 32- or 64-bit words, the
crossover points occur only at the boundaries between words. Therefore,
a whole number of words, containing a whole number of instructions are
typically swapped over. Similarly, mutation operations normally respect
word boundaries and generate legal machine code. However, linear GP lends
itself to a variety of other genetic operations. For example, Figure 7.4 shows
homologous crossover. Many other crossover and mutation operations are
possible (Langdon and Banzhaf, 2005).
In a compiling genetic programming system (Banzhaf, Francone, and
Nordin, 1996) the mutation operator acts on machine code instructions
and is constrained to ensure that only instructions in the function set are
generated and that the register and constant values are within predefined
ranges allowed in the experimental set up. On some classification problems Banzhaf et al. (1996) reported that performance was best when using
crossover and mutation in equal proportions. They suggested that this was
due to the GP population creating introns (blocks of code that does not
affect fitness) in response to the crossover operator, and that these were subsequently converted into useful genetic material by their mutation operator.
65
Parent 1
Parent 2
Offspring
Figure 7.3: Typical linear GP crossover. Two instructions are randomly
chosen in each parent (top two genomes) as cut points. The code fragment
excised from the first parent is then replaced with the code fragment excised
from the second to generate the child (lower chromosome).
Parent 1
Parent 2
Offspring 1
Offspring 2
Figure 7.4: Discipuluss homologous crossover (Foster, 2001; Francone
et al., 1999; Nordin et al., 1999). Crossover is performed on two parents (top
two programs) to yield two offspring (bottom). The two crossover points are
the same in both parents, so the exised code does not change its position
relative to the start of the program (left edge), and the child programs have
the same lengths as their parents. Homologous crossover is often combined
with a small amount of normal two point crossover (Figure 7.3) to introduce
length changes into the GP population.
7.2
Trees are special types of graphs. So it is natural to ask what would happen
if one extended GP so as to be able to evolve graph-like programs. Starting
from the mid 1990s, researchers have proposed several extensions of GP that
do just that, albeit in different ways.
7.2.1
Parallel Distributed GP
66
max
max
y
x
(a)
3
y
3
y
(b)
Figure 7.5: A sample tree where the same subtree is used twice (a) and
the corresponding graph-based representation of the same program (b). The
graph representation may be more efficient since it makes it possible to avoid
the repeated evaluation of the same subtree.
with nodes representing functions and terminals. Edges represent both control flow and data flow. The possible efficiency gains obtained by a graph
representation are illustrated in Figure 7.5.
In the simplest form of PDGP edges are directed and unlabelled, in
which case PDGP is a generalisation of standard GP. However, more complex representations can be used, which allow the evolution of: programs,
including standard tree-like programs, logic networks, neural networks, recurrent transition networks and finite state automata. This can be achieved
by extending the representation by associating labels with the edges of the
program graph. In addition to the function and terminal sets, this form of
PDGP requires the definition of a link set. The labels on the links depend
on what is to be evolved. For example, in neural networks, the link labels
are numerical constants for the neural network weights. In a finite state automaton, the edges are labelled with the input symbols that determine the
FSAs state transitions. It is even possible for the labels to be automatically
defined edges, which play a role similar to ADFs (Section 6.1.1) by invoking
other PDGP graphs.
In PDGP, programs are manipulated by special crossover and mutation
operators which guarantee the syntactic correctness of the offspring. Each
node occupies a position in a regular grid. The genetic operators act by
moving, copying or randomly generating sub-regions of the grid. For this
reason PDGP search operators are very efficient.
PDGP programs can be executed according to different policies depend-
67
ing on whether instructions with side effects are used or not. If there are
no side effects, running a PDGP program can be seen as a propagation of
the input values from the bottom to the top of the programs graph (as in
a feed-forward artificial neural network or data flow parallel computer).
7.2.2
7.2.3
Cartesian GP
68
7.2.4
Chapter 8
Probabilistic Genetic
Programming
Genetic programming typically uses an evolutionary algorithm as its main
search engine. However, this is not the only option. The use of simulated
annealing and hill climbing to search the space of computer programs was
mentioned in Section 5.4. This chapter considers recent work where the exploration is performed by population-based search algorithms which adapt
and sample probability distributions instead of using traditional genetic operators.
Sampling from a probability distribution means generating random values whose statistical properties match those of the given distribution. For
example, if one sampled a univariate Gaussian distribution, one would expect the resulting values to tend to have mean and standard deviation similar to the mean and standard deviation of the Gaussian. The notion of
sampling can be extended to much more complex distributions involving
multiple variables. Furthermore, discrete as well as continuous variables are
possible.
8.1
70
Different EDAs use different models for the probability distribution that
controls the sampling (see (Larra
naga, 2002; Larra
naga and Lozano, 2002)
for more information). For example, population-based incremental learning
(PBIL) (Baluja and Caruana, 1995) and the uniform multivariate distribution algorithm (UMDA) (M
uhlenbein and Mahnig, 1999a,b) assume that
each variable is independent of the other variables. Consequently, these algorithms need to store and adjust only a linear array of probabilities, one for
each variable. This works well for problems with weak interactions between
variables. Since no relationship between the variables is stored or learned,
however, PBIL and UMDA may have difficulties solving problems where the
interactions between variables are significant.
Naturally, higher order models are possible. For example, the MIMIC
algorithm of de Bonet, Isbell, and Viola (1997) uses second-order statistics. It is also possible to use flexible models where interactions of different orders are captured. The Bayesian optimisation algorithm (BOA)
(Pelikan, Goldberg, and Cant
u-Paz, 1999) uses baysian networks to represent generic sampling distributions, while the extended compact genetic
algorithm (eCGA) (Harik, 1999) clusters genes into groups where the genes
in each group are assumed to be linked but groups are considered independent. The sampling distribution is then taken to be the product of the
distributions modelling the groups.
EDAs have been very successful. However, they are often unable to represent both the overall structure of the distribution and its local details,
typically being more successful at the former. This is because EDAs represent the sampling distribution using models with an, inevitably, limited
number of degrees of freedom. For example, suppose the optimal sampling
distribution has multiple peaks, corresponding to different local optima, separated by large unfit areas. Then, an EDA can either decide to represent
only one peak, or to represent all of them together with the unfit areas. If
the EDA chooses the wrong local peak this may lead to it getting stuck and
not finding the global optimum. Conversely if it takes a wider view, this
leads to wasting many trials sampling irrelevant poor solutions.
Consider, for example, a scenario where there are five binary variables,
x1 , x2 , x3 , x4 and x5 , and two promising regions: one near the string of all
zeros, i.e., (x1 , x2 , x3 , x4 , x5 ) = (0, 0, 0, 0, 0), and the other near the string
of all ones, i.e., (x1 , x2 , x3 , x4 , x5 ) = (1, 1, 1, 1, 1). One option for a (simple)
EDA is to focus on one of the two regions, e.g., setting the variables xi
to 0 with high probability (say, 90%). This, however, fails to explore the
other region, and risks missing the global optimum. The other option is to
maintain both regions as possibilities by setting all the probabilities to 50%,
i.e., each of the variables xi is as likely to be 0 as 1. These probabilities will
generate samples in both of the promising regions. For example, the strings
(0, 0, 0, 0, 0) and (1, 1, 1, 1, 1) will each be generated with a 3.125% proba-
71
8.2
Pure EDA GP
Hamming distance between two strings (whether binary or not) is the number
of positions where the two strings differ.
2 There is a weak form of dependency, in that there can be a primitive in a particular
position only if the primitive just above it is a function. The choice of this parent primitive
does not, however, influence the choice of the child primitive.
72
73
+
*
/
x
+
*
/
x
+
*
/
x
0.1
0.2
0.1
0.0
0.4
0.2
0.1
0.2
0.1
0.2
0.3
0.1
+
*
0.2
0.3
0.1
0.1
0.2
0.1
+
*
/
x
/
x
R
0.0
0.0
0.3
0.1
0.1
0.5
+
*
/
x
0.1
0.1
0.2
0.1
0.2
0.3
0.1
0.1
0.1
0.4
0.3
0.0
+
*
/
x
0.3
0.1
0.0
0.0
0.3
0.3
Figure 8.1: Example of probability tree used for the generation of programs
in PIPE. New program trees are created starting from the root node at the
top and moving through the hierarchy. Each node in an offspring tree is
selected from the left hand side of the corresponding table with probability
given by the right hand side. Each branch of the tree continues to expand
until either the tree of probability tables is exhausted or a leaf (e.g., R) is
selected.
74
8.3
A variety of other systems have been proposed which combine the use of
grammars and probabilities. We mention only a few here; a more extended
review of these is available in (Shan, McKay, Essam, and Abbass, 2006).
Ratle and Sebag (2001) used a stochastic context-free grammar to generate program trees. The probability of applying each rewrite rule was
adapted using a standard EDA approach so as to increase the likelihood of
using successful rules. The system could also be run in a mode where rule
probabilities depended upon the depth of the non-terminal symbol to which
a rewrite rule was applied, thereby providing a higher degree of flexibility.
The approach taken in program evolution with explicit learning (PEEL)
(Shan, McKay, Abbass, and Essam, 2003) was slightly more general. PEEL
used a probabilistic L-system where rewrite rules were both depth- and
location-dependent. The probabilities with which rules were applied were
adapted by an ant colony optimisation (ACO) algorithm (Dorigo and
St
utzle, 2004). Another feature of PEEL was that the L-systems rules
could be automatically refined via splitting and specialisation.
Other programming systems based on probabilistic grammars which are
optimised via ant systems include ant-TAG (Abbass, Hoai, and McKay,
2002; Shan, Abbass, McKay, and Essam, 2002), which uses a tree-adjunct
grammar as its main representation, and generalised ant programming
(GAP) (Keber and Schuster, 2002), which is based on a context-free grammar. Other systems which learn and use probabilistic grammars include
grammar model based program evolution (GMPE) (Shan, McKay, Baxter,
Abbass, Essam, and Hoai, 2004), the system described in (Bosman and de
Jong, 2004a,b) and Baysian automatic programming (BAP) (Regolin and
Pozo, 2005).
Chapter 9
Multi-objective
Genetic Programming
The area of multi-objective GP (MO GP) has been very active in the last
decade. In a multi-objective optimisation (MOO) problem, one optimises
with respect to multiple goals or fitness functions f1 , f2 , .... The task of a
MOO algorithm is to find solutions that are optimal, or at least acceptable,
according to all the criteria simultaneously.
In most cases changing an algorithm from single-objective to multiobjective requires some alteration in the way selection is performed. This is
how many MO GP systems deal with multiple objectives. However, there
are other options. We review the main techniques in the following sections.
The complexity of evolved solutions is one of the most difficult things
to control in evolutionary systems such as GP, where the size and shape of
the evolved solutions is under the control of evolution. In some cases, for
example, the size of the evolved solutions may grow rapidly, as if evolution
was actively promoting it, without any clear benefit in terms of fitness. We
will provide a detailed discussion of this phenomenon, which is know as bloat,
and a variety of counter measures for it in Section 11.3. However, in this
chapter we will review work where the size of evolved solutions has been
used as an additional objective in multi-objective GP systems. Of course,
we will also describe work where other objectives were used.
9.1
76
P
example, one could use a linear combination of the form f = i wi fi , where
the parameters w1 , w2 , . . . are suitable constants. A MOO problem can then
be solved by using any single-objective optimisation technique with f as a
fitness function. This method has been used frequently in GP to control
bloat. By combining program fitness and program size to form a parsimonious fitness function one can evolve solutions that satisfy both objectives
(see Koza (1992); Zhang and M
uhlenbein (1993, 1995); Zhang, Ohm, and
M
uhlenbein (1997) and Section 11.3.2).
A semi-linear aggregation of fitness and speed was used in (Langdon
and Poli, 1998b) to improve the performance of GP on the Santa Fe Trail
Ant problem. There, a threshold was used to limit the impact of speed to
avoid providing an excessive bias towards ants that were fast but could not
complete the trail.
A fitness measure which linearly combines two related objectives, the
sum of squared errors and the number of hits (a hit is a fitness case in which
the error falls below a pre-defined threshold), was used in (Langdon, Barrett,
and Buxton, 2003) to predict biochemical interactions in drug discovery.
Zhang and Bhowan (2004) used a MO GP approach for object detection.
Their fitness function was a linear combination of the detection rate (the
percentage of small objects correctly reported), the false alarm rate (the
percentage of non-objects incorrectly reported as objects), and the false
alarm area (the number of false alarm pixels which were not object centres
but were incorrectly reported as object centres).
OReilly and Hemberg (2007) used six objectives for the evolution of
L-systems which developed into 3-D surfaces in response to a simulated
environment. The objectives included the size of the surface, its smoothness,
its symmetry, its undulation, the degree of subdivision of the surface, and
the softness of its boundaries.
(Koza, Jones, Keane, and Streeter, 2004) used 16 different objectives
in the process of designing analogue electrical circuits. In the case of an
amplifier circuit these included: the 10dB initial gain, the supply current, the
offset voltage, the gain ratio, the output swing, the variable load resistance
signal output, etc. These objectives were combined in a complex heuristic
way into a scalar fitness measure. In particular, objectives were divided
into groups and many objectives were treated as penalties that were applied
to the main fitness components only if they are outside certain acceptable
tolerances.
9.2
Since selection does not depend upon how the members of the population
are represented, the MOO techniques developed for other evolutionary algorithms can be easily adapted to GP.
77
Pareto Front
y
Non dominated
points
1
2
B
The main idea in MOO is the notion of Pareto dominance. Given a set
of objectives, a solution is said to Pareto dominate another if the first is not
inferior to the second in all objectives, and, additionally, there is at least one
objective where it is better. This notion can lead to a partial order, where
there is no longer a strict linear ordering of solutions. In Figure 9.1, for
example, individual A dominates (is better than) individual B along the y
axis, but B dominates A along the x axis. Thus there is no simple ordering
between then. The individual marked 2, however dominates B on both
axes and would thus be considered strictly better than B.
In this case the goal of the search algorithm becomes the identification of
a set of solutions which are non-dominated by any others. Ideally, one would
want to find the Pareto front, i.e., the set of all non-dominated solutions in
the search space. However, this is often unrealistic, as the size of the Pareto
front is often limited only by the precision of the problem representation. If
x and y in Figure 9.1 are real-valued, for example, and the Pareto front is
a continuous curve, then it contains an infinite number of points, making a
complete enumeration impossible.
9.2.1
Rodriguez-Vazquez, Fonseca, and Fleming (1997) performed non-linear system identification using a MO GP system, where individuals were selected
based on the Pareto dominance idea. The two objectives used were fitness
and model complexity. In each generation individuals were ranked based on
how many other individuals dominated them, and fitness was based on their
rank. To better cover the Pareto front, niching via fitness sharing (Goldberg, 1989) was also performed. Preference information was also included
78
to focus the selection procedure towards specific regions of the Pareto front.
Hinchliffe, Willis, and Tham (1998) applied similar ideas to evolve parsimonious and accurate models of chemical processes using MO GP. Langdon
and Nordin (2000) applied Pareto tournaments to obtain compact solutions
in programmatic image compression, two machine learning benchmark problems and a consumer profiling task. Nicolotti, Gillet, Fleming, and Green
(2002) used multi-objective GP to evolve quantitative structureactivity relationship models in chemistry; objectives included model fitting, the total
number of terms and the occurrence of non-linear terms.
Ekart and Nemeth (2001) tried to control bloat using a variant of Pareto
tournament selection where an individual is selected if it is not dominated
by a set of randomly chosen individuals. If the test fails, another individual
is picked from the population, until one that is non-dominated is found.
In order to prevent very small individuals from taking over the population
in the early generations of runs, the Pareto criterion was modified so as
to consider as non-dominated solutions also those that were only slightly
bigger, provided their fitness was not worse.
Bleuler, Brack, Thiele, and Zitzler (2001) suggested using the well-known
multi-objective optimiser SPEA2 (Zitzler, Laumanns, and Thiele, 2001) to
reduce bloat. de Jong, Watson, and Pollack (2001) and de Jong and Pollack
(2003) proposed using a multi-objective approach to promote diversity and
reduce bloat, stressing that without diversity enhancement (given by modern
MOO methods) searches can easily converge to solutions that are too small
to solve a problem. Tests with even parity and other problems were very
encouraging. Badran and Rockett (2007) argued in favour of using mutation
to prevent the population from collapsing onto single-node individuals when
using a multi-objective GP.
As well as directly fighting bloat, MO GP can also be used to simplify
solution trees. After GP has found a suitable (but large) model, for example,
one can continue the evolutionary process, changing the fitness function to
include a second objective that the model be as small as possible (Langdon,
1998). GP can then trim the trees while ensuring that the simplified program
still fits the training data.
9.2.2
Other Objectives
79
round of comparisons with the rest of the population was used as a tie
breaker. The method successfully evolved queues, lists, and circular lists.
Langdon and Poli (1998b) used Pareto selection with two objectives,
fitness and speed, to improve the performance of GP on the Santa Fe Trail
Ant problem. Ross and Zhu (2004) used MO GP with different variants of
Pareto selection to evolve 2-D textures. The objectives were feature tests
that were used during fitness evaluation to rate how closely a candidate
texture matched visual characteristics of a target texture image. Dimopoulos
(2005) used MO GP to identify the Pareto set for a cell-formation problem
related to the design of a cellular manufacturing production system. The
objectives included the minimisation of total intercell part movement, and
the minimisation of within-cell load variation.
Rossi, Liberali, and Tettamanzi (2001) used MO GP in electronic design
automation to evolve VHDL code. The objectives used were the suitability
of the filter transfer function and the transition activity of digital blocks.
Cordon, Herrera-Viedma, and Luque (2002) used Pareto-dominance-based
GP to learn Boolean queries in information retrieval systems. They used two
objectives: precision (the ratio between the relevant documents retrieved in
response to a query and the total number of documents retrieved) and recall
(the ratio between the relevant documents retrieved and the total number
of documents relevant to the query in the database).
Barlow (2004) used a GP extension of the well-known NSGA-II MOO
algorithm (Deb, Agrawal, Pratap, and Meyarivan, 2000) for the evolution of
autonomous navigation controllers for unmanned aerial vehicles. Their task
was locating radar stations, and all work was done using simulators. Four
objectives were used: the normalised distance from the emitter, the circling
distance from the emitter, the stability of the flight, and the efficiency of
the flight.
Araujo (2006) used MO GP for the joint solution of the tasks of statistical
parsing and tagging of natural language. Their results suggest that solving
these tasks jointly led to better results than approaching them individually.
Han, Zhou, and Wang (2006) used a MO GP approach for the identification of chaotic systems where the objectives included chaotic invariants
obtained by chaotic time series analysis as well, as the complexity and performance of the models.
Khan (2006) used MO GP to evolve digital watermarking programs. The
objectives were robustness in the decoding stage, and imperceptibility by the
human visual system. Khan and Mirza (2007) added a third objective aimed
at increasing the strength of the watermark in relation to attacks.
Kotanchek, Smits, and Vladislavleva (2006) compared different flavours
of Pareto-based GP systems in the symbolic regression of industrial data.
Weise and Geihs (2006) used MO GP to evolve protocols in sensor networks.
The goal was to identify one node on a network to act as a communication
80
relay. The following objectives were used: the number of nodes that know
the designated node after a given amount of time, the size of the protocol
code, its memory requirements, and a transmission count.
Agapitos, Togelius, and Lucas (2007) used MO GP to encourage the
effective use of state variables in the evolution of controllers for toy car
racing. Three different objectives were used: the ratio of the number of
variables used within a program to the number of variables offered for use by
the primitive language, the ratio of the number of variables being set within
the program to the number of variables being accessed, and the average
positional distance between memory setting instructions and corresponding
memory reading instructions.
When two or three objectives need to be simultaneously optimised, the
Pareto front produced by an algorithm is often easy to visualise. When
more than three objectives are optimised, however, it becomes difficult to
directly visualise the set of non-dominated solutions. Valdes and Barton
(2006) proposed using GP to identify similarity mappings between highdimensional Pareto fronts and 3-D space, and then use virtual reality to
visualise the result.
9.2.3
Non-Pareto Criteria
Pareto dominance is not the only way to deal with multiple objectives without aggregating them into a scalar fitness function.
Schmiedle, Drechsler, Grosse, and Drechsler (2001) compared GP with
four different MOO selection methods on the identification of binary decision diagrams. Linear weighting of the objectives was compared against: a)
Pareto dominance; b) a weaker form of Pareto dominance where a solution
is preferred to another if the number of objectives where the first is superior
to the second is bigger than the number of objectives where the opposite is
true; c) lexicographic ordering (where objectives are ordered based on the
users preference); and d) a new method based on priorities. The lexicographic parsimony pressure method proposed in (Luke and Panait, 2002;
Ryan, 1994) is in fact a form of MOO with lexicographic ordering (in which
shorter programs are preferred to longer ones whenever their fitness is the
same or sufficiently similar). An approach which combines Pareto dominance and lexicographic ordering was proposed in (Panait and Luke, 2004).
9.3
81
which initially guide GP towards solutions that maximise the main objective. When enough of the population has reached reasonable levels in that
objective, the fitness function is modified so as to guide the population towards the optimisation of a second objective. In principle this process can
be iterated for multiple objectives. Of course, care needs to be taken to
ensure that the functionality reached with a set of previous fitness measures
is not wiped by the search for the optima of a later fitness function. This
can be avoided by making sure each new fitness function somehow includes
all the previous ones. For example, the fitness based on the new objectives
can be added to the pre-existing objectives with some appropriate scaling
factors.
A similar effect can be achieved via static, but staged, fitness functions.
These are staged in the sense that certain levels of fitness are only be made
available to an individual once it has reached a minimum acceptable performance on all objectives at the previous level. If each level represents one of
the objectives, individuals are then encouraged to evolve in directions that
ensure that good performance is achieved and retained on all objectives.
Koza et al. (1999) used this strategy when using GP for the evolution of
electronic circuits where many criteria, such as input-output performance,
power consumption, size, etc., must all be taken into account to produce
good circuits. Kalganova and Miller (1999) used Cartesian GP (see Section 7.2.3) to design combinational logic circuits. A circuits fitness was
given by a value between 0 and 100 representing the percentage of output
bits that were correct. If the circuit was 100% functional, then a further
component was added which represented the number of gates in the graph
that were not involved in the circuit. Since all individuals had the same
number of gates available in the Cartesian GP grid, this could be used to
minimise the number of gates actually used to solve the problem at hand.
9.4
82
The pygmies and civil servants approach proposed in (Ryan, 1994, 1996)
combines the separation typical of Pareto-based approaches with biased
search operators. In this system two lists are built, one where individuals are ranked based on fitness and the other where individuals are ranked
based on a linear combination of fitness and size (i.e., a parsimonious fitness function). During crossover, the algorithm draws one parent from the
first list and the other from the second list. This can be seen as a form of
disassortative mating aimed at maintain diversity in the population. Another example of this kind is (Zhang and Rockett, 2005) where crossover
was modified so that an offspring is retained only if it dominates either of
its parents.
Furthermore, as discussed in Sections 5.2 and 11.3.2, there are several
mutation operators with a direct or indirect bias towards smaller programs.
This provides a pressure towards the evolution of more parsimonious solutions throughout a run.
As with the staged fitness functions discussed in the previous section,
it is also possible to activate operators with a known bias towards smaller
programs only when the main objective say a 100% correct solution has
been achieved. This was tested in (Pujol, 1999; Pujol and Poli, 1997), where
GP was used to evolve neural networks. After a 100% correct solution was
found, one hidden node of each network in the population was replaced by
a terminal, and the evolution process was resumed. This pruning procedure
was repeated until the specified number of generations had been reached.
Chapter 10
10.1
While admirers of linear GP will suggest that machine code GP is the ultimate in speed, all forms of GP can be made faster in a number of ways.
The first is to reduce the number of times a program is evaluated.
Many applications find the fitness of programs by running them on mul83
84
85
used over time, the evolving population saw more of the training data and so
was less liable to over fit a fraction of them. Thirdly, by randomly changing
the fitness function, it became more difficult for evolution to produce an
overspecialised individual which took over the population at the expense of
solutions which were viable on other parts of the training data. Dynamic
subset selection (DSS) appears to have been the most successful of Gathercoles suggested algorithms. It has been incorporated into Discipulus (see
page 63), and was recently used in a large data mining application (Curry,
Lichodzijewski, and Heywood, 2007).
Where each fitness evaluation may take a long time, it may be attractive to interrupt a long-running program in order to let others run. In GP
systems which allow recursion or contain iterative elements (Brave, 1996;
Langdon, 1998; Wilson and Heywood, 2007; Wong and Leung, 1996) it is
common to enforce a time limit, a limit on the number of instructions executed, or a bound on the number of times a loop is executed. Maxwell
(1994) proposed a solution to the question of what fitness to give to a program that has been interrupted. He allowed each program in the population
a quantum of CPU time. When the program used up its quantum it was
check-pointed.4 In Maxwells system, programs gained fitness as they ran,
i.e., each time a program correctly processed a fitness case, its fitness was
incremented. Tournament selection was then performed. If all members of
the tournament had used the same number of CPU quanta, then the fitter
program was the winner. If, however, one program had used less CPU than
the others (and had a lower fitness) then it was restarted and run until it
had used as much CPU as the others. Then fitnesses were compared in the
normal way.
Teller (1994) had a similar but slightly simpler approach: every individual in the population was run for the same amount of time. When the
allotted time elapsed a program was aborted and an answer extracted from
it, regardless of whether it had terminated or not. Teller called this an any
time approach. This suits graph systems like Tellers PADO (Section 7.2.2)
or linear GP (Chapter 7.1) where it is easy to designate a register as the
output register. The answer can then be extracted from this register or from
an indexed memory cell at any point (including whilst the programming is
running). Other any time approaches include (Spector and Alpern, 1995)
and (Langdon and Poli, 2008).
A simple technique to speed up the evaluation of complex fitness functions is to organise the fitness function into stages of progressively increasing
computational cost. Individuals are evaluated stage by stage. Each stage
contributes to the overall fitness of a program. However, individuals need
4 When a program is check-pointed, sufficient information (principally the program
counter and stack) is saved so that it can later be restarted from where it was stopped.
Many multi-tasking operating systems do something similar.
86
10.2
87
only must the answer be stored, but the interpreter needs to know that the
subtrees inputs are the same too. The common practices of GP come to our
aid here. Usually every tree in the population is run on exactly the same
inputs for each of the fitness cases. Thus, for a cache to work, the interpreter
does not need to know a trees inputs in detail, it need only know which of
the fixed set of test cases was used.
A simple means of implementing this type of cache is to store a vector of
values returned by each subtree for each of the test cases. Whenever a subtree is created (i.e., in the initial generation, by crossover or by mutations)
the interpreter is run and the cache of values for its root node is set. Note
this is recursive, so caches can also be calculated for subtrees within it at
the same time. Now, when the interpreter is run and comes to a subtrees
root node, it will simply retrieve the value it calculated earlier, using the
test cases number as an index into the cache vector.
If a subtree is created by mutation, then its cache of values will be
initially empty and will have to be calculated. However, this costs no more
than it would without caches.
When code is inserted into an existing tree, be it by mutation or
crossover, the chance that the new code behaves identically to the old code
is normally very small. This means that the caches of every node between
the new code and the root node may be invalid. The simplest solution is
to re-evaluate them all. This may sound expensive, but the caches in all
the other parts of the individual remain valid and can be used when the
cache above them is re-evaluated. Thus, in effect, if the crossed over code is
inserted at depth d, only d nodes need to be evaluated.
The whole question of monitoring how effective individual caches are,
what their hit-rates are, etc. has been little explored. In practice, impressive
savings have been achieved by simple implementations, with little monitoring and rudimentary garbage collection. Recent analysis (Ciesielski and Li,
2004; Dignum and Poli, 2007; Langdon and Poli, 2002; Poli et al., 2007)
has shown that GP trees tend not to have symmetric shapes, and many
leaves are very close to the root. This provides a theoretical explanation for
why considerable computational saving can be made by using fitness caches.
While it is possible to use hashing schemes to efficiently find common code,
in practice assuming that common code only arises because it was inherited
from the same location (e.g., by crossing over) is sufficient.
As well as the original Directed acyclic graph (DAG) implementation
(Handley, 1994) other work includes (Ciesielski and Li, 2004; Keijzer, 1996;
McPhee, Hopper, and Reierson, 1998; Yangiya, 1995). While so far we have
only considered programs where no side effects take place, there are cases
where caching can be extended outside this domain. For example, Langdon
(1998) used fitness caches in evolved trees with side effects by exploiting
syntax rules about where in the code the side-effects could lie.
88
10.3
89
10.4
10.4.1
Masterslave GP
90
of computers. Each GP individual and its fitness cases are sent across the
network to a different compute node. The central node waits for the compute
nodes to return their individuals fitnesses. Since individuals and fitness
values are typically stored in small data structures, this can be quite efficient
since transmission overheads are limited.
The central node is an obvious bottleneck. Also, a slow compute node
or a lengthy fitness case will slow down the whole GP population, since
eventually its result will be needed before moving onto the next generation.
10.4.2
GP Running on GPUs
PCI Express
TA TA TA TA
TF TF TF TF
TF TF TF TF
TA TA TA TA
TF TF TF TF
TF TF TF TF
L1
L1
L2 CACHE
64 MB 64 MB
memory
32bits 32bits
ROP
L1
TF TF TF TF
TF TF TF TF
TA TA TA TA
SP SP SP SP
SP SP SP SP
SP SP SP SP
SP SP SP SP
L1
64 MB 64 MB
L1
L2 CACHE
L1
TF TF TF TF
TF TF TF TF
TA TA TA TA
SP SP SP SP
SP SP SP SP
SP SP SP SP
SP SP SP SP
64 MB 64 MB
memory
32bits 32bits
ROP
L1
TF TF TF TF
TF TF TF TF
TA TA TA TA
SP SP SP SP
SP SP SP SP
SP SP SP SP
SP SP SP SP
L2 CACHE
L1
memory
32bits 32bits
ROP
L1
TF TF TF TF
TF TF TF TF
TA TA TA TA
SP SP SP SP
SP SP SP SP
SP SP SP SP
SP SP SP SP
L2 CACHE
L1
64 MB 64 MB
memory
32bits 32bits
ROP
L1
TF TF TF TF
TF TF TF TF
TA TA TA TA
SP SP SP SP
SP SP SP SP
SP SP SP SP
SP SP SP SP
Thread Processor
Figure 10.1: nVidia 8800 Block diagram. The 128 1360 MHz Stream Processors are arranged in 16 blocks of 8. Blocks
share 16 KB memory (not shown), an 8/1 KB L1 cache, 4 Texture Address units and 8 Texture Filters. The 664 bit
bus (dashed) links off chip RAM at 900 MHz. (Since there are two chips for each of the six off-chip memory banks, the
bus is effectively running at up to 1800 Mhz per bank.) There are 6 Raster Operation Partitions. (nVidia, 2007).
64 MB 64 MB
L2 CACHE
64 MB 64 MB
ROP
memory
32bits 32bits
L2 CACHE
L1
TF TF TF TF
TF TF TF TF
TA TA TA TA
SP SP SP SP
SP SP SP SP
SP SP SP SP
SP SP SP SP
memory
32bits 32bits
ROP
L1
SP SP SP SP
SP SP SP SP
L1
SP SP SP SP
SP SP SP SP
L1
SP SP SP SP
SP SP SP SP
L1
SP SP SP SP
SP SP SP SP
Input Assembler
PC
92
that could be compiled for the GPU on the host PC. The compiled programs were transferred one at a time to a GPU for fitness evaluation. Both
groups obtained impressive speedups by running many test cases in parallel.
Langdon and Banzhaf (2008) and Langdon and Harrison (2008) created
a SIMD interpreter (Juille and Pollack, 1996) using RapidMinds GNU C++
OpenGL framework to simultaneously run up to a quarter of a million GP
trees on an NVIDIA GPU (see Figure 10.1).5 As discussed in Section 7.1.2,
GP trees can be linearised. This avoids pointers and yields a very compact
data structure; reducing the amount of memory needed in turn facilitates
the use of large populations. To avoid recursive calls in the interpreter,
Langdon used reverse polish notation (RPN), i.e., a post-fix rather than
a pre-fix notation. Only small modifications are needed to crossover and
mutation so that they act directly on the RPN expressions. This means the
same representation is used on both the host and the GPU. Almost a billion
GP primitives can be interpreted by a single graphics card per second. In
both Cartesian and tree-based GP the genetic operations are done by the
host CPU. Wong, Wong, and Fok (2005) showed, for a genetic algorithm,
these too can be done by the GPU.
Although each of the GPUs processors may be individually quite fast
and the manufacturers claim huge aggregate FLOPS ratings, the GPUs are
optimised for graphics work. In practice, it is hard to keep all the processors
fully loaded. Nevertheless 30 GFLOPS has been achieved (Langdon and
Harrison, 2008). Given the differences in CPU and GPU architectures and
clock speeds, often the speedup from using a GPU rather than the host
CPU is the most useful statistic. This is obviously determined by many
factors, including the relative importance of amount of computation and
size of data. The measured RPN tree speedups were 7.6-fold (Langdon and
Harrison, 2008) and 12.6-fold (Langdon and Banzhaf, 2008).
10.4.3
GP on FPGAs
Field programmable gate arrays (FPGAs) are chips which contain large arrays of simple logic processing units whose functionality and connectivity
can be changed via software in microseconds by simply writing a configuration into a static memory. Once an FPGA is configured it can update
all of its thousands of logic elements in parallel at the clock speed of the
circuit. Although an FPGAs clock speed is often an order of magnitude
slower than that of a modern CPU, its massive parallelism makes it a very
powerful computational device. Because of this and of their flexibility there
has been significant interest in using FPGAs in GP.
Work has ranged from the use of FPGAs to speed up fitness evaluation
5 Bigger populations, e.g. five million programs (Langdon and Harrison, 2008), are
possible by loading them onto the GPU in 256k units.
93
(Koza, Bennett, Hutchings, Bade, Keane, and Andre, 1997; Seok, Lee, and
Zhang, 2000) to the definition of specialised operators (Martin and Poli,
2002). It is even possible to implement a complete GP on FPGAs, as suggested in (Heywood and Zincir-Heywood, 2000; Martin, 2001, 2002; Sidhu,
Mei, and Prasanna, 1998). A massively parallel GP implementation has also
been proposed by Eklund (2001, 2004) although to date all tests with that
architecture have only been performed in simulation.
10.4.4
Sub-machine-code GP
10.5
Geographically Distributed GP
94
(a)
(b)
95
Chapter 11
98
11.1
Mathematical Models
Schema theories are among the oldest and the best known models of evolutionary algorithms (Holland, 1992; Whitley, 1994). Schema theories are
based on the idea of partitioning the search space into subsets, called
schemata. They are concerned with modelling and explaining the dynamics
of the distribution of the population over the schemata. Modern genetic
algorithm schema theory (Stephens and Waelbroeck, 1997, 1999) provides
exact information about the distribution of the population at the next generation in terms of quantities measured at the current generation, without
having to actually run the algorithm.
The theory of schemata in GP has had a difficult childhood. Some excellent early efforts led to different worst-case-scenario schema theorems (Altenberg, 1994; Koza, 1992; OReilly and Oppacher, 1994b; Poli and Langdon,
1997; Rosca, 1997; Whigham, 1995). Only very recently have the first exact schema theories become available (Poli, 2000a,b, 2001a) which give exact
formulations (rather than lower bounds) for the expected number of individuals sampling a schema at the next generation. Initially (Poli, 2000b, 2001a),
these exact theories were only applicable to GP with one-point crossover (see
Section 5.3). However, more recently they have been extended to the class of
homologous crossovers (Poli, McPhee, and Rowe, 2004) and to virtually all
types of crossovers that swap subtrees (Poli and McPhee, 2003a,b), including
standard GP crossover with and without uniform selection of the crossover
points (Section 2.4), one-point crossover, context-preserving crossover and
size-fair crossover (which have been described in Section 5.3), as well as
more constrained forms of crossover such as strongly-typed GP crossover
(see Section 6.2.2), and many others.
Other models of evolutionary algorithms include models based on
Markov chain theory (e.g. (Davis and Principe, 1993; Nix and Vose, 1992))
and on statistical mechanics (e.g. (Pr
ugel-Bennett and Shapiro, 1994)).
Markov models have been applied to GP (Mitavskiy and Rowe, 2006; Poli
et al., 2004; Poli, Rowe, and McPhee, 2001), but so far they have not been
99
11.2
Search Spaces
100
0.1
0.01
0.001
0.0001
1e-05
1e-06
255
1e-07
201
0
10
20
30
40
50
60
70
80
31
63
91
151
127
Size
Figure 11.1: Proportion of NAND trees that yield each three-input functions. As circuit size increases the distribution approaches a limit.
e) CCNOT (Toffoli gate) computer, f) quantum computers, g) the average
computer and h) AND, NAND, OR, NOR expressions.
Recently, (Langdon and Poli, 2006; Poli and Langdon, 2006b) started extending these results to Turing complete machine code programs. For this
purpose, a simple, but realistic, Turing complete machine code language,
T7, was considered. It includes: directly accessed bit addressable memory,
an addition operator, an unconditional jump, a conditional branch and four
copy instructions. A mathematical analysis of the halting process based on
a Markov chain model of program execution and halting was performed.
The model can be used to estimate, for any given program length, important quantities, such as the halting probability and the run time of halting
programs. This showed a scaling law indicating
that the halting probabil
number
ity for programs of length L is of order 1/ L, while the expected
11.3 Bloat
11.3
101
Bloat
11.3.1
Bloat in Theory
102
11.3 Bloat
103
where (t + 1) is the mean size of the programs in the population at generation t + 1, E is the expectation operator, ` is a program size, and p(`, t)
is the probability of selecting programs of size ` from the population in
generation t.
This equation can be rewritten in terms of the expected change in average
program size as:
X
E[(t + 1) (t)] =
` (p(`, t) (`, t)),
(11.2)
`
where (`, t) is the proportion of programs of size ` in the current generation. Both equations apply to a GP system with selection and any form of
symmetric subtree crossover.1
Note that Equations (11.1) and (11.2) do not directly explain bloat. They
are, however, important because they constrain what can and cannot happen size-wise in GP populations. Any explanation for bloat (including the
theories summarised above) has to agree with Equations (11.1) and (11.2).
In particular, Equation (11.1) predicts that, for symmetric subtreeswapping crossover operators, the mean program size evolves as if selection
only was acting on the population. This means that if there is a change in
mean size (bloat, for example) it must be the result of some form of positive
or negative selective pressure on some or all of the length classes `. Equation (11.2) shows that there can be bloat only if the selection probability
p(`, t) is different from the proportion (`, t) for at least some `. In particular, for bloat to happen there will have to be some small `s for which
p(`, t) < (`, t) and also some bigger `s for which p(`, t) > (`, t) (at least
on average).
104
11.3.2
Numerous empirical techniques have been proposed to control bloat (Langdon et al., 1999; Soule and Foster, 1998b). We cannot look at them all.
However, we briefly review some of the most important.
Size and Depth Limits
Rather naturally, the first and simplest method to control code growth is the
use of hard limits on the size or depth of the offspring programs generated
by the genetic operators.
Many implementations of this idea (e.g., (Koza, 1992)) apply a genetic
operator and then check whether the offspring is beyond the size or depth
limit. If it isnt, the offspring enters the population. If, instead, the offspring exceeds the limit, one of the parents is returned. Obviously, this
implementation does not allow programs to grow too large. However, there
is a serious problem with this way of applying size limits, or more generally,
constraints to programs: parent programs that are more likely to violate a
constraint will tend to be copied (unaltered) more often than programs that
dont. That is, the population will tend to be filled up with programs that
nearly infringe the constraint, which is typically not what is desired.
11.3 Bloat
105
It is well known, for example, that depth thresholds lead to the population filling up with very bushy programs where most branches reach the
depth limit (being effectively full trees). On the contrary, size limits produce
populations of stringy programs which tend to all approach the size limit.
See (Crane and McPhee, 2005; McPhee, Jarvis, and Crane, 2004) for more
on the impact of size and depth limits, and the differences between them.
The problem can be fixed by not returning parents if the offspring violates
a constraint. This can be realised with two different strategies. Firstly, one
can just return the oversize offspring, but give it a fitness of 0, so that
selection will get rid of it at the next generation. Secondly, one can simply
declare the genetic operation failed, and try again. This can be done in two
alternative ways: a) the same parent or parents are used again, but new
mutation or crossover points are randomly chosen (which can be done up
to a certain number of times before giving up on those parents), or b) new
parents are selected and the genetic operation is attempted again.
If a limit is used, programs must not be so tightly constrained that they
cannot express any solution to the problem. As a rule of thumb, one should
try to estimate the size of the minimum possible solution (using the terminals
and functions given to GP) and add some percentage (e.g., 50-200%) as a
safety margin. In general, however, it may be hard to heuristically come up
with good limits, so some trial and error may be required. Alternatively,
one can use one of the many techniques that have been proposed to adjust
size limits during runs. These can be both at the level of individuals and the
population. See for example the work by Silva and Almeida (2003); Silva
and Costa (2004, 2005a,b); Silva, Silva, and Costa (2005).
Anti-bloat Genetic Operators
One can control bloat by using genetic operators which directly or indirectly
have an anti-bloat effect.
Among the most recent bloat-control methods are size fair crossover
and size fair mutation (Crawford-Marks and Spector, 2002; Langdon, 2000).
These work by constraining the choices made during the execution of a
genetic operation so as to actively prevent growth. In size-fair crossover, for
example, the crossover point in the first parent is selected randomly, as in
standard crossover. Then the size of the subtree to be excised is calculated.
This is used to constrain the choice of the second crossover point so as
to guarantee that the subtree chosen from the second parent will not be
unfairly big.
Older methods include several mutation operators that may help control
the average tree size in the population while still introducing new genetic
material. Kinnear (1993) proposes a mutation operator which prevents the
offsprings depth being more than 15% larger than its parent. Langdon
(1998) proposes two mutation operators in which the new random subtree is
106
on average the same size as the code it replaces. In Hoist mutation (Kinnear,
1994a) the new subtree is selected from the subtree being removed from the
parent, guaranteeing that the new program will be smaller than its parent.
Shrink mutation (Angeline, 1996) is a special case of subtree mutation where
the randomly chosen subtree is replaced by a randomly chosen terminal.
McPhee and Poli (2002) provides theoretical analysis and empirical evidence
that combinations of subtree crossover and subtree mutation operators can
control bloat in linear GP systems.
Other methods which control bloat by exploiting the bias of the operators
were discussed in Section 9.4.
Anti-Bloat Selection
As clarified by the size evolution equation discussed in the previous section,
in systems with symmetric operators, bloat can only happen if there are
some longer-than-average programs that are fitter than average or some
shorter-than-average programs that are less fit than average, or both. So,
it stands to reason that in order to control bloat one needs to somehow
modulate the selection probabilities of programs based on their size.
As we have discussed in Section 9.2.1, recent methods also include the
use of multi-objective optimisation to control bloat. This typically involves
the use of a modified selection based on the Pareto criterion.
A recent technique, the Tarpeian method (Poli, 2003), controls bloat
by acting directly on the selection probabilities in Equation (11.2). This is
done by setting the fitness of randomly chosen longer-than-average programs
to 0. This prevents them being parents. By changing how frequently this
is done the anti-bloat intensity of Tarpeian control can be modulated. An
advantage of the method is that the programs whose fitness is zeroed are
never executed, thereby speeding up runs.
The well-known parsimony pressure method (Koza, 1992; Zhang and
M
uhlenbein, 1993, 1995; Zhang et al., 1997) changes the selection probabilities by subtracting a value based on the size of each program from its fitness.
Bigger programs have more subtracted and, so, have lower fitness and tend
to have fewer children. That is, the new fitness function is f (x) c `(x),
where `(x) is the size of program x, f (x) is its original fitness and c is a constant known as the parsimony coefficient.2 Zhang and M
uhlenbein (1995)
showed some benefits of adaptively adjusting the coefficient c at each generation but most implementations actually keep the parsimony coefficient
constant.
2 While the new fitness is used to guide evolution, one still needs to use the original
fitness function to recognise solutions and stop runs.
11.3 Bloat
107
108
Figure 11.2: Plots of the evolution average size over 500 generations for
multiple runs of the 6-MUX problem with various forms of covariant parsimony pressure. The Constant runs had a constant target size of 150. In
the Sin runs the target size was sin((generation + 1)/50.0) 50.0 + 150.
For the Linear runs the target size was 150 + generation. The Limited
runs used no size control until the size reached 250, then the target was held
at 250. Finally, the Local runs used c = Cov(`, f )/ Var(`), which allowed
a certain amount of drift but still avoided runaway bloat (see text).
Part III
Practical Genetic
Programming
109
Chapter 12
Applications
Since its early beginnings, GP has produced a cornucopia of results. The
literature, which covers more than 5000 recorded uses of GP, reports an
enormous number of applications where GP has been successfully used as
an automatic programming tool, a machine learning tool or an automatic
problem-solving engine. It is impossible to list all such applications here.
In the following sections we start with a discussion of the general kinds
of problems where GP has proved successful (Section 12.1) and then review a representative subset for each of the main application areas of GP
(Sections 12.212.11), devoting particular attention to the important areas
of symbolic regression (Section 12.2) and human-competitive results (Section 12.3).
12.1
112
12 Applications
to individual problems; unveil unexpected relationships among variables; and, sometimes GP can discover new concepts that can then be
applied in a wide variety of circumstances.
113
12.2
In principle, there are as many possible applications of GP as there are applications for programsin other words, virtually infinite. However, before
one can try to solve a new problem with GP, one needs to define an appropriate fitness function. In problems where only the side effects of a program are
of interest, the fitness function usually compares the effects of the execution
of a program in some suitable environments with a desired behaviour, often
in a very application-dependent manner. However, in many problems the
114
12 Applications
goal is to find a function whose output has some desired property, e.g., the
function matches some target values (as in the example given in Section 4.1).
This is generally known as a symbolic regression problem.
Many people are familiar with the notion of regression. Regression means
finding the coefficients of a predefined function such that the function best
fits some data. A problem with regression analysis is that, if the fit is not
good, the experimenter has to keep trying different functions by hand until
a good model for the data is found. Not only is this laborious, but also
the results of the analysis depend very much on the skills and inventiveness
of the experimenter. Furthermore, even expert users tend to have strong
mental biases when choosing functions to fit. For example, in many application areas there is a considerable tradition of using only linear or quadratic
models, even when the data might be better fit by a more complex model.
Symbolic regression attempts to go beyond this. It consists of finding
a function that fits the given data points without making any assumptions
about the structure of that function. Since GP makes no such assumption,
it is well suited to this sort of discovery task. Symbolic regression was one
of the earliest applications of GP (Koza, 1992), and continues to be widely
studied (Cai, Pacheco-Vega, Sen, and Yang, 2006; Gustafson, Burke, and
Krasnogor, 2005; Keijzer, 2004; Lew, Spencer, Scarpa, Worden, Rutherford,
and Hemez, 2006).
The steps necessary to solve symbolic regression problems include the five
preparatory steps mentioned in Chapter 2. We practiced them in the example in Chapter 4, which was an instance of a symbolic regression problem.
There is an important difference here, however: the data points provided in
Chapter 4 were computed using a simple formula, while in most realistic situations each point represents the measured values taken by some variables
at a certain time in some dynamic process, in a repetition of an experiment,
and so on. So, the collection of an appropriate set of data points for symbolic
regression is an important and sometimes complex task.
For instance, consider the case of using GP to evolve a soft sensor (Jordaan, Kordon, Chiang, and Smits, 2004). The intent is to evolve a function
that will provide a reasonable estimate of what a sensor (in an industrial
production facility) would report, based on data from other actual sensors
in the system. This is typically done in cases where placing an actual sensor
in that location would be difficult or expensive. However, it is necessary to
place at least one instance of such a sensor in a working system in order to
collect the data needed to train and test the GP system. Once the sensor
is placed, one would collect the values reported by that sensor and by all
the other real sensors that are available to the evolved function, at various
times, covering the various conditions under which the evolved system will
be expected to act.
Such experimental data typically come in large tables where numerous
115
116
12 Applications
Table 12.1: Samples showing the size and location of Elviss finger tip
as apparent to this two eyes, given various right arm actuator set points (4
degrees of freedom). Cf. Figure 12.1. When the data are used for training,
GP is asked to invert the mapping and evolve functions from data collected
by both cameras showing a target location to instructions to give to Elviss
four arm motors so that its arm moves to the target.
Arm actuator
-376
-372
-377
-385
-393
-401
-409
-417
-425
-433
-441
..
.
-626
-622
-627
-635
-643
-651
-659
-667
-675
-683
-691
..
.
1000
1000
899
799
699
599
500
399
299
199
99
..
.
-360
-380
-359
-319
-279
-239
-200
-159
-119
-79
-39
..
.
x
44
43
43
38
36
32
32
31
30
31
31
..
.
Left eye
y size
10
29
7
29
9
33
16
27
24
26
32
25
35
24
41
17
45
25
47
20
49
16
..
..
.
.
Right eye
x
y size
-9 12
25
-9 12
29
-20 14
26
-17 22
30
-21 25
20
-26 28
18
-27 31
19
-28 36
13
-27 39
8
-27 43
9
-26 45
13
..
..
..
.
.
.
most symbolic regression fitness functions tend to include summing the errors measured for each record in the data set, as we did in Section 4.2.2.
Usually either the absolute difference or the square of the error is used.
The fourth preparatory step typically involves choosing a size for the
population (which is often done initially based on the perceived difficulty of
the problem, and is then refined based on the actual results of preliminary
runs). The user also needs to set the balance between the selection strength
(normally tuned via the tournament size) and the intensity of variation
(which can be varied by modifying the mutation and crossover rates, but
many researchers tend to fix to some standard values).
117
Figure 12.1: Elvis sitting with its right hand outstretched. The apparent
position and size of a bright red laser attached to its finger tip is recorded
(see Table 12.1). The data are then used to train a GP to move the robots
arm to a spot in three dimensions using only its eyes.
12.3
Getting machines to produce human-like results is the very reason for the
existence of the fields of artificial intelligence and machine learning. However, it has always been very difficult to assess how much progress these
fields have made towards their ultimate goal. Alan Turing understood that
in order to avoid human biases when assessing machine intelligence, machine
behaviour must be evaluated objectively. This led him to propose an imitation game, now known as the Turing test (Turing, 1950). Unfortunately,
the Turing test is not usable in practice, and so, there is a need for more
workable objective tests of machine intelligence.
Koza, Bennett, and Stiffelman (1999) suggested shifting attention from
the notion of intelligence to the notion of human competitiveness. A result
cannot acquire the rating of human competitive merely because it is endorsed by researchers inside the specialised fields that are attempting to
create machine intelligence. A result produced by an automated method
118
12 Applications
must earn the rating of human competitive independently of the fact that
it was generated by an automated method.
Koza proposed that an automatically-created result should be considered
human-competitive if it satisfies at least one of these eight criteria:
1. The result was patented as an invention in the past, is an improvement
over a patented invention or would qualify today as a patentable new
invention.
2. The result is equal to or better than a result that was accepted as a new
scientific result at the time when it was published in a peer-reviewed
scientific journal.
3. The result is equal to or better than a result that was placed into a
database or archive of results maintained by an internationally recognised panel of scientific experts.
4. The result is publishable in its own right as a new scientific result,
independent of the fact that the result was mechanically created.
5. The result is equal to or better than the most recent human-created
solution to a long-standing problem for which there has been a succession of increasingly better human-created solutions.
6. The result is equal to or better than a result that was considered an
achievement in its field at the time it was first discovered.
7. The result solves a problem of indisputable difficulty in its field.
8. The result holds its own or wins a regulated competition involving
human contestants (in the form of either live human players or humanwritten computer programs).
These criteria are independent of, and at arms length from, the fields of
artificial intelligence, machine learning, and GP.
Over the years, dozens of results have passed the human-competitiveness
test. Some pre-2004 human-competitive results include:
Creation of quantum algorithms, including a better-than-classical algorithm for a database search problem and a solution to an AND/OR
query problem (Spector et al., 1998, 1999).
Creation of a competitive soccer-playing program for the RoboCup 1997
competition (Luke, 1998).
Creation of algorithms for the transmembrane segment identification
problem for proteins (Koza, 1994, Sections 18.8 and 18.10) and (Koza
et al., 1999, Sections 16.5 and 17.2).
119
Creation of a sorting network for seven items using only 16 steps (Koza
et al., 1999, Sections 21.4.4, 23.6, and 57.8.1).
Synthesis of analogue circuits (with placement and routing, in some
cases), including: 60- and 96-decibel amplifiers (Koza et al., 1999,
Section 45.3); circuits for squaring, cubing, square root, cube root,
logarithm, and Gaussian functions (Koza et al., 1999, Section 47.5.3);
a circuit for time-optimal control of a robot (Koza et al., 1999, Section
48.3); an electronic thermometer (Koza et al., 1999, Section 49.3); a
voltage-current conversion circuit (Koza, Keane, Streeter, Mydlowec,
Yu, and Lanza, 2003, Section 15.4.4).
Creation of a cellular automaton rule for the majority classification
problem that is better than all known rules written by humans (Andre
et al., 1996).
Synthesis of topology for controllers, including: a PID (proportional,
integrative, and derivative) controller (Koza et al., 2003, Section 9.2)
and a PID-D2 (proportional, integrative, derivative, and second derivative) controller (Koza et al., 2003, Section 3.7); PID tuning rules that
outperform the Ziegler-Nichols and Astrom-Hagglund tuning rules
(Koza et al., 2003, Chapter 12); three non-PID controllers that outperform a PID controller that uses the Ziegler-Nichols or AstromHagglund tuning rules (Koza et al., 2003, Chapter 13).
In total (Koza and Poli, 2005) lists 36 human-competitive results. These
include 23 cases where GP has duplicated the functionality of a previously
patented invention, infringed a previously patented invention, or created a
patentable new invention. Specifically, there are fifteen examples where GP
has created an entity that either infringes or duplicates the functionality of
a previously patented 20th -century invention, six instances where GP has
done the same with respect to an invention patented after 1 January 2000,
and two cases where GP has created a patentable new invention. The two
new inventions are general-purpose controllers that outperform controllers
employing tuning rules that have been in widespread use in industry for
most of the 20th century.
Many of the pre-2004 results were obtained by Koza. However, since
2004, a competition has been held annually at ACMs Genetic and Evolutionary Computation Conference (termed the Human-Competitive awards
the Humies). The $10,000 prize is awarded to projects that have produced automatically-created results which equal or better those produced
by humans.
The Humies Prizes have typically been awarded to applications of evolutionary computation to high-tech fields. Many used GP. For example,
the 2004 gold medals were given for the design, via GP, of an antenna for
120
12 Applications
121
12.4
Hampo and Marko (1992) were among the first people from industry to
consider using GP for signal processing. They evolved algorithms for preprocessing electronic motor vehicle signals for possible use in engine monitoring and control.
Several applications of GP for image processing have been for military
uses. For example, Tackett (1993) evolved algorithms to find tanks in infrared images. Howard, Roberts, and Brankin (1999); Howard, Roberts, and
Ryan (2006) evolved programs to pick out ships from SAR radar mounted
on satellites in space and to locate ground vehicles from airborne photo reconnaissance. They also used GP to process surveillance data for civilian
purposes, such as predicting motorway traffic jams from subsurface traffic
speed measurements (Howard and Roberts, 2004).
Using satellite SAR radar, Daida, Hommes, Bersano-Begey, Ross, and
Vesecky (1996) evolved algorithms to find features in polar sea ice. Optical satellite images can also be used for environmental studies (Chami and
Robilliard, 2002) and for prospecting for valuable minerals (Ross, Gualtieri,
Fueten, and Budkewitsch, 2005).
Alcazar used GP to find recurrent filters (including artificial neural networks (Esparcia-Alcazar and Sharman, 1996)) for one-dimensional electronic
122
12 Applications
signals (Sharman and Esparcia-Alcazar, 1993). Local search (simulated annealing or gradient descent) can be used to adjust or fine-tune constant
values within the structure created by genetic search (Smart and Zhang,
2004).
Yu and Bhanu (2006) have used GP to preprocess images, particularly
of human faces, to find regions of interest for subsequent analysis. See also
(Trujillo and Olague, 2006a).
Zhang has been particularly active at evolving programs with GP to
visually classify objects (typically coins) (Zhang and Smart, 2006). He has
also applied GP to human speech (Xie, Zhang, and Andreae, 2006).
Parisian GP is a system in which the image processing task is split
across a swarm of evolving agents (flies). In (Louchet, 2001; Louchet,
Guyon, Lesot, and Boumaza, 2002) the flies reconstruct three dimensions
from pairs of stereo images. For example, in (Louchet, 2001), as the flies
buzz around in three dimensions their position is projected onto the left and
right of a pair of stereo images. The fitness function tries to minimise the
discrepancy between the two images, thus encouraging the flies to settle on
visible surfaces in the 3-D space. So, the true 3-D space is inferred from
pairs of 2-D images taken from slightly different positions.
While the likes of Google have effectively indexed the written word, for
speech and pictures indexing has been much less effective. One area where
GP might be applied is in the automatic indexing of images. Some initial
steps in this direction are given in (Theiler, Harvey, Brumby, Szymanski,
Alferink, Perkins, Porter, and Bloch, 1999).
To some extent, extracting text from images (OCR) can be done fairly
reliably, and the accuracy rate on well formed letters and digits is close
to 100%. However, many interesting cases remain (Cilibrasi and Vitanyi,
2005) such as Arabic (Klassen and Heywood, 2002) and oriental languages,
handwriting (De Stefano, Cioppa, and Marcelli, 2002; Gagne and Parizeau,
2006; Krawiec, 2004; Teredesai and Govindaraju, 2005) (such as the MNIST
examples), other texts (Rivero, nal, Dorado, and Pazos, 2004) and musical
scores (Quintana, Poli, and Claridge, 2006).
The scope for applications of GP to image and signal processing is almost
unbounded. A promising area is medical imaging (Poli, 1996b). GP image
techniques can also be used with sonar signals (Martin, 2006). Off-line work
on images includes security and verification. For example, Usman, Khan,
Chamlawi, and Majid (2007) have used GP to detect image watermarks
which have been tampered with. Recent work by Zhang has incorporated
multi-objective fitness into GP image processing (Zhang and Rockett, 2006).
In 1999 Poli, Cagnoni and others founded the annual European Workshop on Evolutionary Computation in Image Analysis and Signal Processing
(EvoIASP). EvoIASP is held every year with the EuroGP. Whilst not solely
dedicated to GP, many GP applications have been presented at EvoIASP.
12.5
123
GP is very widely used in the areas of financial trading, time series prediction
and economic modelling and it is impossible to describe all its applications.
It this section we will hint at just a few areas.
Chen has written more than 60 papers on using GP in finance and economics. Recent papers have looked at the modelling of agents in stock
markets (Chen and Liao, 2005), game theory (Chen, Duffy, and Yeh, 2002),
evolving trading rules for the S&P 500 (Yu and Chen, 2004) and forecasting
the Heng-Sheng index (Chen, Wang, and Zhang, 1999).
The efficient markets hypothesis is a tenet of economics. It is founded
on the idea that everyone in a market has perfect information and acts
rationally. If the efficient markets hypothesis held, then everyone would
see the same value for items in the market and so agree the same price.
Without price differentials, there would be no money to be made from the
market itself. Whether it is trading potatoes in northern France or dollars
for yen, it is clear that traders are not all equal and considerable doubt has
been cast on the efficient markets hypothesis. So, people continue to play the
stock market. Game theory has been a standard tool used by economists to
try to understand markets but is increasingly supplemented by simulations
with both human and computerised agents. GP is increasingly being used
as part of these simulations of social systems.
Neely, Weller, and Dittmar (1997), Neely and Weller (1999, 2001) and
Neely (2003) of the US Federal Reserve Bank used GP to study intra-day
technical trading on the foreign exchange markets to suggest the market is
efficient and found no evidence of excess returns. This negative result
was criticised by Marney, Miller, Fyfe, and Tarbert (2001). Later work by
Neely, Weller, and Ulrich (2006) suggested that data after 1995 are consistent with Los adaptive markets hypothesis rather than the efficient markets
hypothesis. Note that here GP and computer tools are being used in a
novel data-driven approach to try and resolve issues which were previously
a matter of dogma.
From a more pragmatic viewpoint, Kaboudan shows GP can forecast international currency exchange rates (Kaboudan, 2005), stocks (Kaboudan,
2000) and stock returns (Kaboudan, 1999). Tsang and his co-workers continue to apply GP to a variety of financial arenas, including: betting (Tsang,
Li, and Butler, 1998), forecasting stock prices (Li and Tsang, 1999; Tsang
and Li, 2002; Tsang, Yung, and Li, 2004), studying markets (MartinezJaramillo and Tsang, 2007), approximating Nash equilibrium in game theory (Jin, 2005; Jin and Tsang, 2006; Tsang and Jin, 2006) and arbitrage
(Tsang, Markose, and Er, 2005). Dempster and HSBC also use GP in foreign exchange trading (Austin, Bates, Dempster, Leemans, and Williams,
124
12 Applications
2004; Dempster and Jones, 2000; Dempster, Payne, Romahi, and Thompson,
2001). Pillay has used GP in social studies and teaching aids in education,
e.g. (Pillay, 2003). As well as trees (Koza, 1990), other types of GP have
been used in finance, e.g. (Nikolaev and Iba, 2002).
Since 1995 the International Conference on Computing in Economics
and Finance (CEF) has been held every year. It regularly attracts GP papers, many of which are on-line. In 2007 Brabazon and ONeill established
the European Workshop on Evolutionary Computation in Finance and Economics (EvoFIN). EvoFIN is held with EuroGP.
12.6
125
control laws to apply). For example, Flemings group in Sheffield used multiobjective GP (Hinchliffe and Willis, 2003; Rodriguez-Vazquez, Fonseca, and
Fleming, 2004) to reduce the cost of running aircraft jet engines (Arkov,
Evans, Fleming, Hill, Norton, Pratt, Rees, and Rodriguez-Vazquez, 2000;
Evans, Fleming, Hill, Norton, Pratt, Rees, and Rodriguez-Vazquez, 2001).
Alves da Silva and Abrao (2002) surveyed GP and other AI techniques
applied in the electrical power industry.
12.7
126
12 Applications
Kell and his colleagues in Aberystwyth have had great success in applying
GP widely in bioinformatics (see infrared spectra above and (Allen, Davey,
Broadhurst, Heald, Rowland, Oliver, and Kell, 2003; Day, Kell, and Griffith,
2002; Gilbert, Goodacre, Woodward, and Kell, 1997; Goodacre and Gilbert,
1999; Jones, Young, Taylor, Kell, and Rowland, 1998; Kell, 2002a,b,c; Kell,
Darby, and Draper, 2001; Shaw, Winson, Woodward, McGovern, Davey,
Kaderbhai, Broadhurst, Gilbert, Taylor, Timmins, Goodacre, Kell, Alsberg,
and Rowland, 2000; Woodward, Gilbert, and Kell, 1999)). Another very
active group is that of Moore and his colleagues (Moore, Parker, Olsen, and
Aune, 2002; Motsinger, Lee, Mellick, and Ritchie, 2006; Ritchie, Motsinger,
Bush, Coffey, and Moore, 2007; Ritchie, White, Parker, Hahn, and Moore,
2003).
Computational chemistry is widely used in the drug industry. The properties of simple molecules can be calculated. However, the interactions between chemicals which might be used as drugs and medicinal targets within
the body are beyond exact calculation. Therefore, there is great interest in
the pharmaceutical industry in approximate in silico models which attempt
to predict either favourable or adverse interactions between proto-drugs and
biochemical molecules. Since these are computational models, they can be
applied very cheaply in advance of the manufacturing of chemicals, to decide
which of the myriad of chemicals might be worth further study. Potentially,
such models can make a huge impact both in terms of money and time
without being anywhere near 100% correct. Machine learning and GP have
both been tried. GP approaches include (Bains, Gilbert, Sviridenko, Gascon, Scoffin, Birchall, Harvey, and Caldwell, 2002; Barrett and Langdon,
2006; Buxton, Langdon, and Barrett, 2001; Felton, 2000; Globus, Lawton,
and Wipke, 1998; Goodacre, Vaidyanathan, Dunn, Harrigan, and Kell, 2004;
Harrigan et al., 2004; Hasan, Daugelat, Rao, and Schreiber, 2006; Krasnogor, 2004; Si, Wang, Zhang, Hu, and Fan, 2006; Venkatraman, Dalby, and
Yang, 2004; Weaver, 2004).
12.8
127
12.10
The Arts
Computers have long been used to create purely aesthetic artifacts. Much
of todays computer art tends to ape traditional drawing and painting, producing static pictures on a computer monitor. However, the immediate
advantage of the computer screen movement can also be exploited. In
both cases evolutionary computation can and has been exploited. Indeed,
with evolutions capacity for unlimited variation, evolutionary computation
offers the artist the scope to produce ever changing works. Some artists
have also worked with sound.
128
12 Applications
The use of GP in computer art can be traced back at least to the work
of Sims (Sims, 1991) and Latham.1 Jacobs work (Jacob, 2000, 2001) provides many examples. McCormack (2006) considers the recent state of play
in evolutionary art and music. Many recent techniques are described in
(Machado and Romero, 2008).
Evolutionary music (Todd and Werner, 1999) has been dominated by
Jazz (Spector and Alpern, 1994). An exception is Bach (Federman, Sparkman, and Watt, 1999). Most approaches to evolving music have made at
least some use of interactive evolution (Takagi, 2001) in which the fitness
of programs is provided by users, often via the Internet (Ando, Dahlsted,
Nordahl, and Iba, 2007; Chao and Forrest, 2003). The limitation is almost always finding enough people willing to participate (Langdon, 2004).
Costelloe and Ryan (2007) tried to reduce the human burden. Algorithmic
approaches are also possible (Cilibrasi, Vitanyi, and de Wolf, 2004; Inagaki,
2002).
One of the sorrows of AI is that as soon as it works it stops being AI (and
celebrated as such) and becomes computer engineering. For example, the
use of computer generated images has recently become cost effective and is
widely used in Hollywood. One of the standard state-of-the-art techniques
is the use of Reynolds swarming boids (Reynolds, 1987) to create animations of large numbers of rapidly moving animals. This was first used in
Cliffhanger (1993) to animate a cloud of bats. Its use is now commonplace
(herds of wildebeest, schooling fish, and even large crowds of people). In
1997 Reynold was awarded an Oscar.
Since 2003, EvoMUSART (the European Workshop on Evolutionary Music and Art) has been held every year along with the EuroGP conference as
part of the EvoStar event.
12.11
Compression
Koza (1992) was the first to use genetic programming to perform compression. He considered, in particular, the lossy compression of images. The idea
was to treat an image as a function of two variables (the row and column
of each pixel) and to use GP to evolve a function that matches as closely as
possible the original. One can then use the evolved GP tree as a lossy compressed version of the image, since it is possible to obtain the original image
by evaluating the program at each row-column pair of interest. The technique, which was termed programmatic compression, was tested on one small
synthetic image with good success. Programmatic compression was further
developed and applied to realistic data (images and sounds) by Nordin and
Banzhaf (1996).
1 http://www.williamlatham1.com/
12.11 Compression
129
Iterated Functions System (IFS) are important in the domain of fractals and the fractal compression algorithm. Lutton, Levy-Vehel, Cretin,
Glevarec, and Roll (1995a,b) used genetic programming to solve the inverse
problem of identifying a mixed IFS whose attractor is a specific binary (black
and white) image of interest. The evolved program can then be taken to represent the original image. In principle, this can then be further compressed.
The technique is lossy, since rarely the inverse problem can be solved exactly. No practical application or compression ratio results were reported
in (Lutton et al., 1995a,b). Using similar principles, Sarafopoulos (1999)
used GP to evolve affine IFSs whose attractors represent a binary image
containing a square (which was compressed exactly) and one containing a
fern (which was achieved with some error in the finer details).
Wavelets are frequently used in lossy image and signal compression.
Klappenecker and May (1995) used GP to evolve wavelet compression algorithms (internal nodes represented conjugate quadrature filters, leaves represented quantisers). Results on a small set of real-world images were impressive, with the GP compression outperforming JPEG at all compression
ratios.
The first lossless compression technique (Fukunaga and Stechert, 1998)
used GP to evolve non-linear predictors for images. These were used to
predict the gray level a pixel will take based on the gray values of a subset
of its neighbours (those that have already been computed in a row-by-row
and column-by-column scan of the image array). The prediction errors together with the models description represent a compressed version of the
image. These were compressed using the Huffman encoding. Results on five
images from the NASA Galileo Mission database were very promising with
GP compression outperforming some of the best human-designed lossless
compression algorithms.
In many compression algorithms some form of pre-processing or transformation of the original data is performed before compression. This often improves compression ratios. Parent and Nowe (2002) evolved pre-processors
for image compression using GP. The objective of the pre-processor was to
reduce losslessly the entropy in the original image. In tests with five images
from the Canterbury corpus, GP was successful in significantly reducing the
image entropy. As verified via the application of bzip2, the resulting images
were markedly easier to compress.
In (Krantz, Lindberg, Thorburn, and Nordin, 2002) the use of programmatic compression was extended from images to natural videos. A program
was evolved that generates intermediate frames of video sequence, where
each frame is composed by a series of transformed regions from the adjacent
frames. The results were encouraging in the sense that a good approximation to frames was achieved. While a significant improvement in compression was achieved, programmatic compression was very slow in comparison
130
12 Applications
with standard compression methods, the time needed for compression being
measured in hours or even days. Acceleration in GP image compression was
achieved in (He, Wang, Zhang, Wang, and Fang, 2005), where an optimal
linear predictive technique was proposed which used a less complex fitness
function.
Recently Kattan and Poli (2008) proposed a GP system called GP-ZIP
for lossless data compression based on the idea of optimally combining wellknown lossless compression algorithms. The file to be compressed was divided into chunks of a predefined length, and GP was asked to find the best
possible compression algorithm for each chunk in such a way as to minimise
the total length of the compressed file. The compression algorithms available to GP-ZIP included arithmetic coding, Lempel-Ziv-Welch, unbounded
prediction by partial matching, and run length encoding among others. Experimentation showed that when the file to be compressed is composed of
heterogeneous data fragments (as it is the case, for example, in archive files),
GP-zip is capable of achieving compression ratios that are significantly superior to those obtained with other compression algorithms.
Chapter 13
Troubleshooting GP
The dynamics of evolutionary algorithms (including GP) are often very complex, and the behaviour of an EA is typically challenging to predict or understand. As a result it is often difficult to troubleshoot such systems when
they are not performing as expected. While we obviously cannot provide
troubleshooting suggestions that are specific to every GP implementation
and application, we can suggest some general issues to keep in mind. To a
large extent the advice in (Kinnear, 1994b; Koza, 1992; Langdon, 1998) also
remains sound.
13.1
131
132
13.2
13 Troubleshooting GP
13.3
When working on real problems there are not likely to be any silver bullets.
No technique (including GP) is likely to solve all instances of an NP-hard
problem in an amount of time that grows linearly with the size of the problem. GP has proven extremely successful in a wide variety of domains (e.g.,
Chapter 12) but that doesnt mean that it will work immediately or easily
in every domain, or even that it is the best tool for a specific domain.
While some of the successes in the field have been easy, most were the
133
13.4
Dont assume a little fiddling with parameters, operators, fitness functions, etc., is harmless. One of the awkward realities of many widely applicable tools is that they typically have numerous tunable parameters. Evolutionary algorithms such as GP are no exception. Often changing a parameter or two can have a fairly minimal impact, and averaging over many
runs is required to reliably detect those effects. Some parameter changes,
however, can produce more dramatic effects. Changing the function set, for
example, can significantly change the distribution of the sizes and shapes of
trees, especially in the early generations, and potentially bias the system in
unexpected ways.
Another source of change can be the problem domain. A common mistake is to hope that parameter settings that worked well for one problem
will also work well for what appears to be a very similar problem. Problems
that appear similar to humans, however, may have quite different search
characteristics.
In addition, there are many small differences in GP implementations that
are rarely considered important or even reported. However, our experience is
that they may produce significant changes in the behaviour of a GP system.
Differences as small as an > in place of a in an if statement can have
an important effect. For example, the substitution > may influence
the winners of tournaments, the designation of the best-of-run individual,
the choice of which elements are cloned when elitism is used, or the offspring
produced by operators which accept the offspring only if it is better or not
worse than a parent.
13.5
When big changes appear to make little difference, this can sometimes be
used to identify problems with the domain representation and fitness measure. Alternatively it may be that the problem is simply too difficult, and
no change is likely to make a significant difference.
Suppose that youre not making much progress during a set of runs. One
might react by sweeping the parameter space, doing runs with a variety of
134
13 Troubleshooting GP
13.6
If youre not getting your desired results, it is important to take the time
to dig around in the populations and see what is actually being evolved.2
For example, if youre using ADFs because you think that your problem
would benefit from a modular solution, examine the individuals that youre
evolving. Are they using ADFs? (Sometimes the result producing branch
simply will not refer to the ADFs at all.) Are they using them in a modular
way? Are ADFs being used multiple times? Do the ADFs encapsulate some
interesting logic, or are they just re-naming an input variable? If youre
using grammatical evolution, on the other hand, are your evolved individuals
using your grammar as you expected? Or is the grammar in fact biasing
2 If the system youre using doesnt allow you to dump individuals from a run, add
that functionality or use a different system.
135
136
13 Troubleshooting GP
Figure 13.1: Visualisation of the size and shape of the entire population of
1,000 individuals in the final generation of runs using a depth limit of 50 (on
the left) and a size limit of 600 (on the right). The inner circle is at depth
50, and the outer circle is at depth 100. These plots are from (Crane and
McPhee, 2005) and were drawn using the techniques described in (Daida
et al., 2005).
way is to use GP as a multi-objective evolutionary algorithm (cf. Chapter 9.)
In some cases the details of the trees are less important than their general
size and shape. Daida et al. (2005) presented a particularly useful set of
visualisation techniques for this situation.5 These techniques allow one to
see the size and shape of both individual trees as well as an aggregate view
of entire populations. Figure 13.1, for example, shows the impact of size and
depth limits on the size and shape of trees in two different runs with very
similar average sizes and depths. The plots make it clear, however, that the
shapes of the resulting trees were quite different.
13.7
Encourage Diversity
137
13.8
Embrace Approximation
There is a widespread belief that computer programs are fragile and that
any change to any bit in them will cause them to stop working. This is
fostered by the common knowledge that a small typing mistake by a human
programmer can sometimes introduce a troublesome bug into a program.
6 In a panmictic population no mating restrictions are imposed as to which individual
mates with which.
7 Doing this means that the selection scheme is no longer elitist, and it may be worthwhile to protect the best individual(s) to preserve the elitism.
8 What is meant by a large population has changed over time. In the early days
of GP, populations of 1,000 or more could be considered large. However, CPU speeds
and computer memory have increased exponentially over time. So, at the time of writing
it is not unusual to see populations of hundred of thousands or millions of individuals
being used in the solution of hard problems. Research indicates that there are benefits in
splitting populations into demes even for much smaller populations. See Section 10.5.
138
13 Troubleshooting GP
Programmers know from painful experience, however, that far from proving
immediately fatal, errors can lay hidden for years. Further, not all errors
are created equal. Some are indeed critical and must be dealt with immediately, while others are rare or largely inconsequential and so never become a
major priority. The worst are arguably the severe bugs that rarely express
themselves, as they can be extremely difficult to pin down yet still have dire
consequences when they appear.
In summary, there is no such thing as a perfect (non-trivial) humanwritten program and all such programs include a variety of errors of different
severity and with a different frequency of manifestation.9
This sort of variability is also very common in GP work. It provides the
sort of toehold that evolution can exploit in the early generations of GP
runs. The population of programs just needs to contain a few which move
vaguely in the right direction. Many of their offspring may be totally blind or
have no legs, just so long as a few continue to slime towards the light. Over
generations evolution may hopefully cobble together some useful features
from this initially unpromising ooze. The results, however, are unlikely
to be perfect or pretty. If you as a GP engineer insist on only accepting
solutions that are beautifully symmetric and walk on two legs on day one,
you are likely to be disappointed. As we have argued above, even humanwritten programs often only approximate their intended functionality. So,
why should we not accept the same from GP?
If you accept this notion, then it is important to provide your system with
some sort of gradient upon which to act, allowing it to evolve ever better
approximations. It is also important to ensure that your test environment
(usually encapsulated in the fitness function) places appropriate emphasis on
the most important features of the space from a user perspective. Consider a
problem with five test cases, four of which are fairly easy and consequently
less important, with the fifth being crucial and quite difficult. A likely
outcome in such a setting is that individuals that can do the four easier
tasks, but are unable to make the jump to the fifth. There are several
things you could try: 1) weighting the hard task more heavily, 2) dividing
it up in some way into additional sub-tasks, or 3) changing it from being a
binary condition (meaning that an individual does or does not succeed on the
fifth task) to a continuous condition, so that an individual GP program can
partially succeed on the fifth task. The first of these options is the simplest
to implement. The second two, however, create a smoother gradient for the
evolutionary process to follow, and so may yield better results.
9 This
13.9
139
Control Bloat
If you are running out of memory or your execution times seem inordinately
high, look at how your average program size is changing over time. If programs are growing extremely fast, you may want to implement some form
of bloat control (see Section 11.3). Naturally, long runs may simply be the
result of the population being very large or the fitness evaluation being slow.
In these cases, you may find the techniques described in Chapter 10 helpful.
Controlling bloat is also important if your goal is to find a comprehensible
model, since in practice smaller models are easier to understand. A large
model will not only be difficult to understand but also may over-fit the
training data (Gelly, Teytaud, Bredeche, and Schoenauer, 2006).
13.10
Checkpoint Results
13.11
Report Well
There are many potential reasons why work may be poorly received. Here
are a few: insufficient explanation of methods and algorithms, insufficient
experimental evidence, insufficient analysis, lack of statistical significance,
lack of replicability, reading too much into ones results, insufficient novelty,
poor presentation and poor English. In scientific, rather than commercial,
work it is vital to report enough details so that someone else can reproduce
your results. One very useful idea is to publish a table summarising your
GP run. Table 4.1 (page 31) contains an example tableau.
As explained in Section 13.2, it is essential to ensure that results are
statistically significant so that nobody can dismiss them as the consequence
of a lucky fluke. Complex ideas are often best explained by diagrams. When
possible, descriptions of non-trivial algorithms should be accompanied by
pseudocode, along with text describing the most important components of
the algorithm.
140
13 Troubleshooting GP
In addition to reporting your results, make sure you also discuss their
implications. If, for example, what GP has evolved means the customer can
save money or could improve their process in some way, then this should
be highlighted. Also be careful to not construct excessively complex explanations for the observations. It is very tempting to say X is probably due
to Y, but for this to be believable one should at least have made some
attempt to check if Y is indeed taking place, and whether modulations or
suppression of Y in fact produce modulations and/or suppression of X.
Finally, the most likely outcomes of a text that is badly written or badly
presented are: 1) your readers will misunderstand you, and 2) you will have
fewer readers. Spell checkers can help with typos, but whenever possible
one should ensure a native English speaker has proofread the text.
13.12
For any work in science, engineering, industry or commerce to make an impact it must be presented in a form that can convince others of the validity
of its results and conclusions. This might include: a pitch within a corporation seeking continued financial support for a project, the submission of
a research paper to a journal or the presentation of a GP-based product to
potential customers.
The burden of proof is on the users of GP, and it is important to use the
customers language. If the fact that GP discovers a particular chemical is
important in a reaction or drug design, for example, one should make this
stand out during the presentation. A great advantage of GP over many AI
techniques in that its results are often simple equations. Ensure these are
intelligible to your customer, e.g., by simplification. Also make an effort to
present your results using your customers terminology. Your GP system
may produce answers as trees, but if the customers use spreadsheets, consider translating the tree into a spreadsheet formula. Alternatively, your
customer may not be particularly interested in the details of the solution,
but instead care a great deal about which inputs the evolutionary process
tended to use.
Also, one should try to discover how the customers intend to validate
GPs answer. Do not let them invent some totally new data which has
nothing to do with the data they supplied for training (just to see how well
it does...). Avoid customers with contrived data: GP is not omnipotent
and knows nothing about things it has not seen. At the same time you
should be scrupulous about your own use of holdout data. GP is a very
powerful machine learning technique, and with this comes the ever present
danger of over-fitting. One should never allow performance on data reserved
for validation to be used to choose which answer to present to the customer.
Chapter 14
Conclusions
In his seminal paper entitled Intelligent Machinery, Turing (1948) identified three ways by which human-competitive machine intelligence might be
achieved. In connection with one of those ways, Turing said:
There is the genetical or evolutionary search by which a combination of genes is looked for, the criterion being the survival
value. (Turing, 1948)
Turing did not specify how to conduct the genetical or evolutionary
search for machine intelligence. In particular, he did not mention the idea of
a population-based parallel search in conjunction with sexual recombination
(crossover) as described in Hollands 1975 book Adaptation in Natural and
Artificial Systems (Holland, 1992, second edition). However, in Turings
paper Computing Machinery and Intelligence (Turing, 1950), he did point
out:
We cannot expect to find a good child-machine at the first attempt. One must experiment with teaching one such machine
and see how well it learns. One can then try another and see
if it is better or worse. There is an obvious connection between
this process and evolution:
Structure of the child machine = Hereditary material
Changes of the child machine = Mutations
Natural selection
= Judgement of the experimenter
In other words, Turing perceived that one possibly productive approach
to machine intelligence would involve an evolutionary process in which a
description of a computer program (the hereditary material) undergoes progressive modification (mutation) under the guidance of natural selection
(that is, selective pressure in the form of what we now call fitness).
141
142
14 Conclusions
Today, decades later, we can see that indeed Turing was right. GP has
started fulfilling his dream by providing us with a systematic method, based
on Darwinian evolution, for getting computers to automatically solve hard
real-life problems. To do so, it simply requires a high-level statement of
what needs to be done and enough computing power.
Turing also understood the need to evaluate objectively the behaviour exhibited by machines, to avoid human biases when assessing their intelligence.
This led him to propose an imitation game, now known as the Turing test for
machine intelligence, whose goals are wonderfully summarised by Samuels
position statement quoted in the introduction of this book (page 1). The
eight criteria for human competitiveness we discussed in Section 12.3 are
essentially motivated by the same goals.
At present GP is unable to produce computer programs that would pass
the full Turing test for machine intelligence, and it might not be ready
for this immense task for centuries. Nonetheless, thanks to the constant
improvements in GP technology, in its theoretical foundations and in computing power, GP has been able to solve dozens of difficult problems with
human-competitive results and to provide valuable solutions to many other
problems (see Chapter 12). These are a small step towards fulfilling Turing
and Samuels dreams, but they are also early signs of things to come. It is
reasonable to predict that in a few years time GP will be able to routinely
and competently solve important problems for us, in a variety of application
domains with human-competitive performance. Genetic programming will
then become an essential collaborator for many human activities. This will
be a remarkable step forward towards achieving true human-competitive
machine intelligence.
This field guide is an attempt to chart the terrain of techniques and
applications we have encountered in our journey in the world of genetic
programming. Much is still unmapped and undiscovered. We hope this
book will make it easier for other travellers to start many long and profitable
journeys in this exciting world.
If you have found this book to be useful, please feel free to redistribute it
(see page ii). Should you want to cite this book, please refer to the entry for
(Poli et al., 2008) in the bibliography.
Part IV
In the end we find that Mary does indeed have a little GP. . .
and the wolf is shown to have a very large bibliography.
143
Appendix A
Resources
The field of GP took off in the early 1990s, driven in significant part by
the publication of (Koza, 1992). Those early days were characterised by the
exponential growth common in the initial stages of successful technologies.
Many influential papers from that period can be found in the proceedings
of the International Conference on Genetic Algorithms (ICGA-93, ICGA95), the IEEE conferences on Evolutionary Computation (EC-1994), and
the Evolutionary Programming conferences. A surprisingly large number
of these are now available on-line, and weve included as many URLs as
we could in the bibliography.1 After almost twenty years, GP has matured
and is used in a wondrous array of applications from banking to betting,
from bomb detection to architectural design, from the steel industry to the
environment, from space to biology, and many others (as we have seen in
Section 12).
In 1996 it was possible to list almost all the studies and applications of
GP (Langdon, 1996), but today the range is far too great. In this appendix
we will review some of the wide variety of available sources on GP which
should assist readers who wish to explore further. Consulting information
available on the Web is certainly a good way to get quick answers for someone
who wants to know what GP is. These answers, however, will often be too
shallow for someone who really wants to then apply GP to solve practical
problems. People in this position should probably invest some time going
through more detailed accounts; some of the key books in the field include
(Banzhaf, Nordin, Keller, and Francone, 1998a; Koza, 1992; Langdon and
Poli, 2002), and others are listed in Section A.1. Technical papers in the
extensive GP literature may be the next stage. Although this literature is
easily accessible thanks to the complete on-line bibliography (Langdon et al.,
1995-2008), newcomers will often need to be selective in what they read. The
1 Each
included URL was tested and was operational at the time of writing.
145
146
A Resources
A.1
Key Books
A.2
147
Key Journals
In addition to GPs own Genetic Programming and Evolvable Machines journal, Evolutionary Computation, the IEEE transaction on Evolutionary Computation, Complex Systems (Complex Systems Publication, Inc.), the new
Journal on Artificial Evolution and Applications and many others publish
GP articles. The GP bibliography (Langdon et al., 1995-2008) lists a further
375 different journals worldwide that have published articles related to GP.
A.3
A.4
GP Implementations
148
A Resources
A.5
On-Line Resources
On-line resources appear, disappear, and move with great speed, so all the
addresses here (and elsewhere in the book), which were correct at the time
of writing, are obviously subject to change without notice after publication.
Hopefully, the most valuable resources should be readily findable using standard search tools.
One of the key on-line resources is the GP bibliography (Langdon et al.,
1995-2008).3 At the time of writing, this bibliography contains about 5,000
GP entries, roughly half of which can be downloaded immediately.4
paste error. Fortunately no published results were affected, but it was a very unsettling
experience.
3 http://www.cs.bham.ac.uk/~wbl/biblio/
4 The GP bibliography is a volunteer effort and depends crucially on submissions from
users. Authors are encouraged to check that their GP publications are listed, and send
missing entries to the bibliographys maintainers.
149
150
A Resources
Figure A.1: Co-authorship connections within GP. Each of the 1,141 dots
indicates an author, and edges link people who have co-authored one or
more GP papers. (To reduce clutter only links to first authors are shown.)
The size of each dot indicates the number of entries. The on-line version is
annotated using JavaScript and contains hyperlinks to authors and their
GP papers. The graph was created by GraphViz twopi, which tries to
place strongly connected people close together. This diagram displays just
the centrally connected component (Tomassini et al., 2007) and contains
approximately half of all GP papers. The remaining papers are not linked
by co-authorship to this graph. Several other large components are also
available on-line via the GP Bibliography (Langdon et al., 1995-2008).
Appendix B
TinyGP
TinyGP1 i s a highly optimised GP system that was originally developed to
meet the specifications set out in the TinyGP competition of the Genetic and
Evolutionary Computation Conference (GECCO) 2004. We include it as a
working example of a real GP system, to show that GP software tools are
not necessarily big, complex and difficult to understand. The system can be
used as is or can be modified or extended for a users specific applications.
Furthermore, TinyGP may serve as a guide to other implementations of
genetic programming.
The following section provides a description of the main characteristics
of TinyGP. Section B.2 describes the format for the input files for TinyGP.
Section B.3 provides further details on the implementation and the source
code for a Java version of TinyGP. Finally, Section B.4 describes a sample
run of the system.
There are numerous other GP systems available on-line. See Section A.4
for a discussion of some of the options.
B.1
Overview of TinyGP
151
152
B TinyGP
153
B.2
The input files for TinyGP have the following plain ASCII format:
HEADER
// See below
FITNESSCASE1 // The f i t n e s s c a s e s ( one p e r l i n e )
FITNESSCASE2
FITNESSCASE3
....
Each fitness case is of the form
X1 . . . XN TARGET
where X1 to XN represent a set of input values for a program, while
TARGET represents the desired output for the given inputs.
The header has the following entries
NVAR NRAND MINRAND MAXRAND NFITCASES
where NVAR is an integer representing the number of variables the system
should use, NRAND is an integer representing the number of random constants to be provided in the primitive set, MINRAND is a float representing
the lower limit of the range used to generate random constants, MAXRAND
is the corresponding upper limit, and NFITCASES is an integer representing the number of fitness cases. NRAND can be set to 0, in which case
MINRAND and MAXRAND are ignored. For example:
1 100 -5 5 63
0.0 0
154
B TinyGP
0.1 0.0998334166468282
0.2 0.198669330795061
0.3 0.29552020666134
....
55 LINES OMITTED
....
5.9 -0.373876664830236
6.0 -0.279415498198926
6.1 -0.182162504272095
6.2 -0.0830894028174964
These fitness cases are sin(x) for x {0.0, 0.1, 0.2, . . . 6.2}
B.3
Source Code
The original TinyGP system was implemented, in the C programming language, to maximise efficiency and minimise the size of the executable.2 The
version presented here is a Java re-implementation of TinyGP. The original
version did not allow the use of random numerical constants.
How does TinyGP work? The system is based on the standard flattened
(linear) representation for trees, which effectively corresponds to listing the
primitives in prefix notation but without any brackets. Each primitive occupies one byte. A program is simply a vector of characters. The parameters
of the system are as specified in Section B.1. They are fixed at compile time
through a series of static class variable assignments. The operators used
are subtree crossover and point mutation. The selection of the crossover
points is performed at random with uniform probability. The primitive set
and fitness function are as indicated above. The code uses recursion for the
creation of the initial population (grow), for the identification of the subtree
rooted at a particular crossover point (traverse), for program execution
(run), and for printing programs (print indiv). A small number of global
variables have been used. For example, the variable program is a program
counter used during the recursive interpretation of programs, which is automatically incremented every time a primitive is evaluated. Although using
global variables is normally considered bad programming practice, this was
done purposely, after extensive experimentation, to reduce the executables
size.
2 The C version of TinyGP is probably the worlds smallest tree-based symbolicregression GP system. The source code, in C, is 5,906 bytes. The original version included
a compilation script which, with a variety of tricks, created a self-extracting executable
occupying 2,871 bytes (while the actual size of the executable after self-extraction was
4,540 bytes). All optimisations in the code were aimed at bringing the executable size
(as opposed to the source code size) down, the main purpose being to show that, against
popular belief, it is possible to have really tiny and efficient GP systems.
155
The code reads command line arguments using the standard args array.
Generally the code is quite standard and should be self-explanatory for
anyone who can program in Java, whether or not they have implemented a
GP system before. Therefore very few comments have been provided in the
source code.
The source is provided below.
1
2
3
4
5
6
/
Program:
Author :
tiny gp . java
Riccardo Poli (email : rpoli@essex . ac .uk)
7
8
9
10
import j a v a . u t i l . ;
import j a v a . i o . ;
import j a v a . t e x t . DecimalFormat ;
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
public c l a s s t i n y g p {
double [ ] f i t n e s s ;
char [ ] [ ] pop ;
s t a t i c Random rd = new Random ( ) ;
static f i n a l int
ADD = 1 1 0 ,
SUB = 1 1 1 ,
MUL = 1 1 2 ,
DIV = 1 1 3 ,
FSET START = ADD,
FSET END = DIV ;
s t a t i c double [ ] x = new double [ FSET START ] ;
s t a t i c double minrandom , maxrandom ;
s t a t i c char [ ] program ;
s t a t i c i n t PC;
s t a t i c i n t varnumber , f i t n e s s c a s e s , randomnumber ;
s t a t i c double f b e s t p o p = 0 . 0 , f av g p op = 0 . 0 ;
s t a t i c long s e e d ;
s t a t i c double a v g l e n ;
static f i n a l int
MAX LEN = 1 0 0 0 0 ,
POPSIZE = 1 0 0 0 0 0 ,
DEPTH
= 5,
GENERATIONS = 1 0 0 ,
TSIZE = 2 ;
public s t a t i c f i n a l double
PMUT PER NODE = 0 . 0 5 ,
CROSSOVER PROB = 0 . 9 ;
s t a t i c double [ ] [ ] t a r g e t s ;
41
42
43
44
45
46
156
case ADD : return ( run ( ) + run ( ) ) ;
case SUB : return ( run ( ) run ( ) ) ;
case MUL : return ( run ( ) run ( ) ) ;
case DIV : {
double num = run ( ) , den = run ( ) ;
i f ( Math . abs ( den ) <= 0 . 0 0 1 )
return ( num ) ;
else
return ( num / den ) ;
}
}
return ( 0 . 0 ) ; // should never get here
47
48
49
50
51
52
53
54
55
56
57
58
59
B TinyGP
60
61
62
63
i n t t r a v e r s e ( char [ ] b u f f e r , i n t b u f f e r c o u n t ) {
i f ( b u f f e r [ b u f f e r c o u n t ] < FSET START )
return ( ++b u f f e r c o u n t ) ;
64
switch ( b u f f e r [ b u f f e r c o u n t ] ) {
case ADD:
case SUB :
case MUL:
case DIV :
return ( t r a v e r s e ( b u f f e r , t r a v e r s e ( b u f f e r , ++b u f f e r c o u n t
) ) );
}
return ( 0 ) ; // should never get here
65
66
67
68
69
70
71
72
73
74
75
76
77
78
void s e t u p f i t n e s s ( S t r i n g fname ) {
try {
int i , j ;
String line ;
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
BufferedReader in =
new B u f f e r e d R e a d e r (
new
F i l e R e a d e r ( fname ) ) ;
l i n e = in . readLine () ;
S t r i n g T o k e n i z e r t o k e n s = new S t r i n g T o k e n i z e r ( l i n e ) ;
varnumber = I n t e g e r . p a r s e I n t ( t o k e n s . nextToken ( ) . t r i m ( ) ) ;
randomnumber = I n t e g e r . p a r s e I n t ( t o k e n s . nextToken ( ) . t r i m ( ) )
;
minrandom =
Double . p a r s e D o u b l e ( t o k e n s . nextToken ( ) .
trim ( ) ) ;
maxrandom = Double . p a r s e D o u b l e ( t o k e n s . nextToken ( ) . t r i m ( ) )
;
f i t n e s s c a s e s = I n t e g e r . p a r s e I n t ( t o k e n s . nextToken ( ) . t r i m ( ) )
;
t a r g e t s = new double [ f i t n e s s c a s e s ] [ varnumber + 1 ] ;
i f ( varnumber + randomnumber >= FSET START )
System . out . p r i n t l n ( t o o many v a r i a b l e s and c o n s t a n t s ) ;
94
95
f o r ( i = 0 ; i < f i t n e s s c a s e s ; i ++ ) {
l i n e = in . readLine () ;
t o k e n s = new S t r i n g T o k e n i z e r ( l i n e ) ;
f o r ( j = 0 ; j <= varnumber ; j ++) {
t a r g e t s [ i ] [ j ] = Double . p a r s e D o u b l e ( t o k e n s . nextToken ( ) .
trim ( ) ) ;
}
96
97
98
99
100
}
in . close () ;
101
102
}
catch ( F i l e N o t F o u n d E x c e p t i o n e ) {
System . out . p r i n t l n ( ERROR: P l e a s e p r o v i d e a data f i l e ) ;
System . e x i t ( 0 ) ;
}
catch ( E x c e p t i o n e ) {
System . out . p r i n t l n ( ERROR: I n c o r r e c t data f o r m a t ) ;
System . e x i t ( 0 ) ;
}
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
l e n = t r a v e r s e ( Prog , 0 ) ;
f o r ( i = 0 ; i < f i t n e s s c a s e s ; i ++ ) {
f o r ( i n t j = 0 ; j < varnumber ; j ++ )
x [ j ] = targets [ i ] [ j ] ;
program = Prog ;
PC = 0 ;
r e s u l t = run ( ) ;
f i t += Math . abs ( r e s u l t t a r g e t s [ i ] [ varnumber ] ) ;
}
return( f i t ) ;
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
i f ( pos == 0 )
prim = 1 ;
138
139
140
141
142
143
144
145
146
147
148
157
i f ( prim == 0 | | depth == 0 ) {
prim = ( char ) rd . n e x t I n t ( varnumber + randomnumber ) ;
b u f f e r [ pos ] = prim ;
return ( pos +1) ;
}
else {
prim = ( char ) ( rd . n e x t I n t (FSET END FSET START + 1 ) +
FSET START) ;
switch ( prim ) {
case ADD:
case SUB :
158
case MUL:
case DIV :
b u f f e r [ pos ] = prim ;
return ( grow ( b u f f e r , grow ( b u f f e r , pos +1 , max , depth 1) ,
max , depth 1 ) ) ;
}
149
150
151
152
153
154
}
return ( 0 ) ;
155
156
157
B TinyGP
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
i n t p r i n t i n d i v ( char [ ] b u f f e r , i n t b u f f e r c o u n t e r ) {
i n t a1 =0 , a2 ;
i f ( b u f f e r [ b u f f e r c o u n t e r ] < FSET START ) {
i f ( b u f f e r [ b u f f e r c o u n t e r ] < varnumber )
System . out . p r i n t ( X+ ( b u f f e r [ b u f f e r c o u n t e r ] + 1 )+
);
else
System . out . p r i n t ( x [ b u f f e r [ b u f f e r c o u n t e r ] ] ) ;
return ( ++b u f f e r c o u n t e r ) ;
}
switch ( b u f f e r [ b u f f e r c o u n t e r ] ) {
case ADD: System . out . p r i n t ( ( ) ;
a1=p r i n t i n d i v ( b u f f e r , ++b u f f e r c o u n t e r ) ;
System . out . p r i n t ( + ) ;
break ;
case SUB : System . out . p r i n t ( ( ) ;
a1=p r i n t i n d i v ( b u f f e r , ++b u f f e r c o u n t e r ) ;
System . out . p r i n t ( ) ;
break ;
case MUL: System . out . p r i n t ( ( ) ;
a1=p r i n t i n d i v ( b u f f e r , ++b u f f e r c o u n t e r ) ;
System . out . p r i n t ( ) ;
break ;
case DIV : System . out . p r i n t ( ( ) ;
a1=p r i n t i n d i v ( b u f f e r , ++b u f f e r c o u n t e r ) ;
System . out . p r i n t ( / ) ;
break ;
}
a2=p r i n t i n d i v ( b u f f e r , a1 ) ;
System . out . p r i n t ( ) ) ;
return ( a2 ) ;
}
190
191
192
193
194
195
196
197
198
199
200
201
while ( l e n < 0 )
l e n = grow ( b u f f e r , 0 , MAX LEN, depth ) ;
159
i n d = new char [ l e n ] ;
202
203
System . a r r a y c o p y ( b u f f e r , 0 , ind , 0 , l e n ) ;
return ( i n d ) ;
204
205
206
207
208
209
210
211
f o r ( i = 0 ; i < n ; i ++ ) {
pop [ i ] = c r e a t e r a n d o m i n d i v ( depth ) ;
f i t n e s s [ i ] = f i t n e s s f u n c t i o n ( pop [ i ] ) ;
}
return ( pop ) ;
212
213
214
215
216
217
218
219
220
221
222
223
224
225
f o r ( i = 0 ; i < POPSIZE ; i ++ ) {
n o d e c o u n t += t r a v e r s e ( pop [ i ] , 0 ) ;
fa v gp o p += f i t n e s s [ i ] ;
i f ( f i t n e s s [ i ] > fbestpop ) {
best = i ;
fbestpop = f i t n e s s [ i ] ;
}
}
a v g l e n = ( double ) n o d e c o u n t / POPSIZE ;
fa vgp o p /= POPSIZE ;
System . out . p r i n t ( G e n e r a t i o n=+gen+ Avg F i t n e s s=+(fa v gp o p
)+
Best F i t n e s s=+(f b e s t p o p )+ Avg S i z e=+
a v g l e n+
\ nBest I n d i v i d u a l : ) ;
p r i n t i n d i v ( pop [ b e s t ] , 0 ) ;
System . out . p r i n t ( \n ) ;
System . out . f l u s h ( ) ;
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
i n t tournament ( double [ ] f i t n e s s , i n t t s i z e ) {
i n t b e s t = rd . n e x t I n t (POPSIZE) , i , c o m p e t i t o r ;
double f b e s t = 1.0 e34 ;
247
248
249
250
251
252
253
f o r ( i = 0 ; i < t s i z e ; i ++ ) {
c o m p e t i t o r = rd . n e x t I n t (POPSIZE) ;
i f ( f i t n e s s [ competitor ] > f b e s t ) {
f b e s t = f i t n e s s [ competitor ] ;
best = competitor ;
}
160
}
return ( b e s t ) ;
254
255
256
B TinyGP
257
258
259
260
i n t n e g a t i v e t o u r n a m e n t ( double [ ] f i t n e s s , i n t t s i z e ) {
i n t w o r s t = rd . n e x t I n t (POPSIZE) , i , c o m p e t i t o r ;
double f w o r s t = 1 e34 ;
261
f o r ( i = 0 ; i < t s i z e ; i ++ ) {
c o m p e t i t o r = rd . n e x t I n t (POPSIZE) ;
i f ( f i t n e s s [ competitor ] < fworst ) {
fworst = f i t n e s s [ competitor ] ;
worst = competitor ;
}
}
return ( w o r s t ) ;
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
x o 1 s t a r t = rd . n e x t I n t ( l e n 1 ) ;
xo1end = t r a v e r s e ( p a r e n t 1 , x o 1 s t a r t ) ;
279
280
281
x o 2 s t a r t = rd . n e x t I n t ( l e n 2 ) ;
xo2end = t r a v e r s e ( p a r e n t 2 , x o 2 s t a r t ) ;
282
283
284
l e n o f f = x o 1 s t a r t + ( xo2end x o 2 s t a r t ) + ( l e n 1 xo1end ) ;
285
286
o f f s p r i n g = new char [ l e n o f f ] ;
287
288
System . a r r a y c o p y ( p a r e n t 1 , 0 , o f f s p r i n g , 0 , x o 1 s t a r t ) ;
System . a r r a y c o p y ( p a r e n t 2 , x o 2 s t a r t , o f f s p r i n g , x o 1 s t a r t ,
( xo2end x o 2 s t a r t ) ) ;
System . a r r a y c o p y ( p a r e n t 1 , xo1end , o f f s p r i n g ,
x o 1 s t a r t + ( xo2end x o 2 s t a r t ) ,
( l e n 1 xo1end ) ) ;
289
290
291
292
293
294
295
return ( o f f s p r i n g ) ;
296
297
298
299
300
301
302
303
304
305
306
307
308
System . a r r a y c o p y ( p a r e n t , 0 , p a r e n t c o p y , 0 , l e n ) ;
f o r ( i = 0 ; i < l e n ; i ++ ) {
i f ( rd . nextDouble ( ) < pmut ) {
mutsite = i ;
i f ( p a r e n t c o p y [ m u t s i t e ] < FSET START )
309
310
311
312
313
314
315
316
317
318
319
320
}
return ( p a r e n t c o p y ) ;
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
void p r i n t p a r m s ( ) {
System . out . p r i n t ( TINY GP ( Java v e r s i o n ) \n ) ;
System . out . p r i n t ( SEED=+s e e d+ \nMAX LEN=+MAX LEN+
\nPOPSIZE=+POPSIZE+ \nDEPTH=+DEPTH+
\nCROSSOVER PROB=+CROSSOVER PROB+
\nPMUT PER NODE=+PMUT PER NODE+
\nMIN RANDOM=+minrandom+
\nMAX RANDOM=+maxrandom+
\nGENERATIONS=+GENERATIONS+
\nTSIZE=+TSIZE+
\n\n ) ;
}
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
void e v o l v e ( ) {
i n t gen = 0 , i n d i v s , o f f s p r i n g , p a r e n t 1 , p a r e n t 2 , p a r e n t ;
double n e w f i t ;
char [ ] newind ;
print parms () ;
s t a t s ( f i t n e s s , pop , 0 ) ;
f o r ( gen = 1 ; gen < GENERATIONS; gen ++ ) {
if (
f b e s t p o p > 1e5 ) {
System . out . p r i n t ( PROBLEM SOLVED\n ) ;
System . e x i t ( 0 ) ;
}
f o r ( i n d i v s = 0 ; i n d i v s < POPSIZE ; i n d i v s ++ ) {
i f ( rd . nextDouble ( ) > CROSSOVER PROB ) {
p a r e n t 1 = tournament ( f i t n e s s , TSIZE ) ;
p a r e n t 2 = tournament ( f i t n e s s , TSIZE ) ;
161
162
B TinyGP
newind = c r o s s o v e r ( pop [ p a r e n t 1 ] , pop [ p a r e n t 2 ] ) ;
}
else {
p a r e n t = tournament ( f i t n e s s , TSIZE ) ;
newind = mutation ( pop [ p a r e n t ] , PMUT PER NODE ) ;
}
n e w f i t = f i t n e s s f u n c t i o n ( newind ) ;
o f f s p r i n g = n e g a t i v e t o u r n a m e n t ( f i t n e s s , TSIZE ) ;
pop [ o f f s p r i n g ] = newind ;
f i t n e s s [ offspring ] = newfit ;
}
s t a t s ( f i t n e s s , pop , gen ) ;
364
365
366
367
368
369
370
371
372
373
374
375
}
System . out . p r i n t ( PROBLEM NOT SOLVED\n ) ;
System . e x i t ( 1 ) ;
376
377
378
379
380
381
382
383
384
i f ( a r g s . l e n g t h == 2 ) {
s = I n t e g e r . valueOf ( args [ 0 ] ) . intValue ( ) ;
fname = a r g s [ 1 ] ;
}
i f ( a r g s . l e n g t h == 1 ) {
fname = a r g s [ 0 ] ;
}
385
386
387
388
389
390
391
392
t i n y g p gp = new t i n y g p ( fname , s ) ;
gp . e v o l v e ( ) ;
393
394
395
396
};
B.4
It is very common, nowadays, for people to write and execute code within
some development environment. Each has its own way of doing these operations, but the process is typically very straightforward.
If one wants to compile TinyGP from the operating systems shell, this
can be done by issuing the command javac -O tiny gp.java. This applies
to both Unix and Windows users. Windows users will have to click on
StartRun and then issue the command cmd to launch a shell. Of course,
if the javac Java compiler and/or the tiny gp.java source file are not in
the current directory/folder, then full path names must be provided when
issuing the compilation command.
If the dataset is stored in a file problem.dat, the program can then
simply be launched with the command java tiny gp. Otherwise, the user
can specify a different datafile on the command line, by giving the command
163
java tiny gp FILE, where FILE is the dataset file name (which can include
the full path to the file). Finally, the user can specify both the datafile and
a seed for the random number generator on the command line, by giving
the command java tiny gp SEED FILE, where SEED is an integer.
As an example, we ran TinyGP on the sin(x) dataset described in Section B.2 (which is available at http://cswww.essex.ac.uk/staff/rpoli/
TinyGP/sin-data.txt). The output produced by the program was something like the following
TINY GP ( Java v e r s i o n )
SEED=1
MAX LEN=10000
POPSIZE=100000
DEPTH=5
CROSSOVER PROB=0.9
PMUT PER NODE=0.05
MIN RANDOM=5.0
MAXRANDOM=5.0
GENERATIONS=100
TSIZE=2
/ ( ( 2 . 7 6 6 0 9 7 8 9 9 9 5 4 3 8 3 (X1 / ( ( ( X1 / ( ( ( ( X1 / (X1
3.200 1163 763204445) ) X1 ) 3.2001163763204445 )
3.200 1163 763204445) ) + X1 ) + (X1 (X1
3 . 9 5 3 2 4 3 6 9 3 8 9 5 4 3 7 6 ) ) ) ) ) ( ( ( X1 X1 ) / ( ( ( X1 /
( 3 . 9 5 3 2 4 3 6 9 3 8 9 5 4 3 7 6 3.20011637632 04 445) ) X1 )
3.200 1163 763204445) ) / ( ( ( X1 + X1 ) / (X1 X1 ) ) + X1 ) )
))
164
B TinyGP
1000
Avg Fitness
Best Fitness
Fitness
100
10
20
40
60
80
100
Generations
100
Avg Size
90
80
Average Size
70
60
50
40
30
20
10
20
40
60
80
100
Generations
2
sin(x)
GP (gen=99)
1.5
0.5
-0.5
-1
-1.5
-2
Figure B.1: Final generation of a TinyGP sample run: best and mean
fitness (top), mean program size (middle) and behaviour of the best-so-far
individual (bottom).
165
x2
2
d xe
( x2 +x)
where
a = 2.76609789995
b = 10.240744822
c = 3.9532436939
d = 3.20011637632
e = 12.6508398844
Hardly an obvious approximation for the sine function, but still a very accurate one, at least over the test range.
Bibliography
H. Abbass, N. Hoai, and R. McKay. Anttag: A new method to compose computer
programs using colonies of ants. In IEEE Congress on Evolutionary Computation,
2002., 2002. URL http://citeseer.ist.psu.edu/abbass02anttag.html.
A.-C. Achilles and P. Ortyl. The Collection of Computer Science Bibliographies, 19952008. URL http://liinwww.ira.uka.de/bibliography/.
G. Adorni, S. Cagnoni, and M. Mordonini. Efficient low-level vision program design using sub-machine-code genetic programming.
In M. Gori, editor, AIIA
2002, Workshop sulla Percezione e Visione nelle Macchine, Siena, Italy, 10-13
September 2002. URL http://www-dii.ing.unisi.it/aiia2002/paper/PERCEVISIO/
adorni-aiia02.pdf.
GPBiB
A. Agapitos, J. Togelius, and S. M. Lucas. Multiobjective techniques for the use of state
in genetic programming applied to simulated car racing. In D. Srinivasan and L. Wang,
editors, 2007 IEEE Congress on Evolutionary Computation, pages 15621569, Singapore, 25-28 September 2007. IEEE Computational Intelligence Society, IEEE Press.
ISBN 1-4244-1340-0.
GPBiB
S. H. Al-Sakran, J. R. Koza, and L. W. Jones. Automated re-invention of a previously
patented optical lens system using genetic programming. In M. Keijzer, et al., editors,
Proceedings of the 8th European Conference on Genetic Programming, volume 3447 of
Lecture Notes in Computer Science, pages 2537, Lausanne, Switzerland, 30 March 1 April 2005. Springer. ISBN 3-540-25436-6. URL http://springerlink.metapress.
com/openurl.asp?genre=article&issn=0302-9743&volume=3447&spage=25.
GPBiB
R. Aler, D. Borrajo, and P. Isasi. Using genetic programming to learn and improve
control knowledge. Artificial Intelligence, 141(1-2):2956, October 2002. URL http:
//scalab.uc3m.es/~dborrajo/papers/aij-evock.ps.gz.
GPBiB
J. Allen, H. M. Davey, D. Broadhurst, J. K. Heald, J. J. Rowland, S. G. Oliver, and
D. B. Kell. High-throughput classification of yeast mutants for functional genomics
using metabolic footprinting. Nature Biotechnology, 21(6):692696, June 2003. URL
http://dbkgroup.org/Papers/NatureBiotechnology21(692-696).pdf.
GPBiB
L. Alonso and R. Schott. Random Generation of Trees. Kluwer Academic Publishers,
Boston, MA, USA, 1995. ISBN 0-7923-9528-X.
L. Altenberg. Emergent phenomena in genetic programming. In A. V. Sebald and L. J.
Fogel, editors, Evolutionary Programming Proceedings of the Third Annual Conference, pages 233241, San Diego, CA, USA, 24-26 February 1994. World Scientific Publishing. ISBN 981-02-1810-9. URL http://dynamics.org/~altenber/PAPERS/EPIGP/.
GPBiB
167
168
BIBLIOGRAPHY
BIBLIOGRAPHY
169
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=0)
1.5
80
1
100
0.5
60
Fitness
Generation 0
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
170
BIBLIOGRAPHY
W. Banzhaf and W. B. Langdon. Some considerations on the reason for bloat. Genetic
Programming and Evolvable Machines, 3(1):8191, March 2002. ISSN 1389-2576. URL
http://web.cs.mun.ca/~banzhaf/papers/genp_bloat.pdf.
GPBiB
W. Banzhaf. Genetic programming for pedestrians. In S. Forrest, editor, Proceedings
of the 5th International Conference on Genetic Algorithms, ICGA-93, page 628,
University of Illinois at Urbana-Champaign, 17-21 July 1993. Morgan Kaufmann.
URL http://www.cs.ucl.ac.uk/staff/W.Langdon/ftp/ftp.io.com/papers/GenProg_
forPed.ps.Z.
GPBiB
W. Banzhaf, F. D. Francone, and P. Nordin. The effect of extensive use of the mutation
operator on generalization in genetic programming using sparse data sets. In H.-M.
Voigt, et al., editors, Parallel Problem Solving from Nature IV, Proceedings of the
International Conference on Evolutionary Computation, volume 1141 of LNCS, pages
300309, Berlin, Germany, 22-26 September 1996. Springer Verlag. ISBN 3-540-61723X.
GPBiB
W. Banzhaf, P. Nordin, R. E. Keller, and F. D. Francone. Genetic Programming
An Introduction; On the Automatic Evolution of Computer Programs and its Applications. Morgan Kaufmann, San Francisco, CA, USA, January 1998a. ISBN 155860-510-X. URL http://www.elsevier.com/wps/find/bookdescription.cws_home/
677869/description#description.
GPBiB
W. Banzhaf, R. Poli, M. Schoenauer, and T. C. Fogarty, editors. Genetic Programming,
volume 1391 of LNCS, Paris, 14-15 April 1998b. Springer-Verlag. ISBN 3-540-643605.
URL http://www.springer.de/cgi-bin/search_book.pl?isbn=3-540-64360-5.
GPBiB
W. Banzhaf, J. Daida, A. E. Eiben, M. H. Garzon, V. Honavar, M. Jakiela, and R. E.
Smith, editors. GECCO-99: Proceedings of the Genetic and Evolutionary Computation Conference, Orlando, Florida, USA, 13-17 July 1999. Morgan Kaufmann.
ISBN 1-55860-611-4. URL http://www.amazon.com/exec/obidos/ASIN/1558606114/
qid%3D977054373/105-7666192-3217523.
GPBiB
G. J. Barlow. Design of autonomous navigation controllers for unmanned aerial vehicles
using multi-objective genetic programming. Masters thesis, North Carolina State University, Raleigh, NC, USA, March 2004. URL http://www.andrew.cmu.edu/user/gjb/
includes/publications/thesis/barlow2004-thesis/barlow2004-thesis.pdf. GPBiB
S. J. Barrett. Recurring analytical problems within drug discovery and development. In
T. Scheffer and U. Leser, editors, Data Mining and Text Mining for Bioinformatics:
Proceedings of the European Workshop, pages 67, Dubrovnik, Croatia, 22 September 2003. URL http://www2.informatik.hu-berlin.de/~scheffer/publications/
ProceedingsWS2003.pdf. Invited talk.
GPBiB
S. J. Barrett and W. B. Langdon. Advances in the application of machine learning
techniques in drug discovery, design and development. In A. Tiwari, et al., editors,
Applications of Soft Computing: Recent Trends, Advances in Soft Computing, pages
99110, On the World Wide Web, 19 September - 7 October 2005 2006. Springer. ISBN
ISBN 3-540-29123-7. URL http://www.cs.ucl.ac.uk/staff/W.Langdon/ftp/papers/
barrett_2005_WSC.pdf.
GPBiB
T. Bartz-Beielstein. Experimental research in evolutionary computation : the new experimentalism. Springer, 2006.
BIBLIOGRAPHY
171
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=1)
1.5
80
1
100
0.5
60
Fitness
Generation 1
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
172
BIBLIOGRAPHY
BIBLIOGRAPHY
173
Genetic and evolutionary computation, volume 2, pages 15591565, London, 7-11 July
2007. ACM Press. URL http://www.cs.bham.ac.uk/~wbl/biblio/gecco2007/docs/
p1559.pdf.
GPBiB
B. F. Buxton, W. B. Langdon, and S. J. Barrett. Data fusion by intelligent classifier
combination. Measurement and Control, 34(8):229234, October 2001. URL http:
//www.cs.ucl.ac.uk/staff/W.Langdon/mc/.
GPBiB
W. Cai, A. Pacheco-Vega, M. Sen, and K. T. Yang. Heat transfer correlations by symbolic
regression. International Journal of Heat and Mass Transfer, 49(23-24):43524359,
November 2006.
GPBiB
E. Cant
u-Paz, J. A. Foster, K. Deb, L. Davis, R. Roy, U.-M. OReilly, H.-G. Beyer, R. K.
Standish, G. Kendall, S. W. Wilson, M. Harman, J. Wegener, D. Dasgupta, M. A.
Potter, A. C. Schultz, K. A. Dowsland, N. Jonoska, and J. F. Miller, editors. Genetic
and Evolutionary Computation GECCO 2003, Part I, volume 2723 of Lecture Notes
in Computer Science, Chicago, IL, USA, 12-16 July 2003. Springer. ISBN 3-540-406026.
GPBiB
F. Castillo, A. Kordon, and G. Smits. Robust pareto front genetic programming parameter selection based on design of experiments and industrial data. In R. L. Riolo, et al.,
editors, Genetic Programming Theory and Practice IV, volume 5 of Genetic and Evolutionary Computation, chapter 2, pages . Springer, Ann Arbor, 11-13 May 2006a.
ISBN 0-387-33375-4.
GPBiB
F. Castillo, A. Kordon, G. Smits, B. Christenson, and D. Dickerson. Pareto front genetic programming parameter selection based on design of experiments and industrial
data. In M. Keijzer, et al., editors, GECCO 2006: Proceedings of the 8th annual
conference on Genetic and evolutionary computation, volume 2, pages 16131620,
Seattle, Washington, USA, 8-12 July 2006b. ACM Press. ISBN 1-59593-186-4. URL
http://www.cs.bham.ac.uk/~wbl/biblio/gecco2006/docs/p1613.pdf.
GPBiB
M. Chami and D. Robilliard. Inversion of oceanic constituents in case I and II waters with
genetic programming algorithms. Applied Optics, 41(30):62606275, October 2002.
URL http://ao.osa.org/ViewMedia.cfm?id=70258&seq=0.
GPBiB
A. Channon. Unbounded evolutionary dynamics in a system of agents that actively process and transform their environment. Genetic Programming and Evolvable Machines,
7(3):253281, October 2006. ISSN 1389-2576.
D. L. Chao and S. Forrest. Information immune systems. Genetic Programming and
Evolvable Machines, 4(4):311331, December 2003. ISSN 1389-2576.
S. M. Cheang, K. S. Leung, and K. H. Lee. Genetic parallel programming: Design and
implementation. Evolutionary Computation, 14(2):129156, Summer 2006. ISSN 10636560.
GPBiB
K. Chellapilla. Evolving computer programs without subtree crossover. IEEE Transactions on Evolutionary Computation, 1(3):209216, September 1997a.
GPBiB
K. Chellapilla. Evolutionary programming with tree mutations: Evolving computer programs without crossover. In J. R. Koza, et al., editors, Genetic Programming 1997:
Proceedings of the Second Annual Conference, pages 431438, Stanford University, CA,
USA, 13-16 July 1997b. Morgan Kaufmann.
100
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=2)
1.5
80
1
100
0.5
60
Fitness
Generation 2
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
174
BIBLIOGRAPHY
BIBLIOGRAPHY
175
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=3)
1.5
80
1
100
0.5
60
Fitness
Generation 3
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
176
BIBLIOGRAPHY
BIBLIOGRAPHY
177
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=4)
1.5
80
1
100
0.5
60
Fitness
Generation 4
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
178
BIBLIOGRAPHY
C. Dimopoulos. A genetic programming methodology for the solution of the multiobjective cell-formation problem. In H.-D. Cheng, editor, Proceedings of the 8th Joint
Conference in Information Systems (JCIS 2005), pages 14871494, Salt Lake City,
USA, 21-25 July 2005.
GPBiB
J. U. Dolinsky, I. D. Jenkinson, and G. J. Colquhoun. Application of genetic programming
to the calibration of industrial robots. Computers in Industry, 58(3):255264, April
2007. ISSN 0166-3615.
GPBiB
R. P. Domingos, R. Schirru, and A. S. Martinez. Soft computing systems applied to
PWRs xenon. Progress in Nuclear Energy, 46(3-4):297308, 2005.
GPBiB
M. Dorigo and T. St
utzle. Ant Colony Optimization. MIT Press (Bradford Books), 2004.
D. C. Dracopoulos. Evolutionary Learning Algorithms for Neural Adaptive Control. Perspectives in Neural Computing. Springer Verlag, P.O. Box 31 13 40, D-10643 Berlin,
Germany, August 1997. ISBN 3-540-76161-6. URL http://www.springer.de/catalog/
html-files/deutsch/comp/3540761616.html.
GPBiB
S. Draves. The electric sheep. SIGEVOlution, 1(2):1016, 2006. ISSN 1931-8499.
S. Droste, T. Jansen, G. Rudolph, H.-P. Schwefel, K. Tinnefeld, and I. Wegener. Theory
of evolutionary algorithms and genetic programming. In H.-P. Schwefel, et al., editors,
Advances in Computational Intelligence: Theory and Practice, Natural Computing
Series, chapter 5, pages 107144. Springer, 2003. ISBN 3-540-43269-8.
GPBiB
M. Ebner, M. Reinhardt, and J. Albert. Evolution of vertex and pixel shaders. In
M. Keijzer, et al., editors, Proceedings of the 8th European Conference on Genetic
Programming, volume 3447 of Lecture Notes in Computer Science, pages 261270,
Lausanne, Switzerland, 30 March - 1 April 2005. Springer. ISBN 3-540-254366.
URL http://springerlink.metapress.com/openurl.asp?genre=article&issn=
0302-9743&volume=3447&spage=261.
GPBiB
M. Ebner, M. ONeill, A. Ek
art, L. Vanneschi, and A. I. Esparcia-Alc
azar, editors. Proceedings of the 10th European Conference on Genetic Programming, volume 4445 of
Lecture Notes in Computer Science, Valencia, Spain, 11 - 13 April 2007. Springer. ISBN
3-540-71602-5. URL http://www.springerlink.com/content/978-3-540-71602-0/.
GPBiB
EC-Digest, 1985-2008. URL http://ec-digest.research.ucf.edu/.
A. E. Eiben and J. E. Smith. Introduction to Evolutionary Computing. Springer, 2003.
ISBN 3-540-40184-9.
GPBiB
A. Ekart and S. Z. Nemeth. Selection based on the pareto nondomination criterion for
controlling code growth in genetic programming. Genetic Programming and Evolvable
Machines, 2(1):6173, March 2001. ISSN 1389-2576.
GPBiB
S. E. Eklund. A massively parallel architecture for linear machine code genetic programming. In Y. Liu, et al., editors, Evolvable Systems: From Biology to Hardware: Proceedings of 4th International Conference, ICES 2001, volume 2210 of
Lecture Notes in Computer Science, pages 216224, Tokyo, Japan, October 35 2001. Springer-Verlag. URL http://www.springerlink.com/openurl.asp?genre=
article&issn=0302-9743&volume=2210&spage=216.
GPBiB
S. E. Eklund. A massively parallel GP engine in VLSI. In D. B. Fogel, et al., editors,
Proceedings of the 2002 Congress on Evolutionary Computation CEC2002, pages 629
633. IEEE Press, 2002. ISBN 0-7803-7278-6.
GPBiB
BIBLIOGRAPHY
179
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=5)
1.5
80
1
100
0.5
60
Fitness
Generation 5
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
180
BIBLIOGRAPHY
BIBLIOGRAPHY
181
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=6)
1.5
80
1
100
0.5
60
Fitness
Generation 6
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
182
BIBLIOGRAPHY
on Molecular Nanotechnology, Westin Hotel in Santa Clara, CA, USA, November 1215, 1998 1998. URL http://www.foresight.org/Conferences/MNT6/Papers/Globus/
index.html.
GPBiB
D. E. Goldberg. Genetic Algorithms in Search Optimization and Machine Learning.
Addison-Wesley, 1989.
R. Goodacre and R. J. Gilbert. The detection of caffeine in a variety of beverages using
curie-point pyrolysis mass spectrometry and genetic programming. The Analyst, 124:
10691074, 1999.
GPBiB
R. Goodacre. Explanatory analysis of spectroscopic data using machine learning of simple, interpretable rules. Vibrational Spectroscopy, 32(1):3345, 5 August 2003. URL
http://www.biospec.net/learning/Metab06/Goodacre-FTIRmaps.pdf. A collection of
Papers Presented at Shedding New Light on Disease: Optical Diagnostics for the New
Millennium (SPEC 2002) Reims, France 23-27 June 2002.
GPBiB
R. Goodacre, B. Shann, R. J. Gilbert, E. M. Timmins, A. C. McGovern, B. K.
Alsberg, D. B. Kell, and N. A. Logan. The detection of the dipicolinic acid
biomarker in bacillus spores using curie-point pyrolysis mass spectrometry and fouriertransform infrared spectroscopy. Analytical Chemistry, 72(1):119127, 1 January
2000. URL http://pubs.acs.org/cgi-bin/article.cgi/ancham/2000/72/i01/html/
ac990661i.html.
GPBiB
R. Goodacre, S. Vaidyanathan, W. B. Dunn, G. G. Harrigan, and D. B. Kell.
Metabolomics by numbers: acquiring and understanding global metabolite data.
Trends in Biotechnology, 22(5):245252, 1 May 2004. URL http://dbkgroup.org/
Papers/trends%20in%20biotechnology_22_(245).pdf.
GPBiB
F. Gruau. Neural Network Synthesis using Cellular Encoding and the Genetic Algorithm.
PhD thesis, Laboratoire de lInformatique du Parallilisme, Ecole Normale Supirieure de
Lyon, France, 1994. URL ftp://ftp.ens-lyon.fr/pub/LIP/Rapports/PhD/PhD1994/
PhD1994-01-E.ps.Z.
GPBiB
F. Gruau and D. Whitley. Adding learning to the cellular development process: a comparative study. Evolutionary Computation, 1(3):213233, 1993.
GPBiB
F. Gruau. Genetic micro programming of neural networks. In K. E. Kinnear, Jr., editor,
Advances in Genetic Programming, chapter 24, pages 495518. MIT Press, 1994. URL
http://cognet.mit.edu/library/books/view?isbn=0262111888.
GPBiB
F. Gruau. On using syntactic constraints with genetic programming. In P. J. Angeline
and K. E. Kinnear, Jr., editors, Advances in Genetic Programming 2, chapter 19, pages
377394. MIT Press, Cambridge, MA, USA, 1996. ISBN 0-262-01158-1.
GPBiB
S. M. Gustafson. An Analysis of Diversity in Genetic Programming. PhD thesis, School
of Computer Science and Information Technology, University of Nottingham, Nottingham, England, February 2004. URL http://www.cs.nott.ac.uk/~smg/research/
publications/phdthesis-gustafson.pdf.
GPBiB
S. M. Gustafson and E. K. Burke. The speciating island model: An alternative parallel
evolutionary algorithm. Journal of Parallel and Distributed Computing, 66(8):1025
1036, August 2006. Parallel Bioinspired Algorithms.
GPBiB
S. M. Gustafson, E. K. Burke, and N. Krasnogor. On improving genetic programming
for symbolic regression. In D. Corne, et al., editors, Proceedings of the 2005 IEEE
Congress on Evolutionary Computation, volume 1, pages 912919, Edinburgh, UK,
2-5 September 2005. IEEE Press. ISBN 0-7803-9363-5.
GPBiB
BIBLIOGRAPHY
183
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=7)
1.5
80
1
100
0.5
60
Fitness
Generation 7
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
184
BIBLIOGRAPHY
BIBLIOGRAPHY
185
M. Hinchliffe, M. Willis, and M. Tham. Chemical process sytems modelling using multiobjective genetic programming. In J. R. Koza, et al., editors, Genetic Programming
1998: Proceedings of the Third Annual Conference, pages 134139, University of Wisconsin, Madison, Wisconsin, USA, 22-25 July 1998. Morgan Kaufmann. ISBN 1-55860548-7.
GPBiB
M. P. Hinchliffe and M. J. Willis.
Dynamic systems modelling using genetic programming.
Computers & Chemical Engineering, 27(12):18411854,
2003. URL http://www.sciencedirect.com/science/article/B6TFT-49MDYGW-2/2/
742bcc7f22240c7a0381027aa5ff7e73.
GPBiB
S.-Y. Ho, C.-H. Hsieh, H.-M. Chen, and H.-L. Huang. Interpretable gene expression
classifier with an accurate and compact fuzzy rule base for microarray data analysis.
Biosystems, 85(3):165176, September 2006.
GPBiB
N. X. Hoai and R. I. McKay. Softening the structural difficulty in genetic programming
with TAG-based representation and insertion/deletion operators. In K. Deb, et al.,
editors, Genetic and Evolutionary Computation GECCO-2004, Part II, volume 3103
of Lecture Notes in Computer Science, pages 605616, Seattle, WA, USA, 26-30 June
2004. Springer-Verlag. ISBN 3-540-22343-6. URL http://link.springer.de/link/
service/series/0558/bibs/3103/31030605.htm.
GPBiB
N. X. Hoai, R. I. McKay, and H. A. Abbass. Tree adjoining grammars, language bias,
and genetic programming. In C. Ryan, et al., editors, Genetic Programming, Proceedings of EuroGP2003, volume 2610 of LNCS, pages 335344, Essex, 14-16 April 2003.
Springer-Verlag. ISBN 3-540-00971-X. URL http://www.cs.adfa.edu.au/~abbass/
publications/hardcopies/TAG3P-EuroGp-03.pdf.
GPBiB
N. X. Hoai, R. I. B. McKay, and D. Essam. Representation and structural difficulty in
genetic programming. IEEE Transactions on Evolutionary Computation, 10(2):157
166, April 2006. URL http://sc.snu.ac.kr/courses/2006/fall/pg/aai/GP/nguyen/
Structdiff.pdf.
GPBiB
N. X. Hoai, R. I. McKay, D. Essam, and H. T. Hao. Genetic transposition in treeadjoining grammar guided genetic programming: The duplication operator. In
M. Keijzer, et al., editors, Proceedings of the 8th European Conference on Genetic
Programming, volume 3447 of Lecture Notes in Computer Science, pages 108119,
Lausanne, Switzerland, 30 March - 1 April 2005. Springer. ISBN 3-540-254366.
URL http://springerlink.metapress.com/openurl.asp?genre=article&issn=
0302-9743&volume=3447&spage=108.
GPBiB
T.-H. Hoang, D. Essam, R. I. B. McKay, and X. H. Nguyen. Building on success in genetic
programming:adaptive variation & developmental evaluation. In Proceedings of the
2007 International Symposium on Intelligent Computation and Applications (ISICA),
Wuhan, China, September 21-23 2007. China University of Geosciences Press. URL
http://sc.snu.ac.kr/PAPERS/dtag.pdf.
GPBiB
J. H. Holland. Adaptation in Natural and Artificial Systems: An Introductory Analysis
with Applications to Biology, Control and Artificial Intelligence. MIT Press, 1992.
First Published by University of Michigan Press 1975.
P. Holmes. The odin genetic programming system. Tech Report RR-95-3, Computer
Studies, Napier University, Craiglockhart, 216 Colinton Road, Edinburgh, EH14 1DJ,
1995. URL http://citeseer.ist.psu.edu/holmes95odin.html.
GPBiB
100
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=8)
1.5
80
1
100
0.5
60
Fitness
Generation 8
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
186
BIBLIOGRAPHY
GPBiB
Iba.
Complexity-based fitness evaluation for variable length representation.
Position paper at the Workshop on Evolutionary Computation
with Variable Size Representation at ICGA-97, 20 July 1997.
URL http:
//coblitz.codeen.org:3125/citeseer.ist.psu.edu/cache/papers/cs/16452/http:
zSzzSzwww.miv.t.u-tokyo.ac.jpzSz~ibazSztmpzSzagp94.pdf/iba94genetic.pdf.
GPBiB
BIBLIOGRAPHY
187
H. Iba, T. Sato, and H. de Garis. Recombination guidance for numerical genetic programming. In 1995 IEEE Conference on Evolutionary Computation, volume 1, pages
97102, Perth, Australia, 29 November - 1 December 1995b. IEEE Press.
GPBiB
Y. Inagaki. On synchronized evolution of the network of automata. IEEE Transactions on Evolutionary Computation, 6(2):147158, April 2002. ISSN 1089-778X.
URL http://ieeexplore.ieee.org/iel5/4235/21497/00996014.pdf?tp=&arnumber=
996014&isnumber=21497&arSt=147&ared=158&arAuthor=Inagaki%2C+Y.%3B.
GPBiB
C. Jacob. Principia Evolvica Simulierte Evolution mit Mathematica. dpunkt.verlag,
Heidelberg, Germany, August 1997. ISBN 3-920993-48-9.
GPBiB
C. Jacob. The art of genetic programming. IEEE Intelligent Systems, 15(3):8384, MayJune 2000. ISSN 1094-7167. URL http://ieeexplore.ieee.org/iel5/5254/18363/
00846288.pdf.
GPBiB
C. Jacob. Illustrating Evolutionary Computation with Mathematica. Morgan Kaufmann,
2001. ISBN 1-55860-637-8. URL http://www.mkp.com/books_catalog/catalog.asp?
ISBN=1-55860-637-8.
GPBiB
N. Jin. Equilibrium selection by co-evolution for bargaining problems under incomplete
information about time preferences. In Proceedings of the IEEE Congress on Evolutionary Computation (CEC), pages 26612668, Edinburgh, 25 September 2005.
N. Jin and E. P. K. Tsang. Co-adaptive strategies for sequential bargaining problems with
discount factors and outside options. In Proceedings of the 2006 IEEE Congress on
Evolutionary Computation, pages 79137920, Vancouver, 6-21 July 2006. IEEE Press.
ISBN 0-7803-9487-9.
GPBiB
H. E. Johnson, R. J. Gilbert, M. K. Winson, R. Goodacre, A. R. Smith, J. J. Rowland,
M. A. Hall, and D. B. Kell. Explanatory analysis of the metabolome using genetic
programming of simple, interpretable rules. Genetic Programming and Evolvable Machines, 1(3):243258, July 2000. ISSN 1389-2576.
GPBiB
A. Jones, D. Young, J. Taylor, D. B. Kell, and J. J. Rowland. Quantification of microbial
productivity via multi-angle light scattering and supervised learning. Biotechnology
and Bioengineering, 59(2):131143, 20 July 1998. ISSN 0006-3592.
GPBiB
Jong-Wan Kim. Proceedings of the 2001 Congress on Evolutionary Computation
CEC2001, COEX, World Trade Center, 159 Samseong-dong, Gangnam-gu, Seoul, Korea, 27-30 May 2001. IEEE Press. ISBN 0-7803-6658-1.
GPBiB
E. Jordaan, J. den Doelder, and G. Smits. Novel approach to develop structure-property
relationships using genetic programming. In T. P. Runarsson, et al., editors, Parallel Problem Solving from Nature - PPSN IX, volume 4193 of LNCS, pages 322331,
Reykjavik, Iceland, 9-13 September 2006. Springer-Verlag. ISBN 3-540-38990-3. GPBiB
E. Jordaan, A. Kordon, L. Chiang, and G. Smits. Robust inferential sensors based on
ensemble of predictors generated by genetic programming. In X. Yao, et al., editors,
Parallel Problem Solving from Nature - PPSN VIII, volume 3242 of LNCS, pages
522531, Birmingham, UK, 18-22 September 2004. Springer-Verlag. ISBN 3-54023092-0.
URL http://www.springerlink.com/openurl.asp?genre=article&issn=
0302-9743&volume=3242&spage=522.
GPBiB
A. K. Joshi and Y. Schabes. Tree adjoining grammars. In G. Rozenber and A. Saloma,
editors, Handbook of of Formal Languages, pages 69123. Springer-Verlag, 1997.
100
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=9)
1.5
80
1
100
0.5
60
Fitness
Generation 9
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
188
BIBLIOGRAPHY
BIBLIOGRAPHY
189
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=11)
1.5
80
1
100
0.5
60
Fitness
Generation 11
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
190
BIBLIOGRAPHY
BIBLIOGRAPHY
191
In Sixth World
URL http://www.
GPBiB
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=13)
1.5
80
1
100
0.5
60
Fitness
Generation 13
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
192
BIBLIOGRAPHY
BIBLIOGRAPHY
193
J. R. Koza, D. E. Goldberg, D. B. Fogel, and R. L. Riolo, editors. Genetic Programming 1996: Proceedings of the First Annual Conference, Stanford University, CA,
USA, 2831 July 1996. MIT Press. URL http://www.genetic-programming.org/
gp96proceedings.html.
GPBiB
J. R. Koza, K. Deb, M. Dorigo, D. B. Fogel, M. Garzon, H. Iba, and R. L. Riolo, editors.
Genetic Programming 1997: Proceedings of the Second Annual Conference, Stanford
University, CA, USA, 13-16 July 1997. Morgan Kaufmann. URL http://www.mkp.com/
books_catalog/1-55860-483-9.asp.
GPBiB
J. R. Koza, W. Banzhaf, K. Chellapilla, K. Deb, M. Dorigo, D. B. Fogel, M. H. Garzon,
D. E. Goldberg, H. Iba, and R. Riolo, editors. Genetic Programming 1998: Proceedings
of the Third Annual Conference, University of Wisconsin, Madison, WI, USA, 22-25
July 1998. Morgan Kaufmann. ISBN 1-55860-548-7.
GPBiB
D. H. Kraft, F. E. Petry, W. P. Buckles, and T. Sadasivan. The use of genetic programming
to build queries for information retrieval. In Proceedings of the 1994 IEEE World
Congress on Computational Intelligence, pages 468473, Orlando, Florida, USA, 2729 June 1994. IEEE Press.
GPBiB
T. Krantz, O. Lindberg, G. Thorburn, and P. Nordin. Programmatic compression of
natural video. In E. Cant
u-Paz, editor, Late Breaking Papers at the Genetic and Evolutionary Computation Conference (GECCO-2002), pages 301307, New York, NY,
July 2002. AAAI. URL http://thomas.krantz.com/paper.pdf.
GPBiB
N. Krasnogor. Self generating metaheuristics in bioinformatics: The proteins structure
comparison case. Genetic Programming and Evolvable Machines, 5(2):181201, June
2004. ISSN 1389-2576.
K. Krawiec. Evolutionary Feature Programming: Cooperative learning for knowledge
discovery and computer vision. Number 385. Wydawnictwo Politechniki Poznanskiej,
Poznan University of Technology, Poznan, Poland, 2004. URL http://idss.cs.put.
poznan.pl/~krawiec/pubs/hab/krawiec_hab.pdf.
GPBiB
W. B. Langdon. The evolution of size in variable length representations. In 1998 IEEE
International Conference on Evolutionary Computation, pages 633638, Anchorage,
Alaska, USA, 5-9 May 1998. IEEE Press. URL http://www.cs.bham.ac.uk/~wbl/ftp/
papers/WBL.wcci98_bloat.pdf.
GPBiB
W. B. Langdon. Size fair and homologous tree genetic programming crossovers. In
W. Banzhaf, et al., editors, Proceedings of the Genetic and Evolutionary Computation
Conference, volume 2, pages 10921097, Orlando, Florida, USA, 13-17 July 1999a.
Morgan Kaufmann. ISBN 1-55860-611-4. URL http://www.cs.ucl.ac.uk/staff/W.
Langdon/ftp/papers/WBL.gecco99.fairxo.ps.gz.
GPBiB
W. B. Langdon. Scaling of program tree fitness spaces. Evolutionary Computation, 7(4):
399428, Winter 1999b. ISSN 1063-6560. URL http://www.mitpressjournals.org/
doi/pdf/10.1162/evco.1999.7.4.399.
GPBiB
W. B. Langdon. Convergence rates for the distribution of program outputs. In W. B.
Langdon, et al., editors, GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference, pages 812819, New York, 9-13 July 2002a. Morgan
Kaufmann Publishers. ISBN 1-55860-878-8. URL http://www.cs.ucl.ac.uk/staff/
W.Langdon/ftp/papers/wbl_gecco2002.pdf.
GPBiB
100
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=15)
1.5
80
1
100
0.5
60
Fitness
Generation 15
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
194
BIBLIOGRAPHY
W. B. Langdon. How many good programs are there? How long are they? In K. A.
De Jong, et al., editors, Foundations of Genetic Algorithms VII, pages 183202,
Torremolinos, Spain, 4-6 September 2002b. Morgan Kaufmann. ISBN 0-12-2081552. URL http://www.cs.ucl.ac.uk/staff/W.Langdon/ftp/papers/wbl_foga2002.pdf.
Published 2003.
GPBiB
W. B. Langdon. Convergence of program fitness landscapes. In E. Cant
u-Paz, et al., editors, Genetic and Evolutionary Computation GECCO-2003, volume 2724 of LNCS,
pages 17021714, Chicago, 12-16 July 2003a. Springer-Verlag. ISBN 3-540-406034.
URL http://www.cs.ucl.ac.uk/staff/W.Langdon/ftp/papers/wbl_gecco2003.
pdf.
GPBiB
W. B. Langdon. The distribution of reversible functions is Normal. In R. L. Riolo and
B. Worzel, editors, Genetic Programming Theory and Practise, chapter 11, pages 173
188. Kluwer, 2003b. ISBN 1-4020-7581-2. URL http://www.cs.ucl.ac.uk/staff/W.
Langdon/ftp/papers/wbl_reversible.pdf.
GPBiB
W. B. Langdon. Global distributed evolution of L-systems fractals. In M. Keijzer, et al.,
editors, Genetic Programming, Proceedings of EuroGP2004, volume 3003 of LNCS,
pages 349358, Coimbra, Portugal, 5-7 April 2004. Springer-Verlag. ISBN 3-540-213465. URL http://www.cs.ucl.ac.uk/staff/W.Langdon/ftp/papers/egp2004_pfeiffer.
pdf.
GPBiB
W. B. Langdon. Pfeiffer A distributed open-ended evolutionary system. In B. Edmonds,
et al., editors, AISB05: Proceedings of the Joint Symposium on Socially Inspired
Computing (METAS 2005), pages 713, University of Hertfordshire, Hatfield, UK, 1215 April 2005a. URL http://www.cs.ucl.ac.uk/staff/W.Langdon/ftp/papers/wbl_
metas2005.pdf. SSAISB 2005 Convention.
GPBiB
W. B. Langdon. The distribution of amorphous computer outputs. In S. Stepney and
S. Emmott, editors, The Grand Challenge in Non-Classical Computation: International Workshop, York, UK, 18-19 April 2005b. URL http://www.cs.ucl.ac.uk/
staff/W.Langdon/ftp/papers/grand_2005.pdf.
GPBiB
W. B. Langdon. Mapping non-conventional extensions of genetic programming. In C. S.
Calude, et al., editors, Unconventional Computing 2006, volume 4135 of LNCS, pages
166180, York, 4-8 September 2006. Springer-Verlag. ISBN 3-540-38593-2. URL http:
//www.cs.ucl.ac.uk/staff/W.Langdon/ftp/papers/wbl_uc2002.pdf.
GPBiB
W. B. Langdon and W. Banzhaf. A SIMD interpreter for genetic programming on GPU
graphics cards. In EuroGP, LNCS, Naples, 26-28 March 2008. Springer. URL http://
www.cs.ucl.ac.uk/staff/W.Langdon/ftp/papers/langdon_2008_eurogp.pdf. Forthcoming.
W. B. Langdon, S. J. Barrett, and B. F. Buxton. Predicting biochemical interactions
human P450 2D6 enzyme inhibition. In R. Sarker, et al., editors, Proceedings of
the 2003 Congress on Evolutionary Computation CEC2003, pages 807814, Canberra,
8-12 December 2003. IEEE Press. ISBN 0-7803-7804-0. URL http://www.cs.ucl.ac.
uk/staff/W.Langdon/ftp/papers/wbl_cec2003.pdf.
GPBiB
W. B. Langdon and B. F. Buxton. Genetic programming for mining DNA chip data
from cancer patients. Genetic Programming and Evolvable Machines, 5(3):251257,
September 2004. ISSN 1389-2576. URL http://www.cs.ucl.ac.uk/staff/W.Langdon/
ftp/papers/wbl_dnachip.pdf.
GPBiB
W. B. Langdon and A. P. Harrison. GP on SPMD parallel graphics hardware for mega
bioinformatics data mining. 2008. In preparation.
BIBLIOGRAPHY
195
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=17)
1.5
80
1
100
0.5
60
Fitness
Generation 17
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
196
BIBLIOGRAPHY
W. B. Langdon. Size fair and homologous tree genetic programming crossovers. Genetic Programming and Evolvable Machines, 1(1/2):95119, April 2000. ISSN 13892576.
URL http://www.cs.ucl.ac.uk/staff/W.Langdon/ftp/papers/WBL_fairxo.
pdf.
GPBiB
W. B. Langdon and W. Banzhaf. Repeated sequences in linear genetic programming
genomes. Complex Systems, 15(4):285306, 2005. ISSN 0891-2513. URL http://www.
cs.ucl.ac.uk/staff/W.Langdon/ftp/papers/wbl_repeat_linear.pdf.
GPBiB
W. B. Langdon, S. M. Gustafson, and J. Koza. The Genetic Programming Bibliography,
1995-2008. URL http://www.cs.bham.ac.uk/~wbl/biblio/.
W. B. Langdon and P. Nordin. Evolving hand-eye coordination for a humanoid robot
with machine code genetic programming. In J. F. Miller, et al., editors, Genetic
Programming, Proceedings of EuroGP2001, volume 2038 of LNCS, pages 313324,
Lake Como, Italy, 18-20 April 2001. Springer-Verlag. ISBN 3-540-41899-7. URL http:
//www.cs.ucl.ac.uk/staff/W.Langdon/ftp/papers/wbl_handeye.ps.gz.
GPBiB
W. B. Langdon and R. Poli. Evolutionary solo pong players. In D. Corne, et al., editors,
Proceedings of the 2005 IEEE Congress on Evolutionary Computation, volume 3, pages
26212628, Edinburgh, UK, 2-5 September 2005. IEEE Press. ISBN 0-7803-93635. URL http://www.cs.ucl.ac.uk/staff/W.Langdon/ftp/papers/pong_cec2005.pdf.
GPBiB
W. B. Langdon and R. Poli. On turing complete T7 and MISC F4 program fitness
landscapes. In D. V. Arnold, et al., editors, Theory of Evolutionary Algorithms, number 06061 in Dagstuhl Seminar Proceedings, Dagstuhl, Germany, 5-10 February 2006.
Internationales Begegnungs- und Forschungszentrum fuer Informatik (IBFI), Schloss
Dagstuhl, Germany. URL http://drops.dagstuhl.de/opus/volltexte/2006/595.
<http://drops.dagstuhl.de/opus/volltexte/2006/595> [date of citation: 2006-01-01].
GPBiB
W. B. Langdon and R. Poli. Mapping non-conventional extensions of genetic programming. Natural Computing, 7:2143, March 2008. Invited contribution to special issue
on Unconventional computing.
GPBiB
W. B. Langdon, T. Soule, R. Poli, and J. A. Foster. The evolution of size and shape.
In L. Spector, et al., editors, Advances in Genetic Programming 3, chapter 8, pages
163190. MIT Press, Cambridge, MA, USA, June 1999. ISBN 0-262-19423-6. URL
http://www.cs.bham.ac.uk/~wbl/aigp3/ch08.pdf.
GPBiB
P. Larra
naga. A review on estimation of distribution algorithms, chapter 3, pages 57100.
Kluwer Academic Publishers, 2002.
P. Larra
naga and J. A. Lozano. Estimation of Distribution Algorithms, A New Tool for
Evolutionary Computation. Kluwer Academic Publishers, 2002.
S. Lavington, N. Dewhurst, E. Wilkins, and A. Freitas. Interfacing knowledge discovery
algorithms to large database management systems. Information and Software Technology, 41(9):605617, 25 June 1999. URL http://www.sciencedirect.com/science/
article/B6V0B-3WN7DYN-8/1/cdabdda09c085c6a4536aa5e116366ee. special issue on
data mining.
GPBiB
K. S. Leung, K. H. Lee, and S. M. Cheang. Genetic parallel programming - evolving linear
machine codes on a multiple-ALU processor. In S. Yaacob, et al., editors, Proceedings
of International Conference on Artificial Intelligence in Engineering and Technology ICAIET 2002, pages 207213. Universiti Malaysia Sabah, June 2002. ISBN 983-218892-X.
GPBiB
BIBLIOGRAPHY
197
T. L. Lew, A. B. Spencer, F. Scarpa, K. Worden, A. Rutherford, and F. Hemez. Identification of response surface models using genetic programming. Mechanical Systems
and Signal Processing, 20(8):18191831, November 2006.
GPBiB
D. R. Lewin, S. Lachman-Shalem, and B. Grosman. The role of process system engineering
(PSE) in integrated circuit (IC) manufacturing. Control Engineering Practice, 15(7):
793802, July 2006. Special Issue on Award Winning Applications, 2005 IFAC World
Congress.
GPBiB
J. Li and E. P. K. Tsang. Investment decision making using FGP: A case study. In P. J.
Angeline, et al., editors, Proceedings of the Congress on Evolutionary Computation,
volume 2, pages 12531259, Mayflower Hotel, Washington D.C., USA, 6-9 July 1999.
IEEE Press. ISBN 0-7803-5536-9 (softbound). URL http://www.cs.bham.ac.uk/~jxl/
cercialink/web/publication/CEC99.pdf.
GPBiB
L. Li, W. Jiang, X. Li, K. L. Moser, Z. Guo, L. Du, Q. Wang, E. J. Topol, Q. Wang,
and S. Rao. A robust hybrid between genetic algorithm and support vector machine
for extracting an optimal feature gene subset. Genomics, 85(1):1623, January 2005.
GPBiB
R. Linden and A. Bhaya. Evolving fuzzy rules to model gene expression. Biosystems, 88
(1-2):7691, March 2007.
GPBiB
A. Lindenmayer. Mathematic models for cellular interaction in development, parts I and
II. Journal of Theoretical Biology, 18:280299 and 300315, 1968.
H. Lipson. How to draw a straight line using a GP: Benchmarking evolutionary design
against 19th century kinematic synthesis. In M. Keijzer, editor, Late Breaking Papers
at the 2004 Genetic and Evolutionary Computation Conference, Seattle, Washington, USA, 26 July 2004. URL http://www.cs.bham.ac.uk/~wbl/biblio/gecco2004/
LBP063.pdf.
GPBiB
J. Lohn, G. Hornby, and D. Linden. Evolutionary antenna design for a NASA spacecraft.
In U.-M. OReilly, et al., editors, Genetic Programming Theory and Practice II, chapter 18, pages 301315. Springer, Ann Arbor, 13-15 May 2004. ISBN 0-387-23253-2.
GPBiB
J. Lohn, A. Stoica, and D. Keymeulen, editors. The Second NASA/DoD Workshop on
Evolvable Hardware, Palo Alto, California, 13-15 July 2000. IEEE Computer Society.
ISBN 0-7695-0762-X.
M. A. Lones. Enzyme Genetic Programming: Modelling Biological Evolvability in Genetic
Programming. PhD thesis, The University of York, Heslington, York, YO10 5DD, UK,
September 2003. URL http://www-users.york.ac.uk/~mal503/common/thesis/main.
html.
GPBiB
M. Looks. Scalable estimation-of-distribution program evolution. In H. Lipson, editor,
GECCO, pages 539546. ACM, 2007. ISBN 978-1-59593-697-4.
M. Looks, B. Goertzel, and C. Pennachin. Learning computer programs with the bayesian
optimization algorithm. In H.-G. Beyer, et al., editors, GECCO 2005: Proceedings of
the 2005 conference on Genetic and evolutionary computation, volume 1, pages 747
748, Washington DC, USA, 25-29 June 2005. ACM Press. ISBN 1-59593-010-8. URL
http://www.cs.bham.ac.uk/~wbl/biblio/gecco2005/docs/p747.pdf.
GPBiB
J. Louchet. Using an individual evolution strategy for stereovision. Genetic Programming
and Evolvable Machines, 2(2):101109, June 2001. ISSN 1389-2576.
GPBiB
100
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=19)
1.5
80
1
100
0.5
60
Fitness
Generation 19
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
198
BIBLIOGRAPHY
J. Louchet, M. Guyon, M.-J. Lesot, and A. Boumaza. Dynamic flies: a new pattern
recognition tool applied to stereo sequence processing. Pattern Recognition Letters, 23
(1-3):335345, January 2002.
GPBiB
J. Loviscach and J. Meyer-Spradow. Genetic programming of vertex shaders. In
M. Chover, et al., editors, Proceedings of EuroMedia 2003, pages 2931, 2003. GPBiB
S. Luke. Evolving soccerbots: A retrospective. In Proceedings of the 12th Annual
Conference of the Japanese Society for Artificial Intelligence, 1998. URL http:
//www.cs.gmu.edu/~sean/papers/robocupShort.pdf.
GPBiB
S. Luke. Two fast tree-creation algorithms for genetic programming. IEEE Transactions on Evolutionary Computation, 4(3):274283, September 2000. URL http:
//ieeexplore.ieee.org/iel5/4235/18897/00873237.pdf.
GPBiB
S. Luke and L. Panait. Lexicographic parsimony pressure. In W. B. Langdon, et al., editors, GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference, pages 829836, New York, 9-13 July 2002. Morgan Kaufmann Publishers. ISBN
1-55860-878-8. URL http://cs.gmu.edu/~sean/papers/lexicographic.pdf. GPBiB
S. Luke, L. Panait, G. Balan, S. Paus, Z. Skolicki, E. Popovici, J. Harrison, J. Bassett,
R. Hubley, and A. Chircop. ECJ: A Java-based Evolutionary Computation Research
System , 2000-2007. URL http://www.cs.gmu.edu/~eclab/projects/ecj/.
S. Luke and L. Spector. A comparison of crossover and mutation in genetic programming. In J. R. Koza, et al., editors, Genetic Programming 1997: Proceedings of the
Second Annual Conference, pages 240248, Stanford University, CA, USA, 13-16 July
1997. Morgan Kaufmann. URL http://www.cs.gmu.edu/~sean/papers/comparison/
comparison.pdf.
GPBiB
E. Lukschandl, H. Borgvall, L. Nohle, M. Nordahl, and P. Nordin. Distributed java
bytecode genetic programming. In R. Poli, et al., editors, Genetic Programming, Proceedings of EuroGP2000, volume 1802 of LNCS, pages 316325, Edinburgh, 15-16
April 2000. Springer-Verlag. ISBN 3-540-67339-3. URL http://www.springerlink.
com/openurl.asp?genre=article&issn=0302-9743&volume=1802&spage=316. GPBiB
E. Lutton, J. Levy-Vehel, G. Cretin, P. Glevarec, and C. Roll. Mixed IFS: Resolution of
the inverse problem using genetic programming. Complex Systems, 9:375398, 1995a.
GPBiB
E. Lutton, J. Levy-Vehel, G. Cretin, P. Glevarec, and C. Roll. Mixed IFS: Resolution
of the inverse problem using genetic programming. Research Report No 2631, Inria,
1995b. URL http://citeseer.ist.psu.edu/cretin95mixed.html.
GPBiB
R. M. MacCallum. Introducing a perl genetic programming system: and can metaevolution solve the bloat problem? In C. Ryan, et al., editors, Genetic Programming,
Proceedings of EuroGP2003, volume 2610 of LNCS, pages 364373, Essex, 14-16 April
2003. Springer-Verlag. ISBN 3-540-00971-X. URL http://www.sbc.su.se/~maccallr/
publications/perlgp_eurogp2003.pdf.
GPBiB
P. Machado and J. Romero, editors. The Art of Artificial Evolution. Springer, 2008.
A. J. Marek, W. D. Smart, and M. C. Martin. Learning visual feature detectors for
obstacle avoidance using genetic programming. In E. Cant
u-Paz, editor, Late Breaking
Papers at the Genetic and Evolutionary Computation Conference (GECCO-2002),
pages 330336, New York, NY, July 2002. AAAI. URL http://www.martincmartin.
com/papers/LearingVisualFeatureDetectorsForObstAvoidGP_GECCO2002Marek.pdf.
GPBiB
BIBLIOGRAPHY
199
P. Marenbach. Using prior knowledge and obtaining process insight in data based modelling of bioprocesses. System Analysis Modelling Simulation, 31:3959, 1998. GPBiB
J. P. Marney, D. Miller, C. Fyfe, and H. F. E. Tarbert. Risk adjusted returns to technical
trading rules: a genetic programming approach. In 7th International Conference of
Society of Computational Economics, Yale, 28-29 June 2001.
GPBiB
M. C. Martin. Evolving visual sonar: Depth from monocular images. Pattern Recognition
Letters, 27(11):11741180, August 2006. URL http://martincmartin.com/papers/
EvolvingVisualSonarPatternRecognitionLetters2006.pdf. Evolutionary Computer
Vision and Image Understanding.
GPBiB
P. Martin. A hardware implementation of a genetic programming system using FPGAs
and Handel-C. Genetic Programming and Evolvable Machines, 2(4):317343, December 2001. ISSN 1389-2576. URL http://www.naiadhome.com/gpem-d.pdf.
GPBiB
P. Martin. A pipelined hardware implementation of genetic programming using FPGAs
and Handel-C. In J. A. Foster, et al., editors, Genetic Programming, Proceedings of the
5th European Conference, EuroGP 2002, volume 2278 of LNCS, pages 112, Kinsale,
Ireland, 3-5 April 2002. Springer-Verlag. ISBN 3-540-43378-3.
GPBiB
P. Martin and R. Poli. Crossover operators for A hardware implementation of GP using
FPGAs and Handel-C. In W. B. Langdon, et al., editors, GECCO 2002: Proceedings
of the Genetic and Evolutionary Computation Conference, pages 845852, New York,
9-13 July 2002. Morgan Kaufmann Publishers. ISBN 1-55860-878-8. URL http://
www.cs.bham.ac.uk/~wbl/biblio/gecco2002/gp284.ps.
GPBiB
S. Martinez-Jaramillo and E. P. K. Tsang. An heterogeneous, endogenous and coevolutionary GP-based financial market. IEEE Transactions on Evolutionary Computation, 2007. accepted for publication.
P. Massey, J. A. Clark, and S. Stepney. Evolution of a human-competitive quantum
fourier transform algorithm using genetic programming. In H.-G. Beyer, et al., editors, GECCO 2005: Proceedings of the 2005 conference on Genetic and evolutionary
computation, volume 2, pages 16571663, Washington DC, USA, 25-29 June 2005.
ACM Press. ISBN 1-59593-010-8. URL http://www.cs.bham.ac.uk/~wbl/biblio/
gecco2005/docs/p1657.pdf.
GPBiB
S. R. Maxwell, III. Why might some problems be difficult for genetic programming to find
solutions? In J. R. Koza, editor, Late Breaking Papers at the Genetic Programming
1996 Conference Stanford University July 28-31, 1996, pages 125128, Stanford University, CA, USA, 2831 July 1996. Stanford Bookstore. ISBN 0-18-201031-7. GPBiB
S. R. Maxwell, III. Experiments with a coroutine model for genetic programming. In
Proceedings of the 1994 IEEE World Congress on Computational Intelligence, volume 1, pages 413417a, Orlando, Florida, USA, 27-29 June 1994. IEEE Press. ISBN
0-7803-1899-4. URL http://ieeexplore.ieee.org/iel2/1125/8059/00349915.pdf?
isNumber=8059.
GPBiB
J. McCormack. New challenges for evolutionary music and art. SIGEvolution, 1(1):511,
April 2006. URL http://www.sigevolution.org/2006/01/issue.pdf.
GPBiB
A. C. McGovern, D. Broadhurst, J. Taylor, N. Kaderbhai, M. K. Winson, D. A. Small,
J. J. Rowland, D. B. Kell, and R. Goodacre. Monitoring of complex industrial bioprocesses for metabolite concentrations using modern spectroscopies and machine learning:
Application to gibberellic acid production. Biotechnology and Bioengineering, 78(5):
527538, 5 June 2002. URL http://dbkgroup.org/Papers/biotechnol_bioeng_78_
(527).pdf.
GPBiB
100
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=21)
1.5
80
1
100
0.5
60
Fitness
Generation 21
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
200
BIBLIOGRAPHY
BIBLIOGRAPHY
201
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=23)
1.5
80
1
100
0.5
60
Fitness
Generation 23
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
202
BIBLIOGRAPHY
H. M
uhlenbein and T. Mahnig. FDA a scalable evolutionary algorithm for the optimization of additively decomposed functions. Evolutionary Computation, 7(4):353376,
1999b.
C. J. Neely. Risk-adjusted, ex ante, optimal technical trading rules in equity markets.
International Review of Economics and Finance, 12(1):6987, Spring 2003. URL http:
//research.stlouisfed.org/wp/1999/1999-015.pdf.
GPBiB
C. J. Neely and P. A. Weller. Technical trading rules in the european monetary system.
Journal of International Money and Finance, 18(3):429458, 1999. URL http://
research.stlouisfed.org/wp/1997/97-015.pdf.
GPBiB
C. J. Neely and P. A. Weller. Technical analysis and central bank intervention. Journal
of International Money and Finance, 20(7):949970, December 2001. URL http:
//research.stlouisfed.org/wp/1997/97-002.pdf.
GPBiB
C. J. Neely, P. A. Weller, and R. Dittmar.
Is technical analysis in the foreign exchange market profitable? A genetic programming approach. The Journal of Financial and Quantitative Analysis, 32(4):405426, December 1997. ISSN
00221090. URL http://links.jstor.org/sici?sici=0022-1090%28199712%2932%3A4%
3C405%3AITAITF%3E2.0.CO%3B2-T.
GPBiB
C. J. Neely, P. A. Weller, and J. M. Ulrich. The adaptive markets hypothesis: evidence
from the foreign exchange market. Working Paper 2006-046B, Federal Reserve Bank of
St. Louis, Research Division, P.O. Box 442, St. Louis, MO 63166, USA, August 2006.
URL http://research.stlouisfed.org/wp/2006/2006-046.pdf. Revised March 2007.
GPBiB
O. Nicolotti, V. J. Gillet, P. J. Fleming, and D. V. S. Green. Multiobjective optimization
in quantitative structure-activity relationships: Deriving accurate and interpretable
QSARs. Journal of Medicinal Chemistry, 45(23):50695080, November 7 2002. ISSN
0022-2623. URL http://pubs3.acs.org/acs/journals/doilookup?in_doi=10.1021/
jm020919o.
GPBiB
N. Nikolaev and H. Iba. Adaptive Learning of Polynomial Networks Genetic Programming, Backpropagation and Bayesian Methods. Number 4 in Genetic and Evolutionary
Computation. Springer, 2006. ISBN 0-387-31239-0. June.
GPBiB
N. Y. Nikolaev and H. Iba. Genetic programming of polynomial models for financial
forecasting. In S.-H. Chen, editor, Genetic Algorithms and Genetic Programming in
Computational Finance, chapter 5, pages 103123. Kluwer Academic Press, 2002. ISBN
0-7923-7601-3.
GPBiB
A. E. Nix and M. D. Vose. Modeling genetic algorithms with Markov chains. Annals of
Mathematics and Artificial Intelligence, 5:7988, 1992.
P. Nordin. A compiling genetic programming system that directly manipulates the machine code. In K. E. Kinnear, Jr., editor, Advances in Genetic Programming, chapter 14, pages 311331. MIT Press, 1994. URL http://cognet.mit.edu/library/books/
view?isbn=0262111888.
GPBiB
P. Nordin. Evolutionary Program Induction of Binary Machine Code and its Applications.
PhD thesis, der Universitat Dortmund am Fachereich Informatik, 1997.
GPBiB
P. Nordin and W. Banzhaf. Programmatic compression of images and sound. In J. R.
Koza, et al., editors, Genetic Programming 1996: Proceedings of the First Annual
Conference, pages 345350, Stanford University, CA, USA, 2831 July 1996. MIT
Press. URL http://www.cs.mun.ca/~banzhaf/papers/gp96.pdf.
GPBiB
BIBLIOGRAPHY
203
R. Olsson.
Inductive functional programming using incremental
program transformation.
Artificial Intelligence,
74(1):5581,
March
1995.
URL
http://www.sciencedirect.com/science?_ob=MImg&_imagekey=
B6TYF-4002FJH-9-1&_cdi=5617&_orig=browse&_coverDate=03%2F31%2F1995&_
sk=999259998&wchp=dGLbVlb-lSzBV&_acct=C000010182&_version=1&_userid=
125795&md5=ba5db57b3fa83d990440da8dfd8afcd7&ie=f.pdf.
GPBiB
M. Oltean. Evolving evolutionary algorithms using linear genetic programming. Evolutionary Computation, 13(3):387410, Fall 2005. ISSN 1063-6560.
GPBiB
M. Oltean and D. Dumitrescu. Evolving TSP heuristics using multi expression programming. In M. Bubak, et al., editors, Computational Science - ICCS 2004: 4th International Conference, Part II, volume 3037 of Lecture Notes in Computer Science,
pages 670673, Krakow, Poland, 6-9 June 2004. Springer-Verlag. ISBN 3-540-221158.
URL http://springerlink.metapress.com/openurl.asp?genre=article&issn=
0302-9743&volume=3037&spage=670.
GPBiB
R. Ondas, M. Pelikan, and K. Sastry. Genetic programming, probabilistic incremental program evolution, and scalability. In J. Knowles, editor, WSC10: 10th Online
World Conference on Soft Computing in Industrial Applications, pages 363372, On
the World Wide Web, 19 September - 7 October 2005. ISBN 3-540-29123-7. URL
http://isxp1010c.sims.cranfield.ac.uk/Papers/paper122.pdf.
GPBiB
M. ONeill and C. Ryan. Grammatical Evolution: Evolutionary Automatic Programming
in a Arbitrary Language, volume 4 of Genetic programming. Kluwer Academic Publishers, 2003. ISBN 1-4020-7444-1. URL http://www.wkap.nl/prod/b/1-4020-7444-1.
GPBiB
M. ONeill, C. Ryan, M. Keijzer, and M. Cattolico. Crossover in grammatical evolution.
Genetic Programming and Evolvable Machines, 4(1):6793, March 2003. ISSN 13892576.
GPBiB
100
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=25)
1.5
80
1
100
0.5
60
Fitness
Generation 25
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
204
BIBLIOGRAPHY
S. Openshaw and I. Turton. Building new spatial interaction models using genetic
programming. In T. C. Fogarty, editor, Evolutionary Computing, Lecture Notes
in Computer Science, Leeds, UK, 11-13 April 1994. Springer-Verlag. URL http:
//www.geog.leeds.ac.uk/papers/94-1/94-1.pdf.
GPBiB
U.-M. OReilly. An Analysis of Genetic Programming. PhD thesis, Carleton University, Ottawa-Carleton Institute for Computer Science, Ottawa, Ontario, Canada,
22 September 1995. URL http://www.cs.ucl.ac.uk/staff/W.Langdon/ftp/papers/
oreilly/abstract.ps.gz.
GPBiB
U.-M. OReilly. Investigating the generality of automatically defined functions. In J. R.
Koza, et al., editors, Genetic Programming 1996: Proceedings of the First Annual
Conference, pages 351356, Stanford University, CA, USA, 2831 July 1996. MIT
Press. URL http://citeseer.ist.psu.edu/24128.html.
GPBiB
U.-M. OReilly and M. Hemberg. Integrating generative growth and evolutionary computation for form exploration. Genetic Programming and Evolvable Machines, 8(2):
163186, June 2007. ISSN 1389-2576. Special issue on developmental systems. GPBiB
U.-M. OReilly and F. Oppacher. Program search with a hierarchical variable length
representation: Genetic programming, simulated annealing and hill climbing. In
Y. Davidor, et al., editors, Parallel Problem Solving from Nature PPSN III, number
866 in Lecture Notes in Computer Science, pages 397406, Jerusalem, 9-14 October
1994a. Springer-Verlag. ISBN 3-540-58484-6. URL http://www.cs.ucl.ac.uk/staff/
W.Langdon/ftp/papers/ppsn-94.ps.gz.
GPBiB
U.-M. OReilly and F. Oppacher. The troubling aspects of a building block hypothesis for genetic programming.
In L. D. Whitley and M. D. Vose, editors, Foundations of Genetic Algorithms 3, pages 7388, Estes Park, Colorado, USA, 31 July2 August 1994b. Morgan Kaufmann. ISBN 1-55860-356-5.
URL http://citeseer.ist.psu.edu/cache/papers/cs/163/http:zSzzSzwww.ai.mit.
eduzSzpeoplezSzunamayzSzpaperszSzfoga.pdf/oreilly92troubling.pdf. Published
1995.
GPBiB
U.-M. OReilly, T. Yu, R. L. Riolo, and B. Worzel, editors. Genetic Programming Theory
and Practice II, volume 8 of Genetic Programming, Ann Arbor, MI, USA, 13-15 May
2004. Springer. ISBN 0-387-23253-2. URL http://www.springeronline.com/sgw/cda/
frontpage/0,11855,5-40356-22-34954683-0,00.html.
GPBiB
M. Oussaid`
ene, B. Chopard, O. V. Pictet, and M. Tomassini. Parallel genetic programming and its application to trading model induction. Parallel Computing, 23(8):1183
1198, August 1997. URL http://citeseer.ist.psu.edu/cache/papers/cs/166/http:
zSzzSzlslwww.epfl.chzSz~marcozSzparcomp.pdf/oussaidene97parallel.pdf. GPBiB
J. D. Owens, D. Luebke, N. Govindaraju, M. Harris, J. Kruger, A. E. Lefohn, and T. J.
Purcell. A survey of general-purpose computation on graphics hardware. Computer
Graphics Forum, 26(1):80113, March 2007.
L. Panait and S. Luke. Alternative bloat control methods. In K. Deb, et al., editors,
Genetic and Evolutionary Computation GECCO-2004, Part II, volume 3103 of Lecture Notes in Computer Science, pages 630641, Seattle, WA, USA, 26-30 June 2004.
Springer-Verlag. ISBN 3-540-22343-6. URL http://cs.gmu.edu/~lpanait/papers/
panait04alternative.pdf.
GPBiB
J. Parent and A. Nowe. Evolving compression preprocessors with genetic programming.
In W. B. Langdon, et al., editors, GECCO 2002: Proceedings of the Genetic and
Evolutionary Computation Conference, pages 861867, New York, 9-13 July 2002.
BIBLIOGRAPHY
205
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=27)
1.5
80
1
100
0.5
60
Fitness
Generation 27
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
206
BIBLIOGRAPHY
R. Poli. Hyperschema theory for GP with one-point crossover, building blocks, and some
new results in GA theory. In R. Poli, et al., editors, Genetic Programming, Proceedings
of EuroGP2000, volume 1802 of LNCS, pages 163180, Edinburgh, 15-16 April 2000a.
Springer-Verlag. ISBN 3-540-67339-3. URL http://www.springerlink.com/openurl.
asp?genre=article&issn=0302-9743&volume=1802&spage=163.
GPBiB
R. Poli. Exact schema theorem and effective fitness for GP with one-point crossover. In
D. Whitley, et al., editors, Proceedings of the Genetic and Evolutionary Computation
Conference, pages 469476, Las Vegas, July 2000b. Morgan Kaufmann.
R. Poli. Exact schema theory for genetic programming and variable-length genetic algorithms with one-point crossover. Genetic Programming and Evolvable Machines, 2(2):
123163, 2001a.
R. Poli. General schema theory for genetic programming with subtree-swapping crossover.
In Genetic Programming, Proceedings of EuroGP 2001, LNCS, Milan, 18-20 April
2001b. Springer-Verlag.
R. Poli. A simple but theoretically-motivated method to control bloat in genetic programming. In C. Ryan, et al., editors, Genetic Programming, Proceedings of the 6th
European Conference, EuroGP 2003, LNCS, pages 211223, Essex, UK, 14-16 April
2003. Springer-Verlag.
R. Poli. Tournament selection, iterated coupon-collection problem, and backwardchaining evolutionary algorithms. In A. H. Wright, et al., editors, Foundations of Genetic Algorithms 8, volume 3469 of Lecture Notes in Computer Science, pages 132155,
Aizu-Wakamatsu City, Japan, 5-9 January 2005. Springer-Verlag. ISBN 3-540-27237-2.
URL http://www.cs.essex.ac.uk/staff/rpoli/papers/foga2005_Poli.pdf. GPBiB
R. Poli, C. Di Chio, and W. B. Langdon. Exploring extended particle swarms: a genetic
programming approach. In H.-G. Beyer, et al., editors, GECCO 2005: Proceedings of
the 2005 conference on Genetic and evolutionary computation, volume 1, pages 169
176, Washington DC, USA, 25-29 June 2005. ACM Press. ISBN 1-59593-010-8. URL
http://www.cs.essex.ac.uk/staff/poli/papers/geccopso2005.pdf.
GPBiB
R. Poli and W. B. Langdon. A new schema theory for genetic programming with one-point
crossover and point mutation. In J. R. Koza, et al., editors, Genetic Programming 1997:
Proceedings of the Second Annual Conference, pages 278285, Stanford University,
CA, USA, 13-16 July 1997. Morgan Kaufmann. URL http://citeseer.ist.psu.edu/
327495.html.
GPBiB
R. Poli and W. B. Langdon. Schema theory for genetic programming with one-point
crossover and point mutation. Evolutionary Computation, 6(3):231252, 1998a. URL
http://cswww.essex.ac.uk/staff/poli/papers/Poli-ECJ1998.pdf.
GPBiB
R. Poli and W. B. Langdon. On the search properties of different crossover operators
in genetic programming. In J. R. Koza, et al., editors, Genetic Programming 1998:
Proceedings of the Third Annual Conference, pages 293301, University of Wisconsin,
Madison, Wisconsin, USA, 22-25 July 1998b. Morgan Kaufmann. ISBN 1-55860-548-7.
URL http://www.cs.essex.ac.uk/staff/poli/papers/Poli-GP1998.pdf.
GPBiB
R. Poli and W. B. Langdon. Sub-machine-code genetic programming. In L. Spector,
et al., editors, Advances in Genetic Programming 3, chapter 13, pages 301323. MIT
Press, Cambridge, MA, USA, June 1999. ISBN 0-262-19423-6. URL http://cswww.
essex.ac.uk/staff/rpoli/papers/Poli-AIGP3-1999.pdf.
GPBiB
BIBLIOGRAPHY
207
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=29)
1.5
80
1
100
0.5
60
Fitness
Generation 29
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
208
BIBLIOGRAPHY
R. Poli, N. F. McPhee, and J. E. Rowe. Exact schema theory and markov chain models for genetic programming and variable-length genetic algorithms with homologous
crossover. Genetic Programming and Evolvable Machines, 5(1):3170, March 2004.
ISSN 1389-2576. URL http://cswww.essex.ac.uk/staff/rpoli/papers/GPEM2004.
pdf.
GPBiB
R. Poli and J. Page. Solving high-order boolean parity problems with smooth uniform
crossover, sub-machine code GP and demes. Genetic Programming and Evolvable
Machines, 1(1/2):3756, April 2000. ISSN 1389-2576. URL http://citeseer.ist.
psu.edu/335584.html.
GPBiB
R. Poli, J. Page, and W. B. Langdon. Smooth uniform crossover, sub-machine code GP
and demes: A recipe for solving high-order boolean parity problems. In W. Banzhaf,
et al., editors, Proceedings of the Genetic and Evolutionary Computation Conference, volume 2, pages 11621169, Orlando, Florida, USA, 13-17 July 1999. Morgan Kaufmann. ISBN 1-55860-611-4. URL http://www.cs.bham.ac.uk/~wbl/biblio/
gecco1999/GP-466.pdf.
GPBiB
R. Poli, J. E. Rowe, and N. F. McPhee. Markov chain models for GP and variable-length
GAs with homologous crossover. In L. Spector, et al., editors, Proceedings of the
Genetic and Evolutionary Computation Conference (GECCO-2001), pages 112119,
San Francisco, California, USA, 7-11 July 2001. Morgan Kaufmann. ISBN 1-55860774-9. URL http://www.cs.bham.ac.uk/~wbl/biblio/gecco2001/d01.pdf.
GPBiB
R. Poli, J. Woodward, and E. K. Burke. A histogram-matching approach to the evolution of bin-packing strategies. In Proceedings of the IEEE Congress on Evolutionary
Computation, Singapore, 2007. accepted.
R. Poli, W. B. Langdon, M. Schoenauer, T. Fogarty, and W. Banzhaf, editors. Late Breaking Papers at EuroGP98: the First European Workshop on Genetic Programming,
Paris, France, 14-15 April 1998. URL http://www.cs.ucl.ac.uk/staff/W.Langdon/
ftp/papers/csrp-98-10.pdf.
GPBiB
R. Poli, P. Nordin, W. B. Langdon, and T. C. Fogarty, editors. Genetic Programming, Proceedings of EuroGP99, volume 1598 of LNCS, Goteborg, Sweden, 26-27
May 1999. Springer-Verlag. URL http://www.springerlink.com/openurl.asp?genre=
article&issn=0302-9743&volume=1598.
GPBiB
R. Poli, W. Banzhaf, W. B. Langdon, J. F. Miller, P. Nordin, and T. C. Fogarty, editors. Genetic Programming, Proceedings of EuroGP2000, volume 1802 of LNCS,
Edinburgh, 15-16 April 2000. Springer-Verlag. ISBN 3-540-67339-3.
GPBiB
E. Popovici and K. De Jong. The effects of interaction frequency on the optimization
performance of cooperative coevolution. In M. Keijzer, et al., editors, GECCO 2006:
Proceedings of the 8th annual conference on Genetic and evolutionary computation,
volume 1, pages 353360, Seattle, Washington, USA, 8-12 July 2006. ACM Press. ISBN
1-59593-186-4. URL http://www.cs.bham.ac.uk/~wbl/biblio/gecco2006/docs/p353.
pdf.
M. A. Potter. The Design and Analysis of a Computational Model of Cooperative Coevolution. PhD thesis, George Mason University, Washington, DC, spring 1997. URL
http://www.cs.gmu.edu/~mpotter/dissertation.html.
S. Priesterjahn, O. Kramer, A. Weimer, and A. Goebels. Evolution of human-competitive
agents in modern computer games. In G. G. Yen, et al., editors, Proceedings of the
2006 IEEE Congress on Evolutionary Computation, pages 777784, Vancouver, BC,
Canada, 16-21 July 2006. IEEE Press. ISBN 0-7803-9487-9. URL http://ieeexplore.
ieee.org/servlet/opac?punumber=11108.
BIBLIOGRAPHY
209
A. Pr
ugel-Bennett and J. L. Shapiro. An analysis of genetic algorithms using statistical
mechanics. Physical Review Letters, 72:13051309, 1994.
J. C. F. Pujol. Evolution of Artificial Neural Networks Using a Two-dimensional Representation. PhD thesis, School of Computer Science, University of Birmingham, UK,
April 1999.
GPBiB
J. C. F. Pujol and R. Poli. Evolution of the topology and the weights of neural networks
using genetic programming with a dual representation. Technical Report CSRP-97-7,
University of Birmingham, School of Computer Science, February 1997. URL ftp:
//ftp.cs.bham.ac.uk/pub/tech-reports/1997/CSRP-97-07.ps.gz.
GPBiB
B. Punch and D. Zongker. lil-gp Genetic Programming System, 1998. URL http://
garage.cse.msu.edu/software/lil-gp/index.html.
M. I. Quintana, R. Poli, and E. Claridge. On two approaches to image processing algorithm design for binary images using GP. In G. R. Raidl, et al., editors, Applications
of Evolutionary Computing, EvoWorkshops2003: EvoBIO, EvoCOP, EvoIASP, EvoMUSART, EvoROB, EvoSTIM, volume 2611 of LNCS, pages 422431, University of
Essex, England, UK, 14-16 April 2003. Springer-Verlag.
GPBiB
M. I. Quintana, R. Poli, and E. Claridge. Morphological algorithm design for binary
images using genetic programming. Genetic Programming and Evolvable Machines,
7(1):81102, March 2006. ISSN 1389-2576. URL http://cswww.essex.ac.uk/staff/
rpoli/papers/gpem2005.pdf.
GPBiB
A. Ratle and M. Sebag. Genetic programming and domain knowledge: Beyond the limitations of grammar-guided machine discovery. In M. Schoenauer, et al., editors, Parallel
Problem Solving from Nature - PPSN VI 6th International Conference, volume 1917
of LNCS, pages 211220, Paris, France, 16-20 September 2000. Springer Verlag. URL
http://www.lri.fr/~sebag/REF/PPSN00.ps.
GPBiB
A. Ratle and M. Sebag. Avoiding the bloat with probabilistic grammar-guided genetic
programming. In P. Collet, et al., editors, Artificial Evolution 5th International
Conference, Evolution Artificielle, EA 2001, volume 2310 of LNCS, pages 255266,
Creusot, France, October 29-31 2001. Springer Verlag. ISBN 3-540-43544-1. URL http:
//link.springer.de/link/service/series/0558/papers/2310/23100255.pdf. GPBiB
J. Reggia, M. Tagamets, J. Contreras-Vidal, D. Jacobs, S. Weems, W. Naqvi, R. Winder,
T. Chabuk, J. Jung, and C. Yang. Development of a large-scale integrated neurocognitive architecture - part 2: Design and architecture. Technical Report TRCS-4827, UMIACS-TR-2006-43, University of Maryland, USA, October 2006. URL
https://drum.umd.edu/dspace/bitstream/1903/3957/1/MarylandPart2.pdf. GPBiB
E. N. Regolin and A. T. R. Pozo. Bayesian automatic programming. In M. Keijzer, et al., editors, Proceedings of the 8th European Conference on Genetic Programming, volume 3447 of Lecture Notes in Computer Science, pages 3849,
Lausanne, Switzerland, 30 March - 1 April 2005. Springer. ISBN 3-540-254366.
URL http://springerlink.metapress.com/openurl.asp?genre=article&issn=
0302-9743&volume=3447&spage=38.
GPBiB
D. M. Reif, B. C. White, and J. H. Moore. Integrated analysis of genetic, genomic, and
proteomic data. Expert Review of Proteomics, 1(1):6775, 2004. ISSN 1473-7159. URL
http://www.future-drugs.com/doi/abs/10.1586/14789450.1.1.67.
GPBiB
C. W. Reynolds. Flocks, herds, and schools: A distributed behavioral model. SIGGRAPH
Computer Graphics, 21(4):2534, July 1987. ISSN 0097-8930. URL http://www.red3d.
com/cwr/papers/1987/boids.html.
100
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=36)
1.5
80
1
100
0.5
60
Fitness
Generation 36
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
210
BIBLIOGRAPHY
BIBLIOGRAPHY
211
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=43)
1.5
80
1
100
0.5
60
Fitness
Generation 43
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
212
BIBLIOGRAPHY
C. Ryan, T. Soule, M. Keijzer, E. P. K. Tsang, R. Poli, and E. Costa, editors. Genetic Programming, Proceedings of the 6th European Conference, EuroGP 2003, volume 2610 of LNCS, Essex, UK, 14-16 April 2003. Springer-Verlag. ISBN 3-54000971-X. URL http://www.springerlink.com/openurl.asp?genre=article&issn=
0302-9743&volume=2610.
GPBiB
R. P. Salustowicz and J. Schmidhuber. Probabilistic incremental program evolution. Evolutionary Computation, 5(2):123141, 1997. URL ftp://ftp.idsia.ch/pub/rafal/
PIPE.ps.gz.
GPBiB
R. P. Salustowicz, M. A. Wiering, and J. Schmidhuber. Learning team strategies: Soccer
case studies. Machine Learning, 33(2-3):263282, 12 November 1998. ISSN 0885-6125.
URL ftp://ftp.idsia.ch/pub/rafal/soccer.ps.gz.
R. P. Salustowicz and J. Schmidhuber. From probabilities to programs with probabilistic
incremental program evolution. In D. Corne, et al., editors, New Ideas in Optimization, Advanced Topics in Computer Science, chapter 28, pages 433450. McGraw-Hill,
Maidenhead, Berkshire, England, 1999. ISBN 0-07-709506-5.
GPBiB
A. L. Samuel. AI, where it has been and where it is going. In IJCAI, pages 11521157,
1983.
A. Sarafopoulos. Automatic generation of affine IFS and strongly typed genetic programming. In R. Poli, et al., editors, Genetic Programming, Proceedings of EuroGP99, volume 1598 of LNCS, pages 149160, Goteborg, Sweden, 26-27 May 1999.
Springer-Verlag. ISBN 3-540-65899-8. URL http://www.springerlink.com/openurl.
asp?genre=article&issn=0302-9743&volume=1598&spage=149.
GPBiB
K. Sastry and D. E. Goldberg. Probabilistic model building and competent genetic programming. In R. L. Riolo and B. Worzel, editors, Genetic Programming Theory and
Practise, chapter 13, pages 205220. Kluwer, 2003. ISBN 1-4020-7581-2.
GPBiB
M. D. Schmidt and H. Lipson. Co-evolving fitness predictors for accelerating and reducing
evaluations. In R. L. Riolo, et al., editors, Genetic Programming Theory and Practice
IV, volume 5 of Genetic and Evolutionary Computation, chapter 17, pages . Springer,
Ann Arbor, 11-13 May 2006. ISBN 0-387-33375-4.
GPBiB
F. Schmiedle, N. Drechsler, D. Grosse, and R. Drechsler. Priorities in multi-objective
optimization for genetic programming. In L. Spector, et al., editors, Proceedings of the
Genetic and Evolutionary Computation Conference (GECCO-2001), pages 129136,
San Francisco, California, USA, 7-11 July 2001. Morgan Kaufmann. ISBN 1-55860774-9. URL http://www.cs.bham.ac.uk/~wbl/biblio/gecco2001/d01.pdf.
GPBiB
M. Schoenauer, B. Lamy, and F. Jouve. Identification of mechanical behaviour by genetic programming part II: Energy formulation. Technical report, Ecole Polytechnique,
91128 Palaiseau, France, 1995.
GPBiB
M. Schoenauer and M. Sebag. Using domain knowledge in evolutionary system identification. In K. C. Giannakoglou, et al., editors, Evolutionary Methods for Design,
Optimization and Control with Applications to Industrial Problems, Athens, 19-21
September 2001. URL http://arxiv.org/abs/cs/0602021.
GPBiB
M. Schoenauer, M. Sebag, F. Jouve, B. Lamy, and H. Maitournam. Evolutionary identification of macro-mechanical models.
In P. J. Angeline and K. E.
Kinnear, Jr., editors, Advances in Genetic Programming 2, chapter 23, pages
467488. MIT Press, Cambridge, MA, USA, 1996. ISBN 0-262-01158-1. URL
http://citeseer.ist.psu.edu/cache/papers/cs/902/http:zSzzSzwww.eeaax.
BIBLIOGRAPHY
213
polytechnique.frzSzpaperszSzmarczSzAGP2.pdf/schoenauer96evolutionary.pdf.
GPBiB
M. Schoenauer, K. Deb, G. Rudolph, X. Yao, E. Lutton, J. J. Merelo, and H.-P. Schwefel,
editors. Parallel Problem Solving from Nature - PPSN VI 6th International Conference, volume 1917 of LNCS, Paris, France, 16-20 September 2000. Springer Verlag.
ISBN 3-540-41056-2. URL http://www.springer.de/cgi-bin/search_book.pl?isbn=
3-540-41056-2.
GPBiB
D. P. Searson, G. A. Montague, and M. J. Willis. Evolutionary design of process controllers. In In Proceedings of the 1998 United Kingdom Automatic Control Council International Conference on Control (UKACC International Conference on Control 98), volume 455 of IEE Conference Publications, University of Wales, Swansea,
UK, 1-4 September 1998. Institution of Electrical Engineers (IEE). URL http:
//www.staff.ncl.ac.uk/d.p.searson/docs/Searsoncontrol98.pdf.
GPBiB
L. Sekanina. Evolvable Components: From Theory to Hardware Implementations. Natural
Computing. Springer-Verlag, 2003. ISBN 3-540-40377-9. URL http://www.fit.vutbr.
cz/~sekanina/ehw/books.html.en.
H.-S. Seok, K.-J. Lee, and B.-T. Zhang. An on-line learning method for objectlocating robots using genetic programming on evolvable hardware. In M. Sugisaka
and H. Tanaka, editors, Proceedings of the Fifth International Symposium on Artificial Life and Robotics, volume 1, pages 321324, Oita, Japan, 26-28 January 2000. URL
http://bi.snu.ac.kr/Publications/Conferences/International/AROB00.ps. GPBiB
C. Setzkorn. On The Use Of Multi-Objective Evolutionary Algorithms For Classification
Rule Induction. PhD thesis, University of Liverpool, UK, March 2005.
GPBiB
S. C. Shah and A. Kusiak. Data mining and genetic algorithm based gene/SNP selection.
Artificial Intelligence in Medicine, 31(3):183196, July 2004. URL http://www.icaen.
uiowa.edu/~ankusiak/Journal-papers/Gen_Shital.pdf.
GPBiB
Y. Shan, H. Abbass, R. I. McKay, and D. Essam. AntTAG: a further study. In R. Sarker
and B. McKay, editors, Proceedings of the Sixth Australia-Japan Joint Workshop on
Intelligent and Evolutionary Systems, Australian National University, Canberra, Australia, 30 November 2002.
GPBiB
Y. Shan, R. I. McKay, H. A. Abbass, and D. Essam. Program evolution with explicit
learning: a new framework for program automatic synthesis. In R. Sarker, et al.,
editors, Proceedings of the 2003 Congress on Evolutionary Computation CEC2003,
pages 16391646, Canberra, 8-12 December 2003. IEEE Press. ISBN 0-7803-7804-0.
URL http://citeseer.ist.psu.edu/560804.html.
GPBiB
Y. Shan, R. I. McKay, R. Baxter, H. Abbass, D. Essam, and N. X. Hoai. Grammar
model-based program evolution. In Proceedings of the 2004 IEEE Congress on Evolutionary Computation, pages 478485, Portland, Oregon, 20-23 June 2004. IEEE
Press. ISBN 0-7803-8515-2. URL http://sc.snu.ac.kr/courses/2006/fall/pg/aai/
GP/shan/scfgcec04.pdf.
GPBiB
Y. Shan, R. I. McKay, D. Essam, and H. A. Abbass. A survey of probabilistic model
building genetic programming. In M. Pelikan, et al., editors, Scalable Optimization
via Probabilistic Modeling: From Algorithms to Applications. Springer, 2006. ISBN
3-540-34953-7.
GPBiB
S. Sharabi and M. Sipper. GP-sumo: Using genetic programming to evolve sumobots.
Genetic Programming and Evolvable Machines, 7(3):211230, October 2006. ISSN
1389-2576.
GPBiB
100
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=50)
1.5
80
1
100
0.5
60
Fitness
Generation 50
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
214
BIBLIOGRAPHY
BIBLIOGRAPHY
215
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=57)
1.5
80
1
100
0.5
60
Fitness
Generation 57
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
216
BIBLIOGRAPHY
BIBLIOGRAPHY
217
J. Stender, editor. Parallel Genetic Algorithms: Theory and Applications. IOS press,
1993.
C. R. Stephens and H. Waelbroeck. Effective degrees of freedom in genetic algorithms
and the block hypothesis. In T. B
ack, editor, Proceedings of the Seventh International Conference on Genetic Algorithms (ICGA97), pages 3440, East Lansing, 1997.
Morgan Kaufmann.
C. R. Stephens and H. Waelbroeck. Schemata evolution and building blocks. Evolutionary
Computation, 7(2):109124, 1999.
T. Sterling. Beowulf-class clustered computing: Harnessing the power of parallelism in a
pile of PCs. In J. R. Koza, et al., editors, Genetic Programming 1998: Proceedings of
the Third Annual Conference, page 883, University of Wisconsin, Madison, Wisconsin,
USA, 22-25 July 1998. Morgan Kaufmann. ISBN 1-55860-548-7. Invited talk.
A. Stoica, J. Lohn, and D. Keymeulen, editors. The First NASA/DoD Workshop on
Evolvable Hardware, Pasadena, California, 19-21 July 1999. IEEE Computer Society.
URL http://cism.jpl.nasa.gov/ehw/events/nasa_eh/.
W. A. Tackett.
Genetic generation of dendritic trees for image classification.
In Proceedings of WCNN93, pages IV 646649. IEEE Press, July
1993. URL http://www.cs.ucl.ac.uk/staff/W.Langdon/ftp/ftp.io.com/papers/GP.
feature.discovery.ps.Z.
GPBiB
H. Takagi. Interactive evolutionary computation: Fusion of the capabilities of EC optimization and human evaluation. Proceedings of the IEEE, 89(9):12751296, September
2001. ISSN 0018-9219. Invited Paper.
GPBiB
I. Tanev, T. Uozumi, and D. Akhmetov. Component object based single system image for
dependable implementation of genetic programming on clusters. Cluster Computing
Journal, 7(4):347356, October 2004. ISSN 1386-7857 (Paper) 1573-7543 (Online).
URL http://www.kluweronline.com/issn/1386-7857.
GPBiB
J. Taylor, R. Goodacre, W. G. Wade, J. J. Rowland, and D. B. Kell. The deconvolution
of pyrolysis mass spectra using genetic programming: application to the identification
of some eubacterium species. FEMS Microbiology Letters, 160:237246, 1998. GPBiB
A. Teller. Genetic programming, indexed memory, the halting problem, and other curiosities. In Proceedings of the 7th annual Florida Artificial Intelligence Research
Symposium, pages 270274, Pensacola, Florida, USA, May 1994. IEEE Press. URL
http://www.cs.cmu.edu/afs/cs/usr/astro/public/papers/Curiosities.ps. GPBiB
A. Teller. Evolving programmers: The co-evolution of intelligent recombination operators.
In P. J. Angeline and K. E. Kinnear, Jr., editors, Advances in Genetic Programming 2,
chapter 3, pages 4568. MIT Press, Cambridge, MA, USA, 1996. ISBN 0-262-01158-1.
URL http://www.cs.cmu.edu/afs/cs/usr/astro/public/papers/AiGPII.ps. GPBiB
A. Teller and D. Andre. Automatically choosing the number of fitness cases: The rational
allocation of trials. In J. R. Koza, et al., editors, Genetic Programming 1997: Proceedings of the Second Annual Conference, pages 321328, Stanford University, CA,
USA, 13-16 July 1997. Morgan Kaufmann. URL http://www.cs.cmu.edu/afs/cs/usr/
astro/public/papers/GR.ps.
GPBiB
A. Teredesai and V. Govindaraju. GP-based secondary classifiers. Pattern Recognition,
38(4):505512, April 2005.
GPBiB
100
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=64)
1.5
80
1
100
0.5
60
Fitness
Generation 64
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
218
BIBLIOGRAPHY
BIBLIOGRAPHY
219
L. Walker.
Search engine case study:
searching the web using genetic programming and MPI.
Parallel Computing, 27(1-2):7189, January
2001. URL http://www.sciencedirect.com/science/article/B6V12-42K5HNX-4/1/
57eb870c72fb7768bb7d824557444b72.
GPBiB
P. Walsh and C. Ryan. Paragen: A novel technique for the autoparallelisation of sequential programs using genetic programming. In J. R. Koza, et al., editors, Genetic Programming 1996: Proceedings of the First Annual Conference, pages 406
409, Stanford University, CA, USA, 2831 July 1996. MIT Press. URL http:
//cognet.mit.edu/library/books/view?isbn=0262611279.
GPBiB
D. C. Weaver. Applying data mining techniques to library design, lead generation and lead optimization. Current Opinion in Chemical Biology, 8(3):264270,
2004. URL http://www.sciencedirect.com/science/article/B6VRX-4CB69R1-2/2/
84a354cec9064ed07baab6a07998c942.
GPBiB
100
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=71)
1.5
80
1
100
0.5
60
Fitness
Generation 71
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
220
BIBLIOGRAPHY
BIBLIOGRAPHY
221
M.-L. Wong, T.-T. Wong, and K.-L. Fok. Parallel evolutionary algorithms on graphics
processing unit. In D. Corne, et al., editors, Proceedings of the 2005 IEEE Congress on
Evolutionary Computation, volume 3, pages 22862293, Edinburgh, Scotland, UK, 2-5
September 2005. IEEE Press. ISBN 0-7803-9363-5. URL http://ieeexplore.ieee.
org/servlet/opac?punumber=10417&isvol=3.
A. M. Woodward, R. J. Gilbert, and D. B. Kell. Genetic programming as an analytical tool for non-linear dielectric spectroscopy. Bioelectrochemistry and Bioenergetics, 48(2):389396, 1999. URL http://www.sciencedirect.com/science/article/
B6TF7-3WJ72RJ-T/2/19fd01a6eb6ae0b8e12b2bb2218fb6e9.
GPBiB
S. Wright. The roles of mutation, inbreeding, crossbreeding and selection in evolution.
In D. F. Jones, editor, Proceedings of the Sixth International Congress on Genetics,
volume 1, pages 356366, 1932.
H. Xie, M. Zhang, and P. Andreae. Genetic programming for automatic stress detection
in spoken english. In F. Rothlauf, et al., editors, Applications of Evolutionary Computing, EvoWorkshops2006: EvoBIO, EvoCOMNET, EvoHOT, EvoIASP, EvoInteraction, EvoMUSART, EvoSTOC, volume 3907 of LNCS, pages 460471, Budapest, 1012 April 2006. Springer Verlag. ISBN 3-540-33237-5. URL http://www.springerlink.
com/openurl.asp?genre=article&issn=0302-9743&volume=3907&spage=460. GPBiB
L. Yamamoto and C. F. Tschudin. Experiments on the automatic evolution of protocols
using genetic programming. In I. Stavrakakis and M. Smirnov, editors, Autonomic
Communication, Second International IFIP Workshop, WAC 2005, Revised Selected
Papers, volume 3854 of Lecture Notes in Computer Science, pages 1328, Athens,
Greece, October 2-5 2005. Springer. ISBN 3-540-32992-7. URL http://cn.cs.unibas.
ch/people/ly/doc/wac2005-lyct.pdf.
GPBiB
D. Yamashiro, T. Yoshikawa, and T. Furuhashi. Visualization of search process and
improvement of search performance in multi-objective genetic algorithm. In G. G. Yen,
et al., editors, Proceedings of the 2006 IEEE Congress on Evolutionary Computation,
pages 11511156, Vancouver, BC, Canada, 16-21 July 2006. IEEE Press. ISBN 0-78039487-9. URL http://ieeexplore.ieee.org/servlet/opac?punumber=11108.
K. Yanai and H. Iba. Estimation of distribution programming based on bayesian network.
In R. Sarker, et al., editors, Proceedings of the 2003 Congress on Evolutionary Computation CEC2003, pages 16181625, Canberra, 8-12 December 2003. IEEE Press. ISBN
0-7803-7804-0. URL http://www.iba.k.u-tokyo.ac.jp/papers/2003/yanaiCEC2003.
pdf.
GPBiB
K. Yanai and H. Iba. Program evolution by integrating EDP and GP. In K. Deb, et al.,
editors, Genetic and Evolutionary Computation GECCO-2004, Part I, volume 3102
of Lecture Notes in Computer Science, pages 774785, Seattle, WA, USA, 26-30 June
2004. Springer-Verlag. ISBN 3-540-22344-4. URL http://www.iba.k.u-tokyo.ac.jp/
papers/2004/yanaiGECCO2004.pdf.
GPBiB
M. Yangiya. Efficient genetic programming based on binary decision diagrams. In 1995
IEEE Conference on Evolutionary Computation, volume 1, pages 234239, Perth, Australia, 29 November - 1 December 1995. IEEE Press.
GPBiB
X. Yao, E. Burke, J. A. Lozano, J. Smith, J. J. Merelo-Guerv
os, J. A. Bullinaria, J. Rowe,
P. T. A. Kab
an, and H.-P. Schwefel, editors. Parallel Problem Solving from Nature - PPSN VIII, volume 3242 of LNCS, Birmingham, UK, 18-22 September 2004.
Springer-Verlag. ISBN 3-540-23092-0. URL http://www.springerlink.com/openurl.
asp?genre=issue&issn=0302-9743&volume=3242.
GPBiB
100
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=78)
1.5
80
1
100
0.5
60
Fitness
Generation 78
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
222
BIBLIOGRAPHY
BIBLIOGRAPHY
223
100
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=85)
1.5
80
1
100
0.5
60
Fitness
Generation 85
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
Index
% protected division, 22
22 bit parity, sub-machine-code GP, 93
abstraction operator, 48
active code, 102
adaptive market hypothesis, 123
ADATE, 48
ADF
crossover, 49
recursion prevention, 49
troubleshooting, 134
agent
evolutionary, 122
image processing, 122
social simulation, 123
aggregate fitness function, 7576
analogue circuit evolution, 119
ant colony optimisation (ACO), 74
PEEL, 74
ant programming, generalised (GAP), 74
ant-TAG, 74
anytime fitness, 85
applications
arbitrage, 123
architecture, 76
art, 127
bin packing, 127
bioinformatics, 125126
biology, 125
bomb disposal, 124
cellular automata, 119
chemical engineering, 124
chemistry, 126
circuits, 119
classification, 121
control, 119
curve fitting, 113
data compression, 128
data mining, 85, 125
data modelling, 113
economics, 123
entertainment, 127
exchange rate, 123
film industry, 128
finance, 123
gambling, 123
games, 120, 127
guidelines, 111
heuristics, 126
human competitive, 117
hyper-heuristics, 126
image processing, 121, 122
industry, 124
infrared spectra, 125
jet engine optimisation, 125
mechanics, 120
medical, 122, 125
meta-heuristics, 126
music, 128
neural networks, 121
nuclear reactor control, 124
numerical control, 124
OCR, 122
patents, 119
population size, 116
process control, 124
QSAR, 78
quantum computing, 118
robotics, 118, 119
SAT, 127
side effects, 113
signal processing, 121
stock market, 123
symbolic regression, 113
teaching aids, 124
TSP, 127
video, 129
watermarking, 122
arbitrage, 123
architecture
evolution, 50
program, 50
225
226
architecture altering operator, 48, 50
architecture design tool, 76
architecture-defining preparatory step, 50
arity
function, 11
node, 11
art, evolutionary, 127128
artificial intelligence, 1
human competitive, 117121
artificial neural network, 66, 121
auto parallelising code, 68
automatically defined function (ADF), 48
limitations, 49
primitive sets, 50
automatically defined iteration (ADI), 50
automatically defined loop (ADL), 50
automatically defined recursion (ADR), 50
automatically defined store (ADS), 50
Backgammon, 120, 127
backward chaining GP, 86
Bayesian automatic programming (BAP),
74
Bayesian optimisation algorithm (BOA), 70
Beagle, Open, C++ implementation, 148
beowulf cluster, 95
bias induced by constraints, 55
bibliography, GP, 148
bin packing problem, 127
bioinformatics, 125, 126
biological system, 125
bloat, 97, 101
control, 76
crossover bias theory, 104
definition, 101
executable model, 102
initialisation effects, 40
lexicographic control, 77
lexicographic parsimony, 77, 78, 80
MDL, 107
multi-objective, 77
mutation effects, 42
nature of program search spaces theory, 102
Pareto control, 78
parsimony pressure, 106
practical effects, 101
removal bias theory, 102
replication accuracy theory, 102
size evolution equation, 103
Tarpeian method, 106
vs. growth, 101
boids, 128
bomb disposal, 124
books, GP, 146
INDEX
branching factor, 15
bulletin board, 149
cache
data, 86
Cartesian GP, 67
GPU, 90
multi-objective, 81
CCNOT, distribution of circuits, 99
cellular automaton, 68
evolution, 119
cellular encoding, 57
check pointing, 85
Checkers, 127
chemical engineering, 124
chemistry, 78, 126
Chess, 120, 127
circuit design
multi-objective, 76
circuit, analogue
evolution, 119
classification, 132
cellular automaton, majority vote, 68
handwriting, 122
infrared images, 121
M25, 121
optical, 122
PADO, 67
SAR radar, 121
sonar, 122
closure, 2122
cluster computing, 95
co-evolution, 46
code
active, 102
inactive, 102
communication topology
ring, 94
toroidal grid, 94
compiled vs. interpreted GP, 24
GPU, 90
compiling GP populations, 90
comprehensible programs, 51
compression
fractal, 129
image, 128
lossless, 129
lossy, 128, 129
sound, 128
video, 129
wavelet, 129
computational chemistry, 126
computer art, 127128
computer game, 127
computer program, evolution, 2
INDEX
227
toroidal grid, 94
density of solutions, 57
depth
node, 12
tree, 12
depth limits, 104
derivation tree, 54
developmental GP, 23, 57
direct problem, 112
directed acyclic graph (DAG), 87
DirectX, 90
disassortative mating, 82
Discipulus, 63, 124
discussion group, 149
distributed
evolutionary algorithm, 8895
GP, 9395
populations
fine-grained, 94
geographically, 9395
ring topology, 94
toroidal grid, 94
distribution
sampling, 69
diversity, 94
promotion, multi-objective GP, 78
Draughts, 127
dynamic fitness, 84
dynamic size or depth limits, 105
dynamic subset selection (DSS), 85
EA, 1
EC, 1
ECJ, Java implementation, 148
economic modelling, 123124
editing operator, 46
efficient market hypothesis, 123
elitism, troubleshooting, 137
Elvis robot, 115
embarrassingly parallel, 89
engine monitoring and control, 121
entropy, 129
ephemeral random constant, 20
example, 29
estimation of distribution algorithms
(EDAs), 69
estimation of distribution programming
(EDP), 72
evaluation safety, 22
even parity, 78
evolutionary algorithm (EA), 1
distributed, 8895
evolutionary art, 128
evolutionary computation (EC), 1
evolutionary music, 128
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=92)
1.5
80
1
100
0.5
60
Fitness
Generation 92
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
228
evolutionary quantum computing, 120
evolutionary search, mathematical models,
98
evolving agents, 122
evolving designs, 23
example
crossover, 34
ephemeral random constant, 29
fitness, 30
function set, 30
mutation, 33
parameter, 30
terminal set, 29
termination, 31
exceptions
integer overflow, 22
problems caused by trapping, 22
exchange rate, 123
executable model, bloat, 102
expression simplification, 135
extended compact genetic algorithm
(eCGA), 70
extended compact GP (eCGP), 72
farming in parallel GP, 89
feature selection, 115
field programmable gate array (FPGA), 92
films, feature, 128
films, GP, 146
financial time series prediction, 123124
fine-grained distributed GP, 94
finite state automata, 66
fitness, 2, 2426
anytime, 85
case, 26
reduction, 83
combined objectives, 7576
dynamic, 84
dynamic, multi-objective, 8081
example, 30
fast
FPGA, 92
GPU, 9092
sub-machine-code, 93
hits, 76
incremental, 58
multi-objective, 7580
image processing, 122
RMS, 115
sharing, 77, 137
staged, 81, 85
static, problems with, 84
symbolic regression, 115
for, syntax constraint, 52
Fourier transform, quantum evolution, 120
INDEX
FPGA, 92
GP implementation, 93
fractal compression, 129
freeware, 148
ECJ, 148
GPC++, 148
Lil-GP, 148
Open Beagle, 148
TinyGP, 151162
frequency of primitives, 136
full random trees, 1213
function arity, 11
function set, 1923
evolving non-programs, 23
example, 30
modelling, 115
side effects, 24
sufficiency, 2223
function-defining branch, 48
gambling, 123
game theory, 123
games, 127
generalisation-accuracy tradeoff, 107
generalised ant programming (GAP), 74
genetic operator, 2
rates, 26, 116
genetic program representation, 9
GENR8, 76
geographically distributed GP, 9395
GP implementations, 147148
TinyGP, 151162
GP problem solver (GPPS), 51
GP-ZIP, 130
GPC++, implmentation, 148
GPU, 90
speedup factor, 92
grammar, 53
based constraint, 53
based GP, 53
initialisation, 57
operators, 57
context-sensitive, 55
tree adjoining, 55
grammar model based program evolution
(GMPE), 74
grammatical evolution
troubleshooting, 134
grammatical evolution, 55
graph, directed acyclic (DAG), 87
graph-based GP, 6568
graphics processing unit (GPU), 90
grow random trees, 1214
tree size bias, 13
growth vs. bloat, 101
INDEX
229
Java, 64
T7, 64
SIMD and GPU, 92
introns
troubleshooting, 136
useful with mutation, 64
invention, evolution of, 119
inverse kinematics, 115
inverse problem, 112
iterated functions system (IFS), 128
jet engine optimisation, 125
journals, 147
kinematics, 120
L-system, 74, 76
tree adjoining grammar, 58
Lagrange
distribution of the second kind, 104
initialisation, 41
large populations, 137
learning, machine (ML), 1
lexicographic
parsimony bloat control, 77, 80
preference, 77, 80
libraries, dynamic, 47, 48
Lil-GP, 148
limits, size and depth, 104
Lindenmayer grammar, see L-system
linear GP, 6164
Cartesian GP, 67
crossover, 64
homologous, 64
instruction format, 62, 63
interpreted, 62, 64
introns, 64
removal, 64
Java, 64
mutation, 64
speed, 62
T7, 64
linear representation, 6164
Cartesian GP, 67
linearised tree-based GP
prefix, 62
reverse polish, 92
logic network, evolution, 66
lossless compression, 129
lossy compression, 128
machine code GP, 6264
Intel x86, 63
SPARC, 62, 63
Z80, 62, 63
1000
Avg Size
Avg Fitness
Best Fitness
90
sin(x)
GP (gen=99)
1.5
80
1
100
0.5
60
Fitness
Generation 99
(see Sec. B.4)
Average Size
70
50
40
-0.5
10
30
-1
20
-1.5
10
20
40
60
Generations
80
100
20
40
60
Generations
80
100
-2
4
x
230
machine intelligence, 1
human-competitive, 141
machine learning (ML), 1
mailing list, GP, iii, 149
Markov chain model
evolutionary algorithms, 98
GP, 98
program execution, 100
masterslave GP, 89
max problem, 72
medical imaging, 122
message passing interface (MPI), 95
meta-heuristics, 126
meta-optimising semantic evolutionary
search (Moses), 72
migration rate, 94
MIMIC, 70
minimum description length (MDL), 46,
107
model car racing, 127
model, executable, 102
modular structure, 47
modules, 59
Moores Law, 90
MPI, 95
multi-level type systems, 53
multi-objective fitness, 7576
image processing, 122
multi-objective GP (MO GP), 41, 75
jets, 125
operator pressure, 81
Pareto dominance, 7680
small vs good, 81
multi-objective optimisation (MOO), 75
data visualisation, 80
preference information, 77
multiple typed programs, 2122
music, evolutionary, 128
mutation, 2
constants, 43
dynamic libraries, 48
example, 33
hoist, 43, 106
local search, 43
node replacement, 43
permutation, 43
point, 16, 43
rate, 17
shrink, 43, 106
simulated annealing, 44
size-fair, 42, 105
subtree, 16
survey, 4244
swap, 43
INDEX
N-gram GP, 72
NAND, distribution of circuits, 99
nature of program search spaces theory, 102
neural network evolution, 121
with PDGP, 66
niching, 77
NLP
parsing and tagging, multi-objective
GP, 79
text retrieval, multi-objective GP, 79
node
arity, 11
depth, 12
replacement mutation, 43
non-terminal symbols, 53
non-Turing complete program, theory, 99
NSGA-II, extension to GP, 79
nuclear reactor control, 124
numeric regression, 30
numerical control, 124
Odin, 51
one-max, 72
one-point crossover, 44
theory, 98
OpenGL, 92
operator
architecture-altering, 50
composition, 17
constrained, 53
crossover, 2, 34
editing, 46
genetic, 2
grammar-based GP, 57
mutation, 2
rate, 17
reorganisation of subtrees, 46
repair, 81
reproduction, 17, 33
optical character recognition (OCR), 122
Oscar, 128
Othello, 127
over-fitting, 46, 139, 140
dynamic fitness function, 84
overflow, numeric, 22
Pac-Man, Ms, 127
PADO
anytime programming, 67
random access memory, 67
panmictic population, 137
Paragen, 68
parallel computing, 88
parallel distributed GP (PDGP), 65
ADF, 66
INDEX
link set, 66
no side effects, 67
parallel evolutionary algorithm, 8895
parallel GP, 89, 94
parallel processing, 90
parallel virtual machine (PVM), 95
parameter
quadratic polynomial example, 30
values, 2627, 116
Pareto
criterion, 106
dominance, 7680
relaxation, 78
front, 77
preference, 80
tournament, 78
Parisian GP, 122
parsimony
coefficient, 106
difficulties, 107
dynamic, 106, 107
lexicographic, 77
pressure, 106
covariant, 107
particle swarm optimisation (PSO), 127
patentable invention, 119
penalty bloat control method, 107
permutation mutation, 43
point mutation, 16, 43
Poker, 127
polymorphic types, 53
population size, 2627
applications, 116
troubleshooting, 134
population variety, 136
population-based incremental learning
(PBIL), 70
post fix notation, 92
precision vs. recall, multi-objective GP, 79
premature convergence, 84
preparatory step
architecture-defining, 50
fitness function, 24
function set, 20
parameters, 26
terminal set, 19
termination, 27
primitive set, 9, 1923
evolving non-programs, 23
modelling, 115
side effects, 20, 24
vector-based, 51
primitives, number in population, 136
probabilistic incremental program evolution (PIPE), 71
231
probability distribution
sampling, 69
process control, 124125
production rule, 53
program
architecture, 50
evolution, 50
evolution, 2
human understandability, 51
program evolution with explicit learning
(PEEL), 74
program execution, 24
programmatic compression, 128, 129
proportional integrative and derivative
(PID), controller, 119
proposals to funding agencies, 139
protected
MOVE AHEAD, 22
division, 22
other operations, 22
pruning neural networks, 82
PushGP, 59
PVM, 95
pygmies and civil servants, 81
QSAR, multi-objective, 78
quantum
algorithm evolution, 118
computing, 118, 120
Fourier transform evolution, 120
quantum programs, distribution of, 99
ramped half-and-half, 11, 13
problems with, 40
ramped uniform initialisation, 40
RapidMind, 92
rate
crossover, 17
mutation, 17
operator, 17
rational allocation of trials, 84
recurrent transition network, evolution, 66
recursion, 59
regression, 114
numeric, 30
symbolic, 30
applications, 113116
removal bias theory, 102
reorganisation of subtrees, 46
repair operator, 81
replication accuracy theory, 102
representation
GP, 9
graph-based, 6568
prefix notation, 10
232
reverse polish, 92
syntax tree, 9
tree-based in GP, 10
reproduction operator, 17
in example, 33
result-producing branch, 48
reusable component, 48
reverse polish representation, 92
reversible computing, distribution of, 99
rewrite rule, 53
ring species, 88
RoboCup, 118
robot
control, 119
Elvis, 115
football, 127
run time errors, avoiding, 22
run time exceptions, problems, 22
sampling
probability distribution, 69
SAT, 127
schema theory, 98
search
evolutionary models, 98
stochastic, 97
search space, 24, 97, 99
seeding, 41
selection of parents
tournament, 1415
backward chaining GP, 86
semantic constraints, 56
sensor, soft, 114
seti.home, 95
shrink mutation, 43, 106
side effects, 20, 24
applications, 113
signal processing, 121122
SIMD, 90
simple generics, 53
simulated annealing, 44, 46
single typed programs, 21, 51
size
evolution equation, 103
Lagrange distribution, 104
limits, 104
tree, 12
size-fair
crossover, 45, 105
theory, 98
mutation, 42, 105
soft sensor, 114
solution density, 57
sound compression, 128
spatial separation, 94
INDEX
SPEA2, 78
species
formation, 88
ring, 88
speedup factor, GPU, 92
speedup techniques, 83
staged fitness function, 81
static fitness, problems with, 84
stochastic computing, distribution of, 99
stochastic context-free grammar, 74
stochastic search, 97
stock market, 123
Stroganoff, 43
strongly typed GP, 52
theory, 98
sub-machine-code GP (SMCGP), 93
sub-population, 93
submitting to a journal, 139
subtree
crossover, 15
mutation, 16
reorganisation, 46
sufficiency, 2223
supercomputer, 89, 95
swap mutation, 43
swarm, 122
symbolic regression, 30
applications, 113116
syntactic constraints, 56
syntax tree, 9
T7 programming language, 64, 100
take over time, 84
Tarpeian bloat control, 106
teaching aids, 124
terminal set, 1920
evolving non-programs, 23
example, 29
modelling, 115
side effects, 20, 24
sufficiency, 2223
terminals of a grammar, 53
termination
criterion, 27
example, 31
theory, 97
bloat, 104
crossover
bias, 104
context-preserving, 98
homologous, 98
one-point, 98
size-fair, 98
non-Turing complete program, 99
schema, 98
INDEX
search spaces, 99
thesis write up, 139
time series, financial, 123124
TinyGP, 151162
Toffoli, distribution of circuits, 99
tournament selection, 1415, 86
Transputer, 94
trap function, 72
travelling salesman problem (TSP), 127
tree
depth, 12
derivation, 54
editing, 46
size, 12
tree adjoining grammar (TAG), 55
tree-based representation, 10
Tron, 127
Turing complete GP, 64
Turing complete program, theory, 99
Turing test, 117, 142
type
consistency, 2122, 51
conversion, 51
initialisation, 56
more understandable programs, 51
multiple, 52
single, 51
strong, 52
system
higher-order, 53
multi-level, 53
unexploded bombs, 124
uniform crossover, 44
uniform initialisation, 40
uniform multivariate distribution algorithm
(UMDA), 70
validating results, 132
variety, of population, 136
vector-based primitive, 51
VHDL, multi-objective GP, 79
video compression, 129
videos, GP, 146
virtual reality data visualisation, 80
watermark security
application, 122
multi-objective GP, 79
wavelet lossy compression, 129
world wide GP, 95
Wrights geographic model of evolution, 88
233
Colophon
This book was primarily written using the LATEX document preparation
system, along with BibTEX, pdflatex and makeindex. Most of the editing
was done using the emacs and xemacs editors, along with extensions such
as RefTEX; some was done with TEXShop as well. Most of the data plots
were generated using gnuplot and the R statistics package. Diagrams were
generated with a variety of tools, including the Graphviz package, tgif and
xfig. A whole host of programming and scripting languages were used to
automate various processes in both the initial scientific research and in the
production of this book; they are too numerous to list here, but were crucial
nonetheless. The cover was created with Adobe Photoshop1 and gimp.
Coordinating the work of three busy, opinionated authors is not trivial,
and would have been much more difficult without the use of revision control
systems such as Subversion. Around 500 commits were made in a six month
period, averaging around 10 commits per day in the final weeks. The actual
files were hosted as a project at http://assembla.com; we didnt realise
until several months into the project that Assemblas president is in fact
Andy Singleton, who did some cool early work in GP in the mid-90s.
The reviews and summaries on the back cover were generated
stochastically using the idea of N-grams from linguistics. For the reviews
we collected a number of reviews of previous books on GP and EAs, and
tabulated the frequency of different triples of adjacent words. These frequencies of triples in the source text were then used to guide the choices of
words in the generated reviews. The only word following the pair ad
and hoc in our source reviews, for example, was tweaks; thus once ad
and hoc had been chosen, the next word had to be tweaks. The pair
of the, on the other hand, appears numerous times in our source text, followed by words such as field, body, and rapidly. However, theory
is the most common successor, and, therefore, the most likely to be chosen to follow of the in the generation of new text. The generation of the
summaries was similar, but based on the front matter of the book itself.
See (Poli and McPhee, 2008a) for an application of these ideas in genetic
programming.
1 Adobe