Genetic Programming SLides
Genetic Programming SLides
Krzysztof Krawiec
Tomasz Pawlak
1
Introduction
Introduction 2
Outline and objectives
1 Introduction: GP as a variant of EC
2 Specic features of GP
3 Variants of GP
4 Applications
5 Some theory
6 Case studies
Introduction 3
Bilbiography
Introduction 4
Acknowledgment
Credits go to:
A Field Guide to Genetic Programming http://www.gp-field-guide.org.uk/
[42]
Introduction 5
Background
Background 6
Evolutionary Computation (EC)
Formulation:
p ∗ = arg max f (p )
p∈P
where
P is the considered space (search space) of candidate solutions (solutions for
short)
f is a (maximized) tness function
p ∗ is an optimal solution (an ideal) that maximizes f .
Background 7
Generic evolutionary algorithm
Evolutionary Algorithm
Initialization of population P
Population P of individuals
Solution/individual s
Evaluation Fitness function f
f(s)
Termination criteria
Selection
Background 8
[Unique] characteristic of EC
Background 9
What is genetic programming?
In a nutshell:
A variant of EC where the genotypes represent programs, i.e., entities capable of
reading in input data and producing some output data in response to that input.
Fitness function f measures the similarity of the output produced by the program
to the desired output, given as a part of task statement.
Standard representation: expression trees.
The set of program inputs I , even if nite, is usually so large that running each
candidate solution on all possible inputs becomes intractable.
GP algorithms typically evaluate solutions on a sample I 0 ⊂ I , |I 0 | |I | of possible
inputs, and tness is only an approximate estimate of solution quality.
The task is given as a set of tness cases, i.e., pairs (xi , yi ) ∈ I × O , where xi
usually comprises one or more independent variables and yi is the output variable.
The candidate solutions in GP are being assembled from elementary entities called
instructions.
A part of formulation of a GP task is then also an instruction set I , i.e., a set of
symbols used by the search algorithm to compose the programs (candidate
solutions).
Design of I usually requires some background knowledge;
In particular, it should comprise all instructions necessary to nd solution to the
problem posed (closure).
1: function Crossover(p1 , p2 )
2: repeat
3: s1 ← Random node in p1
4: s2 ← Random node in p2
5: (p10 , p20 ) ← Swap subtrees rooted in s1 and s2
6: until Depth(p10 ) < dmax ∧ Depth(p20 ) < dmax . dmax is the tree depth
limit
7: return (p10 , p20 )
8: end function
1: function Mutation(p , I )
2: repeat
3: s ← Random node in p
4: s 0 ← RandomProgram(I )
5: p 0 ← Replace the subtree rooted in s with s 0
6: until Depth(p 0 ) < dmax . dmax is the tree depth limit
7: return p 0
8: end function
Objective: Find program whose output matches x 2 + x + 1 over the range [−1, 1].
Such tasks can be considered as a form of regression.
As solutions are built by manipulating code (instructions), this is referred to as
symbolic regression.
Instruction set:
Nonterminal (function) set: +, -, % (protected division), and ∗; all operating on
oats
Terminal set: x , and constants chosen randomly between -5 and +5
Selection: tness proportionate (roulette wheel) non elitist
Initial pop: ramped half-and-half (depth 1 to 2. 50% of terminals are constants)
(to be explained later)
Parameters:
population size 4,
50% subtree crossover,
25% reproduction,
25% subtree mutation, no tree size limits
Termination: when an individual with tness better than 0.1 found
Assume:
a gets reproduced
c gets mutated (at loci 2)
a and d get crossed-over
a and b get crossed over
Population 1:
The solutions evolving under the selection pressure of the tness function are
themselves functions (programs).
GP operates on symbolic structures of varying lengths.
There are no variables for the algorithm to operate on (at least in the common
sense).
The program can be tested only on a limited number of tness cases (tests).
http://www.genetic-programming.com/johnkoza.html
GP Tree Representations
Set-based Strongly-Typed Genetic Programming
Ephemeral Random Constants
Automatically-Dened Functions and Automatically Dened Macros
Multiple tree forests
Six tree-creation algorithms
Extensive set of GP breeding operators
Grammatical Encoding
Eight pre-done GP application problem domains (ant, regression, multiplexer,
lawnmower, parity, two-box, edge, serengeti)
Standard output:
java ec.Evolve -file ./ec/app/regression/quinticerc.params
...
Threads: breed/1 eval/1
Seed: 1427743400
Job: 0
Setting up
Processing GP Types
Processing GP Node Constraints
Processing GP Function Sets
Processing GP Tree Constraints
{-0.13063322286594392,0.016487577414659428},
{0.6533404396941143,0.1402200189629743},
{-0.03750634856569701,0.0014027712093654706},
...
{0.6602806044824949,0.13869498395598084},
Initializing Generation 0
Subpop 0 best fitness of generation: Fitness: Standardized=1.1303205 Adjusted=0.46941292 H
Generation 1
Subpop 0 best fitness of generation: Fitness: Standardized=0.6804932 Adjusted=0.59506345 H
...
Game-Playing
TORCS Car Racing [53]
Ms PacMan [10]
Othello [30]
Chessboard Evaluation [47]
Backgammon [47]
Mario [52]
NP-Complete Puzzles [15]
Robocode [47]
Rush Hour [47]
Checkers [47]
Freecell [47]
Dynamic Optimisation
Dynamic Symbolic Regression [38, 39, 54]
Dynamic Scheduling [13]
Traditional Programming
Sorting [16, 1]
How should the particular operators coexist in an evolutionary process? In other words:
Or:
pop.subpop.0.species.pipe.num-sources = 2
pop.subpop.0.species.pipe.source.0 = ec.gp.koza.CrossoverPipeline
pop.subpop.0.species.pipe.source.0.prob = 0.9
pop.subpop.0.species.pipe.source.1 = ec.gp.koza.MutationPipeline
pop.subpop.0.species.pipe.source.1.prob = 0.1
Average number of nodes per generation in a typical run of GP solving the Sextic
problem x 6 − 2x 4 + x 2 .
Variants of GP 52
Strongly typed GP (STGP)
Motivation:
Tree-like structures are not natural for contemporary hardware architectures
Program = a sequence of instructions
Data passed via registers
Pros:
Directly portable to machine code, fast execution.
Natural correspondence to standard (GA-like) crossover operator.
Applications: direct evolution of machine code [36].
r1 x1 g1 r1
r2 x2 O1 O2 O3 O4 g2 r2
r3 x3 g3 r3
Variants of GP 55
Linear GP
Variants of GP 56
Stack-based GP
Variants of GP 57
Push: Example
Program:
( 2 3 INTEGER.* 4.1 5.2 FLOAT.+ TRUE FALSE BOOLEAN.OR )
Initial stack states:
BOOLEAN STACK: ()
CODE STACK: ( 2 3 INTEGER.* 4.1 5.2 FLOAT.+ TRUE FALSE BOOLEAN.OR )
FLOAT STACK: ()
INTEGER STACK: ()
Stack states after program execution:
BOOLEAN STACK: ( TRUE )
CODE STACK: ( ( 2 3 INTEGER.* 4.1 5.2 FLOAT.+ TRUE FALSE BOOLEAN.OR ) )
FLOAT STACK: ( 9.3 )
INTEGER STACK: ( 6 )
http://hampshire.edu/lspector/push3-description.html
Variants of GP 58
Other variants of GP
Graph-based GP
Motivation: standard GP cannot reuse subprograms (within a single program)
Example: Cartesian Genetic Programming
Variants of GP 59
Other variants of GP
Variants of GP 60
Simple EDA-like GP: PIPE
Probabilistic Incremental Program Evolution [44]
Variants of GP 61
Applications of GP
Applications of GP 62
Review
2 Koza, J. R., Keane, M. A., Streeter, M. J., Mydlowec, W., Yu, J., Lanza, G., 2003. Genetic Pro-
gramming IV: Routine Human-Competitive Machine Intelligence. Kluwer Academic Publishers.
3 Arcuri, A., Yao, X., A novel co-evolutionary approach to automatic software bug xing. In: Wang,
J. (Ed.), 2008 IEEE World Congress on Computational Intelligence. IEEE Computational Intelligence
Society, IEEE Press, Hong Kong.
4 Schmidt, M., Lipson, H., 3 Apr. 2009. Distilling free-form natural laws from experimental data.
Science 324 (5923), 8185.
Applications of GP 63
Humies
(...) Entries were solicited for cash awards for human-competitive results that were
produced by any form of genetic and evolutionary computation and that were published
Applications of GP 64
Humies
Applications of GP 65
Selected Gold Humies
2004: Jason D. Lohn Gregory S. Hornby Derek S. Linden, NASA Ames Research
Center,
An Evolved Antenna for Deployment on NASA's Space Technology 5 Mission
http://idesign.ucsc.edu/papers/hornby_ec11.pdf
Applications of GP 66
Selected Gold Humies using GP
Applications of GP 67
Selected Gold Humies using GP
2008: Lee Spector David M. Clark Ian Lindsay Bradford Barr Jon Klein
Genetic Programming for Finite Algebras
2010: Natalio Krasnogor Paweª Widera Jonathan Garibaldi
Evolutionary design of the energy function for protein structure prediction GP
challenge: evolving the energy function for protein structure prediction Automated
design of energy functions for protein structure prediction by means of genetic
programming and improved structure similarity assessment
2011: Achiya Elyasaf Ami Hauptmann Moshe Sipper
GA-FreeCell: Evolving Solvers for the Game of FreeCell
Applications of GP 68
Other applications
Applications of GP 69
Additional resources
Additional resources 70
Additional resources
The genetic programming `home page' (a little bit messy, but still valuable)
http://www.genetic-programming.com/
Additional resources 71
Bibliography
Bibliography 72
A. Agapitos and S. M. Lucas.
Evolving Modular Recursive Sorting Algorithms.
In Proc. EuroGP, 2007.
A. Arcuri and X. Yao.
A novel co-evolutionary approach to automatic software bug xing.
In J. Wang, editor, 2008 IEEE World Congress on Computational Intelligence,
pages 162168, Hong Kong, 1-6 June 2008. IEEE Computational Intelligence
Society, IEEE Press.
M. Arganis, R. Val, J. Prats, K. Rodriguez, R. Dominguez, and J. Dolz.
Genetic programming and standardization in water temperature modelling.
Advances in Civil Engineering, 2009, 2009.
A. Brabazon, M. O'Neill, and I. Dempsey.
An Introduction to Evolutionary Computation in Finance.
IEEE Computational Intelligence Magazine, 3(4):4255, 2008.
R. Bradley, A. Brabazon, and M. O'Neill.
Dynamic High Frequency Trading: A Neuro-Evolutionary Approach.
In Proc. EvoWorkshops, 2009.
G. Cuccu and F. Gomez.
When novelty is not enough.
In Proc. EvoApplications, 2011.
Bibliography 73
M. Daga and M. C. Deo.
Alternative data-driven methods to estimate wind from waves by inverse
modeling.
Natural Hazards, 49(2):293310, May 2009.
I. Dempsey, M. O'Neill, and A. Brabazon.
Adaptive Trading With Grammatical Evolution.
In Proc. CEC, 2006.
G. Durrett, F. Neumann, and U.-M. O'Reilly.
Computational Complexity Analysis of Simple Genetic Programming On Two
Problems Modeling Isolated Program Semantics.
In Proc. FOGA, 2011.
E. Galván-López, J. Swaord, M. O'Neill, and A. Brabazon.
Evolving a Ms. PacMan Controller Using Grammatical Evolution.
In Applications of Evolutionary Computation. Springer, 2010.
J. V. Hansen, P. B. Lowry, R. D. Meservy, and D. M. McDonald.
Genetic Programming for Prevention of Cyberterrorism through Dynamic and
Evolving Intrusion Detection.
Decision Support Systems, 43:13621374, 2007.
Bibliography 74
S. Harding, J. F. Miller, and W. Banzhaf.
Developments in Cartesian Genetic Programming: self-modifying CGP.
GPEM, 11:397439, 2010.
D. Jakobovi¢ and L. Budin.
Dynamic Scheduling with Genetic Programming.
In Proc. EuroGP, 2006.
W. Ja±kowski, K. Krawiec, and B. Wieloch.
Evolving strategy for a probabilistic game of imperfect information using genetic
programming.
Genetic Programming and Evolvable Machines, 9(4):281294, 2008.
G. Kendall, A. Parkes, and K. Spoerer.
A Survey of NP-Complete Puzzles.
International Computer Games Association Journal, 31(1):1334, 2008.
K. E. Kinnear, Jr.
Evolving a Sort: Lessons in Genetic Programming.
In Proc. of the International Conference on Neural Networks, 1993.
J. Koza.
A Genetic Approach to the Truck Backer Upper Problem and the Inter-twined
Spiral Problem.
In Proc. International Joint Conference on Neural Networks, 1992.
Bibliography 75
J. R. Koza.
Genetic Programming: On the Programming of Computers by Means of Natural
Selection.
MIT Press, Cambridge, MA, USA, 1992.
J. R. Koza.
Genetic Programming II: Automatic Discovery of Reusable Programs.
MIT Press, Cambridge Massachusetts, May 1994.
K. Krawiec.
Genetic programming-based construction of features for machine learning and
knowledge discovery tasks.
Genetic Programming and Evolvable Machines, 4:329343, 2002.
K. Krawiec and B. Bhanu.
Visual learning by evolutionary and coevolutionary feature synthesis.
IEEE Trans. on Evolutionary Computation, 11:635650, October 2007.
DOI: 10.1109/TEVC.2006.887351.
W. Langdon and W. Banzhaf.
Repeated Patterns in Genetic Programming.
Natural Computing, 7:589613, 2008.
Bibliography 76
W. B. Langdon.
Random search is parsimonious.
In E. Cantú-Paz, editor, Late Breaking Papers at the Genetic and Evolutionary
Computation Conference (GECCO-2002), pages 308315, New York, NY, 9-13
July 2002. AAAI.
W. B. Langdon and W. Banzhaf.
Repeated Sequences in Linear Genetic Programming Genomes.
Complex Systems, 15(4):285306, 2005.
W. B. Langdon and A. P. Harrison.
Evolving Regular Expressions for GeneChip Probe Performance Prediction.
In Proc. PPSN, pages 10611070, 2008.
W. B. Langdon and R. Poli.
Foundations of Genetic Programming.
Springer-Verlag, 2002.
W. B. Langdon, J. Rowsell, and A. P. Harrison.
Creating Regular Expressions as mRNA Motifs with GP to Predict Human Exon
Splitting.
In Proc. GECCO, 2009.
Bibliography 77
W. B. Langdon, O. Sanchez Graillet, and A. P. Harrison.
Automated DNA Motif Discovery.
arXiv.org, 2010.
M. Lones, A. Tyrrell, S. Stepney, and L. Caves.
Controlling Complex Dynamics with Articial Biochemical Networks.
In Proc. EuroGP, pages 159170, 2010.
S. Lucas.
Othello Competition.
http:/\protect\kern-.1667em\relax/algoval.essex.ac.uk:
8080/othello/html/Othello.html, 2012.
[Online; accessed 27-Jan-2012].
S. Lucas.
The Physical Travelling Salesperson Problem.
http:
/\protect\kern-.1667em\relax/algoval.essex.ac.uk/ptsp/ptsp.html,
2012.
[Online: accessed 27Jan-2012].
T. McConaghy.
FFX: Fast, Scalable, Deterministic Symbolic Regression Technology.
In Proc. GPTP, 2011.
Bibliography 78
T. M. Mitchell.
Machine Learning.
McGraw-Hill, 1997.
D. J. Montana.
Strongly typed genetic programming.
BBN Technical Report #7866, Bolt Beranek and Newman, Inc., 10 Moulton
Street, Cambridge, MA 02138, USA, 7 May 1993.
M. Nicolau, M. Schoenauer, and W. Banzhaf.
Evolving Genes to Balance a Pole.
In Proc. EuroGP, 2010.
P. Nordin and W. Banzhaf.
Genetic programming controlling a miniature robot.
In E. V. Siegel and J. R. Koza, editors, Working Notes for the AAAI Symposium
on Genetic Programming, pages 6167, MIT, Cambridge, MA, USA, 1012 Nov.
1995. AAAI.
G. Olague and L. Trujillo.
Evolutionary-computer-assisted design of image operators that detect interest
points using genetic programming.
Image and Vision Computing, In Press, Accepted Manuscript, 2011.
Bibliography 79
M. O'Neill, A. Brabazon, and E. Hemberg.
Subtree Deactivation Control with Grammatical Genetic Programming in
Dynamic Environments.
In Proc. CEC, 2008.
M. O'Neill and C. Ryan.
Grammatical Evolution by Grammatical Evolution: The Evolution of Grammar
and Genetic Code.
In Proc. EuroGP, pages 138149. Springer-Verlag, 57 Apr. 2004.
M. Paterson, Y. Peres, M. Thorup, P. Winkler, and U. Zwick.
Maximum Overhang.
In Proc. 19th Annual ACM-SIAM Symposium on Discrete Algorithms, 2008.
R. Poli and W. B. Langdon.
On the search properties of dierent crossover operators in genetic programming.
In J. R. Koza, W. Banzhaf, K. Chellapilla, K. Deb, M. Dorigo, D. B. Fogel, M. H.
Garzon, D. E. Goldberg, H. Iba, and R. Riolo, editors, Genetic Programming
1998: Proceedings of the Third Annual Conference, pages 293301, University of
Wisconsin, Madison, Wisconsin, USA, 22-25 July 1998. Morgan Kaufmann.
Bibliography 80
R. Poli, W. B. Langdon, and N. F. McPhee.
A eld guide to genetic programming.
Published via http://lulu.com and freely available at
http://www.gp-field-guide.org.uk, 2008.
(With contributions by J. R. Koza).
B. J. Ross and H. Zhu.
Procedural texture evolution using multiobjective optimization.
New Generation Computing, 22(3):271293, 2004.
R. P. Salustowicz and J. Schmidhuber.
Probabilistic incremental program evolution.
Evolutionary Computation, 5(2):123141, 1997.
M. Schmidt and H. Lipson.
Distilling free-form natural laws from experimental data.
Science, 324(5923):8185, 3 Apr. 2009.
S. Silva and L. Vanneschi.
State-of-the-Art Genetic Programming for Predicting Human Oral Bioavailability
of Drugs.
In Proc. 4th International Workshop on Practical Applications of Computational
Biology and Bioinformatics, 2010.
Bibliography 81
M. Sipper.
Let the Games Evolve!
In Proc. GPTP, 2011.
C. Sivapragasam, R. Maheswaran, and V. Venkatesh.
Genetic programming approach for ood routing in natural channels.
Hydrological Processes, 22(5):623628, 2007.
C. Sivapragasam, N. Muttil, S. Muthukumar, and V. M. Arun.
Prediction of algal blooms using genetic programming.
Marine Pollution Bulletin, 60(10):18491855, 2010.
L. Spector.
Towards practical autoconstructive evolution: Self-evolution of problem-solving
genetic programming systems.
In R. Riolo, T. McConaghy, and E. Vladislavleva, editors, Genetic Programming
Theory and Practice VIII, volume 8 of Genetic and Evolutionary Computation,
chapter 2, pages 1733. Springer, Ann Arbor, USA, 20-22 May 2010.
L. Spector, C. Perry, and J. Klein.
Push 2.0 programming language description.
Technical report, School of Cognitive Science, Hampshire College, Apr. 2004.
Bibliography 82
J. Togelius, S. Karakovskiy, J. Koutnik, and J. Schmidhuber.
Super Mario Evolution.
In Proc. IEEE Computational Intelligence and Games, 2009.
TORCS: The Open Car Racing Simulator.
http://torcs.sourceforge.net/, 2012.
L. Vanneschi and G. Cuccu.
Variable Size Population for Dynamic Optimization with Genetic Programming.
In Proc. GECCO, 2009.
E. Vladislavleva, G. Smits, and D. Den Hertog.
Order of Nonlinearity as a Complexity Measure for Models Generated by Symbolic
Regression via Pareto Genetic Programming.
IEEE Trans EC, 13(2):333349, 2009.
N. Wagner, Z. Michalewicz, M. Khouja, and R. McGregor.
Time Series Forecasting for Dynamic Environments: The DyFor Genetic Program
Model.
IEEE Trans EC, 2007.
J. Walker and J. Miller.
Predicting Prime Numbers Using Cartesian Genetic Programming.
In Proc. EuroGP, 2007.
Bibliography 83
J. A. Walker, K. Völk, S. L. Smith, and J. F. Miller.
Parallel Evolution using Multi-chromosome Cartesian Genetic Programming.
GPEM, 10:417445, 2009.
W.-C. Wang, K.-W. Chau, C.-T. Cheng, and L. Qiu.
A comparison of performance of several articial intelligence methods for
forecasting monthly discharge time series.
Journal of Hydrology, 374(3-4):294306, 2009.
W. Weimer, S. Forrest, C. Le Goues, and T. Nguyen.
Automatic program repair with evolutionary computation.
Communications of the ACM, 53(5):109116, June 2010.
P. Widera, J. Garibaldi, and N. Krasnogor.
GP challenge: Evolving energy function for protein structure prediction.
GPEM, 11:6188, 2010.
T. Yu.
Hierarchical Processing for Evolving Recursive and Modular Programs Using
Higher-Order Functions and Lambda Abstraction.
GPEM, 2:345380, 2001.
Bibliography 84