CE310 01 Lecture
CE310 01 Lecture
Evolutionary Computation
and Genetic Programming
Riccardo Poli
2
Learning objectives
• Understand the complexity of combinatorial optimization
problems and why we need meta-heuristic stochastic
algorithms to solve them.
• Understand how CE310 is run and its assignments
• Understand Darwin's theory of evolution and natural
selection
• Understand how information is encoded in the DNA via the
genetic code, and how mutations and crossover act on it.
3
Outline
• Hands-on experience: knapsack simulator and hints
• Why do we need evolutionary computation and
genetic programming?
• What is in the module and how will it be run?
• How does natural evolution work?
4
Problem 2: Knapsack problem hints (1)
• Given a set of items (i=1...N), each with a size (si) and a
value (vi), load the knapsack so that total size <= Smax
and the total value is maximum.
• Hint 1: Representation
5
Problem 2: Knapsack problem hints (2)
• Hint 2: Simulator
7
Combinatorial Optimization
• Combinatorial optimization problems such as the bin
packing and knapsack problems are important but
require exponential run times using brute-force/exact
approaches!
• So, what can we do? Metaheuristics!
8
Meta-heuristic stochastic optimization
• Heuristic (from Greek heuriskein = to search): Approximate
strategies (rules of thumb) or partial search algorithm which
are problem specific.
• Metaheuristic (metá = beyond): Higher-level, more general
procedure designed to find, generate, or select a heuristic
that may provide a sufficiently good solution to an
optimization problem, with an acceptable computation load
• Stochastic: Randomly determined process
• EAs and GP are highly successful meta-heuristics suitable
for combinatorial optimization.
9
Outline
• Hands-on experience: hints for knapsack problem
• Why do we need evolutionary computation and
genetic programming?
• What is in the module and how will it be run?
• How does natural evolution work?
10
What is in the module and how will it be run?
• The module is all about optimisation and problem
solving, so we are in the AI domain
17
Learning Outcomes (2)
After completing this module, you can:
• Compare application domains for GP and associate
these with good GP techniques.
• Identify GP parameters and modify existing GP
operators, representations and fitness functions for
specific applications.
18
Syllabus (1)
• Evolution in Nature
• Genetic Algorithms
• Evolution Strategies
• The basics of GP
• Fitness functions in GP
• Advanced Representations
• Code growth and methods to control it
19
Syllabus (2)
• Applications of GP
• Koza's criteria for human-competitive machine
intelligence and review of GP's human-competitive
results
• Advanced GP techniques and tricks of the trade
• Some other nature-inspired metaheuristics.
20
Delivery
• Normally, two hours of lectures per week to cover
main body of the course and one hour class per week
for formative exercises, problem solving, interactive
coding, etc.
• Learning by doing: To gain insight into the functioning
of the methods you are expected to code core
components yourself and experiment with the
methods outside of the lecture/class (spend 8-10
hours per week on the Module)
21
Assessment
• 60% Examination
• 40% Assignments (published around week 18 with
deadline in week 23, see FASER)
Components:
1. Code your own GA algorithm framework in Python
and solve optimisation problems → 50%
2. Run simulation studies to investigate the impact of
GP parameters on performance → 50%
22
Academic integrity, authorship and plagiarism
• Plagiarism: “the act of using another person's words or ideas
without giving credit to that person”
[https://www.merriam-webster.com/dictionary/plagiarism]
• Intentional, reckless, or unintentional? Note that intentional or
reckless plagiarism is a disciplinary offence
• Please discuss the solutions to tasks among yourselves. But
have the ambition to implement the solution yourself.
• Questions? Please check https://www.essex.ac.uk/student/academic-
skills/academic-integrity
23
Sources of Information: Main source
• Lectures and classes. Slides, videos, code and
handouts available on Moodle
• R. Poli, W.B. Langdon, N.F. McPhee. A field guide to
Genetic Programming, 2008.
• Free PDF http://www.gp-field-guide.org.uk/
24
Sources of Information: Background
• S. Luke, Essentials of Metaheuristics, Lulu, Second
edition, 2013. Start with
https://cs.gmu.edu/~sean/book/metaheuristics/Essentials.pdf
• W.B. Langdon, R. Poli, Foundations of Genetic
Programming, Springer, 2002.
25
Other On-line Resources (1)
• The Wikipedia (Evolutionary Algorithms) is a good starting
point for online reading
• http://en.wikipedia.org/wiki/Evolutionary_algorithms
• The Collection of Computer Science Bibliographies has a
large set of AI bibliographies including many on evolutionary
algorithms
• http://liinwww.ira.uka.de/bibliography/Ai/index.html
• Bibliography on genetic programming at
• http://www.cs.bham.ac.uk/~wbl/biblio/
26
Other On-line Resources (2)
• Google Search
several million hits for both “genetic programming” and
“evolutionary algorithms”
• Google Scholar http://scholar.google.com
300k+ hits for “genetic programming” and 600k+ hits for
“evolutionary algorithms”
27
Questions on the module?
28
Outline
• Hands-on experience: hints for knapsack problem
• Why do we need evolutionary computation and
genetic programming?
• What is in the module and how will it be run?
• How does natural evolution work?
29
Darwin's Theory of Evolution
• Charles Robert Darwin (1809-1882)
• 1859: On the origin of species …
• Variations (mutations) are present in all species
• Evolution is due to a “force” called natural selection which “selects”
the individuals best adapted to the environment
• In a constant environment no changes occur as variants will tend to
lose in the struggle for life. So, species preserve their identity.
• In a varying environment, however, some variants will be better than
the originals and will be preserved. Species evolve in this way.
30
Mendel's independent discoveries
• Gregor Johann Mendel (1822-1884)
• 1865: Versuche über Pflanzenhybriden
[Experiments on Plant Hybridization]
• Factors, what we now call genes/alleles, determine
visible traits in predictable ways
• Dominant and recessive characters
31
Example: pea colour
YY yy
?
Yy Yy
?
YY Yy Yy yy
https://www.futurelearn.com/info/courses/genomics-for-educators/0/steps/305264
32
How natural selection works
• Individuals of a population that are fitter tend to survive for
longer and reproduce
• Their characteristics, encoded in their genes, are transmitted
to their offspring and, thus, tend to propagate into future
generations
• In sexual reproduction, the genes of the offspring are a mix
of those of their parents.
• Offspring's characteristics are partially inherited from their
parents, and partially the result of new genes created during
the process of reproduction.
33
Natural selection does not quite mean ”the
survival of the fittest”
• “Fittest” means “best adapted”, not “in the best
condition”
• “Best adapted” means “best adapted with respect to a
niche” (a relevant sub-set of the environment)
• “Length of life” does not mean “fertility”
• “Fertility” does not mean “successful reproduction”
(i.e., production of offspring who live long enough to
reproduce themselves)
34
The cell Membrane
https://siteman.wustl.edu/wp-content/uploads/ncipdq-media/CDR0000761781.jpg
36
DNA (2)
• DNA is a long description (a book) whose “characters”
are T, G, C, A (about 3 billion bases in humans)
• The “words” of the DNA include three bases (triplets or
codons) and represent amino-acids (building blocks of
proteins)
e.g., TCT=Serine, CAA=Glutamine, TAA=Stop
• A gene is represented by a “sentence” (a meaningful
sequence of triplets) in the DNA
• Some triplets represent stop symbols (the end of a
sentence)
37
Genetic code (1)
38
Genetic code (2)
• Most genes (sentences in the DNA) code for proteins
(structural genes, i.e., trait determining factors)
E.g., haemoglobin →
41
Type of mutations
• Substitution: Exchange one base for another
• Condon (triplet) encodes a different amino acid, the same
amino acid or STOP (incomplete proteins)
• Insertion: Extra base pair inserted
• Deletion: Section of DNA lost or deleted
• Frameshift: Insertions/deletions can alter a gene so that
message is no longer correctly parsed.
Example: The fat cat sat → hef atc ats at
•…
• Mutation can be beneficial, neutral, or harmful! 42
Summary of main properties of DNA
• It stores and transmits information
• It copies itself (mainly) to generate proteins but also
to transmit information
• It can mutate
43
Meiosis – Reproduction (1)
In meiosis (preparation of egg and sperm cells)
homologous chromosomes duplicate (46→92)
(cyan=maternal, red=paternal)
Shyamala Iyer. (2014, February 03). Cell Division. ASU - Ask A Biologist. Retrieved January 8, 2020
from https://askabiologist.asu.edu/cell-division 44
Meiosis – Reproduction (2)
Because of close proximity, they can become
entangled, resulting in chromosomes that are a
mixture of maternal and paternal genes (“crossover" or
"genetic recombination”)
Shyamala Iyer. (2014, February 03). Cell Division. ASU - Ask A Biologist. Retrieved January 8, 2020
from https://askabiologist.asu.edu/cell-division 45
Meiosis – Reproduction (3)
• The cell then divides twice so that only one of
the homologous chromosomes of a pair is
present in the resulting cells (sperm or eggs).
Shyamala Iyer. (2014, February 03). Cell Division. ASU - Ask A Biologist. Retrieved January 8, 2020
from https://askabiologist.asu.edu/cell-division 46
Mating and Growth
• Mating produces (diploid) cells that start duplicating (mitosis)
• Duplication involves chromosome copying and cell division (no
crossover). Mutations can occur during the copying phase.
• The process of growth transforms the information in the genes
(genotype) into an adult individual (phenotype).
• Growth is controlled mainly by the environment (but also by
genetic factors)
• During the development of an individual cells specialise
(differentiation) and migrate
47
Cell differentiation
• Wait: all cells have exactly the same DNA! So, how do
we get skin cells, neurons, muscle cells, …?
• Cell differentiation happens thanks to the regulation
of gene expression determined by environment
surrounding the cell
• Regulation is performed by tracts of DNA that function
as switches for groups of genes.
49
What did we learn today? (1)
• We have learned that combinatorial optimisation problems
(Knapsack problem & Bin Packing problem) can be solved
with meta-heuristic stochastic optimisation methods such
as Evolutionary Algorithms.
• In this module we will learn about EAs and work with them
as tools to solve different types of (continuous and
combinatorial) optimisation problems.
• EAs can also be used a form of machine learning: program
induction (Genetic Programming)
• We also learned what is required in terms of assessment.
50
What did we learn today? (2)
• Natural selection means individuals in a population that are
fitter with respect to their environment produce more
offspring. The trait that made those individuals succeed
becomes common in the population.
• Genes encode the various traits of an individual. All traits of
an individual are combined in a long linear representation
(1-D array) (DNA).
• Genetic recombination (crossover), i.e., the exchange of
genetic material to create offspring's that are different than
the parents, and random mutations drive evolution.
51
Don’t forget to …
…catch-up with Python coding if necessary.
Resources:
• Python (https://www.python.org/)
• Jupyter Lab (https://jupyter.org/)
• Anaconda (https://www.anaconda.com/)
• Google Colab (https://colab.research.google.com)
• VirtualLab (https://csee-horizon.essex.ac.uk/)