0% found this document useful (0 votes)
13 views91 pages

Ec 05 2023

The document discusses different representations used in evolutionary computation, including binary, integer, real-valued, and permutation representations. It describes the roles of mutation and crossover variation operators and how they apply to different representations. It provides examples of different representations and how variation operators work within those representations.

Uploaded by

. 蝦米
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views91 pages

Ec 05 2023

The document discusses different representations used in evolutionary computation, including binary, integer, real-valued, and permutation representations. It describes the roles of mutation and crossover variation operators and how they apply to different representations. It provides examples of different representations and how variation operators work within those representations.

Uploaded by

. 蝦米
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 91

演化式計算

Evolutionary Computation
EE-6950, Fall 2023
Lecture # 5 - EC Representation
Outline
I. HW#03 Solution Review
- Rastrigin tness function & matplotlib stats plotting

II. EC Representation

III. Homework #4

2
fi
I. Homework #03:
Solution Review
HW#03 Solution Review

• Code walkthrough & demo


• Solution has also been uploaded to iLearning (.py source code)

4
II. EC Representation
Remember me? ☺ Evolutionary Algorithm

/Crossover

6
Representation, Mutation, and Recombination
• Role of representation and variation operators

• Most common representations:


- Binary
- Integer
- Real-Valued or Floating-Point
- Permutation
- Tree

7
Role of representation & variation operators

• First stage of building an EA and most di cult one:


- Choose representation to suit problem
- Encoding for individuals that can be manipulated by variation operators

• Variation operators:
- Mutation
- Crossover (recombination)

• Type of variation operators needed depends on chosen representation

8
Recap:
EC Representation & Variation Components
The 8-queens problem: Representation

External representation:
a board configuration

Internal state:
Possible mapping
a permutation of
the numbers 1–8 1 3 5 2 6 4 7 8

10
Simple 1-D Optimization: Representation

• External representation: The real number,

• Internal state: The real number,

f(x) = 50 − x 2
11
𝑥𝑥
Mutation

• Role: Causes random variance, potentially outside current “DNA” pool

• Acts on a single state (Individual) and delivers another (e.g., it’s a point
operator)

• Element of randomness is essential, di erentiates it from other unary


heuristic operators

12
Mutation

before 1 1 1 1 1 1 1

after 1 1 1 0 1 1 1

13
Recombination (Crossover)

• Role: Merges information from parents into o spring

• Choice of what information to merge is stochastic

• O spring may be worse, better, or the same as the parents

• Hope is that some are better by combining elements of genotypes that lead to
good traits

14
ff
Recombination

Parents
cut cut
1 1 1 1 1 1 1 0 0 0 0 0 0 0

1 1 1 0 0 0 0 0 0 0 1 1 1 1

Offspring

15
Crossover or Mutation?

• Decade long debate: which one is better/necessary

• Answer (at least, rather wide agreement):


• It depends on the problem, but in general it is good to have both
• Both have di erent and useful roles
• Mutation-only-EA is possible, crossover-only-EA would not work well

16
ff
Crossover or Mutation?
Exploration: Discovering promising areas in the search space, i.e. gaining new
information on the problem

Exploitation: Optimizing within a promising area, i.e. using known information

There is co-operation AND competition between them:


- Crossover is explorative, it makes a big jump to an area somewhere “in between” two
(parent) areas
- Mutation is both exploitative & explorative, frequently creates small random
perturbations near the parent, but can also explore area outside of existing DNA pool

Note: How we de ne this depends somewhat on our perspective as well as how we implement our
variation operators. From Eiben's perspective, the explorative vs exploitative distinction primarily
relates to how close the generated states are to the parent state(s).
17
fi
Crossover or Mutation?

• Only crossover can combine information from two parents

• Only mutation can introduce new information

• To hit the optimum you often need a ‘lucky’ mutation

18
Binary Representation
Binary Representation

• One of the earliest EC representations


• Internal representation consists of a string of binary digits

20
Mutation

• Alter each bit independently with a probability pm


• pm is called the mutation rate
• Typically between 1/pop_size and 1/ word_length

• Mutation can cause undesirable non-uniform e ect (can use gray coding)

21
1-Point Crossover
• Choose a random point on the two parents
• Split parents at the crossover point
• Create children by exchanging tails
• pc typically in range (0.6, 0.9)

22
Alternative Crossover Operators

• Why do we need other crossover approaches?

• Performance with 1-point crossover depends on the order that variables


occur in the representation
- More likely to keep together bits that are near each other
- Can never keep together bits from opposite ends of string
- This is known as Positional Bias

23
n-Point Crossover

• Choose n random crossover points


• Split along those points
• Glue parts, alternating between parents
• Generalization of 1-point (but, still some positional bias)

24
Uniform Crossover

• Randomly choose each bit of the rst child


• Make an inverse copy for the second child
• Crossover is then independent of position

25

fi
Integer Representation
Integer Representation

• Nowadays it is generally accepted that it is better to encode numerical


variables directly (integers, oating point variables)

• Some problems naturally have integer variables


- e.g. image processing parameters

• Others take categorical values from a xed set


- e.g. {blue, green, yellow, pink}

27

fl
Integer Representation

Examples:

3 5 6 3 9

1 5 4 2 8

28
Integer Variation Operators

• Extend binary mutation for:


- “Creep” i.e. more likely to move to similar value
• Adding a random positive or negative value to each value with probability p.
• (for problems where integers act like real numbers, value & continuity have meaning)

- Random resetting (esp. categorical variables)


• With probability pm a new value is chosen at random
• (for problems where integers act like indexes)

• Same recombination as for binary representation


• i.e., N-point/uniform crossover operators also work for integers

29
Mutation

3 5 6 3 9

3 5 4 3 9

30
Crossover

3 5 6 3 9 3 5 4 3 8

1 5 4 2 8 1 5 6 2 9

31
Real-valued/Floating-Point Representation
Real-Valued or Floating-Point Representation

• Many problems occur as multi-variate real valued problems, e.g. continuous


parameter optimization

x = ⟨x1, x2, . . . xn⟩

• Illustration: Ackley’s function


⎛ 1 n 2⎞
f (x) = −20 ⋅ exp ⎜ −0.2 ⋅ ∑ xi ⎟
⎝ n i=1 ⎠
⎛1 n ⎞
−exp ⎜ ∑ cos(2π xi )⎟ + 20 + e
⎝ n i=1 ⎠

33
A possible approach (GA): Mapping real values on bit strings
z ∈ [x,y] ⊆ ℜ represented by {a1,…,aL} ∈ {0,1}L

• Γ: {0,1}L → [x,y] de nes the representation

y − x L −1
Γ(a1 ,..., aL ) = x + L ⋅ ( ∑ a L − j ⋅ 2 j ) ∈ [ x, y ]
2 − 1 j =0
• Only 2L values out of in nite are represented
• L determines possible maximum precision of solution
• High precision  long chromosomes (slow evolution)

• Note: Used in early GA “universal” problems solvers. Given the power of oating point processors on
modern machines, I don’t recommend this approach unless you’ve got a good reason for using nite-
precision bit-based representation

34
fi
fi
Uniform Mutation

• General scheme of oating point mutations


x = x1 , ..., xl → x ʹ = x1ʹ, ..., xlʹ

xi , xiʹ ∈ [LBi , UBi ]

• Uniform Mutation:
xiʹ drawn randomly (uniform) from [LBi ,UBi ]

• Analogous to bit- ipping (binary) or random resetting (integers)


35
fl
fl
Non-uniform Mutation

• Non-uniform mutation:
- Most common method is to add random deviate to each variable separately,
taken from N(0,σ) Gaussian distribution and then restrict to range
x′i = xi + N(0,σ)

- Standard deviation σ (mutation step size), controls amount of change

36

Self-Adaptive Mutation (Important!)

• Step-sizes are included in the encoding and undergo variation and selection
themselves:
⟨x1, …, xn, σ⟩
• Mutation step size is not set by user but co-evolves with solution

• Di erent mutation strategies may be appropriate in di erent stages of the


evolutionary search process.

37
ff
Self-Adaptive Mutation

• Mutate σ rst
• Net mutation e ect: ⟨x, σ⟩ → ⟨x′, σ′⟩
• Order is important:
• rst σ → σ′ (see later how)
• then x → x′ = x + N(0,σ′)
• Rationale: new ⟨x′, σ′⟩ is evaluated twice
• Primary: x′ is good if f(x′) is good
• Secondary: σ′ is good if the x′ it created is good

38











fi
fi
ff
Uncorrelated mutation with single σ

• Encoding: ⟨x1, …, xn, σ⟩


• σ′ = σ exp(τ N(0,1))
• x′i = xi + σ′ Ni(0,1)

• Typically the “learning rate” τ ∝ 1/n1/2


- where n="problem size", typically the number of dimensions, or vector length

• And we typically have a boundary rules (no over ow/under ow):


• σ′ < ϵmin → σ′ = ϵmin
• σ′ > ϵmax → σ′ = ϵmax
39







Uncorrelated mutation with single σ
Circle (hyper-sphere): All dimensions have same mutation probability “scale”

Mutations with equal likelihood

40
Uncorrelated mutation with N σ’s

• Encoding: ⟨x1, …, xn, σ1, …, σn⟩


• σ′i = σi exp(τ′ N(0,1) + τ Ni(0,1))
• x′i = xi + σ′i Ni(0,1)

• Two learning rate parameters:


• τ′ overall learning rate
• τ coordinate wise learning rate

• τ′ ∝ 1/(2n)1/2 and τ ∝ 1/(2n1/2)1/2

• Boundary rules: σ′i < ϵmin → σ′i = ϵmin , σ′i > ϵmax → σ′i = ϵmax
41










Uncorrelated mutation with N σ’s
Ellipse (hyper-ellipsoid): Each dimension has its own mutation probability “scale”

Mutations with equal likelihood

42
Correlated mutation

• Encoding: ⟨x1, …, xn, σ1, …, σn, α1, …, αk⟩


where k = n (n − 1)/2

• Covariance matrix C is de ned as:


• cii = σi2
• cij = 0 if i and j are not correlated
• cij = ½ • ( σi2 - σj2 ) • tan(2 αij) if i and j are correlated

• Note the numbering / indices of the α‘s

43

fi
Correlated mutation
The mutation mechanism is then:

• σ’i = σi • exp(τ’ • N(0,1) + τ • Ni (0,1))


• α’j = αj + β • N (0,1)
• x ’ = x + N(0,C’)
• x stands for the vector 〈 x1,…,xn 〉
• C’ is the covariance matrix C after mutation of the α values

• τ ∝ 1/(2 n)½ and τ ∝ 1/(2 n½) ½ and β ≈ 5°


• σi’ < ϵmin ⇒ σi’ = ϵmin, σ’ > ϵmax ⇒ σ’ = ϵmax, and
• | α’j | > π ⇒ α’j = α’j - 2 π sign(α’j)
44
Correlated mutation
Ellipse (hyper-ellipsoid):
Each dimension can has its own mutation probability “scale” & rotation

No free lunch!
• Representation may now be more “accurate”
• But we have also substantially increased the
problem size by including N σ’s and k α’s

Mutations with equal likelihood

45
Crossover operators

• Discrete:
• Each value in o spring z comes from one of its parents (x, y) with equal
probability: zi = xi or yi
• Could use n-point or uniform crossover, similar to binary or integer cases

3.2 5.1 6.5 3.3 9.7 3.2 5.1 4.2 3.3 8.8

1.8 5.6 4.2 2.6 8.8 1.8 5.6 6.5 2.6 9.7

46
ff
Crossover operators
• Intermediate/Arithmetic
• Create children “between” parents (hence a.k.a. arithmetic recombination)
zi = α xi + (1 − α) yi where α : 0 ≤ α ≤ 1
• The parameter α can be:
- Constant: uniform arithmetical crossover
- Variable (e.g. depend on the age of the population)
- Picked at random every time
Parent 1 Parent 2

Child

xi zi yi
47
Single arithmetic crossover

• Parents:⟨x1, …, xn⟩ and ⟨y1, …, yn⟩


• Pick a single k at random
• Child1 is (reverse for Child2):
⟨x1, …, xk = αyk + (1 − α)xk, …, xn⟩

Example with α = 0.5


48
Simple arithmetic crossover

• Parents: ⟨x1, …, xn⟩ and ⟨y1, …, yn⟩


• Pick a random k, then from this point on, mix values
• Child1 is (reverse for Child2):
⟨x1, …, xk−1, αyk + (1 − α)xk, …, αyn + (1 − α)xn⟩

Example with α = 0.5


49
Complete arithmetic crossover

• Most commonly used approach


• Parents: ⟨x1, …, xn⟩ and ⟨y1, …, yn⟩
Note: α often is a random number

• Child1 is (reverse for Child2): with di erent value for each child

αx + (1 − α)y

Example with α = 0.5

50
ff
Blend Crossover

• Parents: ⟨x1, …, xn⟩ and ⟨y1, …, yn⟩


• di = | yi − xi |
• Random sample zi = [xi − αdi, xi + αdi]
• Allows values “outside” of parents
• There are many variations on this theme,
such as SBX operator

51
Multi-parent recombination

• Recall that we are not restricted by the practicalities of nature

• Mutation uses n = 1 parent, and “traditional” crossover n = 2:


- The extension to n > 2 is natural to examine

• Been around since 1960s, still rare but studies indicate can be useful

52
Multi-parent recombination, type 1
• Idea: Segment and recombine parents
• Example: diagonal crossover for n parents:
• Choose n-1 crossover points (same in each parent)
• Compose n children from segments of parents along a “diagonal”, wrapping around

• This operator generalizes 1-point crossover


53
Multi-parent recombination, type 2

• Idea: Arithmetic combination of real values


• Example: arithmetic crossover for n parents:
• ith value in child is the average of the parents’ ith values
• Creates center of mass as child
• Unusual in genetic algorithms, but long known and sometimes used in
evolution strategies

54
Permutation Representation
Permutation Representations

• Ordering/sequencing problems form a special type

• Tasks solved by arranging some objects in a certain order


- Example: production scheduling: important thing is which elements are
scheduled before others (order)
- Example: Travelling Salesman Problem (TSP) : important thing is which
elements occur next to each other (adjacency)

• These problems are generally expressed as a permutation:


• If there are n variables then the representation is as a list of n integers,
each of which occurs exactly once

56
Permutation Representation:
TSP example

• Problem:
- Given n cities
- Find a complete tour with minimal length
• Encoding:
- Label the cities 1, 2, … , n
- One complete tour is one permutation (e.g. for
n=4 [1,2,3,4], [3,4,2,1] are OK)

• Search space is BIG:


- for 30 cities there are 30! ≈ 1032 possible tours

57
Mutation

• “Normal” mutation operators lead to inadmissible solutions


- Can end up with values that occur more than once (illegal)
- Can end up with values that no longer exist! (also illegal)

• Therefore must change at least two values at once

• Mutation now re ects the probability that some operator is applied once to
the whole string, rather than individually in each position

58
fl
Swap mutation

• Pick two slots at random and swap their positions

59
Insert Mutation

• Pick two slot values at random


• Move the second to follow the rst, shifting the rest along to accommodate
• Note that this preserves most of the order and the adjacency information

60

fi
Scramble mutation

• Pick a sub-sequence at random


• Randomly rearrange the values in those positions

61
Inversion mutation

• Pick two slots at random and then invert the substring between them.
• Preserves most adjacency information (only breaks two links) but disruptive of
order information

62
Crossover operators

• “Normal” crossover operators will often lead to inadmissible solutions

12345 12321
54321 54345

• Many specialized operators have been devised which focus on combining


order or adjacency information from the two parents

63
Order 1 Crossover

• Idea is to preserve relative order that elements occur


• Informal procedure:
1. Choose an arbitrary sub-sequence from the rst parent
2. Copy this part to the rst child
3. Copy the numbers that are not in the rst part, to the rst child:
• starting right from cut point of the copied part,
• using the order of the second parent
• and wrapping around at the end
4. Analogous for the second child, with parent roles reversed

64
fi
Order 1 Crossover

• Copy randomly selected set from rst parent

• Copy rest from second parent in order 1,9,3,8,2

65

fi
Partially Mapped Crossover (PMX)
• Typically used for “adjacency” type problems (e.g., TSP-like problems)

Informal procedure for parents P1 and P2:


1. Choose random segment and copy it from P1
2. Starting from the rst crossover point look for elements in that segment of P2 that have not been copied
3. For each of these i look in the o spring to see what element j has been copied in its place from P1
4. Place i into the position occupied j in P2, since we know that we will not be putting j there (as is already in o spring)
5. If the place occupied by j in P2 has already been lled in the o spring k, put i in the position occupied by k in P2
6. Having dealt with the elements from the crossover segment, the rest of the o spring can be lled from P2.

Second child is created analogously


66
fi
ff
fi
Partially Mapped Crossover (PMX)

67
Cycle Crossover
Basic idea: Each value comes from one parent together with its position.

Informal procedure:
1. Make a cycle of values from P1 in the following way.
(a) Start with a random slot in P1.
(b) Look at the value at the same position in P2.
(c) Go to the position with the same value in P1.
(d) Add this value to the cycle.
(e) Repeat step b through d until you arrive at the initial slot you started with in P1.

2. Put values of cycle from the rst child on positions they have from rst
parent.

68

fi
Cycle Crossover
• Step 1: identify cycles

• Step 2: copy cycles into o spring

69

ff
Tree Representation
Tree Representation
• Trees are a universal form, good for symbolic expressions, e.g. consider:

• Arithmetic formula:

( 5 + 1)
y
2π + (x + 3) −

• Logical formula: (x ∧ true) → ((x ∨ y) ∨ (z ↔ (x ∧ y)))


• Program: i=1;
while (i < 20)
{
i=i+1;
}

71
Tree Representation

( 5 + 1)
y
2π + (x + 3) −

72
Tree Representation

(x ∧ true) → ((x ∨ y) ∨ (z ↔ (x ∧ y)))

73
Tree Representation

i=1;
while (i < 20)
{
i=i+1;
}

74
Tree Representation

• In GA, ES, EP representations are linear structures (bit strings, integer string,
real-valued vectors, permutations)

• Tree shaped representations are non-linear structures

• In GA, ES, EP the “size” of the representation is xed (e.g., a xed-length


multi-variate vector)

• Trees in GP (genetic programming) may vary in depth and width

75
Tree Representation

• Symbolic expressions (S-Expressions) can be de ned by


• Terminal set T
• Function set F (with the arities of function symbols)

• In general, expressions in GP are not typed (closure property: any f ∈ F can


take any g ∈ F as argument)

• LISP-like “syntax”, has traditionally often been used in elds such as AI,
easier programmatic structure to manipulate computationally

76
Mutation

• Most common mutation: replace randomly chosen subtree by randomly


generated tree

77
Mutation

• Mutation has two parameters:


• Probability pm to choose mutation
• Probability to chose an internal point as the root of the subtree to be
replaced

• The size of the child can exceed the size of the parent

78
Recombination

• Most common recombination: exchange two randomly chosen subtrees


among the parents

• Recombination has two parameters:


• Probability pc to choose recombination
• Probability to chose an internal point within each parent as crossover point

• The size of o spring can exceed that of the parents

79
ff
Recombination

Parent 1 Parent 2

Child 1 Child 2

80
A few final thoughts
Self-Adaptivity

• Adaptive strategy parameters:


- Important! These are not restricted only to real-number problems, typically
used with all representations!

• Adaptive mutation rates:


- Generally tend to decrease in late part of run progression (convergence)
- Non-stationary problems: mutation rate can adaptively respond, increase/
decrease as run progresses (see next slide)

• Why not make everything adaptive?


- No free lunch, it makes the problem and search space larger
- Need to strike the right balance
82
Self-Adaptivity

Example Non-stationary problem & adaptive mutation rates


(Figure 4.7 in Eiben’s book, Chpt 4)

Objective Value Adaptive Mutation Strength

83
Variation & Other Representations

• Mutation operators for given representation often combine multiple


approaches with adaptive rates

• Does variation always need to create “disorder”?


- No! (hybrid algorithms, problem dependent knowledge)

• Other representations beyond these?


• Yes, certainly!
• Layout, network graphs, ANN, etc. (Maybe your problem is di erent too!)
• We will look at some of these later when discussing more advanced and
specialized applications

84
III. Homework #4
- Due by Oct 12 @ noon
- Submit to iLearning
Homework assignment #4 –EV2
EV2 – Let’s upgrade & improve EV1 a bit:
1. Modify EV1 mutation operator to be self-adaptive:
- Use single-σ uncorrelated adaptive mutation rate
- Hint 1: No longer need mutationProb and mutationStddev parameters from EV1
- Hint 2: Eqns 4.2 & 4.3 Eiben’s book, or my slides 39-41 “Uncorrelated mutation with
single σ”

2. Modify crossover operator to use stochastic arithmetic crossover


- Instead of x=(x1+x2)/2, use x=α*x1 + (1 - α)*x2, where α : 0 ≤ α ≤ 1 (α is random value)

86
Homework assignment #4 – EV2

3. Let's exploit the inherent parallel search capabilities of our population


further and allow multiple children per generation

- Create 5 children per generation (crossover & mutate). As in EV1, continue to use random parent
selection, 2 parents per child (note in the case of purely random selection, parents could
potentially participate in reproduction multiple times)

- As in EV1, continue to use "replace-worst" survivor selection, however in EV2, use this selection
process to allow both the parents and children to compete against each other for survival. (Hint:
create all of the children rst, then sequentially run replace-worst selection for each child)

87
fi
Homework assignment #4 – EV2
4. Improve code structure, better use of OO:
• Move mutation and crossover to Individual class:
- Hint 1: def crossover(self, other) -> returns new Individual
- Hint 2: def mutate(self) -> mutates internal state of existing Individual

5. Test your EV2 program using the two tness functions from our previous
self-study exercise (Parabola & Rastrigin functions)

88
Homework assignment #4

• Suggested (optional):
- Use matplotlib to plot the adaptive mutation strength vs. generation count.
How does it behave? Does it eventually decrease with time?

89
Homework #4: Recommended reading

• Recommended reading:
- Eiben Chapter 4

90
Next up…
EC Fitness, Selection,
and Population Management

You might also like