0% found this document useful (0 votes)
14 views

Genetic Algorithm Applied On Multiobjective Optimization

Uploaded by

sow
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Genetic Algorithm Applied On Multiobjective Optimization

Uploaded by

sow
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

GENETIC ALGORITHM APPLIED ON

MULTIOBJECTIVE OPTIMIZATION

ADDIS ABABA UNIVERSITY


COLLEGE OF COMPUTATIONAL AND NATURAL SCIENCES
DEPARTMENT OF MATHEMATICS
A project Submitted in partial fulfillment of the requirement of
the degree of master of science in mathematics

By: Beletew Mekasha


Stream: Optimization
Advisor: Semu Mitiku(PhD)

June 17, 2014


ADDIS ABABA UNIVERSITY
DEPARTMENT OF MATHEMATICS

The undersigned hereby certify that they have read and recommend to the department
of mathematics for acceptance of this project entitled ”GENETIC ALGORITHM AP-
PLIED ON MULTIOBJECTIVE OPTIMIZATION” by Beletew Mekasha in par-
tial fulfillment of the requirements for the degree of Master of Science in mathematics.

Advisor: Dr. Semu Mitiku Kassa

Signature:

Date

Examiner 1: Dr. ———————————–

Signature:

Date

Examiner 2: Dr. ———————————–

Signature:

Date
Acknowledgment

I would like to express my deep sense of gratitude and indebtedness to my advisor,


Dr. Semu Mitiku, for his continual encouragement and patient guidance during the early
stages of chaos and confusions. I also thank him for indicating to me a future direction that
I hope will continue working on. Without his support, expertise and guidance, this project’s
would never have come in to completion.
Finally,I would also like to express my appreciation to my families for their continuous
encouragement.

ii
Abstract

Multi-objective formulations are a realistic models for many complex optimization prob-
lems. In this project we presented multiobjective optimization problems using genetic
algorithms developed specifically for the problems with multiple objectives. Customized
genetic algorithms have been demonstrated to be particularly effective to determine excel-
lent solutions(pareto-optimal points) to the problems. Moreover, in solving multi-objective
problems, designers may be interested in a set of pareto-optimal points instead of a single
point. Since genetic algorithms(GAs) work with a population of points, it seems natural
to use GAs in multi-objective optimization problems to capture a number of solutions si-
multaneously. In this project we also describe the working principle of a binary-coded and
real-parameter genetic algorithm, which is ideally suited to handle problems with a con-
tinuous search space.Moreover, a non-dominated sorting-based multi-objective evolutionary
algorithm (MOEA), called non-dominated sorting genetic algorithm II (NSGA-II), is also
presented.

Keywords: Generic Algorithm, Multi-objective Optimization, Elitism, Pareto optimal so-


lutions, Ordering relation.

iii
Contents

iv
Introduction

The objective of this paper is to present an overview of multiple-objective optimization


methods using genetic algorithms (GA). For multiple-objective problems, the objectives are
generally conflicting, preventing simultaneous optimization of each objective. GA is inspired
by the evolutionist theory explaining the origin of species.
In many real-life problems, objectives under consideration conflict with each other. Hence,
optimizing X with respect to a single objective often results in unacceptable results with
respect to the other objectives. Therefore, a perfect multi-objective solution that simul-
taneously optimizes each objective function is almost impossible. A reasonable solution
to a multiobjective problem is to investigate a set of solutions, each of which satisfies the
objectives at an acceptable level without being dominated by any other solution.
chapter 1 presents the basic terminology for use throughout the rest of the project.
Furthermore, a historical overview of single-objective and multiobjective optimization is
also discussed, together with a short introduction to evolutionary algorithm and genetic
algorithm. Additionally, we also provide goal of multiobjective optimization problems.
Chapter 2 attempts to present the preference attitudes of the decision maker which play
an essential role that specifies the meaning of optimality or desirability. In this chapter we will
discuss the principles of multiobjective optimization and present optimality concepts for any
solution to be optimal in the presence of multiple objectives. They are very often represented
as binary relations on the objective space and are called preference orders. Furthermore, we
discuss solution concepts for muIti-objective optimization problems and investigate some
fundamental properties of solutions.
Additionally, in typical multi-objective optimization problem, there exist a set of solutions
which are superior to the rest of solutions. In the search space when all objectives are
considered but are inferior to other solutions in the space in one or more objectives. These
solutions are known as pareto-optimal solution or non-dominated solutions[?]. As there are
different algorithms for finding the minimum number from a finite set, some of approaches
have also been mentioned in this project for finding the non-dominated set from a given
population of solutions.
In chapter 3, The classical methods of multi-objective optimization are discussed. One
way to solve multi-objective problems is to scalarize the vector of objectives in to one ob-
jective by averaging the objectives with a weight vector. This process allows a simpler
optimization algorithm to be used, but the obtained solution largely depends on the choice

v
of the weight vector used in the scalarization process. In this chapter, we present the working
principle of a binary-coded and real-parameter genetic algorithm operator.
Chapter 4 presents about Non-dominated Sorting Genetic Algorithm (NSGA-II) which
carries out a non-dominated sorting of a combined parent and offspring population. There-
after, starting from the best non-dominated solutions, each front is accepted until all pop-
ulation slots filled. This makes the algorithm an elitist type. For the solutions of the last
allowed front, a crowded distance-based niching strategy is used to resolve which solutions
are carried over to the new population.
The first multi-objective GA, called Vector Evaluated Genetic Algorithms (or VEGA),
was proposed by Schaffer[?]. Afterward, several major multi-objective evolutionary al-
gorithms were developed such as Multi-objective Genetic Algorithm (MOGA)[?], Niched
Pareto Genetic Algorithm , Random Weighted Genetic Algorithm (RWGA), Non-dominated
Sorting Genetic Algorithm (NSGA), Strength Pareto Evolutionary Algorithm (SPEA)[?],
Pareto-Archived Evolution Strategy (PAES), Fast Non-dominated Sorting Genetic Algo-
rithm (NSGA-II), Multi-objective Evolutionary Algorithm (MEA), Rank-Density Based Ge-
netic Algorithm (RDGA).
Several survey papers have been published on evolutionary multi-objective optimization.
This project takes a different course and focuses on important issues while designing a multi-
objective GA and describes common techniques used in multi-objective GA to attain the
goals in multi-objective optimization and we address the inclusion of an elite-preserving
operator to make the algorithms better converge to the Pareto-optimal solutions by using
fast and elitist Non-dominated Sorting Genetic Algorithm (NSGA-II).

vi
Chapter 1

Preliminary

1.1 Definition and Preliminary Concepts


Optimization is the process of adjusting the inputs or characteristics of a device, mathe-
matical process, or experiment to find the minimum or maximum output or result. The
input consists of variables; the process or function is known as the cost function, objective
function, or fitness function; and the output is the cost or fitness.
Optimization refers to finding one or more feasible solutions which correspond to extreme
values of one or more objectives. The need for finding such optimal solutions in a problem
comes mostly from the extreme purpose of either designing a solution for minimum possi-
ble cost of fabrication, or maximum possible reliability, or others. Because such extreme
properties of optimal solutions, optimization methods are of great importance in practice[?].
Definition 1.1.1. Decision Variables: The decision variables are the numerical quantities
for which values are to be chosen in an optimization problem. These quantities are denoted
as xj , j = 1, 2, · · · , n. The vector X of n decision variables is represented by:
X = [x1 , x2 , · · · , xn ]T
Definition 1.1.2. Constraints: In most optimization problems there are always restric-
tions imposed by the particular characteristics of the environment or available resources
(e.g., physical limitations, time restrictions, etc.). These restrictions must be satisfied in
order to consider a certain solution acceptable. All these restrictions in general are called
constraints, and they describe dependence among decision variables and constants (or param-
eters) involved in the problem. These constraints are expressed in the form of mathematical
inequalities:
gi (x) ≤ 0, i = 1, · · · , m

or equalities:
hj (x) = 0, j = 1, · · · , p
The set of all n-tuples of real numbers denoted by Rn is called Euclidean n−space.
Two Euclidean spaces, objective space and decision variable space are considered in multi-
objective optimization problems.

1
Given an n-dimensional decision variable vector x = [x1 , · · · , xn ] in the solution space X,
find a vector x∗ that minimizes a given set of K objective functions: z(x∗ ) = [z1 (x∗ ), · · · , zK (x∗ )].
The solution space X is generally restricted by a series of constraints and bounds on the
decision variables
In many real-life problems with more than one criteria, objectives under consideration
conflict with each other. Hence, optimizing x with respect to a single objective often results
in unacceptable value with respect to other objectives. Therefore, a perfect multi-objective
solution that simultaneously optimizes each objective function is almost impossible. A rea-
sonable solution to a multi-objective problem is to investigate a set of solutions, each of
which satisfies the objectives at an acceptable level without being dominated by any other
solution.

• Dominance: If all objective functions are to be minimized, a feasible solution x is


said to dominate another feasible solution y ( x  y ), if and only if, zi (x) ≤ zi (y) for
i = 1, · · · , k and zj (x) < zj (y) for at least one index j.

• Pareto optimal: A solution is said to be Pareto optimal if it is not dominated by any


other solution in the solution space. A Pareto optimal solution cannot be improved
with respect to any objective without worsening at least one other objective. The set
of all feasible non-dominated solutions in X is referred to as the Pareto optimal set,
and for a given Pareto optimal set, the corresponding objective function values in the
objective space is called the Pareto front.

• Non conflicting:If the objective functions are not conflicting to each other, the car-
dinality of the pareto-optimal set is one. This means that the minimum solution
corresponding to any objective function is the same.

1.2 Single and multi-objective optimization


Single-objective optimization: When an optimization problem modeling a physical sys-
tem involves only one objective function, the task of finding the optimal solution is called
single-objective optimization[?].

Definition 1.2.1. (General Single-Objective Optimization Problem) : A general


single-objective optimization problem is defined as minimizing (or maximizing) f (x) subject
to gi (x) ≤ 0, i = 1, · · · , m, and hj (x) = 0, j = 1, · · · , p x ∈ Ω. A solution minimizes
(or maximizes) the scalar f (x) where x is a n-dimensional decision variable vector x =
(x1 , · · · , xn ) from some universe Ω.

Observe that gi (x) ≤ 0 and hj (x) = 0 represent constraints that must be fulfilled while
optimizing (minimizing or maximizing) f (x). Ω contains all possible x that can be used to
satisfy an evaluation of f (x) and its constraints. Of course, x can be a vector of continuous
or discrete variables as well as f being continuous or discrete.

2
Multi-objective optimization: When an optimization problem involves more than
one objective function, the task of finding one or more optimum solutions is known as multi-
objective optimization.
For multiple-objective problems, the objectives are generally conflicting, preventing si-
multaneous optimization of each objective or finding a multi-dimensional Pareto-optimal
front. As in a single-objective optimization problem, the multi-objective optimization prob-
lem may contain a number of constraints which any feasible solution (including all optimal
solutions) must satisfy.

Definition 1.2.2. A general multi-objective optimization problem: A multi-objective


optimization problem is defined as minimizing (or maximizing) F (x) = (f1 (x), · · · , fk (x))
subject to gi (x) ≤ 0, i = 1, · · · , m, and hj (x) = 0, j = 1, · · · , p x ∈ Ω. A multi-objective
optimization problem solution minimizes (or maximizes) the components of a vector F (x)
where x is a n-dimensional decision variable vector x = (x1 , · · · , xn ) from some universe Ω.

It is noted that gi (x) ≤ 0 and hj (x) = 0 represent constraints that must be fulfilled
while minimizing (or maximizing) F (x) and Ω contains all possible x that can be used to
satisfy an evaluation of F (x). The optimal solutions in multi objective optimization can be
defined from a mathematical concept of partial ordering. In the parlance of multi-objective
optimization, the term domination is used for this purpose.
There are two general approaches to multiple-objective optimization. One is to combine
the individual objective functions into a single composite function. Determination of a single
objective is possible with methods such as utility theory, weighted sum method, etc., but
the problem lies in the correct selection of the weights or utility functions to characterize
the decision-makers preferences.

The second general approach is to determine an entire Pareto optimal solution set or
a representative subset. A Pareto optimal set is a set of solutions that are non-dominated
with respect to each other. While moving from one Pareto solution to another, there is
always a certain amount of sacrifice in one objective to achieve a certain amount of gain in
the other. Pareto optimal solution sets are often preferred to single solutions because they
can be practical when considering real-life problems, since the final solution of the decision
maker is always a trade-off between crucial parameters.

1.2.1 Difference between single and multi-objective optimization


There are a number of fundamental differences between single-objective and multi-objective
optimization. These are[?]:

• Two goals instead of one: In single-objective optimization, there is one goal-the


search for an optimum solution. Although the search space may have a number of local
optimal solutions, the goal is always to find the global optimum solution. However,
most single-objective optimization algorithms aim at finding one optimum solution
even when there exist a number of optimal solutions.

3
However, in multi-objective optimization, there are clearly two goals. Progressing
towards the pareto-optimal front is certainly an important goal. However, maintaining
a diverse set of solutions in the non-dominated front is also essential. The achievement
of one goal does not necessarily achieve the other goal. Explicit or implicit mechanisms
to emphasize convergence near the pareto-optimal front and the maintenance of a
diverse set of solutions most be introduced in an algorithm. Because of these dual
tasks, multi-objective optimization is more difficult than single-objective optimization.

• Dealing with two search spaces: Another difficulty is that a multi-objective opti-
mization involves two search spaces, instead of one. In single-objective optimization,
there is only one search space, the decision variable space. An algorithm works in this
space by accepting and rejecting solutions based on their objective function values.
Here, in addition to the decision variable space there also exist the objective or crite-
rian space. Although these two spaces are related by unique mapping between them,
often the mapping is nonlinear and the properties of the two search spaces are not
similar.

• No artificial fix-ups: Multi-objective optimization for finding multiple pareto-optimal


solutions eliminate all fix-ups and can, in principle , find a set of optimal solutions corre-
sponding to different weight and ε-vectors. The avoidance of multiple simulation runs,
no artificial fix-ups, availability of efficient population based optimization algorithms.

1.3 Objective in multi-objective optimization algorithms


The ultimate goal of a multi-objective optimization algorithm is to identify solutions in the
Pareto optimal set. However, identifying the entire Pareto optimal set, for many multio-
bjective problems, is practically impossible due to its size. In addition, for many problems,
especially for combinatorial optimization problems, proof of solution optimality is compu-
tationally infeasible. Therefore, a practical approach to multi-objective optimization is to
investigate a set of solutions (the best-known Pareto set) that represent the Pareto opti-
mal set as much as possible. With these concerns in mind, a multi-objective optimization
approach should achieve the following three conflicting goals:

• The best-known Pareto front should be as close possible as to the true Pareto front.
Ideally, the best-known Pareto set should be a subset of the Pareto optimal set.

• To find a set of solutions as diverse as possible. In addition to being converged close


to the parto-optimal front, they must also be sparsely spaced in the pareto-optimal
region. Only with a diverse set of solutions, can we be assured of having a good set of
trade-off solutions among objectives [?].

• In addition, the best-known Pareto front should capture the whole spectrum of the
Pareto front. This requires investigating solutions at the extreme ends of the objective
function space.

4
1.4 Evolutionary Algorithms
The potential of evolutionary algorithms for solving multiobjective optimization problems
was hinted as early as the late 1960s by Rosenberg in his PhD thesis[?]. Evolutionary algo-
rithm is characterized by a population of solution candidates and the reproduction process
enables the combination of existing solutions to generate new solutions. This enables finding
several members of the Pareto-optimal set in a single run instead of performing a series of
separate runs, which is the case for some of the conventional stochastic processes.
Evolutionary algorithms are based on the principle of evolution, i.e. survival of the fittest.
Unlike classical methods, they do not use a single search point but a population of points
called individuals. Each individual represents a potential solution to the problem. In these
algorithms, the population evolves toward increasingly better regions of the search space by
undergoing statistical transformations called recombination, mutation and selection.
Evolutionary Algorithms have a number of components, procedures, or operators that
must be specified in order to define a particular Evolutionary Algorithms. The most impor-
tant components are:
a. Representation (definition of individuals) Objects forming possible solutions
within the original problem context are referred to as phenotypes, while their encod-
ing, that is, the individuals within the evolutionary algorithms are called genotypes.
The first design step is commonly called representation, as it amounts to specifying a
mapping from the phenotypes onto a set of genotypes that are said to represent these
phenotypes. A solution a good phenotype is obtained by decoding the best genotype
after termination. To this end it should hold that the (optimal) solution to the problem
at hand a phenotype is represented in the given genotype space.
b. Evaluation function (or fitness function) Typically, this function is composed of a
quality measure in the phenotype space and the inverse representation. The evaluation
function is commonly called the fitness function in Evolutionary Algorithms.
c. Population The role of the population is to satisfy (the representation of) possible
solutions. Given a representation, defining a population can be as simple as specifying
how many individuals are in it, that is. In almost all Evolutionary Algorithms appli-
cations the population size is constant and does not change during the evolutionary
search. The diversity of a population is a measure of the number of different solutions
present. No single measure for diversity exists.
d. Parent selection mechanism Choosing individuals for recombination and mutation
to become parents to the next generation and selection effectively gives an individual
with higher fitness value probably contributing one or more children in succeeding
generation. The role of parent selection or mating selection is to distinguish among
individuals based on their quality, in particular, to allow the better individuals to
become parents of the next generation. In Evolutionary Algorithms, parent selection
is typically probabilistic. Thus, high-quality individuals get a higher chance to become
parents than those with low quality. Nevertheless, low-quality individuals are often

5
given a small, but positive chance; otherwise the whole search could become too greedy
and get stuck in a local optimum.

e. Recombination and mutation The individuals chosen by selection recombine with


each other and new individuals will be created. The aim is to get offspring individuals,
that inherit the best possible combination of the characteristics (genes) of their parents.
By means of random change of some of the genes, it is guaranteed that even if none of
the individuals contain the necessary gene value for the extremum, it is still possible
to reach the extremum.

f. Survivor selection mechanism (replacement) Between all individuals in the cur-


rent population are chose those, who will continue and by means of crossover and
mutation will produce offspring population.The best n individuals are directly trans-
ferred to the next generation.

Figuer 1.1:Flowchart of evolutionary algorithm iteration. Adopted from: - [?]

1.4.1 Genetic Algorithms


The genetic algorithm(GA), as the name implies, mimic the biological evolution process to
find the fittest set of parameters. This application of biological process, named as genetic
algorithm was developed by Holland and his colleagues in the 1960s and 1970s[?]. GA
searches the solution space in multiple and random directions. GA is observed to be an
effective algorithm for highly nonlinear solution spaces since it is not typically trapped in
local optima. The GAs differ from most optimization techniques because of their global
searching from a population of solutions rather than from one single solution.
GA Operators: The simplest form of genetic algorithm involves three types of opera-
tors: selection, crossover and mutation.

i Selection: Select a new population on the basis of the assigned fitness.

6
ii Crossover: Make pairs randomly and a crossover for each pair according to a given
crossover rate (probability) to create two offspring.

iii Mutation: Mutate each individual according to a given mutation rate (probability).

The evolutionary algorithms use the three main principles of the natural evolution which
is reproduction, natural selection and diversity of the species, maintained by the differences
of each generation with the previous.
Genetic Algorithms works with a set of individuals, representing possible solutions of the
task. The selection principle is applied by using a criterion, giving an evaluation for the
individual with respect to the desired solution. The best-suited individuals create the next
generation.

Figure 1.2: flowchart of the working principle of a GA. Adopted from:- [?]
In its general form, genetic algorithm(GA) work through the following steps:

(1) Creation of a random initial population of Np potential solutions to the problem and
evaluation of these individuals in terms of their fitness, i.e. of their corresponding
objective function values;

(2) Check for termination of the algorithm :- As in the most optimization algorithms, it is
possible to stop the genetic optimization by:

– Value of the function:- The value of the function of the best individual is within
defined range around a set value. However, it is not recommended to use this
criterion alone, because of the stochastic element in the search the procedure, the
optimization might not finish within sensible time;
– Maximal number of iterations:- this is the most widely used stopping criteria. It
guarantees that the algorithms will give some results within some time, whenever
it has reached the extremum or not;

7
– Stall generation:- if within initially set number of iterations (generations) there
is no improvement of the value of the fitness function of the best individual the
algorithms stops.

(3) selection of a pair of individuals as parents;

(4) crossover of the parents, with generation of two children;

(5) genetic mutation;

(6) replacement in the population, so as to maintain the population number Np constant.

8
Chapter 2

Multi-objective optimization

2.1 Multi-objective optimization Formulation


A multi-objective optimization problem has a number of objective function which are to
be minimized or maximized. As in the single-objective optimization problem, here too the
problem usually has a number of constraints which only feasible solution (including the
optimal solution) satisfy them. The general form of multi-objective optimization(MOOP)
is[?]:
M inimize/M aximize fm (x), m = 1, 2, · · · , M ; (2.1)

Subject to gj (x) ≥ 0, j = 1, 2, · · · , J;
hk (x) = 0, k = 1, 2, · · · , K;
(L) (U )
xi ≤ xi ≤ xi i = 1, 2, · · · , n.

A solution x is a vector of n decision variables; x = (x1 , · · · , xn ). The last set of con-


straints are called variable bounds, restricting each decision variable xi to take a value with
in a lower xLi and upper xUi bound. These bounds constitute a decision space.
Associated with the problem are J inequality and K equality constrains. The terms
gj (x) and hk (x) are called constraint functions. A solution x that does not satisfy all of the
(J + K) constraints and all of the 2N variable bounds stated above is called an infeasible
solution. On the other hand, if any solution x satisfies all constraints and variable bounds,
it is known as a feasible solution.
There are M objective functions f (x) = (f1 (x), · · · , fm (x))T considered in the above
formulation. Each objective function can be either minimized or maximized. The duality
principle, in the context of optimization suggests that we can convert a maximization problem
in to a minimization one by multiplying the objective function by −1[?].
For each solution x in the decision variable space(i.e., the space, of which the feasible set X
is a subset, is called the decision space), there exists a point in the objective space, denoted by
f (x) = z = (z1 , z2 , · · · , zm )T . The mapping takes place between an n− dimensional solution
vector and an m− dimensional objective vector. Multi-objective optimization is sometimes

9
referred to as vector optimization, because a vector of objectives, instead of a single objective,
is optimized[?].
Linear and Nonlinear Multi-objective Optimization Problem: If all objective
functions and constraint functions are linear, the resulting multi-objective optimization
problem is called a multi-objective linear program(MOLP). Like the linear programming
problems, MOLPs also have many theoretical properties. However, if any of the objective or
constraint functions are nonlinear, the resulting problem is called a nonlinear multi-objective
problem[?].
2.1.1 Convex and Non convex Multi-Objective Optimization Problem
Before we discuss a convex multi-objective optimization problem, let us first define a convex
function:
Definition 2.1.1. A subset S of Rn is said to be convex if for any two pair of element
x1 , x2 ∈ Rn , the following condition is true:

λx1 + (1 − λ)x2 ∈ S

for all 0 ≤ λ ≤ 1.
Definition 2.1.2. A function f : Rn −→ R is a convex function if for any two pair of
solution x1 , x2 ∈ Rn , the following condition is true:

f (λx1 + (1 − λ)x2 ) ≤ λf (x1 ) + (1 − λ)f (x2 ) (2.2)

for all 0 ≤ λ ≤ 1.
The above definition gives rise to the following properties of a convex function:
• The linear approximation of f (x) at any point in the interval [x1 , x2 ] always underes-
timates the actual function value.

• The Hessian matrix of f (x) is positive definite for all x.

• For a convex function, a local minimum is always a global minimum.

A function satisfying the inequality shown in (??) with a 0 >0 sign instead of a 0 ≤0 sign is
called a non-convex function. To test it a function is convex with in an interval, the Hessian
matrix 52 f is calculated and checked for its positive-definiteness at all points in the interval.
One of the ways to check the positive-definiteness of a matrix is to compute the eigenvalues of
the matrix and check to see if all eigenvalues are positive. To test if a function is non-convex
in an interval,the Hessian matrix − 52 f is checked for its positive-definiteness[?].
Definition 2.1.3. A multi-objective optimization problem is convex if all its objective func-
tions are convex and the feasible region is convex[?].
Proposition 2.1. Let X be a convex set in Rn , f (x) = (f1 (x), · · · , fk (x)) and f : Rn −→ Rk
is Rk+ convex if and only if the functions fi are convex,for all i = 1, 2, · · · , k.

10
Proof. (⇒) : let f is Rk+ is convex and x, y ∈ X for all λ ∈ [0, 1]
⇒ λf (x) + (1 − λ)f (y) − f (λx + (1 − λ)y) ∈ Rk+

⇒ λ(f1 (x), · · · , fk (x)) + (1 − λ)(f1 (y), · · · , fk (y)) − [f1 (λx + (1 − λ)y), · · · , fk (λx + (1 − λ)y)]

⇒ (λf1 (x), · · · , λfk (x))+((1−λ)f1 (y), · · · , (1−λ)fk (y))−[f1 (λx+(1−λ)y), · · · , fk (λx+(1−λ)y)]


⇒ [λf1 (x)+(1−λ)f1 (y), · · · , λfk (x)+(1−λ)fk (y)]−[f1 (λx+(1−λ)y), · · · , fk (λx+(1−λ)y)]
⇒ [λf1 (x)+(1−λ)f1 (y)−f1 (λx+(1−λ)y), · · · , λfk (x)+(1−λ)fk (y)−fk (λx+(1−λ)y)] ∈ Rk+
⇒ fi is convex.
(⇐) Let fi is convex for all i = 1, · · · , k
⇒ λfi (x) + (1 − λ)fi (y) − [fi (λx + (1 − λ)y)] ∈ R+ ,∀(i = 1, · · · , k) and ∀λ ∈ [0, 1]

⇒ [λf1 (x)+(1−λ)f1 (y), · · · , λfk (x)+(1−λ)fk (y)]−[f1 (λx+(1−λ)y), · · · , fk (λx+(1−λ)y)]

⇒ (λf1 (x), · · · , λfk (x))+((1−λ)f1 (y), · · · , (1−λ)fk (y)))−[f1 (λx+(1−λ)y), · · · , fk (λx+(1−λ)y)]


⇒ λ(f1 (x), · · · , fk (x)) + (1 − λ)(f1 (y), · · · , fk (y)) − [f1 (λx + (1 − λ)y), · · · , fk (λx + (1 − λ)y)]
⇒ λf (x) + ()1 − λ)f (y) − f (λx + (1 − λ)y) ∈ Rk+
⇒ f is Rk+ - convex function.

2.2 Ordering Relations and Dominance Structure


First we introduce binary relations and some of their properties to define several classes of
orders. That is, let S be any set. A binary relation on S is a subset R of S × S. Some
properties of binary relations are[?]:

Definition 2.2.1. A binary relation R on S is called

• reflexive if (x, x) ∈ R for all x ∈ S,

• irreflexive if (x, x) ∈
/ R for all x ∈ S,

• symmetric if (x, y) ∈ R ⇒ (y, x) ∈ R for all x, y ∈ S,

• asymmetric if (x, y) ∈ R ⇒ (y, x) ∈


/ R for all x, y ∈ S,

• antisymmetric if (x, y) ∈ R and (y, x) ∈ R ⇒ x = y for all x, y ∈ S,

• transitive if (x, y) ∈ R and (y, z) ∈ R ⇒ (x, z) ∈ R for all x, y, z ∈ S,

• negatively transitive if (x, y) ∈


/ R and (y, z) ∈
/ R ⇒ (x, z) ∈
/ R for all x, y, z ∈ S,

• connected if (x, y) ∈ R or (y, x) ∈ R for all x, y ∈ S with x 6= y,

• strongly connected (or total) if (x, y) ∈ R or (y, x) ∈ R for all x, y ∈ S.

11
Definition 2.2.2. A binary relation R on a set S is
• an equivalence relation if it is reflexive, symmetric, and transitive,

• a preorder (quasi-order) if it is reflexive and transitive.

Instead of (x, y) ∈ R we shall also write xRy. In the case of R being a preorder the
pair (S, R) is called a preordered set. In the context of (pre)orders yet another notation for
the relation R is convenient. We shall write x  y as shorthand for (x, y) ∈ R and x  y
for (x, y) ∈
/ R and indiscriminately refer to the relation R or the relation . This notation
can be read as ”preferred to. ”Actually, ≺ and ∼ can be seen as the strict preference and
equivalence (or indifference) relation, respectively.
Definition 2.2.3. A binary relation  is called
• partial order if it is reflexive, transitive and antisymmetric,

• strict partial order if it is asymmetric and transitive (or, equivalently, if it is irreflexive


and transitive).
Let x and y be a vector on Rn , then some of the frequently used orders on Rn are:

Notation Definition Name


x5y xk 6 yk , k = 1, · · · , n weak componentwise order
x6y xk 6 yk , k = 1, · · · , n; x 6= y componentwise order
x<y xk < yk , k = 1, · · · , n strict componentwise order
x <lex y xk∗ < yk∗ or x = y for some k ∗ lexicographic order
Table 2.1: Some of orders on Rn
Definition 2.2.4. A subset C ⊆ Rp is called a cone, if αd ∈ C for all d ∈ C and for all
α ∈ R, α > 0.
Definition 2.2.5. A cone C in Rp is called
• nontrivial or proper if C 6= ∅ and C 6= Rp ,

• convex if αd1 + (1 − α)d2 ∈ C for all d1 , d2 ∈ C and for all 0 < α < 1,
T
• pointed if for d ∈ C, d 6= 0, −d ∈
/ C, i.e., C (−C) ⊆ {0}.

Due to the definition of a cone, C is convex if for all d1 , d2 ∈ C we have d1 + d2 ∈ C,


too: αd1 ∈ C and (1 − α)d2 ∈ C because C is a cone.
A preference order: A preference order represents a preference attitude of the decision
maker in the objective space. It is a binary relation on a set Y = f (X) ⊂ RP , where f is the
vector-valued objective function and X the feasible decision set[?]. The basic binary relation
 means strict preference; that is, y  z for y, z ∈ Y implies that the result (objective value)
y is preferred to z. From this, we may define two other binary relations ∼ and % as

12
y ∼ z if and only if not y  z and not z  y,

y % z if and only if y  z or y ∼ z.

The relation ∼ is called indifference (read y ∼ z as y is indifferent to z), and % is called


preference-indifference (read y % z as z is not preferred to y).

Definition 2.2.6. (Preference Function)


Given the preference order  on a set Y , a real-valued function u on Y such that

u(y) > u(z) if and only if y  z for all y, z ∈ Y

is called a preference function.

Remark 2.1. Note the following relationships:

y ∼ z ⇐⇒ not y  z and not z  y ⇐⇒ u(y) = u(z),

y % z ⇐⇒ y  z or y ∼ z ⇐⇒ u(y) = u(z).

Definition 2.2.7. (Efficiency)


Let Y be a feasible set in the objective space RP , and let  be a preference order on Y .
0
Then an element y ∈ Y is said to be an efficient (noninferior) element of Y with respect to
0
the order  if there does not exist an element y ∈ Y such that y  y . The set of all efficient
elements is denoted by ξ(Y, ). That is,
0 0
ξ(Y, ) = {y ∈ Y |@y ∈ Y : y  y }.

Domination Structures: Preference orders (and more generally, binary relationships)


on a set Y can be represented by a point-to-set map from Y to Y . In fact, a binary
relationship may be considered to be a subset of the product set Y × Y , and so it can be
regarded as a graph of a point-to-set map from Y to Y . Namely, we identify the preference
order  with the graph of the point-to-set map P [?]:
0 0
P (y) = {y ∈ Y : y  y }

[P (y) is a set of the elements in Y less preferred to y],


Another way of representing preference orders by point-to-set maps is the concept of
domination structures. For the preference  and for y ∈ Y ⊂ RP , the set

D+ (y) := {d ∈ RP : y + d  y}

is called the domination set for y, and

D− (y) := {d ∈ RP : y  y + d}

13
is called the dominated set for y. In addition,

I(y) := {d ∈ RP : y ∼ y + d}

is called the indifference set for y.


In our multiobjective optimization problem(MOOP) in which each objective function is to
be minimized, the domination set for Pareto order 6 is given by

D+ (y) = RP− {0}

for any y ∈ RP .
For each y ∈ Y ⊂ RP , we define the set of domination factors

D(y) := {d ∈ RP : y  y + d} ∪ {0}.

This means that deviation d ∈ D(y) from y is less preferred to the original y. Then the
point-to-set map D from Y to RP clearly represents the given preference order. We call D
the domination structure.

Definition 2.2.8. Given a set Y in RP and a domination structure D(.), the set of efficient
elements is defined by
0
ξ(Y, D) = {y ∈ Y |@y ∈ Y : y ∈ y + D(y)}.

This set ξ(Y, D) is called the efficient set with respect to the domination structure D.

The most important and interesting special case of the domination structures is when
D(.) is a constant point-to-set map, particularly when D(y) is a constant cone for all y.
When D(y) = D(acone), the domination structure D(.) is said to be

• asymmetry ⇐⇒ [d ∈ D, d 6= O =⇒ −d ∈
/ D]
0 0
• transitivity ⇐⇒ [d, d ∈ D =⇒ d + d ∈ D].

Pointed convex cones are often used for defining domination structures. We usually write
0 0 0 0
y 5D y for y, y ∈ Rp if and only if y −y ∈ D for a convex cone D in Rp . Also y ≤D y means
0 0 0 0
that y − y ∈ D but y − y ∈ / D. When D is pointed, y ≤D y if and only if y − y ∈ D{0}.
When D = RP+ , it is omitted as 5 or ≤. In other words,

0 0
y 5 y if and only if yi 5 yi for all i = 1, · · · , p,
0 0 0
y ≤ y if and only if y 5 y and y 6= y ,
i.e.,
0
yi 5 yi for all i = 1, · · · , p,
0
yi < yi for some i ∈ {1, · · · , p}.

14
2.3 Solution Concept
The concept of optimal solutions to multi-objective optimization problems is not trivial and
in itself debatable. It is closely related to the preference attitudes of the decision makers. The
most fundamental solution concept is that of efficient solutions (also called non-dominated
solutions or non-inferior solutions) with respect to the domination structure of the decision
maker[?].
We consider the multi-objective optimization problem (P) minimize

f (x) = (fl (x), f2 (x), · · · , fp (x)

subject to x ∈ X ⊂ Rn . Let, Y = f (X) = {y : y = f (x), x ∈ X}.


A domination structure representing a preference attitude of the decision maker is supposed
to be given as a point-to-set map D from Y into Rp .

Definition 2.3.1. (efficient solution) A point x∗ ∈ X is said to be an efficient solution


to the multi-objective optimization problem (P) with respect to the domination structure
D if f (x∗ ) ∈ ξ(Y, D); that is, if there is no x ∈ X such that f (x) ∈ f (x) + D(f (x)) and
f (x) 6= f (x∗ ) (i.e., such that f (x∗ ) ∈ f (x) + D(f (x)){O}).

Proposition 2.2. Given two domination structures D1 and D2 , D1 is said to be included


by D2 , if
D1 (y) ⊂ D2 (y)
for all y ∈ Y . In this case, ξ(Y, D1 ) ⊃ ξ(Y, D2 )

Proof. Let y ∈ ξ(Y, D2 ). There is no y ∈ Y such that f (y) ∈ f (y) + D2 (f (y){0}).


Since D1 is included in D2 , there is no y ∈ Y , f (y) ∈ f (y) + D1 (f (y){0}). This implies
that y ∈ ξ(Y, D1 ). Therefore, ξ(Y, D1 ) ⊃ ξ(Y, D2 )

Proposition 2.3. Let D be a nonempty cone containing 0, then

ξ(Y, D) ⊃ ξ(Y + D, D)

with equality holding if D is pointed and convex.

Proof. The result is trivial if Y is empty, so we assume otherwise. First suppose y ∈


0
ξ(Y + D, D) but y ∈ / ξ(Y, D). If y ∈
/ Y , there exist y ∈ Y and nonzero d ∈ D such that
0
y = y + d. Since 0 ∈ D, Y ⊂ Y + D. Hence, y ∈ / ξ(Y + D, D), which is a contradiction. If
y ∈ Y , we directly have a similar contradiction.
Next suppose that D is pointed and convex, y ∈ ξ(Y, D) but y ∈ / ξ(Y + D, D). Then there
0 0 0 0 00 00 00 00
exists a y ∈ Y + D with y − y = d ∈ D{0}. Then y = y + d with y ∈ Y, d ∈ D.
00 0 00 0 00
Hence, y = y + (d + d ) and d + d ∈ D, since D is a convex cone. Since D is pointed,
0 00
d + d 6= 0 and so y ∈ / ξ(Y, D), which leads to a contradiction. This completes the proof of
the proposition.

15
Proposition 2.4. Let Y1 and Y2 be two sets in Rp , and let D be a constant domination
structure on Rp (a constant cone). Then
ξ(Y1 + Y2 , D) ⊂ ξ(Y1 , D) + ξ(Y2 , D).
Proof. Let y ∗ ∈ ξ(Y1 + Y2 , D). Then y ∗ = y l + y 2 for some y l ∈ Y1 and y 2 ∈ Y2 . We show
that y l ∈ ξ(Y1 , D). If we suppose the contrary, then there exist y ∈ Y1 and nonzero d ∈ D
such that y l = y + d. Then y ∗ = y l + y 2 = y + y 2 + d and y + y 2 ∈ Y1 + Y2 , which contradicts
the assumption y ∗ ∈ ξ(Y1 + Y2 , D). Similarly we can prove that y 2 ∈ ξ(Y 2, D). Therefore,
y ∗ ∈ ξ(Y1 , D) + ξ(Y2 , D).

2.4 Dominance and Pareto Optimality


Most multi-objective optimization algorithms use the concept of dominance in their search.
We first define some special solution which are often used in multi-objective optimization
algorithms[?].
Ideal objective vector: For each of the M confilicting objective, there exists one
different optimal solution. An objective vector constructed with these individual optimal
objective values consitutes the ideal objective vector[?].
Definition 2.4.1. The mth component of the ideal objective vector z ∗ is the constrained
minimum solution of the following problem;
minimize fm (x)
subject to x ∈ S
Thus, if the minimum solution for the mth objective function is the decision vector x∗(m)

with function value fm , the ideal vector is as follows.
z ∗ = f ∗ = (f1∗ , f2∗ , · · · , fm

)

Nadir objective vector: The nadir objective vector, z nad , represents the upper bound
of each objective in the entire pareto-optimal set, and not in the entire search space.
In order to normalize each objective in the entire range of the pareto-optimal region, the
knowledge of nadir and ideal objective vectors can be used as follows
fi − zi∗
finorm =
zinad − zi∗
Definition 2.4.2. A Solution x1 is said to dominate the other solution x2 , if both conditions
1 and 2 are true:
1. The solution x1 is no worse than x2 in all objective,
2. The solution x1 is strictly better than x2 in at least one objective.

If any of the above condition is violated, the solution x1 does not dominate the solution x2 .

16
2.4.1 Pareto Optimality
For a given finite set of solutions, we can perform all possible pair-wise comparisons and
which solutions which and which solutions are non-dominated with respect to each other.
At the end, we expect to have a set of solutions, any two of which do not dominate each
other.
0
Among a set of solution P , the non-dominated set of solutions P are those that not
dominated by any member of the set P . When the set P is the entire search space, the
0
resulting non-dominated set P is called the pareto-optimal set[?].
Definition 2.4.3.

• The non-dominated set of the entire feasible search space S is the globally pareto-
optimal set.

• A solution x1 strongly dominates a solution x2 , if the solution x1 is strictly better


than solution x2 in all M objective.
0
• Among a set of solution P , the weakly non-dominated set of solutions P are those
that are not strongly dominated by any other member of the set P .

• A decision vector x∗ ∈ S is properly pareto-optimal if it is pareto-optimal and there


exist some real number M > 0 such that for each fi and each x ∈ S satisfying fi (x) <
fi (x∗ ), there exists at least one fj such that fj (x∗ ) < fj (x) and

fi (x∗ ) − fi (x)
≤M
fj (x) − fj (x∗ )

2.5 Procedures for Finding Non-dominated Set


Finding the non-dominated set of solutions from a given set of solutions is similar in principle
to finding the minimum of a set of real numbers. In the case of finding the non-dominated set,
the dominance relation  (or ≺ ) can be used to identify the better of two given solutions.
For finding the non-dominated set from a given population of solutions, there are many
approaches, but the three of them are mentioned as follows:
Approach 1: Naive and slow:-[?]In this approach, each solution xi is compared with
every other solution in the population to check if it is dominated by any solution in the
population. If the solution xi is found to be dominated by any solution, this means that
there exists at least one solution in the population which is better than xi in all objectives.
Hence the solution xi can not belong to the non-dominated set. However, if no solution is
fond to dominate solution xi , it is a member of the non-dominated set. This is how any
other solution in the population can be checked to see if it belongs to the non-dominated
set from a finite set of arbitrary selected population. The following procedure describes a
step-by-step for finding the non-dominated set in the given set P of size N .

17
0
Step 1 Set solution counter i = 1 and create an empty non-dominated set P .

Step 2 For a solution xj ∈ P (but j 6= i ), check if solution xj dominates solution xi . If yes,


go to step 4.

Step 3 If more solutions are left in P , increment j by one and go to step 2; otherwise, set
0 0
P = P ∪ {xi }.
0
Step 4 Increment i by one. If i ≤ N , go to step 2; otherwise, stop and declare P as the
non-dominated set.

Approach 2:Continuously updated:-[?]In this approach, every solution from the pop-
ulation is checked with a partially filled population for domination. To start with, the first
0
solution from the population is kept in an empty set p . Thereafter, each solution xi (the
0
second solution on wards) is compared with all members of the set p , one by one. If the
0 0
solution xi dominates any member of p , then that solution is removed from p . In this way
0
non-members of the non-dominated solutions get deleted from p . Otherwise, if solution xi
0
is dominated by any member of p , the solution xi is ignored. If solution xi is not dominated
0 0 0
by any member of p , it is entered in p . This is how the set p grows with non-dominated
0
solutions. When all solutions of the population are checked,the remaining members of p
constitute the non-dominated set. The procedure is :
0
Step 1 Initialize P = {x1 }. Set solution counter i = 2.

Step 2 Set j = 1.
0
Step 3 Compare solution xi with xj from P for domination.
0 0 0 0 0
Step 4 If i dominates j, delete the j th member form P or update P = P \{P (j) }. If j < |P |,
increment j by one and then go to step 3. Otherwise, go to step 5. Alternatively, if
0
the j th member of P dominates xi , increment i by one and then go to step 2.
0 0 0
Step 5 Insert xi in P or update P = P ∪ {xi }. If i < N , increment i by one and go to step
0
2. Otherwise, stop and declare P as the non-dominated set.

Approach 3: Kung et al.’s Efficient method:-[?]This approach first sorts the popu-
lation according to the descending order of importance to the first objective function value.
Thereafter, the population is recursively halved as top(T) and bottom(B) subpopulations.
Knowing that the top-half of the population is better in terms of the first objective function,
the bottom-half is then checked for domination with the top-half. The solutions of B that
are not dominated by any member of T are combined with members of T to form a merged
population M . The merging and the domination check starts with the innermost case (when
there is only one member left in either T or B in recursive divisions of the population) and
the proceeds in a bottom-up fashion. Generally the procedure of this approach is:

Step 1 Sort the population according to the descending order of importance to the first ob-
jective function and rename the population as P of size N .

18
Step 2 Front(P ) (which is the list of the population that sorted according to the descending
order of importance in the first objective function) if |P | = 1, return P as the output of
Front (P ). Otherwise, T = F ront(P (1) − P |P |2 ) and B = F ront(P ((|P |2)+1) − P (|P |) ).
If the ith solution of B is not dominated by any solution of T , create a merged set
M = T ∪ {xi }. Return M as the output of Front(P).

Example 2.1. Let us consider a two-objective optimization problem with five different
solutions shown in the objective space, as illustrated in Figure 2.1. Let us also assume
that the objective function 1 needs to be maximized while the objective function 2 needs
to be minimized. Five solutions with different objective function values are shown in this
figure. We illustrate the working principle of the above stated approaches on the same set
of five(N = 5) solutions, as shown in Figure 2.1. Ideally, the exact objective vector for
each solution will be used in executing the procedure, but here we use the figure to compare
different solutions. We follow the procedure step-by-step in the following.

Figuer 2.1:A population of five solutions. Adopted from: - [?]

Identifying the non-dominated set:Approach 1.


0
step 1 We set i = 1 and P = ∅.

step 2 We compare solution 1 with all other solutions for domination, starting from solution
2. We observe that solution 2 does not dominate solution 1. Since solution 1 is better
than solution 2 in objective function 1 and solution 1 is also better than solution 2 in
objective function 2. Thus, both of the conditions for domination are satisfied.

19
step 3 However, solution 3 dominates solution 1. Thus, we move to step 4.

step 4 Solution 1 does not belong to the non-dominated set and we increment i to 2 and move
to step 2 to check the fate of solution 2.

step 2 We observe that solution 1 dominates solution 2. We therefore move to step 4.

step 4 Thus, solution 2 does not belong to the non-dominated set. Next, we check solution 3.

steps 2 and 3 Starting from solution 1, we observe that neither solution 1 nor 2 dominate
solution 3. In fact, solutions 4 and 5 also do not dominates solution 3. Thus, we
0
include solution 3 in the non-dominated set, P = {3}.

step 4 We now check solution 4.


0
step 2 Solution 5 dominates solution 4. Thus, it can not be a member of P .

step 4 Now we check the final solution (solution 5).

step 2 We observe that none of the solutions (1 to 4) dominates solution 5.


0
step 3 So, solution 5 also belongs to the non-dominated set. Thus, we update P = {3, 5}.
0
step 4 We have now considered all five solutions and the non-dominated set P = {3, 5}.

Identifying the non-dominated set:Approach 2.


0
step 1 P = {1} and we set i = 2.
0
step 2 We set the solution counter of P as j = 1 (which refers to solution 1).
0
step 3 We now compare solution 2(i = 2) lone member of P (solution 1) for domination. We
0
observe that solution 1 dominates solution 2. Since the j th member of P dominates
solution i, we we increment i to 3 and go to step 2. This means that solution 2 does
not belong to the non-dominated set.
0
step 2 The set P still has solution 1 (j = 1) only.

step 3 Now we compare solution 3 with solution 1. We observe that solution 3(i = 3) dom-
0
inates solution 1. Thus, we delete the j th (or the first) member from P and update
0 0
P = ∅. Thus, |P | = 0. This depicts that a non-member of the non-dominated set gets
0
deleted from P . We now move to step 5.
0 0
step 5 We insert i = 3 in P or update P = {3}. Since i < 5 here, we increment i to 4 and
move to step 2.
0
step 2 We set j = 1 which refers to the lone element (solution 3) of P .

step 3 By comparing solution 4 with solution 3, we observe that solution 3 dominates solution
4. Thus, we increment i to 5 and move to step 2.

20
0
step 2 We still have solution 3 in P .

step 3 Now we compare solution 5 with solution 3. we observe that neither of them dominates
the other. Thus, we move to step 5.
0 0
step 5 We insert solution 5 in P and update P = {3, 5} as the non-dominated set.

2.5.1 Non-Dominated Sorting of a Population


Most evolutionary multi-objective optimization algorithms require us to find only the best
non-dominated front in a population. These algorithms classify the population in to two sets,
the non-dominated set and the remaining dominated set. In such algorithms, the population
needs to be sorted according to an ascending level of non-dominated. The procedures are:

Step 1 Set all non-dominated sets Pj , (j = 1, 2, · · · ) as empty sets. Set non-domination level
counter j = 1
0
Step 2 Use any one of the approaches 1 to 3 to find the non-dominated set P of population
P.
0 0
Step 3 Update Pj = P and P = p\{P }.

Step 4 If P 6= ∅, increment j by one and go to step 2. Otherwise, stop and declare all
non-dominated sets Pi , for i = 1, 2, · · · , j.

21
Chapter 3

Classical Methods and Genetic


Algorithm operator

3.1 Classical Methods


The task of finding multiple Pareto-optimal solutions is achieved by executing many inde-
pendent single-objective optimization, each time finding a single Pareto-optimal solution. A
parametric scalarizing approach (such as the weighted-sum approach, -constraint approach,
and others) can be used to convert multiple objectives into a parametric single-objective
function. By simply varying the parameters (weight vector or -vector) and optimizing the
scalarized function, different Pareto-optimal solutions can be found[?].

3.1.1 -Constraints Method


Besides the scalarization approach, one of the solution technique to multi-objective opti-
mization is the  -constraints method proposed by Chankong and Haimes in 1983. The -
constraint technique is based upon selecting a primary objective function and then bounding
the others with a separate allowable -constraint (must be known a priori). The -constraints
are then changed in order to generate another point on the Pareto front (phenotype) and so
forth resulting in finding elements in the Pareto optimal set (genotype). Non-uniformity in
the distribution of the Pareto front points usually occurs[?]. In mathematical terms, if we
let fj (x) be the objective function chosen to be minimized, we have the following problem:

min fj (x) (3.1)

s.t. fi (x) ≤ εi , ∀i ∈ {1, · · · , n}\{j}


x ∈ S, j ∈ {1, · · · , n}, j 6= i
Theorem 3.1. The solution of -Constraints problem(3.1) is weakly pareto-optimal[?].
Proof. Let x∗ ∈ S be a solution of the weighting problem. Let us suppose that it is not
weakly pareto-optimal. In this case, there exists a solution x ∈ S such that fi (x) < fi (x∗ )
for all i = 1, 2, · · · , n. This means that fj (x) < fj (x∗ ) ≤ εi for all j = 1, 2, · · · , , j 6= i. Thus
x is feasible with respect to the -Constraint problem. While on addition fj (x) < fj (x∗ ),

22
we have a contradiction to the assumption that x∗ is a solution of the -Constraint problem.
Thus, x∗ has to be weakly pareto-optimal.

Theorem 3.2. A decision vector x∗ ∈ S is pareto-optimal if and only if it is a solution of -


Constraints problem(3.1) for every j = 1, 2, · · · , n, where εi = fi (x∗ ) for i = 1, 2, · · · , n, i 6= j.

Proof. Necessity: Let x∗ ∈ S be pareto-optimal. Let us assume that it does not solve the
-Constraints problem For some j where εi = fi (x∗ ) for i = 1, 2, · · · , n,
j 6= i. Then there exists a solution. x ∈ S such that fj (x) < fj (x∗ ) and fi (x) ≤ fi (x∗ ) when
j 6= i. This contradicts the pareto-optimality of x∗ .
Sufficiency: since x∗ ∈ S is by assumption a solution of the -Constraint problem for every
j = 1, 2, · · · , n there is no x ∈ S such that fj (x) < fj (x∗ ) and fi (x) ≤ fi (x∗ ) when j 6= i.
This is the definition of pareto-optimality for x∗ .

Theorem 3.3. Let the multi-objective optimization problem be convex. If x∗ ∈ S is a


solution of -Constraints problem(3.1) for any given fj to be minimized and εi = fi (x∗ ) for
i = 1, 2, · · · , n, j 6= i. Then there exists a weighting vector 0 ≤ w ∈ Rn , ni=1 wi = 1, such
P
that x∗ is also a solution of weighting problem(3.2).

NOTE: The proof of this result relies on the generalized Gordon Theorem[?]. That is,
Let f be an m-dimensional convex vector function on the convex set X ⊂ Rn . Then either

I f (x) < 0 has a solution x ∈ X or

II pf (x) = 0 for all x ∈ X for some p ≥ 0, p ∈ Rm but never both.

Proof. Since x∗ solves -Constraints problem(3.1) the system of inequalities

fj (x) < fj (x∗ ), fi (x) ≤ fi (x∗ ), i = 1, 2, · · · , n, j 6= i

has no solution in X.
Upon imposing the convexity assumption and Papplying the generalized Gordon Theorem,
n
n
there exists p ∈ R P with : p ≥ 0 such thatP i=1 pi [fi (x) − fi (x∗ )] ≥ 0 for allPx ∈ X. By
choosing wi = pi  ni=1 pi (possible since ni=1 pi > 0), we have w ≥ 0 and ni=1 wi = 1
with this choice of w, we have, for all x ∈ X
n
X n
X n
X

wi [fi (x) − fi (x )] ≥ 0 =⇒ wi f (x) ≥ wi fi (x∗ )
i=1 i=1 i=1

Hence x∗ solves the weighting problem.

Advantages: Different pareto-optimal solutions can be found by using different εm


values. The same method can also be used for problems having convex or non-convex
objective space alike.
Disadvantages: The solution of the problem largely depends on the chosen  vector.
Moreover, as the number of objectives increases, there exists more elements in the  vector
there by requiring more information from the user.

23
3.1.2 Weighted Sum Method
A multi-objective problem is often solved by combining its multiple objectives into one
single-objective scalar function. This approach is in general known as the weighted-sum
or scalarization method. In more detail, the weighted-sum method minimizes a positively
weighted convex sum of the objectives, that is,
n
X
min wi fi (x) (3.2)
i=1

n
X
s.t. wi = 1
i=1

wi > 0, i = 1, · · · , n
x∈S

that represents a new optimization problem with a unique objective function.

Theorem 3.4. The solution of weighting problem(3.2) is weakly pareto-optimal[?].

Proof. Let x∗ ∈ S be a solution of the weighting problem. Let us suppose that it is not
weakly pareto-optimal. In this case, there exists a solution x ∈ S such that fi (x) < fi (x∗ ) for
all i = 1, 2, · · · , n. According to
Pthe assumptions
Pnset to the∗ weighting coefficients, wj > 0 for
n
at least one j. Thus, we have i=1 wi fi (x) < i=1 wi fi (x ). This is a contradiction to the
assumption that x∗ is a solution of the weighting problem. Thus x∗ is weakly pareto-optimal.

Theorem 3.5. The solution of weighting problem(3.2) is pareto-optimal if the weighting


coefficients are positive.

Proof. Let x∗ ∈ S be a solution of the weighting problem with positive weighting coefficients.
Let us suppose that it is not pareto-optimal. This means that there exists a solution x ∈ S
such that fi (x) ≤ fi (x∗ ) for all i = 1,P2, · · · , n and fjP
(x) < fj (x∗ ) for at least one j. Since
wi > 0 for all i = 1, · · · , n, we have i=1 wi fi (x) < ni=1 wi fi (x∗ ). This is a contradiction
n

to the assumption that x∗ is a solution of the weighting problem. Thus x∗ must be pareto-
optimal.

Theorem 3.6. Let the multi-objective optimization problem be convex. Pn If x ∈ S is pareto-
optimal, then there exist a weighting vector w (wi ≥ 0, i = 1, 2, · · · n, i=1 wi = 1) such that
x∗ is a solution of weighting problem(3.2).

Proof. Since x∗ is pareto-optimal, it is by Theorem(3.2) a solution of the -Constraints prob-


lem for every objective function fj to be minimized. By the aid of the convexity assumption
and theorem(3.3) the proof is completed.

24
Advantages: This is probably the simplest way to solve a multi-objective optimization
problems. The concept is intuitive and easy to use. For problems having a convex pareto-
optimal front, this method guarantees finding solutions on the entire pareto-optimal set.
Disadvantages: In most non-linear multi-objective optimization problems, a uniformly
distributed set of weight vectors need not find a uniformly distributed set of pareto-optimal
solutions. Since this mapping is not usually known, it becomes difficult to set the weight
vectors to obtain a pareto-optimal solution in a desired region in the objective space.
Moreover, different weight vectors need not necessarily lead to different pareto-optimal
solutions. If the chosen single objective optimization algorithm can not find all minimum
solutions for a weight vector, some pareto-optimal solutions can not be found.

3.2 Principles of Genetic Algorithms Operator


3.2.1 Binary Genetic Algorithms
A binary code represents text or computer processor instructions using the binary number
system’s two binary digits, 0 and 1. A binary code assigns a bit string to each symbol or
instruction. For example, a binary string of eight binary digits (bits) can represent any of
256 possible values and can therefore correspond to a variety of different symbols, letters or
instructions[?].
Representing a solution In order to use GAs to find the optimal decision variables
which satisfy the constraint and objective functions, we first need to represent them in binary
strings. However, in fact, GAs can be assigned to use any integer or non-integer values just
by changing the string length and lower and upper bounds;

xmax
i − xmin
i
xi = xmin
i + DV (si )
2li −1
Where li is the string length used to code the ith variable and DV (si ) is the decoded value
of the string si (where the complete string is s = ∪ni=1 si ). It allows the decision variables are
to take positive and negative values.
Assigning fitness to a solution It is important to reiterate that binary GAs work with
strings representing the decision variables, instead of decision variable themselves. Once a
string(or a solution) is created by genetic operators, it is necessary to evaluate the solution,
particularly in the context of the underlying objective and constraint functions.
The evaluation of a solution means calculating the objective function value and constraint
violations. Thereafter, a metric must be defined by using the objective function value and
constraint violations to assign a relative merit to the solution (called the fitness).

• Reproduction or Selection operator The primary objective of the reproduction


operator is to make duplicates of good solutions and eliminate bad solution in a pop-
ulation; while keeping the population size constant. The common method of selection
operator are:

25
– Tournament selection It is played between two solutions and the better so-
lution is chosen and placed in the matting pool. Two other solutions are picked
again and another slot in the matting pool is filled with the better solution. The
best solution in a population will win both times, thereby making two copies of
it in the new population. Using a similar argument, the worst solution will lost
in both tournament and will be eliminated from the population. In this way, any
solution in a population will have zero, one or two copies in the new population.
– Proportionate selection Solutions are Assigned copies, the number of which is
proportional to their fitness values. If the average fitness of all population member
is favg , a solution with a fitness fi gets an expected fi favg number of copies.
– Ranking selection First, the solutions are sorted according to their fitness, from
the worst(rank 1) to the best(rank N). Each member in the sorted list is assigned
a fitness equal to the applied with the ranked fitness values, and N solutions are
chosen for the matting pool.

• Crossover operator This operator randomly chooses a locus and exchanges the sub-
sequences before and after that locus between two chromosomes to create two offspring.
Like the Selection operator, there exists a number of crossover operators i.e., single-
point, two-point and uniform crossover operator.

In single-point crossover operator, this is performed by randomly choosing a crossing


size along the string and by exchanging all bits on the right side of the crossing sizes.
For example, the strings 10000100 and 11111111 could be crossed over after the third
locus in each to produce the two offspring 10011111 and 11100100.

In a two-point crossover operator, two different cross sizes are chosen at random. This
will divide the string into three sub-strings. The crossover operation is completed by
exchanging the middle substring between the strings.

• Mutation operator The bit-wise mutation operator changes a 1 to a 0, and vice


verse, with a mutation probability of pm . The need for mutation is to keep diversity
in the population.

3.2.2 Real-parameter Genetic Algorithms

Real parameters are used directly (without any string coding), solving real-parameter
optimization problems is a step easier when compared to the binary-coded GAs. Decision
variables can be directly used to compute the fitness values. Since the selection operator
works with the fitness value, any selection operator used with binary-coded GAs can also
be used in real-parameter GAs[?].
Simulated Binary Crossover: The procedure of computing the children solutions
(1,t+1) (2,t+1) (1,t) (2,t)
xi and xi from parent solutions xi and xi is described as follows. A spread
factor β is defined as the ratio of the absolute difference in children values to the parent

26
values:
(2,t+1) (1,t+1)
xi − xi
β=|(2,t) (1,t)
|
xi − xi
First, a random number u between 0 and 1 is created. Thereafter, from a specified probability
distribution function, the ordinate β is found so that the area under the probability curve
from 0 to β is equal to the chosen random number u. The probability distribution used to
create a child solutionis derived from an analysis of search power and is given as follows [?]:

0.5(n + 1)β n if, β ≤ 1;
C(β) = 1
0.5(n + 1) n+2 , otherwise.
β

Where n is any non-negative real number.Using this equation we can calculate β as follows:

 1
(2u) n + 1

if,u ≤ 0.5;

β= 1
 1
) n + 1 ,otherwise.

(

2(1 − u)

After obtaining β from the above probability distribution, the children solutions are calcu-
lated as follows:
(1,t+1) (1,t) (2,t)
xi = 0.5[(1 + β)xi + (1 − β)xi ],
(2,t+1) (1,t) (2,t)
xi = 0.5[(1 − β)xi + (1 + β)xi ].

Non-Uniform Mutation: Here, the probability of creating a solution closer to the


parent is more than the probability of creating one away from it. However, as the generations
(t) proceed, this probability of creating solutions closer to the parents gets higher and higher.
(1,t+1) (1,t+1) (U ) (L)
yi = xi + τ (xi − xi )(1 − ri1−ttmax )b
Where τ takes a boolean value, -1 or 1, each with a probability of 0.5 and tmax is the
maximum number of allowed generation, while b is a user defined parameter.
Polynomial Mutation:
ck = pk + (pUk − plk )δk
where ck is the child and pk is the parent with pUk being the upper bound on the parent
component, plk is the lower bound and δk is small variation which is calculated from a
polynomial distribution by using
δk = (2rk )(ηm +1) − 1, if rk < 0

δk = 1 − [2(1 − rk )](ηm +1) , if rk ≥ 0


rk is an uniformly sampled random number between (0,1) and ηm is mutation distribution
index.

27
Chapter 4

Non-Dominated Sorting Genetic


Algorithm

4.1 The common approach to fitness function


Weighted Sum Approaches:
The classical approach to solve a multi-objective optimization problem is to assign a
weight wi i to each objective function fi (x) so that the problem is converted to a single ob-
jective problem with a scalar objective function. The weighted sum approach is a straight-
forward to implementation, Since a single objective is used in fitness assignment and a single
objective GA can be used with minimum modifications[?].
Population based approaches: The classical example of this sort of approach is the
Vector Evaluated Genetic Algorithm (VEGA), proposed by Schaffer[?]. VEGA basically
consists of a simple genetic algorithm with a modified selection mechanism. In the VEGA,
population Pt is randomly divided into K equal sized sub-populations; P1 , P2 , · · · , PK . Then,
each solution in sub-population Pi is assigned a fitness value based on objective function fi .
Solutions are selected from these sub-populations using proportional selection for crossover
and mutation. Crossover and mutation are performed on the new population in the same
way with the single objective GA[?].
Pareto-Ranking Approaches: Pareto-ranking approaches explicitly utilize the con-
cept of Pareto dominance in evaluating fitness or assigning selection probability to solutions.
The population is ranked according to a dominance rule, and then each solution is assigned
a fitness value based on its rank in the population, not its actual objective function value[?].
Consider an individual xi at generation t which is dominated by an individuals in the
current population. Its current position in the individuals’ rank can be given by

rank(xi ; t) = 1 + nq(xi ; t)

where nq(xi ; t) is the number of solutions dominating solution xi at generation t. All non-
dominated individuals are assigned rank 1.

28
4.2 Diversity
Maintaining a diverse population is an important consideration in multi-objective GA to
obtain solutions uniformly distributed over the true Pareto font. Without taking any pre-
ventive measures, the population tends to form relatively few clusters in multi-objective GA.
This phenomenon is called genetic drift, and several approaches are used to prevent genetic
drift, as follows.
. Fitness Sharing Fitness sharing aims to encourage the search in unexplored sections
of a Pareto front by artificially reducing fitness of solutions in densely populated areas.
To achieve this goal, densely populated areas are identified and a fair penalty method
is used to penalize the solutions located in such areas. Sharing function is used to
obtain an estimate of the number of solutions belonging to each optimum. The idea of
fitness sharing was first proposed by Goldberg and Richardson[?] in the investigation
of multiple local optima for multi-modal functions. They used the following function
in their simulation studies:

1 − ( d )α if,d ≤ σ
share
sh(dij ) = σshare
0 otherwise

The parameter d is the distance between any two solutions in the population. Although
α does not have too much effect on the performance of the sharing function method. In
most application, an α = 1 or 2 is used. The Euclidean distance between two decision
variable vectors x(i) and x(j) can be calculated as dij :
v
u n
uX (i) (j)
dij = t (xk − xk )2
k=1

and σshare for introducing q equispaced (optima equally) niches in the search space is:
q
Pn (U ) (L) 2
k=1 (xk − xk )
σshare = √
2( q)1n

If d is zero(meaning that two solutions are identical or their distance is zero), sh(d) = 1.
This means that a solution has full sharing effect on it itself. On the other hand, if
d ≥ σshare meaning that two solutions are at least a distance of σshare away from each
other), sh(d) = 0. This means that two solutions which are a distance of σshare away
from each other do not have any sharing effect on each other.

The niche count nci is calculated for the ith solution, as follows:
n
X
nci = sh(dij )
j=1

29
Then the niche count provides an estimate of the extent of crowding near a solution.
It is important to note that nci is always greater than or equal to one. This is because
the right side includes the term sh(dii ) = sh(0) = 1. The final task is to calculate the
shared fitness value as
0 fi
fi =
nci
Two solutions might be very close in the objective function space while they have very
different structural features. Therefore, fitness sharing based on the objective function
space may reduce diversity in the decision variable space. However, Deb and Goldberg
[?] reported that fitness sharing on the objective function space usually performs better
than one based on the decision variable space.

Diversity Preservation
Most multi-objective evolutionary algorithms (MOEAs) try to maintain diversity within the
current Pareto set approximation by incorporating density information into the selection
process: an individuals chance of being selected is decreased the greater the density of
individuals in its neighborhood.
The sharing function method involves a sharing parameter σshare , which sets the extent
of sharing desired in a problem. This parameter is related to the distance metric cho-
sen to calculate the proximity measure between two population members. The parameter
σshare denotes the largest value of that distance metric within which any two solutions share
each others fitness. This parameter is usually set by the user, although there exist some
guidelines[?]. In the proposed NSGA-II, we replace the sharing function approach with a
crowded-comparison approach that eliminate any user-defined parameter for maintaining
diversity among population members.
Density Estimation: To get an estimate of the density of solutions surrounding a
particular solution in the population, we calculate the average distance of two points on
either side of this point along each of the objectives. This quantity idistance serves as an
estimate of the perimeter of the cuboid formed by using the nearest neighbors as the vertices
(call this the crowding distance).
Crowded-Comparison Operator: The crowded-comparison operator (≺n ) guides
the selection process at the various stages of the algorithm toward a uniformly spread-out
Pareto-optimal front. Assume that every individual xi in the population has two attributes:
1) non-domination rank (irank );
2) crowding distance (idistance ).
We now define a partial order ≺n as:
i ≺n j ,if (irank < jrank ) or (irank = jrank ) and (idistance > jdistance )
That is, between two solutions with differing non-domination ranks, we prefer the solution
with the lower (better) rank. Otherwise, if both solutions belong to the same front, then we
prefer the solution that is located in a lesser crowded region.

30
Elitisim
Elitism in the context of single-objective GA means that the best solution found so far during
the search has immunity against selection and always survives in the next generation. In
this respect, all non-dominated solutions discovered by a multi-objective GA are considered
as elite solutions.
Elitism can be introduced globally in a generational sense. Once the offspring population
is created, both parent and offspring population can be combined together. Thereafter, the
best N , members may be chosen to form the population of the next generation without any
parameter. In this way too, parents get a chance to compute with the offspring population
for their survival in the next generation. It makes sure that the fitness of the population of
best solution does not deteriorate. In this way, a good solution found early on in the run
will never be lost unless a better solution is discovered.
In fact, Rudolph(1996) has proved that GAs converge to the global optimal solution of
some functions in the presence of elitism. Moreover, the presence of elites enhances the
probability of creating better offspring[?]. Elitism can be implemented to different degrees.
For example, one can simply keep track of the best solution in a population and update it
if a better solution is discovered at any generation, but not use the elite solutions in any
genetic operations. On the other hand, in another extreme implementation, all elites present
in the current population can be carried over to the new population. In this way, not many
new solutions get a chance to enter the new population and the search does not progress any
where.

Termination of the Genetic Algorithm


Because the GA is a stochastic search method, it is difficult to formally specify convergence
criteria. As the fitness of a population may remain static for a number of generations
before a superior individual is found, the application of conventional termination criteria
becomes problematic. A common practice is to terminate the GA after a prespecified number
of generations and then test the quality of the best members of the population against
the problem definition. The number of generations that evolve depends on whether an
acceptable solution is reached or a set number of iterations is exceeded. After a while all
the chromosomes and associated costs would become the same if it were not for mutations.
At this point the algorithm should be stopped. If no acceptable solutions are found, the GA
may be restarted or a fresh search initiated.

4.3 Elitist Non-Dominated Sorting Genetic Algorithm


In NSGA-II, the offspring population Qt is first created by using the parent population Pt .
However, instead of finding the non-dominated front of Qt only, first the two populations are
combined together to form Rt of size 2N . Then, a non-dominated sorting is used to classify

31
the entire population Rt . Although this requires more effort compared to performing a non-
dominated sorting on Qt alone, it allows a global non-domination check among the offspring
and parent solutions.
Once the non-dominated sorting is over, the new population is filled by solutions of
different non-dominated fronts, one at a time. The filling starts with the best non-dominated
front and continues with solutions of the second non-dominated front, followed by the third
non-dominated front, and so on.
In the following, we outline the algorithm in a step-by-step format. Initially, a random
population Po is created. The population is sorted into different non-domination levels.
Each solution is assigned a fitness equal to its non-domination level (1 is the best level).
Thus, minimization of the fitness is assumed. Binary tournament selection (with a crowded
tournament operator described later), recombination and mutation operators are used to
create an offspring population Q0 of size N . The NSGA-II procedure is outlined in the
following.

step 1 Combine parent and offspring populations and create Rt = Pt ∪ Qt . Perform a non-
dominated sorting to Rt and identify different fronts:fi , i = 1, 2, · · · , etc.

step 2 Set new population Pt+1 = ∅. Set a counter i = 1. Until | Pt+1 | + | fi |< N, perform
Pt+1 = Pt+1 ∪ fi and i = i + 1.

step 3 Perform the crowding-sort (fi <c ) procedure and include the most widely spread (N − |
Pt+1 |) solutions by using the crowding distance values in the sorted fi to Pt+1 .

step 4 Create offspring population Qt+1 from Pt+1 by using the crowded tournament selection,
simulated binary crossover and polynomial mutation operators.

Figure 4.1: NSGA-II procedure


Adopted from:- [?]

32
Crowding Distance
To get an estimate of the density of solutions surrounding a particular solution xi in the
population, we take the average distance of two solutions on either side of solution xi along
each of the objectives. This quantity di serves as an estimate of the parameter of the cuboid
formed by using the nearest neighbors as the vertices (we call this the crowding distance).
The following algorithm is used to calculate the crowding distance of each point in the set f
Crowding Distance Assignment Procedure:crowding-sort (f <c )
step 1 Call the number of solutions in f as l =| f |. For each i in the set, first assign di = 0.
step 2 For each objective function m = 1, 2, · · · , M , sort the set in worse order of fm or, find
the sorted indices vector: I m = sort(fm , >).
step 3 For m = 1, 2, · · · , M , assign a large distance to the boundary solutions, or dIhm = dIlm =
∞, and for all other solutions j = 2 to (l − 1), assign:
Im Im
fmj+1 − fmj−1
dIjm = dIjm + max min
fm − fm

The index Ij denotes the solution index of the j th member in the sorted list. Thus, for
any objective, Il and Ih denote the lowest and highest objective function values respectively.
The second term on the right side of the last equation is the difference in objective function
values between two neighboring solutions on either side of solution Ij .

Crowded Tournament Selection Operator


The crowded comparison operator (<c ) compares two solutions and returns the winner of
the tournament. It assumes that every solution xi has two attributes.
1, A non-domination rank ri in the population
2, A local crowding distance (di ) in the population.
The crowding distance di of a solution xi is a measure of the search space around xi which
is not occupied by any other solution in the population. Based on these two attributes, we
can define the crowded tournament selection operator as follows.
Definition 4.3.1. Crowded tournament selection operator: A solution xi wins a tournament
with another solution j if any of the following condition are true:
1, If solution xi has a better rank, that is, ri < rj .
2, If they have the same rank but solution xi has a better crowding distance than solution
xj , that is, ri = rj and di > dj .
The first condition makes sure that chosen solution lies on a better non-dominated front.
The second condition resolves the tie of both solutions being on the same non-dominated
front by deciding on their crowded distance. The one residing in a less crowded area (with
a large crowding distance di ) wins.

33
Example 4.1. We consider the following two-objective,two-variable minimization problem
to illustrate how the algorithm, presented in this project, works.

minimize f1 (x) = x1
1 + x2
minimize f2 (x) =
x1

subject to 0.1 ≤ x1 ≤ 1,
0 ≤ x2 ≤ 5.

We have also chosen six random solutions and we assume an offspring population of six
solutions, in the search space for illustrating the working principle of algorithm described in
this project. These solutions are also tabulated in the following table.
Table 4.1: Parent and offspring with their objective function value.

Parent population Offspring population


solution x1 x2 f1 f2 solution x1 x2 f1 f2
1 0.31 0.89 0.31 6.10 a 0.21 0.24 0.21 5.90
2 0.43 1.92 0.43 6.79 b 0.79 2.14 0.79 3.97
3 0.22 0.56 0.22 7.09 c 0.51 2.32 0.51 6.52
4 0.59 3.63 0.59 7.85 d 0.27 0.87 0.27 6.93
5 0.66 1.41 0.66 3.65 e 0.58 1.62 0.58 4.52
6 0.83 2.51 0.83 4.23 f 0.24 1.05 0.24 8.54

step 1 We first combine the population Pt and Qt and from a set Rt = {1, 2, 3, 4, 5, 6, a, b, c, d, e, f }.
Next, we perform a non-dominated sorting on Rt . We obtain the following non-
dominated fronts:
f1 = {5, a, e},
f2 = {1, 3, b, d},
f3 = {2, 6, c, f },
f4 = {4}.

step 2 We set Pt+1 = ∅ and i = 1. Next, we observe that | Pt+1 | + | f1 |= 0 + 3 = 3. Since


this is less than the population size N (= 6), we include this front in Pt+1 . We set
Pt+1 = {5, a, e}. With these three solutions, we now need three more solutions to fill
up the new parent population. Now, with the inclusion of the second front, the size of
| Pt+1 | + | f2 | is (3 + 4), or 7. Since this is greater than 6, we stop including any more
fronts into the population. Note that if fronts 3 and 4 had not been classified earlier,
we could have saved these computations.

step 3 Next, we consider solutions of the second front only and observe that three (of four)
solutions must be chosen to fill up three remaining slots this subpopulation (solution
1,3,b and d ) by using the <c operator. We calculate the crowded distance values of
these solutions in the front by using the step-by step procedure.

34
step C1 We notice that l = 4 and set d1 = d3 = db = dd = 0. We also set f1max = 1,
f1min = 0.1, f2max = 60 and f2min = 0.
step C2 For the first objective function, the sorting of these solutions is shown in table 4.2
and is as follows:
I 1 = {3, d, 1, b}.
step C3 Since solutions 3 and b are boundary solution, we set d3 = db = ∞. for the other
two solutions, we obtain:
(1) (3)
f1 − f1 0.31 − 0.22
dd = 0 + max min
=0+ = 0.10.
f1 − f1 1 − 0.1
(b) (d)
f1 − f1 0.79 − 0.27
d1 = 0 + max min
=0+ = 0.58.
f1 − f1 1 − 0.1
Now, we turn to the second objective function and update the above distances.
First the sorting on this objective yields I 2 = {b, 1, d, 3}. Thus, d3 = db = ∞ and
the other two distance are as follows:
(3) (1)
f2 − f2 7.09 − 6.10
dd = dd + max min
= 0.10 + = 0.12.
f2 − f2 60 − 0
(d) (b)
f2 − f2 6.93 − 3.97
d1 = d1 + max min
= 0.58 + = 0.63.
f2 − f2 60 − 0
The overall crowding distances of the four solutions are:

d1 = 0.63, d3 = ∞, db = ∞, dd = 0.12.

Evidently solution d has the smallest parimeter of the hypercube around it than
any other solution in the set f2 . Now, we move to the main algorithm.

Table 4.2: The fitness assignment procedure under NSGA-II .

Front 1 Sorting in Distance


solution x1 x2 f1 f2 f1 f2
5 0.66 1.41 0.66 3.65 third first ∞
a 0.21 0.24 0.21 5.90 first third ∞
e 0.58 1.62 0.58 4.52 second second 0.54
Front 2 Sorting in Distance
solution x1 x2 f1 f2 f1 f2
1 0.31 0.89 0.31 6.10 third second 0.63
3 0.22 0.56 0.22 7.09 first fourth ∞
b 0.79 2.14 0.79 3.97 fourth first ∞
d 0.27 0.87 0.27 6.93 second third 0.12
step 3 Sorting,in descending order, these crowding distance values, yields the sorted set
{3, b, 1, d}. We choose the first three solutions.

35
step 4 The new population is Pt+1 = {5, a, e, 3, b, 1}. It is important to note that this
population is formed by choosing solutions from the better non-dominated fronts.
The offspring population Qt+1 has to be created next by using this parent population. We
realize that the exact offspring population will depend on the chosen pair of solutions partic-
ipating in a tournament and the chosen crossover and mutation operators. Let us say that
we pair solutions (5,e), (a,3), (1,b), (a,1), (e,b) and (3,5), so that each solution participates
in exactly two tournament. In the first tournament, we observe that solutions 5 and e belong
to the same front (r5 = re = 1). Thus, we choose the one with larger crowding distance
value. We find that solution 5 is the winner.
In the next comparison between solutions a and 3, solution a wins, since it belongs to
a better front. Performing other tournaments, we obtain the mating pool:{5, 5, a, a, b, e}.
Now, these solutions can be mated pair-wise and mutated to create Qt+1 . This completes
one generation of the NSGA-II.
Example 4.2. Consider the following two-objective multi-objective optimization problem:
min f1 (x) = 3x3 − 26x + 10
f2 (x) = 9x2 − 26
subject to the constraints:
x ≥ −2.5

We applied the non-dominated sorting genetic algorithm (NSGA-II) to this problem and the
algorithm was coded in Matlab. We use the Simulated Binary Crossover (SBX) and the
polynomial mutation operator and also we used the crossover probability of pc = 0.9 and a
1
mutation probability of pm = (where n is the number of decision variables). The distri-
n
bution indices for crossover and mutation operators as µc = 20 and µm = 20 respectively.
It shows a typical result with a population of 50 individuals and figure 4.2 and figure 4.3
are the classical representation in multi-objective optimization: f2 vs. f1 after 5 and 100
generations respectively.

Figure 4.2:The population after 5 generations Figure 4.3:The population after 100
with NSGA-II generations with NSGA-II

36
Example 4.3. Function f1 (X) and f2 (X) proposed by Zitzler [?] problem consists of solving
the following multi-objective optimization problem:

minimize f1 (X) = x1 r
f1
minimize f2 (X) = g(X).(1 − )
g(X)

9 Pn
s.t. g(X) = 1 + xi
n − 1 i=0
0 ≤ xi ≤ 1

where X = (x1 , · · · , xn ) and n = 30.

We applied the non-dominated sorting genetic algorithm (NSGA-II) to this problem and the
algorithm was coded in Matlab. We use the Simulated Binary Crossover (SBX) and the
polynomial mutation operator and also we used the crossover probability of pc = 0.9 and a
1
mutation probability of pm = (where n is the number of decision variables). The distri-
n
bution indices for crossover and mutation operators as µc = 20 and µm = 20 respectively.
It shows a typical result with a population of 100 individuals and figure 4.4 and figure 4.5
are the classical representation in multi-objective optimization: f2 vs. f1 after 50 and 500
generations respectively.

Figure 4.4:The population after 50 generations Figure 4.5:The population after 500

with NSGA-II generations with NSGA-II

37
Bibliography

[1] Kalyanmoy-Deb, Multiobjective Optimization using Evolutionary Algorithms, John Wi-


ley & Sons, Ltd, England, (2001).

[2] M.Ehrgott, Multi-criteria Optimization, Springer, Berlin. Heidelderg, (2nd edition),


(2005).

[3] K. Miettinen, Nonlinear Multiobjective Optimization, Kluwer Academic Publishers,


Boston(1999).

[4] A.Abraham, L. Jain and R. Goldberg(Eds). Evolutionary Multiobjective Optimization,


Springer-Verlag London, (2005).

[5] Y. Sawaragi, H. Nakayama and T. Tanino, Theory of Multiobjective Optimization, Aca-


demic press INC, London, (1985).

[6] V. Chankong, Y. Y. Haimes, Multiobjective Decision Making Theory and Methodology,


Elsevier Science Publishing Co. Inc, New York, (1983).

[7] T. Back, Evolutionary Algorithms in Theory and Practice, Oxford University Press,
New York, (1996).

[8] C. A. C. Coello, G. B. Lamont and D. A. V. Veldhuizen, Evolutionary Algorithms for


Solving Multi-Objective Problems Springer, New York, (2nd edition), (2002).

[9] O. L. Mangasarian, Nonlinear Programming, McGraw-Hill, New York, (1969).

[10] L. Wang, A. H. C. Ng and Kalyanmoy Deb(Eds), Multi-objective Evolutionary Opti-


mization for Product Design and Manufacturing, Springer, Verlag London, (2011).

[11] S.S. Rao, Optimization theory and application, Wiley Eastern Limited, New Delhi,
(1991).

[12] C. M. Fonseca and P. J. Fleming. Genetic Algorithms for Multiobjective Optimization:


Formulation, Discussion and Generalization. In S. Forrest, editor, Proceedings of the
Fifth International Conference on Genetic Algorithms, San Mateo, California, pp. 416-
423, (1993).

[13] Abdullah Konak, David W. Coit, Alice E. Smith, Multi-objective optimization using
genetic algorithms: A tutorial, Elsevier Ltd, (2005)

38
[14] Zitzler, E. and Thiele, L., Multiobjective evolutionary algorithms: a comparative case
study and the strength of pareto approach, IEEE Transactions on Evolutionary Com-
putation 3(4), 257-271 (1999).

[15] Schaffer, J.D. Multiple Objective optimization with vector evaluated genetic algorithms,
In J. J. Grefenstette, editor, Proceedings of an International Conference on Genetic
Algorithm and their applications, pp. 93-100, Pittsburgh,(1985).

[16] Deb, K. and Goldberg, D.E. An investigation of of niche an species fromation in genetic
function optimization’ in The Third International Conference on Genetic Algorithms.
Morgan Kaufmann Publishers Inc. San Francisco, CA, USA, pp. 42-50, (1989).

[17] K. Deb, S. Agrawal, A. Pratap, and T. Meyarivan. A fast elitist non-dominated sorting
genetic algorithm for multi-objective optimization: NSGA-II. In M. Schoenauer et al.,
editors, Parallel Problem Solving from Nature (PPSN VI), Springer, pp. 849 858, Berlin,
(2000).

[18] K. Deb and R. B. Agrawal, Simulated binary crossover for continuous search
space,Complex System 9(2), pp. 115-148, (1995).

[19] K. Deb and M. Goyal, A combined genetic adaptive search for engineering design, com-
puter science and informatics 26(4), pp. 30-45,(1996).

39

You might also like