0% found this document useful (0 votes)
52 views

Numerical Optimization: Basic Concepts and Algorithms: R. Duvigneau

This document provides an overview of numerical optimization concepts and algorithms. It begins with defining optimization problems and describing common problem types like convex, multi-modal, and noisy problems. It then outlines classical descent algorithms like steepest descent, conjugate gradient, and quasi-Newton methods. Evolutionary algorithms like genetic algorithms and evolution strategies are also introduced. The document discusses optimality conditions and choices for search directions and step lengths in descent methods. It provides illustrations of optimization algorithms on analytical test functions and describes constrained optimization problems and associated algorithms.

Uploaded by

Kadz Corp
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views

Numerical Optimization: Basic Concepts and Algorithms: R. Duvigneau

This document provides an overview of numerical optimization concepts and algorithms. It begins with defining optimization problems and describing common problem types like convex, multi-modal, and noisy problems. It then outlines classical descent algorithms like steepest descent, conjugate gradient, and quasi-Newton methods. Evolutionary algorithms like genetic algorithms and evolution strategies are also introduced. The document discusses optimality conditions and choices for search directions and step lengths in descent methods. It provides illustrations of optimization algorithms on analytical test functions and describes constrained optimization problems and associated algorithms.

Uploaded by

Kadz Corp
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

May 27th 2015

Numerical Optimization:
Basic Concepts and Algorithms
R. Duvigneau

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 1


Outline

I Some basic concepts in optimization


I Some classical descent algorithms
I Some (less classical) semi-deterministic approaches
I Illustrations on various analytical problems
I Constrained optimality
I Some algorithm to account for constraints

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 2


Some basic concepts

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 3


Problem description

Definition of a single-criterion parametric problem with real unknown

Minimize f (x ) x ∈ Rn cost fonction


Submitted to gi (x ) = 0 i = 1, · · · , l equality constraints
hj (x ) > 0 j = 1, · · · , m inequality constraints

What does your cost function look like ?

Convex problem Multi-modal problem Noisy problem

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 4


Some commonly used algorithms

I Descent methods : adapted to convex cost functions


steepest descent, conjugate gradient, quasi-Newton, Newton, etc.

I Evolutionary methods : adapted to multi-modal cost functions


genetic algorithms, evolution strategies, particle swarm, ant colony, simulated
annealing, etc.

I Pattern search methods : adapted to noisy cost functions


Nelder-Mead simplex, Torczon’s multidirectional search, etc.

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 5


Optimality conditions

Definition of a minimum
x ? is a minimum of f : Rn 7→ R if and only if there exists ρ > 0 such as:
I f defined on B(x ? , ρ)
I f (x ? ) < f (y ) ∀y ∈ B(x ? , ρ) y 6= x ?

→ not very useful to build algorithms ...

Characterization
A sufficient condition for x ? to be a minimum is (if f twice differentiable):
I ∇f (x ? ) = 0 (stationarity of gradient vector)
I ∇2 f (x ? ) > 0 (Hessian matrix positive definite)

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 6


Some classical descent algorithms

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 7


Descent methods

Model algorithm

For each iteration k (starting from xk ):

I Evaluate gradient ∇f (xk )

I Define a search direction dk (∇f (xk ))

I Line search : choice of step length ρk

I Update: xk+1 = xk + ρk dk

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 8


Choice of the search direction

Steepest-descent method:
I dk = −∇f (xk )

I Descent condition ensured :


∇f (xk ) · dk = −∇f (xk ) · ∇f (xk ) < 0

I But this yields an oscillatory path:


dk+1 · dk = (−∇f (xk+1 )) · dk = 0 (if
exact line search)

I Linear convergence rate:


kx −x ? k Illustration of steepest-descent path
limk→∞ kxk+1−x ? k = a > 0
k

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 9


Choice of the search direction

quasi-Newton method

I dk = −Hk−1 · ∇f (xk ) où Hk
approximate of the Hessian matrix
∇2 f (xk )
I H should fulfill the following
conditions:
I Symmetry
I Positive definite: ∇f (xk ) · dk =
−∇f (xk ) · H −1 · ∇f (xk ) < 0
I 1D approximation of the curvature:
Hk+1 (xk+1 −xk ) = ∇f (xk+1 )−∇f (xk )

I Ex : BFGS method Hk+1 =


Hk − T 1 Hk sk skT HkT + T1 yk ykT
sk H k sk yk sk
Illustration of quasi-Newton method
où sk = xk+1 − xk et
yk = ∇f (xk+1 ) − ∇f (xk )

I Super-linear convergence rate :


kx −x ? k
limk→∞ kxk+1−x ? k = 0
k

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 10


Choice of the step length
A classical criterion to ensure convergence : Armijo-Goldstein

I f (xk + ρk dk ) < f (xk ) + α∇f (xk ) · ρk dk (Armijo)


I f (xk + ρk dk ) > f (xk ) + β∇f (xk ) · ρk dk (Goldstein)

Illustration of Armijo-Goldstein criterion

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 11


Choice of the step length
An other criterion to ensure convergence (gradient required) : Armijo-Wolfe

I f (xk + ρk dk ) < f (xk ) + α∇f (xk ) · ρk dk (Armijo)


I ∇f (xk + ρk dk ) · dk > β∇f (xk ) · dk (Wolfe)

Illustration of Armijo-Wolfe criterion

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 12


Choice of the step length

The step length is determined using an iterative 1D search:

(p)
I Start from an initial guess ρk (p = 0)
(p+1)
I Update to ρk :
I Bisection method
I Polynomial interpolation
I ...
I until stopping criteria are fulfilled

A balance is necessary between the computational cost and the accuracy

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 13


Some (less classical) semi-deterministic approaches

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 14


Evolutionary algorithms

Principles

Inspired by Darwinian theory of evolution :

I A population is composed of individuals who have different characteristics

I Most fitted individuals can survive and reproduce

I An offspring population is generated from survivors

→ Mechanisms to improve progressively the population performance !

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 15


Evolution strategies

Model algorithm (λ, µ)-ES


At each iteration k, a population is characterized by its mean x̄k and its variance σ̄k2 .
Generation of population k + 1 :

I Generation of λ perturbation amplitudes σi = σ̄k e τ N(0,1)

I Generation of λ new individuals xi = x̄k + σi N(0, Id) (mutation)


with N(0, Id) multi-variate normal distribution
I Evaluation of the fitness of the λ individuals
I Choice of µ survivors among the λ new individuals (selection)
I Update of the population characteristics (crossover et self-adaptation) :
µ µ
1X 1X
x̄k+1 = xi σ̄k+1 = σi
µ i=1 µ i=1

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 16


Evolution strategy

Some results

I Proof of convergence towards the global


optimum in a statistical sense :
∀ > 0 limk→∞ P(|f (x̄k ) − f (x ? )| 6 ) = 1

I Linear convergence rate

I Capability to avoid local optima

I Limited to a rather small number of


Illustration of evolution strategy step
parameters (O(10))

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 17


Evolution strategies
Method CMA-ES (Covariance Matrix Adaption)
Imprvement of ES algorithm by using an anisotropic distribution

I offspring population is generated using a covariance matrix Ck :

xi = x̄k + σ̄k N(0, Ck ) = x̄k + σ̄k Bk Dk N(0, Id)


1/2
avec Bk matrix of eigenvectors of Ck et Dk eigenvalues matrix
I Iterative construction of the covariance matrix:
µ
c 1 X i
C0 = Id Ck+1 = (1 − c)Ck + pk pkT + c(1 − ) ω (yi )(yi )T with :
| {z } m m i=1
| {z }
previous estimation | {z }
1D update
covariance of parents
pk evolution path (last moves) et yi = (xi − x̄k )/σk

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 18


Some illustrations using analytical functions

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 19


Rosenbrock function

I Non-convex unimodal function "Banana valley"


I Dimension n = 16

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 20


Rosenbrock function

Steepest descent Quasi-Newton

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 21


Rosenbrock function

ES CMA-ES

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 22


Camelback function
I Dimension n = 2
I Six local minima
I Two global minima

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 23


Camelback function

Quasi-Newton Optimization path

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 24


Camelback function

ES Optimization path

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 25


Constrained optimality

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 26


Introduction

Necessity of constraints
I Often required to define a well-posed problem from mathematical point of view
(existence, unicity)

I Often required to define a problem that make sense from industrial point of
view (manufacturing)

Different types of constraints


I Equality / inequality constraints

I Linear / non-linear constraints

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 27


Linear contraints

Optimality conditions

A sufficient condition for x ? to be a minimum of f subject to A · x = b :


I A · x ? = b (admissibility)
I ∇f (x ? ) = λ? · A with λ? Lagrange multipliers (stationnarity)
I A · ∇2 f (x ? ) · A > 0 (projected Hessian positive definite)

Illustration of optimality conditions for linear constraints

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 28


Linear constraints

Projection algorithm for descent methods

At each iteration k, from an admissible point xk :

I Evaluation of gradient ∇f (xk )

I Choice of an admissible search direction Z · dk with Z a projection matrix (in the


admissible space: A · Z = 0)

I Line search: choice of step length ρk

I Update : xk+1 = xk + ρk Z · dk

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 29


Non-linear constraints

Optimality conditions

A sufficient condition for x ? to be a minimum of f subject to c(x ) = 0 :


I c(x ? ) = 0 (admissibility)
I ∇f (x ? ) = λ? · A(x ? ) with A(x ) = ∇c(x ) (stationnarity)
I A(x ? ) · ∇2 L(x ? , λ? ) · A(x ? ) > 0 with L(x , λ) = f (x ) − λ · c(x ) (projected
Lagrangian positive definite)

Illustration of optimality conditions for non-linear constraints

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 30


Non-linear constraints

Quadratic penalization algorithm

κ
Cost function with penalization: fq (x , κ) = f (x ) + 2
c(x ) · c(x )
It can be shown that: limκ→∞ x ? (κ) = x ?

Algorithm with quadratic penalization:

I Initialisation of κ

I Minimisation of fq (x , κ)

I Increase κ to reduce constraint


violation

Illustration of quadratic penalization

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 31


Non-linear constraints

Absolute penalization algorithm

Cost function with penalization: fa (x , κ) = f (x ) + κ kc(x )k


It can be shown that: ∃κ? such that x ? (κ) = x ? ∀κ > κ?

Algorithm with absolute penalization :

I Initialisation of κ

I Minimisation of fa (x , κ)

I Increase κ until constraint satisfied

Illustration of absolute penalization

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 32


Non linear constraints

Optimality condition in terms of Lagrangian L(x , λ) = f (x ) − λ · c(x )

I ∇λ L(x ? , λ? ) = 0 (admissibility)
I ∇x L(x ? , λ? ) = 0 (stationnarity)
I A(x ) · ∇2 L(x ? , λ? ) · A(x ) > 0 (positive-definite)

SQP algorithm (Sequential Quadratic Programing)


At each iteration k, Newton method applied to (x , λ):

∇2 f (xk ) − λk · ∇2 c(xk )
     
−A(xk ) δx −∇f (xk ) + λk · A(xk )
· =
−A(xk ) 0 δλ c(xk )

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 33


Some references

Classical methods
I G. N. Venderplaats. Numerical optimization techniques for engineering design.
McGraw-Hill, 1984.
I R. Fletcher. Practical Methods of Optimization. John Wiley & Sons, 1987.
I P. E. Gill, W. Murray, and M. H. Wright. Practical Optimization. Academic
Press, 1981.

Evolutionary methods
I Z. Michalewics. Genetic algorithms + data structures = evolutionary programs.
AI series. Springer-Verlag, New York, 1992.
I D. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning.
Addison Wesley Company Inc., 1989.

R. Duvigneau - Numerical Optimization: Basic Concepts and Algorithms 34

You might also like