0% found this document useful (0 votes)
7 views57 pages

Main

The document is a script for a lecture on optimization, covering both theoretical and numerical solution approaches for linear, nonlinear, and structural optimization problems. It includes definitions, examples, and methods such as the Simplex Algorithm, descent methods, and topology optimization. The lecture aims to identify optimization potentials, formulate tasks, classify problems, and assess results.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views57 pages

Main

The document is a script for a lecture on optimization, covering both theoretical and numerical solution approaches for linear, nonlinear, and structural optimization problems. It includes definitions, examples, and methods such as the Simplex Algorithm, descent methods, and topology optimization. The lecture aims to identify optimization potentials, formulate tasks, classify problems, and assess results.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

Script ”Introduction to Optimization”

Tom Lahmer, Annika Schleinitz, Sudan Shakya,


Daniela Ardila Ospina, Paul Debus, Zouhour Jaouadi, Simon Marwitz

April 2021
Contents

1 Introduction 2
1.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.1 Himmelblau function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.2 Rosenbrock’s function (Banana Function) . . . . . . . . . . . . . . . . . . . 6
1.3.3 Optimization Problem with Constraints . . . . . . . . . . . . . . . . . . . . 6
1.4 Model Calibration / Parameter Identification . . . . . . . . . . . . . . . . . . . . . 7
1.5 Portfolio Optimization after Markowitz . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Linear Optimization 9
2.1 Linear Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Classification of optimization problems . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Transport Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Network Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3 Nonlinear Optimization Problems 15


3.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1.1 Descent Direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1.2 Conditions of optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Descent Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.1 Methods of first order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.2 Methods of second order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.3 0th order descent methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3 Constrained Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3.1 Explicit Restriction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3.2 Penalty Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3.3 Linear Independence Constraint Qualification (LICQ) . . . . . . . . . . . . 30
3.3.4 Lagrange-Newton Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4 Structural Optimization 33
4.1 Dimensioning Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.1.1 Creating a Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.1.2 Minimization of material required to wrap goods (e.g. sugar) . . . . . . . . 34
4.2 Shape Optimization: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2.1 (A very) short introduction to FEM . . . . . . . . . . . . . . . . . . . . . . 35
4.2.2 Geometric description of shapes by splines: . . . . . . . . . . . . . . . . . . 37
4.3 Application: Optimization of the Shape of a Dam . . . . . . . . . . . . . . . . . . . 40
4.4 Topology Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.4.1 Examples of optimized structures . . . . . . . . . . . . . . . . . . . . . . . . 41
4.4.2 Evolutionary Structural Optimization(ESO): . . . . . . . . . . . . . . . . . 41
4.4.3 ESO for Stiffness Optimization: . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.4.4 Bi-directional Evolutionary Structural Optimization (BESO): . . . . . . . . 43
4.4.5 Solid Isotropic Material with Penalization (SIMP): . . . . . . . . . . . . . . 48

1
Chapter 1

Introduction

This is a script which goes along the lecture of Prof. Tom Lahmer on Optimization. It will be an
introduction to theory and numerical solution approaches for
• Linear Problems (graphical solution and Simplex Algorithm),
• Nonlinear Problems (descent and Newton Methods, gradient-free methods, global search),

• Structural Optimization (incl. Topology Optimization).

1.5
G
1 G

N N

0.5
N

0
y

-0.5

-1

-1.5

-2
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
x

Figure 1.1: Paths of iterative optimization algorithms for a non-convex function.

Goals of the Lecture:


• Identification of the potentials for optimization,

• formulation of the optimization task,


• classification of the optimization problem,
• implementation of optimization problem,
• qualitative assessment of the optimization results.

2
1.1 Basics
1.2 Definitions
Optimization Optimization is the systematic search to improve a certain system or situation.
Generally, a measure (the objective) is defined with which the gain can be quantified.
We focus on minimization problems as the maximization of any function f equals the minimization
of −f .

General Optimization Problem (OP)


Find x ∈ Rn so that f (x) is minimized under the constraint x ∈ X.
A shorter way to write this:

min f (x) under the constraint x ∈ X.

Hereby X ⊆ Rn is a non-empty set and f : X → R any real-valued function.

Objective
f is called cost function or objective (Zielfunktion).

Optimization Variable
x ∈ Rn is called design or optimization variable.

Set of admissible solutions


X is the range of admissible solutions (Menge der zulässigen Lösungen).

Optimal Solution
x∗ is the optimal solution of the optimization problem (sometimes also x̂).

Global Minimum
x∗ is called global minimum, if
f (x∗ ) ≤ f (x) ∀x ∈ X.

Strict Global Minimum


x∗ is called strict global minimum, if

f (x∗ ) < f (x) ∀x ∈ X.

Local Minimum
x∗ is called a local minimum, if

f (x∗ ) ≤ f (x) ∀x ∈ U (x∗ ) = {x| kx − x∗ k < }.

Strict Local Minimum


x∗ is called a strict local minimum, if

f (x∗ ) < f (x) ∀x ∈ U (x∗ ).

3
f (x)

U

x∗1 x∗2 x

Figure 1.2: Local and Global Minimum of a Function: x∗1 : local minimum, x∗2 : global minimum

Unconstrained Optimization Problem


An unconstrained optimization problem is given, if X = Rn :

min f (x) under the constraint x ∈ Rn .

Optimization Problem with Constraints (Standard Optimization Problem)


The constrained optimization problem is a special case of optimization problems:
Find x ∈ Rn so that f(x) is minimized under the constraints

gi ≤ 0, i = 1, ..., m,

hj = 0, j = 1, ..., p.
Short form:
min f (x) under the constraints gi (x) ≤ 0, h(x) = 0.

with f : Rn → R and g = (g1 , ..., gm )> : Rn → Rm and


h = (h1 , ..., hp )> : Rn → R as vector valued real functions.

1.3 Examples
1.3.1 Himmelblau function
The following function should be minimized over all values (x, y)> ∈ R2 :

fH (x, y) = (x2 + y − 11)2 + (x + y 2 − 7)2


With contour or isolines we can visualize the cost function in order to retrieve optimal solutions
graphically (only possible in 1D od 2D).
A contour line to level c ∈ R is defined as the set of all points x = (x1 , ..., xn )> that fulfill f (x) = c.

Nf (c) := {x ∈ Rn |f (x) = c}

Thus, the function f always has the same value along a contour line.

4
Figure 1.3: Surface plot of the function fH . Levels of same color indicate the same value of the
objective.

Task: Draw the surface plot of the Himmelblau function in either OCTAVE or Matlab. How
many mininal / maximal points are there?
The gradient ∇f (x) is the vector of all partial derivatives of f .

 ∂f (x) 
1 ∂x
 ∂f (x) 
∇f (x) :=  x ∈ Rn
 ∂x2 

 ... 
∂f (x)
∂xn
The gradient of a function f points in the direction of the steepest increase of f .
Example 1.3.1.
f (x1 , x2 ) = x21 + 4x32 + 3x1 x2
We see n = 2 and !
∂f (x)  
∂x1 2x1 + 3x2
∇f (x) := = .
∂f (x) 12x22 + 3x1
∂x2

In a one-dimensional case, i.e. n = 1, we use either dfdx


(x)
to indicate differentiation or simply
0
f (x). At this stage, we should recall some basic differentiation rules:
For any functions f and g and any real numbers a and b, the derivative of the function h(x) =
af (x) + bg(x) with respect to x is
h0 (x) = af 0 (x) + bg 0 (x)
which includes
(af (x))0 = af 0 (x)
and the sum rule
(f (x) + g(x))0 = f 0 (x) + g 0 (x).
The product rule tells us for h(x) = f (x)g(x)
h0 (x) = (f g)0 (x) = f 0 (x)g(x) + f (x)g 0 (x).
The chain rule applies to the derivative of a function being applied to a function, i.e. h(x) =
f (g(x)), then
0
(x) = f 0 (g(x)) · g 0 (x).
Exercise: Compute the gradient of fH (x, y). Visualize its first entry in terms of a surface plot
and its second entry in two different plots. Which linkage between the minimal / maximal points
and the plotted gradients can be made?

5
1.3.2 Rosenbrock’s function (Banana Function)
We minimize the function
fR (x, y) = 100(y − x2 )2 + (1 − x2 )2
with one minimal point xf = (1, 1)> with f (1, 1) = 0.

Figure 1.4: Rosenbrock‘s Function

1.3.3 Optimization Problem with Constraints


Example for a standard optimization problem:

min f (x, y) = (x − 7)2 + (y − 7)2


such that x, y ≥ 0
and g(x, y) = x + y ≤ 10.

First, we consider the set of admissible solutions X, i.e. all points (x,y) that fulfill the constraints.
This is the area within the red triangle in the following figure.

10

100
6
y

4 50

2
10
0
0 0 5
0 2 4 6 8 10 5 y
10 0
x x

Figure 1.5: Function with constraint

6
It can be seen that the objective function f has a global minimum in point (7,7). Because
7 + 7 ≥ 10, however, this point is not admissible. The actual solution of the problem therefore lies
in the point (x, y) = (5, 5) on the edge of the set of admissible solutions.

1.4 Model Calibration / Parameter Identification


Idea:
Finding optimal parameters x of a model m so that the sum of squared errors between measure-
ments and model’s response are small.
N
X
f (x) = (yi − m(ti , x))2
i=1

Figure 1.6: Fitted linear function to given data points.

In order to minimize the misfit between simulated response and measurements we try to reduce
the sum of squared errors.
min f (x) x ∈ X
with: N number of measurement points, yi measurements at (time) or point ti , m(ti , x) response
of the model at point ti and x vector of model parameters.

1.5 Portfolio Optimization after Markowitz


Given are j = 1, ..., n different possibilities to invest money, e.g. stocks, bonds,... Each investment
shows a return Rj , which can be profit as well as loss, in the next time interval. Unfortunately,
it is usually not known and shares a random character. In order to minimize the risk of loss, the
total budget of the investment amount is divided into units xj (optimization variable!) and spread
over the investment options j = 1, ..., n. The various investments together make up the portfolio.
The task of a portfolio manager is to find the optimal composition of such a portfolio, which means

7
that the shares xj of the different investments must be optimally determined in order to achieve a
certain goal. Often the aim is to achieve the highest possible expected expected return:
n
X n
X
max E(R) = xj E(Rj ), R= xj Rj
xj
j=1 j=1

A high profit correlates with a high risk, which can be modelled with the variance
n
X 2
V (R) = E(R − E(R))2 = E xj (Rj − E(Rj ))
j=1

A good compromise between high profit and low risk can be achieved by solving the following
optimization problem:
n
X n
X 2
min − xj E(Rj ) + αE xj (Rj − E(Rj )) , α>0
j=1 j=1

under the constraint that


n
X
xj = 1, xj ≥ 0, j = 1, ..., n.
j=1

α is a weighting parameter with which the willingness to assume risks can be integrated:
• high willingness of the investor to take risks: α → 0
• low willingness to take risks: α → ∞.

8
Chapter 2

Linear Optimization

2.1 Linear Optimization Problem


2.2 Classification of optimization problems
A function f : Rn → R is said to be linear if and only if

• f (x + y) = f (x) + f (y) ∀ x, y ∈ Rn
• f (cx) = cf (x), c∈R
holds true.
Example 2.2.1.

(a) f (x) = x X
(b) f (x) = 3x1 − 5x2 , x = (x1 , x2 ) X
(c) f (x) = d> x, x, d ∈ Rn X
(d) f (x) = Ax, A ∈ Rm×n , x ∈ Rn X

(e) f (x) = 1 ×
(f) f (x) = x + 1 ×
(g) f (x) = x2 ×

(h) f (x) = sin(x) ×


A linear optimization problem is a special case of an optimization problem with

f (x) = c> x,

i.e.

min c> x under the constraint Ax = b, x ≥ 0.

with x, c ∈ Rn , b ∈ Rm and A ∈ Rm×x being a matrix.
• n - dimension of the design variable,
• m - number of constraints,
• c - vector of costs.

9
The linear optimization problem is a special case of a constrained optimization problem by writing
it as follows:

f (x) = c> x,
g(x) = (−x) ≤ 0,
h(x) = Ax − b = 0.

Example 2.2.2.

1. Cultivation of a field of 40 hectare (ha) with beets and wheat


2. Invest: e 2400 and 312 working days
3. Costs per hectare: e 40 for beets, e 120 for wheat
4. Required working days: 6 for beets, 12 for wheat
5. Profit: e 100 for a hectare beets, e 250 for a hectare of wheat.
Let x1 be the size of the field for the beets and let x2 be the size of the field for the wheat.
What is the optimal trade off in order to maximize profit?

max f (x1 , x2 ) = 100x1 + 250x2

under the constraints:


g1 (x1 , x2 ) = x1 + x2 ≤ 40 maximal area
g2 (x1 , x2 ) = 40x1 + 120x2 ≤ 2400 maximal investment
g3 (x1 , x2 ) = 6x1 + 12x2 ≤ 312 maximal workdays
x1 , x2 ≥ 0 no negative sizes

For linear problems, the constraining lines are straight lines defining the set of admissible
solutions. The linear optimization problem can be solved graphically by simply shifting the cost
function f to the corner point of the set admissible solutions which give the smallest / highest
value of the objective.

40

30

20

10

g2 : 40x1 + 120 x2 = 2400


−10 10 20 30 40 50 60
f : 100x1 + 250 x2 = 5500
g3 : 6x1 + 12 x2 = 312
−10
f : 100x1 + 250 x2 = 3500
−20

g1 : x1 + x2 = 40
−30

In this example x1 = 30, x2 = 10, f (30, 10) = 5500 the investment is limiting, so it is an
”active” constraint. Those constraints, not affecting the value of the objective at the optimal
point, are called in-active constraints.

10
The example can be put into a general approach to solve linear optimization problems: We define
x = (x1 , x2 )>
c> x → max c = (100, 250)>
such that  
1 1
Ax ≤ b with A =  40 120 ,
6 12
 
40
b =  2400
312
.
What can happen in the general case?
1) The constraints form a set which is convex, then one of the corner points is the optimal
solution.
2) The set of admissible solutions is not bounded and f (x∗ ) takes as values either +∞ of −∞.
3) There are too many constraints
→ the set of admissible solutions is empty → no solution exists.

(a) convex
(b) not convex
x2 constraints

x1
objective

Figure 2.1: Top) Graphical representation of convex and non-convex sets. Middle) Domain with
bounded, convex set of admissible solutions. Lower) Example of a non-bounded set.

Algorithms for solving linear optimization problems are looking solely for the corner points of
the convex set of admissible solutions, which give each time an admissible solution. The optimal
solution is now found by choosing other corner points so that the value of the objective decreases.
The algorithms stops if
• no further improvement is possible,
• the algorithm detects that the problem is unbounded,
• the algorithm detects that the problem has no solution (too many constraints).
This Algorithm was found in 1948 based on works of G. Danzig, W. Leontief, and T. S. Motzkin.
Further reading. File SimplexAlgo.pdf in Moodle or https://en.wikipedia.org/wiki/Simplex algorithm.
One implementation of this algorithm can be seen in the following.
To apply this algorithm, e.g. to the farmer problem, the latter was slightly changed into the
so-called standard form, i.e.
min cT x
s.t. Ax = b
x ≥ 0.

11
f u n c t i o n SimplexMethodForFarmerProblem
% Tom Lahmer , 2018
% Simplex Method , 2nd phase .
% The f o l l o w i n g data a r e p r o v i d e d by t h e u s e r :
% Matrix A
A=[1 1 1 0 0 ; 40 120 0 1 0 ; 6 12 0 0 1 ] ;
%V e c t o r b
b=[40 , 2400 , 3 1 2 ] ' ;
% Vector c
c = [ −100 , −250 , 0 , 0 , 0 ] ' ;
% F i r s t b a s i s , i . e . a c t u a l l y a c o r n e r p o i n t b e i n g an a d m i s s i b l e s o l u t i o n ( g u e s s e d )
B= [ 3 , 4 , 5 ] ;

i f l e n g t h (B) ˜= l e n g t h ( b )
f p r i n t f ( ' The b a s i s n e e d s t o have a s many e n t r i e s a s c o n s t r a i n t s ! ' ) ;
return ;
end

AllIndices = 1 : 1 : length ( c ) ;
N = s e t d i f f ( A l l I n d i c e s , B) ; % S e t o f non−b a s i s e n t r i e s
NoOptSolution = t r u e ;
i t e r =0;

w h i l e NoOptSolution
i t e r = i t e r +1;
f p r i n t f ( ' \ n I t e r a t i o n %d . . . ' , iter ) ;

i f d e t (A ( : , B) ) ˜=0 % c h e c k f o r i n v e r t a b i l i t y
GammaBN = i n v (A ( : , B) ) ∗A ( : , N) ;
xB = i n v (A ( : , B) ) ∗b ;
zetaN = c (B) ' ∗GammaBN − c (N) ' ;
c o s t f u n c t i o n = c (B) ' ∗xB ;
else
f p r i n t f ( ' The c h o s e n b a s i s d o e s not r e s u l t i n an a d m i s s i b l e s o l u t i o n ' ) ;
end

% Find p i v o t columns and t e s t w . r . t s t o p p i n g


i f max( zetaN )>0
[ l a r g e s t Z e t a , IndexPC ] = max( zetaN ) ;
pivotColumn = N( IndexPC ) ;
else
NoOptSolution= f a l s e ;
b r e a k ; % i f a l l z e t a v a l u e s a r e n e g a t i v e , s t o p . Opt . s o l i s found
end

% Find p i v o t row and t e s t w . r . t s t o p p i n g , i . e . t h e s m a l l e s t f r a c t i o n


% XB i / Gamma iq ; however o n l y f o r t h o s e Gamma iq where Gamma iq i s
% non−n e g a t i v e
xBOverGamma = xB . / GammaBN( : , IndexPC ) ;

minVal = max(xBOverGamma) ;
f o r i =1: l e n g t h (xBOverGamma)
i f minVal >=xBOverGamma( i ) && GammaBN( i , IndexPC )>0
% f i n d s m a l l e r o n e s with pos . Gamma
minVal = xBOverGamma( i ) ; %s t o r e i t
IndexPR =i ; % t h e l a s t i n d e x where t h i s c a s e i s t r u e i s t h e p i v o t i n d e x
end
end

pivotRow = B( IndexPR ) ;
% change t h e b a s i s
[ B,N] = c h a n g e B a s i s (B, N, c , pivotRow , pivotColumn ) ;
end
x = zeros ( length ( c ) ,1) ;
x (B) = xB ;

f p r i n t f ( ' \nThe o p t i m a l s o l u t i o n i s %f ' ) ;


x
f p r i n t f ( ' with o b j e c t i v e c ˆTx = %f ' , c o s t f u n c t i o n ) ;

f u n c t i o n [ B,N] = c h a n g e B a s i s (B, N, c , pivotRow , pivotColumn )


B = s e t d i f f (B, pivotRow ) ;
B = [ B pivotColumn ] ;
AllIndices = 1 : 1 : length ( c ) ;
N = s e t d i f f ( A l l I n d i c e s , B) ;

12
To achieve at this system, so-called non-negative slack-variables have been added to the inequality
constraints. Compare with definitions of A, B and c in the source code. The usage of the simplex
algorithm is in particular useful, when the dimension of the design variable x or the number of
constraints increases, see the following examples.

2.3 Transport Problems


A transport company has m stores from which n consumers can be supplied with a product. The
delivery costs from warehouse i to consumer j are cij units per product unit. Store i has ai
units of the product in stock. Consumer j has a requirement of bj units. In order not to upset
the customers, the supplier must satisfy the customers’ needs. On the other hand, he wants to
minimize his delivery costs.

delivery
a1

b1

consumers
a2
stores

. . .
. . .

bn

am

• m number of stores with ai units in stock,


• n customers with demand bj ,
• ci,j costs of delivering from store i to customer j per product unit,
• ihe store i has ai units of a product in stock,

• xi,j is the amount delivered from store i to customer j.

This results in the following transport problem, which is a special linear optimization problem:
m X
X n
min xi,j ci,j
i=1 j=1

under the constraints that


Pn
xi,j ≤ ai , i = 1, ..., m, stores not empty
Pj=1
m
i=1 xi,j ≥ bj , j = 1, ..., n, demands are satisfied
xi,j ≥ 0, i = 1, ..., m, j = 1, ..., n no negative deliveries

2.4 Network Problems


A company wants to transport as many goods as possible from city A to city D via the illustrated
road network. The numbers next to the edges of the network indicate the maximum capacity of
each edge.
How can this problem be modelled mathematically?
• V = {A, B, C, D} set of nodes corresponding to the cities in the network

13
B

5 7
2

A C D
3 6

• E = {(AB), (BC), (BD), (AC), (CD)} set of edges corresponding to the streets
• xi,j describes the transported amount along the edge (i, j) with maximal capacity

li,j ≤ xi,j ≤ ui,j , (i, j) ∈ E

• no goods are produced nor consumed in B and C:

xBD + xBC − xAB = 0


xCD − xBC − xAC = 0

Since no additional goods are produced or disappear on the way, the task is to maximize the
quantity of goods leaving the start node A (= the quantity of goods arriving at the end node D).

max xAB + xAC .

Remark: The pair G := (V, E), i.e. the combination of the set of all vertices and edges forms
a graph G. Many optimization problems are solved on graphs, e.g. the shortest-paths algorithm
which are used in navigation software. Generally, these are optimization algorithms on graphs
which allow to find the shortest distance between two cities or two points in a network. Optimiza-
tion of graphs is a lecture on its own. Still, most of the algorithms are variants of linear optimization
problems, where the dimension of the matrix A scales with the number of cities (edges) and streets
(vertices). Some further reading here: https://en.wikipedia.org/wiki/Shortest path problem

14
Chapter 3

Nonlinear Optimization Problems

Minimize f (x) with respect to x ∈ Rn , so x = (x1 , x2 , ..., xn )T .

3.1 Definitions
The construction of algorithms for nonlinear Optimization Problems (NLOP) generally requires:

• A good initial guess of the solution x0 ,


• The gradient of f (x), denoted as ∇ f (x), i.e. vector of all partial derivatives of f
 >
∂f (x) ∂f (x)
∇ f (x) = , .., ,
∂x1 ∂xn

• (sometimes) the Hessian matrix Hf (x), matrix of second derivatives of f (x). The Hessians
are always symmetric.
 
∂ 2 f (x) ∂ 2 f (x) ∂ 2 f (x)
2 ···
 ∂∂2 fx(x)
1 ∂x1 ∂x2
∂ 2 f (x)
∂x1 ∂xn 
∂ 2 f (x) 

 ∂x2 ∂x1 ∂ 2 x2 ··· ∂x2 ∂xn 
Hf (x) =  .. .. .. .. .

 . . . . 

∂ 2 f (x) ∂ 2 f (x) ∂ 2 f (x)
∂xn ∂x1 ∂xn ∂x2 ··· ∂ 2 xn

Note: There are algorithms which do not require any or only some of these. For example the
Nelder-Mead Method, genetic algorithms, etc.

3.1.1 Descent Direction


Definition: Let f : Rn → R and x ∈ Rn . A vector d ∈ Rn is called a descent direction of f (x),
if there is a t such that
f (x + td) ≤ f (x), ∀ 0 < t < t

Lemma: Let f : Rn → R be a continuously differentiable function. Then d is a descent direction


if ∇f (x)> d < 0.

Consequence: The negative gradient d = −∇f (x) is therefore always a descent direction.

Geometrical interpretation: The angle between gradient and descent direction needs to be
between 90◦ and 270◦ .

15
Figure 3.1: Descent directions

As long as (∇f (x))> d < 0 there are possibilities to further decrease the value of f . This
condition is violated if ∇f (x) = 0.

3.1.2 Conditions of optimality


3.1.2.1 Necessary Condition of optimality
Definition: Let f : Rn → R be continuously differentiable and x∗ is a local minimum point,
then ∇f (x∗ ) = 0. All points with ∇f (x∗ ) = 0 are called stationary points.

3.1.2.2 Sufficient condition of optimality:


• If x∗ is a stationary point and Hf is positive definite, then it is a minimal point. A matrix
A is positive definite, if ∀x ∈ Rn x> Ax ≥ 0
• If x∗ is a stationary point and Hf is negative definite, then it is a maximal point . A matrix
A is negative definite, if ∀x ∈ Rn x> Ax ≤ 0
• If x∗ is a stationary point and Hf is indefinite, then it is saddle point.

If Hf is positive (or negative) definite, the function is said to have non-negative (or non-positive)
curvature.
As the matrices Hf are symmetric, we can test the definiteness with the eigenvalue criterion:
A symmetric matrix is positive definite, when all its eigenvalues are larger than zero. It is negative
definite, when all its eigenvalues are smaller than zero. If we have eigenvalues equal to zero, we
speak of semi-definite matrices. Matrices which have positive and negative eigenvalues are called
indefinite.
Example 3.1.1.

f (x, y) = x2 + y 2
 
2x
∇f (x, y) =
2y
 
2 0
Hf (x, y) =
0 2
x∗ = (0, 0)>
 
∗ 0
∇f (x ) =
0
[Ger16]
Eigenvalues: [2, 2] > 0 → Positive definite, minimum

16
Example 3.1.2.

f (x, y) = −x2 − y 2
 
−2x
∇f (x, y) =
−2y
 
−2 0
Hf (x, y) =
0 −2
x∗ = (0, 0)>
 
∗ 0
∇f (x ) =
0
[Ger16]
Eigenvalues: [−2, −2] > 0 → Negative definite, maximum
Example 3.1.3.

f (x, y) = x2 − y 2
 
2x
∇f (x, y) =
−2y
 
2 0
Hf (x, y) =
0 −2
x∗ = (0, 0)>
 
∗ 0
∇f (x ) =
0
x = (1, 1)>
[Ger16]
Eigenvalues: [−2, 2] 6= 0 → indefinite, Saddle point

3.2 Descent Methods


We derive now algorithms to solve
min f (x), x ∈ Rn , withf : Rn → R
numerically.

We distinguish between methods of:


0th order only objective value f is available. (Gradient-free).
1st order f 0 is available, but f 00 is not available (gradient-based).
2nd order f 0 and f 00 are available (gradient + Hessian ∼
= Newton Methods)
We will start deriving methods of first order (gradient-methods) and will extend them to meth-
ods of second order. After this, we will discuss some few approaches for situations, where f might
not be differentiable, so that the gradient does not exist. Also for non-differentiable functions
(e.g. noise functions), some effective methods exits, which are presented in the subsection 0 order
methods.

3.2.1 Methods of first order


3.2.1.1 Gradient methods
Remember:
• Condition of a descent direction (∇f (x))> d < 0
• Direction of steepest descent d = −∇f (x)

17
3.2.1.2 Method of steepest descent

Algorithm 1: Steepest descent Algorithm


Initialize at x0 ∈ Rn , k = 1
while ||∇f (xk )|| >  do
dk = −∇f (xk )
Determine step length t > 0 with f (xk + tdk ) < f (xk )
xk+1 = xk + tdk
k =k+1
end
Print optimal point and number of iterations

• In this method t is a step length parameter.


• It should be sufficiently large for efficiency, but not too long to ensure convergence.

Figure 3.2: Descent Method

3.2.1.3 Step length control


1. Assume that the descent direction dk is well chosen.
2. For the determination of the step length it is sufficient to regard the following function

ϕ(t) := f (xk + tdk ) with t > 0

3. From ϕ0 (0) = ∇f (xk )> dk < 0 it follows ϕ(t) < ϕ(0) ∀ 0 < t < t̄

This motivates the Armijo-Rule.


For given numbers β ∈ (0, 1) and σ ∈ (0, 1) determine

tk = max{β j , j = 0, 1, 2, ...}
such that
ϕ(tk ) ≤ ϕ(0) + σtk ϕ0 (0)
| {z } | {z }
convergence efficiency
0
Due to ϕ (0) being negative, the Armijo rule guarantees that the values of the cost functions are
decreasing.

18
(a) One-dimensional line search starting from xk to- (b) Increments that fulfil the Armijo rule [Ger16]
wards dk [Ger16]

Algorithm 2: Steepest Descent with Simple Armijo-Rule Step-Length Search


Initialize at x0 ∈ Rn ,k = 1, β ∈ (0, 1), σ ∈ (0, 1)
while ||∇f (xk )|| >  do
dk = −∇f (xk )
tk = 1
while f (xk + tdk ) > f (xk ) + σtk ∇f (xk )T dk do
tk = βtk
end
xk+1 = xk + tk dk ,
k =k+1
end
Print optimal point and number of iterations

Example 3.2.1.
We use the gradient method to minimize the function
f (x1 , x2 ) = x21 + 10x22
The step size is determined with an exact line search with:
ϕ(t) := f (xk + tdk ) with t > 0
We obtain: 
2x1
∇f (x1 , x2 ) =
20x2
 
2x1
d=− = (d1 , d2 )>
20x2
ϕ(t) = (x1 + td1 )2 + 10(x2 + td2 )2
ϕ0 (t) = 2(x1 + td1 )d1 + 20(x2 + td2 )d2
For an optimal t it needs to hold that ϕ0 (t) = 0 which gives
x1 d1 + 10x2 d2
t=−
d21 + 10d22
Then we can update xk+1 = xk + tdk .

With the initial guess of (1, 0.1)T the method of steepest descent with this exact line search
requires 63 iterations to reach a stopping criterion of ||∇f (xk )|| ≤ 10−5 and the optimal point is
(0, 0)T .

19
Note: The equations used above are specific for this example. Within the computer classes, you
will implement this algorithm for exactly this function. However, try to implement it afterwards
as general as possible, so that you can replace the choice of the function f by any other function.
For the general case, the implementation of the simple Armijo-Rule is more flexible than the use
of an exactly derived step-length (as suggested here).

3.2.1.4 Conjugate Gradient Methods (CG)


The idea of conjugate gradient methods is to expand f (x) in the neighbourhood of the current
point xk as
Hf
f (x) = f (xk ) + ∇f (xk )(x − xk ) + (x − xk )T (x − xk ) + . . .
2
Computing the gradient and following further derivations, the next step descent direction dk+1
will include the current step descent direction dk

dk+1 = ∇f (xk ) + βdk .

In order for the method to converge, these two directions should be conjugate to each other.

Definition: Two search directions are conjugate to each other if

(dk+1 )T Ak dk = 0

The matrix A should be the Hessian Hf to be exact. We will use an approximation and obtain a
CG-Method.
The parameter βk may be computed according to one of the following equations (named after their
developers):
• Fletcher–Reeves:
| − ∇f (xk )|2
βkF R =
| − ∇f (xk−1 )|2

• Polak–Ribière:
∇f (xk )T (∇f (xk ) − ∇f (xk−1 ))
βkP R =
| − ∇f (xk−1 )|2

• Hestenes-Stiefel:
∇f (xk )T (∇f (xk ) − ∇f (xk−1 ))
βkHS =
−dTk−1 (−∇f (xk (−∇f (xk−1 ))

• Dai–Yuan:
| − ∇f (xk )|2
βkDY = T
−dk−1 (−∇f (xk (−∇f (xk−1 ))

Note: The method simply requires storage of the old descent direction.

20
Algorithm 4: Gradient Method CG with β F R Algorithm
Initialize at x0 ∈ Rn ,k = 1
dk = −∇f (xk )
while ||∇f (xk )|| >  do
if k=1 then
dk = −∇f (xk )
else
compute β
dk = −∇f (xk ) + βdk−1
end
store dk , ∇f (xk ) for the next step
choose tk , e.g. approximate linesearch
xk+1 = xk + tk dk ,
k =k+1
end

3.2.2 Methods of second order


We will make now use of second order derivatives.
Background: Taylor series expansion of the objective.
We start in one dimension, f : R → R
1
f (x) ≈ f (xk ) + f 0 (xk )(x − xk ) + f 00 (xk )(x − xk )2 + · · ·
| {z 2 }
fˆ(x)

3.2.2.1 Newton’s method

Finding the optimum of fˆ(x) means to differentiate fˆ(x) and to set the result equal to zero.

fˆ0 (x) = f 0 (xk ) + f 00 (xk )(xk − x) + ···


|{z} =0
Can be neglected

The solution will be obtained by a fixed-point iteration

f 0 (xk )
xk+1 = xk − 00
f (xk )
| {z }
N ewton0 s M ethod

Going to n-dimensions
f 0 (xk ) → ∇f (xk )
f 00 (xk ) → Hf (xk )
So now for x ∈ Rn , f : Rn → R

xk+1 = xk −(Hf (xk ))−1 ∇f (xk )


| {z }
dN

Generally, we do not invert Hf , but rather solve the so-called Newton Equation

Hf dN = ∇f

to obtain the descent direction according to Newton dN .

Rates of Convergence: Let xk be the components of a series with limit x∗ . We say that:
xk converges linearly to x∗ if for any constant 0 < c < 1 if holds
||xk+1 − x∗ || ≤ c||xk − x∗ ||

21
Algorithm 5: Local Newton’s Method
Initialize at x0 ∈ Rn , > 0
while ||∇f (xk )|| >  do
dk solve Hf (xk )dk = −∇f (xk ) (Newton Equation→ NE)
xk+1 = xk + dk
k =k+1
end

(a) Quadratic function (b) Rosenbrock’s function


Number of iterations: 1, Number of iterations: 24,
x∗ = (1, 1) x∗ = (0.9999886, 0.9999768)

Figure 3.4: Newton method

xk converges superlinearly if there is a series of constants ck with ck → 0 for k → ∞ and


||xk+1 − x∗ || ≤ ck ||xk − x∗ ||
xk converges quadratically if there is a constant 0 < c < 1 such that
||xk+1 − x∗ || ≤ c||xk − x∗ ||2
For the steepest descent method only linear convergence can be shown. For Newton’s method we
have at least superlinear convergence.
In case that Hf fulfills a Lipschitz condition

||Hf (x) − Hf (y)|| ≤ L||x − y||

Then quadratic convergence of Newton’s method can be proven.

Properties of Newton’s Method


• Works without line-Search (Finding t).
• Generally fast, quadratic convergence.
• Computation of Hessian might be demanding.
• It may diverge if the initial guess x0 is for away from x∗ and Hf (x0 ) is not positive definite.

3.2.2.2 Levenberg-Marquardt-Method(LMM):
In order to deal with divergence far from x∗ we may regard the following descent directions:
 
1 0 0···
0 1 0···
• dsd := −I∇f (xk ) Steepest descent Method, I = 0 0 1 · · ·
 
..
 
.

22
Table 3.1: Comparison: Steepest descent Method and Newton’s Method

Item Steepest descent Method Newton’s Method


Pro Converges for every initial guess Fast convergence
Cons Slow convergence (High number of iterations), Converges only close to x∗

• dN := −(Hf (xk ))−1 ∇f (xk ) Newton’s Method


• dLM := −(αI + Hf (xk ))−1 ∇f (xk ), α > 0 Scaling
The last choice is a hybrid between the steepest descent method and Newton’s method and called
the Levenberg-Marquardt method. For large α, the dLM is close to dsd which is preferable at the
beginning of the iterations.
If α → ∞ → dLM ∼ dsd
For small α, the dLM is close to dN , which is preferable when close to the optimal point.

If α → 0 → dLM ∼ dN
So adapting α during the algorithms is beneficial.

α0 > α1 > α2 > · · ·

Properties:
• Convergence for every initial guess.
• One may implement this with a variable αk with αk → 0 if k → ∞.
• For LMM expect at least superlinear convergence.

• A step length control is recommended.

3.2.2.3 Quasi Newton Methods:


Quasi Newton methods update an approximation of Hf (x∗ ) as the interaction progresses. The ad-
vantage of the Quasi-Newton-Methods is, that the Hessian-Matrix must not be computed, which
is in may cases compuationally demanding (2nd order derivatives).

Algorithm 6: Quasi-Newton Algorithm


Initialize at x0 ∈ Rn , > 0
Initialize H0 , often: H0 = I (identity matrix)
while ||∇f (xk )|| >  do
Compute dk = −Hk−1 ∇f (xk )
xk+1 = xk + tk dk (tk with line-search)
−1
Use xk , xk+1 and Hk to compute Hk+1
k =k+1
end

BFGS (Broyden,Fletcher,Goldfarb,Shannon) Method We consider the following rank-two


update
yy T (Hk s)(Hk s)T
Hk+1 = Hk + T − ,
y s sT Hk s

23
−1
and for the inverse Hk+1 of Hk+1 it holds

sy T ysT ssT
   
−1
Hk+1 = I− Hk−1 I − T + T .
yT s y s y s

Here s = xk+1 − xk and y = ∇f (xk+1 ) − ∇f (xk ). Each Hk+1 is symmetric positive definite.
If ||H0 − ∇2 f (x∗ )|| ≤ δ then the BFGS iterates are well-defined and converge q-superlinearly to
x∗ .

(a) Quadratic function-14 iterations (b) Rosenbrock’s function-41 steps

Figure 3.5: Quasi Newton method with BFGS

3.2.3 0th order descent methods


3.2.3.1 Method of Golden Cut:
Determine the minimizer of a function f : R → R over a closed interval [a, c].
Assumption: The function f is unimodal (f has only one local minimizer).

B/A = A / (A+B)
13/21 = 21/ (13+21) = 0,62

(a) Method of Golden Cut (b) Fibonacci series1 forming a spiral. The golden cut
is the ratio of area of all squares.

These few lines are easily implemented. The question is, how to make the algorithm fast, so
that not too many iterations are required. To answer this question, we normalize the intervals at
each iteration k
bk − ak
mk =
ck − ak
1 (1 + 1 = 2; 1 + 2 = 3; 2 + 3 = 5; 3 + 5 = 8; ...

24
Algorithm 7: Golden cut Algorithm
Choose two points d and b such that a < d < b < c
while |f (a) − f (c)| ≥  do
if f (d) ≤ f (b)(minimal point is left of b) then
c=b
else
a=d
end
Define new intermediate values
d = a + m(c − a)
b = a + (1 − m)(c − a)
end

and
ck − bk
1 − mk = .
ck − ak
The key idea is to choose m in that way, that there is an equal probability for the two cases
(f (d) ≤ f (b) and f (d) > f (b)) in the next iteration. This might be achieved by making use of the
rule of the golden cut:
a+b b
= =Φ
b a
a b
+1− =0
b a
Φ−1 + 1 − Φ = 0
1 + Φ − Φ2 = 0
s 
2
1 1
···Φ = − + + 1 = 0.618
2 2
m = 1 − 0.618 = 0.382

3.2.3.2 Method according to Powell


Generalizes one directional search algorithms to n-dimensional problems. We search in one dimen-
sional directions parallel to the axes in a Cartesian coordinate system, see Figure 3.7. We combine
the solutions to a joint search direction d and update the current point

xk+1 = xk + d, k = 0, 1, 2, ...k-iteration counter

A further one dimensional-search might be added to find the optimal length of the joint search
direction.

Figure 3.7: Method according to Powell.

25
3.2.3.3 Method of Nelder-Mead (Simplex Method)

(a) Nelder-Mead (Simplex Method) [Ast06] (b) Nelder-Mead Simplex and New Points [Kel99]

In each step, we construct a simplex in Rn which is given by its n + 1 vertices xj (n = 2 triangle,


n = 3 tetrahedron) and sort the vertices according to their objective function values

f (x1 ) ≤ f (x2 ) ≤ · · · f (xn ) ≤ f (xn+1 ).


The idea is to replace now the worst vertex xn+1 , i.e. the point with the highest value of the
objective. Therefore we compute the centroid.
n
1X
x̄ = xi ,
n i=1
and check the new point

x(µ) = (1 + µ)x̄ − µxn+1 .


The value µ is now selected from a sequence

−1 < µic < 0 < µoc < µr < µe often − 1 < −0.5 < 0 < 0.5 < 1 < 2

26
Algorithm 7: Algorithm of Nelder-Mead
Define objective function f (x) with dimensions n
Initialize µic = −0.5, µoc = 0.5, µr = 1, µe 2
for 1 < k < n + 1 do
fk = f (xk )
end
Sort fk and xk
while Stopping criterion do
Compute x̄, x(µr ) and fr = f (x(µr ))
if f (x1 ) ≤ fr < f (xn ) (Reflection) then
Replace xn+1 with x(µr )
else if fr < f (x1 ) (Expansion) then
Compute fe = f (x(µe ))
if fe < fr then
Replace xn+1 with x(µe )
else
Replace xn+1 with x(µr )
end
else if f (xn ) ≤ fr < f (xn+1 ) (Outside Contraction) then
Compute fc = f (x(µoc ))
if fc ≤ fr then
Replace xn+1 with x(µoc )
else
for 2 ≤ i ≤ n + 1 do
set xi = x1 − (xi − x1 )/2, compute f (xi ) (Shrink).
end
end
else if f (xr ) ≥ f (xn+1 ) (Inside Contraction) then
Compute fc = f (x(µic ))
if fc < f (xn+1 ) then
Replace xn+1 with x(µic )
else
for 2 ≤ i ≤ n + 1 do
set xi = x1 − (xi − x1 )/2, compute f (xi ) (Shrink).
end
end
for 1 < k < n + 1 do
fk = f (xk )
end
Sort fk and xk
end
The Nelder-Mead Method is provided in Matlab or Octave with the fminsearch method: An
application to the Rosenbrock’s function
fun = @( x ) 100*( x (2) - x (1) ^2) ^2 + (1 - x (1) ) ^2;
x0 = [ -1.2 ,1]; % initial guess
options = optimset ( ' PlotFcns ' ,@ optimplotfval , ' Display ' , ' iter ') ;
x = fminsearch ( fun , x0 , options )

27
3.3 Constrained Optimization
We regard now problems of the following type:
min f (x) such that
gi (x) ≤ 0, i = 1, . . . , m Inequality constraints,
hj (x) = 0, j = 1, . . . , p Equality constraints,
xl [q] ≤ x[q] ≤ xu [q], q = 1, . . . , n Explicit restrictions.

where x ∈ Rn , f : Rn → R, f, gi , hj : Rn → R and sufficiently smooth. The bracketed notation


x[q] denotes the q th element of the xk -vector at the k th , e.g.
 >
xk = xk [1] . . . xk [q] . . . xk [n]
The set of admissible solutions of Standard Optimization Problems (SOP) is

X = {x ∈ Rn | gi (x) ≤ 0, hj (x) = 0 x[q]l ≤ x[q] ≤ x[q]u , q = 1, . . . , n}


We start to derive again optimality criteria as they provide a basis to derive methods for the
solution of Standard Optimization Problems (SOP).

3.3.1 Explicit Restriction


We start regarding problems with explicit restrictions:

Definition: If x∗ is the optimal point, then xl [q] is an active constraint if xl [q] = x∗ [q]. Equally
with xu [q].
Else, the constraints are called inactive constraints.
A(x) = {q ∈ {1, · · · , n}, xl [q] = x∗ [q] or xui = x∗i } forms the set of active constraints.
I(x) = {i ∈ {1, · · · , n}} \ A(x) forms the set of inactive constraints.

Optimality conditions: In the case of no restrictions we required that ∇f (x∗ ) = 0, Hf (x∗ )


(positive definite).
This needs to be generalized:
∇f (x∗)T (x − x∗ ) ≤ 0
HfR (x∗ ) is positive definite with

δij if i ∈ A(x) or j ∈ A(x)
(HfR (x∗ ))ij =
(Hf (x∗ ))ij i, j ∈ I(x)
In the above, δij is Kronecker’s delta, defined as follows:

1 if i = j
δij =
0 else.
How to deal now with constraints in our algorithms?
Defining a projector  l
 xi if xi ≤ xli
P (x)i = xi if xli ≤ xi ≤ xui
 u
xi if xui ≤ xi
then a point x∗ is stationary if
x∗ = P (x∗ + td(x∗ )), ∀t > 0.
To apply a projector in the previously discussed algorithms (steepest descent, Newton, Leveberg-
Marquadt, ... Nelder Mead) the change is simply in the line, where we update the parameters, i.e.
we have now
xk+1 = P (xk + tdk )
for any choice of the descent direction dk and step-length t. Figure 3.9 visualizes, how the concept
works for a 2 dimensional quadratic function with constraints x < −1 and y < −1.

28
Figure 3.9: Projection method for the bound constraints x < −1 and y < −1

3.3.2 Penalty Methods


Penalty methods simply increase the value of the objective when one constraint is violated.

fˆ(x) = f (x) + r Pf (x)

with Pf (x) a penalty function and r > 0 is a scaling parameter. There are different definition of
Pf , for example:
p
X m
X n
X n
X
Pf (x) = (hj (x))2 + (max(gi (x), 0))2 + (min(xi − xli , 0))2 + (min(xui − xi , 0))2
j=1 i=1 i=1 i=1

The effect on the optimization algorithm can be seen in Figure 3.10. Penalty formulations are

Figure 3.10: Penalty method for the bound constraints x < −1 and y < −1

more flexible, and do not require any changes of the algorithm (only the objective is changed). A
difficulty might be the evaluation of points outside the set of admissible points which may lead to
failing model evaluations (for example negative material parameters).

29
3.3.3 Linear Independence Constraint Qualification (LICQ)
We will extend now the theory and numerical frameworks to allow also for equality (hj (x) = 0)
and inequality (gi (x) ≤ 0) constraints. These are in particular of high importance in the context
of structural optimization, where, e.g., one wants to minimize the volume (material) of a structure
while keeping its safety guaranteed. Without this safety constraint, the problem would provide
the trivial solution not to use any material at all (zero volume).
We now define a condition on the gradients of the inequality constraints. We regard the linear
independence constrained qualification (LICQ) as a condition on X.
Let x∗ be admissible. The set

A(x∗ ) = {i ∈ {1, ..., m} | gi (x) = 0}

is the set of indices of ”active inequality constraints”. The LICQ is fulfilled in x∗ if the gradients
of the active constraints

∇hj (x∗ ), j = 1, ..., p ∇gi (x∗ ), i ∈ A(x∗ )


are linearly independent. This condition implies that X is not degenerated (so it is not empty
or consists solely of one point or line). We regard now the LAGRANGIAN (L) function as a
combination of the cost function and the constraints.
m
X p
X
L(x, λ, µ) = f (x) + λi gi (x) + µj hj (x)
i=1 j=1

with Langrange multipliers

λ = (λ1 , · · · , λm )> and µ = (µ1 , · · · , µp )> .

Theorem (Karush-Kuhn-Tucker) (KKT): Let x∗ be an optimal point of the Standard Op-


timization Problems (SOP). The functions f , gi and hj are continuously differentiable and the
LICQ is satisfied, then there exists Lagrangian multipliers λ and µ such that:
1. Stationarity of L
∇x L(x, λ, µ) = 0
(which means that ∇f can be represented by a linear combination of ∇gi and ∇hj in x∗ )
2. Complementary conditions

λi ≥ 0, λi gi (x∗ ) = 0, gi (x∗ ) ≤ 0

(Either gi is active, than λi can be arbitrary, or gi is not active and λi needs to be zero)
Example 3.3.1. Equality constraints

min f (x1 , x2 ) = x1 + x2 such that


h(x1 , x2 ) = x21 − x2 = 0

The Lagrangian to this problem is

L(x, µ) = x1 + x2 +µ (x21 − x2 )
| {z } | {z }
f (x1 ,x2 ) h(x1 ,x2 )

It holds    
1 + 2µx1 12x1
∇x L(x1 , x2 , µ) = ∇h (x1 , x2 ) =
1−µ −1
 
1
∇f (x1 , x2 ) =
1

30
LICQ is fulfilled as ∇h 6= 0 ∀x1 .
We apply KKT: Let (x1 , x2 ) be a local minimum. Then there exists a parameter µ with ∇x L(x1 , x2 ) =
(0, 0)T which gives a system of linear equations

1 + 2µx1 = 0

1−µ=0
with solution x1 = −0.5 and µ = 1. Introducing x1 = −0.5 to the equality constraint we get
x2 = 0.25.

Figure 3.11: Example min f (x1 , x2 ) = x1 + x2

Example 3.3.2.
min f (x) = x1 , ∈ R2 such that
g(x) = (x1 − 4)2 + x22 − 16 ≤ 0
With this
L(x, λ) = x1 + λ((x1 − 4)2 + x22 − 16)
and      
1 + 2λ(x1 − 4) 1 2(x1 − 4)
∇x L = ∇f (x) = ∇g(x) =
2λx2 0 2x2
The LICQ is fulfilled for all points except (x1 , x2 ) = (4, 0) as ∇g 6= 0. And for g we obtain
g(4, 0) = −16 ≤ 0, so the inequality constraint is not active.
The KKT theorem tells us, that there is a λ such that

1 + 2λ(x1 − 4) = 0
−→ ○ 1
2λx2 = 0

and from complementary conditions

λ (x1 − 4)2 + x22 − 16 = 0 → ○



2
λ≥0→○ 3
(x1 − 4)2 + x22 − 16 ≤ 0 → ○
4

We regard two cases:


1. λ > 0
if λ > 0, then x2 = 0 due to equation ○
1 From equation ○
2 it follows with equation ○
1

(x1 − 4)2 + x22 − 16 = 0

31
and
1 + 2λ(x1 − 4) = 0.
With this, we have a system of equations for x1 and λ. It follows that x1 = 0 or x1 = 8 For
x1 = 8 we obtain a value for λ = −1/8 which contradicts λ > 0. (Conflict with equation ○)
3
The only point satisfying the KKT condition is x1 = 0, x2 = 0 and λ = 1/8.

2. λ = 0
If λ = 0 we would get that 1 = 0 which is again a contradiction w.r.t equation ○.
1

3.3.4 Lagrange-Newton Method


We want to construct algorithms to solve Standard Optimization Problems (SOP) with equality
constraints.

min f (x) such that hj (x) = 0, j = 1, ..., p


.
If x∗ is an optimal point and all ∇hj (x∗ ) are linearly independent, then the KKT conditions hold,
so there is µ = (µ1 , ...µp )T such that
p
X
∇x L(x∗ , µ̂) = ∇x f (x∗ ) + µ̂j ∇hj (x∗ ) = 0 (3.1)
j=1

hj (x ) = 0, j = 1, ..., p. (3.2)

This is a nonlinear system of equations for x and µ. With F : Rn × Rp → Rn+p and


 
  h1 (x)
∇x L(x, µ)
F (x, µ) = ; h(x) =  ... 
 
h(x)
hp (x)

We require F (x, µ) = 0
Applying Newton’s method to F (x, µ) = 0 gives the Lagrange-Newton Method.

Algorithm 8: Lagrange-Newton Method Algorithm


x0 ∈ Rn and µ0 ∈ Rp , set k = 0 and  > 0
if ||F (xk , µk )|| ≤  then
stop
else
Solve
 2 the linear system ofequations
∇xx L(xk , µk ) h0 (xk )T
   
dk ∇x L(xk , µk )
=
h0 (xk ) 0 γk h(x2 )
and update
xk+1 = xk + dk
µk+1 = µk + γk
Set k = k + 1
end

The Lagrange-Newton Method is well defined for all x0 and µ0 close to x∗ and µ̂ and converges
superlinearly. If Hf and h00 are Lipschitz-continuous, then we have convergence of quadratic order.
Comment: For inequality constraint one generally has to use penalty formulations, i.e. we
minimize f˜(x) = f (x) + rP (x) with r > 0 and P (x) a penalty function which returns ”high” values
if any of the inequality constraint is violated. The function f˜(x) is minimized by approaches from
unconstrained non-linear optimization.

32
Chapter 4

Structural Optimization

Dimensioning: The situation of a system which is defined by its geometry and the related
dimensions such as length, height, radius, etc. can be improved. For example: Finding the
optimum depth of cut required in a sheet of paper to obtain a box of maximum volume, finding
the minimal requirement of materials to form a cuboid (or cube).

Shape Optimization: It refers to improvement of the state of an object by changing its shape
parameters or the shape of its boundary.

Topology Optimization: It refers to the state of a system by changing its topology, which
counts the number of holes. It deals with material distribution within a design space.

4.1 Dimensioning Problems


4.1.1 Creating a Box

Consider a sheet of paper (eg. 40cm x 30cm) which is to be modified to form a box so as to
obtain maximal volume, where

33
x - design variable: How deep to cut into a sheet of paper to form a box with maximal volume

max V (x) x ∈ R+ , V (x) = (b − 2x) × (a − 2x) × x

Derivative:

dV (x) !
= ab − 4(a + b)x + 12x2 = 0
dx
with one positive root:
r
∗ a+b (a + b)2 ab
x = − −
6 6 12
with a = 30 and b = 40

x∗ = 5, 7
Remark: When you use this example together with the method of golden cut, do not be confused
with the variables a and b. They are used in the algorithm and in the example. Better to rename
them, e.g. for the example into l length and w width.

4.1.2 Minimization of material required to wrap goods (e.g. sugar)


Imagine, a certain volume of goods needs to be wrapped, consuming as minimal material as possible,
where

V =a×b×c
A = 2ab + 2bc + 2ac
Let V be given, e.g. 1000
1000
→c=
ab
Minimize now
1000 1000
A(a, b) = 2ab + 2b + 2a
ab ab
We work with the gradient
2b − 2000
 
a2 !
∇A(a, b) = =0
2a − 2000
b2

→ a = b = 10 → c = 10
The cube is the best shape to wrap goods of any given volume. Remark: When you use this
example together with the method of golden cut, do not be confused with the variables a and b.
They are used in the algorithm and in the example. Better to rename them, e.g. for the example
into x, y and z.

34
4.2 Shape Optimization:
Let’s begin with the shape optimization of mechanical structures.

Figure 4.1: Flowchart of the classical structural optimization

4.2.1 (A very) short introduction to FEM


In most cases for the model of an elastostatic problem, the Finite Element Method (FEM) is used
to solve it:
• We start with the strong form

B > cBu = f in Ω,
u = ū on Γe
cBu = t̄ on Γt

where
• u-displacements,
• B-strain-displacement differential matrix
 ∂

∂x 0 0
0 ∂
 ∂y 0 
0 ∂ 
0 ∂z 
B=

∂ ∂ 
0 ∂z ∂y 
∂ ∂ 
 ∂z 0 ∂x

∂ ∂
∂y ∂x 0

• Bu - strains,
• c - Stiffness tensor/matrix.
Ω-computational domain
Γ-boundary
Γe -part of Γ with essential boundary conditions
Γt -part of Γ with traction forces
Nodal displacements ui are computed on the nodes by the FEM directly from the FE system

35
Figure 4.2: Computational domain, mesh and boundaries with boundary conditions

Element displacements un are calculated by interpolation using the shape functions

un = Nn uni

Nn -Shape functions
uni -nodes connected to element n.
The strains can be obtained by differentiation of the displacements.

n = Bn uni ,

where Bn is the matrix of derivatives of the shape functions


 ∂N 
∂x 0 0
 0 ∂N
 ∂y 0 

 0 ∂N 
0 ∂z 
Bn =  .

∂N ∂N 
 0
 ∂N ∂z ∂N ∂y 

 ∂z 0 ∂x

∂N ∂N
∂y ∂x 0
The stresses are obtained by using the Hooke’s Law

σn = cn Bn uni

We start with the strong formulation:

B > cBu = f in Ω

Multiplying with sufficiently smooth functions v which vanish on the boundary and application of
partial integration yields
Z Z
(cBu)> BvdΩ = f vdΩ
Ω Ω

which gives the so-called weak formulation of the above problem.


By choosing u and v from finite dimensional function spaces the solution of the weak formulation
is approximated by its algebraic form

Ku = F

36
where,
u-vector of all the nodal displacements
F -vector of forces
K-stiffness matrix
The matrix K is assembled from the elemental stiffness matrices Kn
N
X N Z
X
K= Kn = (Bn> cn Bn )dΩn
n=1 n=1 Ωm

F is obtained by assembling the volume forces.


The integral is generally approximated by quadrature rules, like Gaussian quadrature.
In elastodynamics the equation of motion considers additionally inertia and damping effects. The
equation of motion in its Finite Element form is given by:

M ü(t) + Du̇(t) + Ku(t) = F (t)


where, M and D are the Mass matrix and Damping matrix respectively.
So if we assume we have access to the solutions of the Finite Element problem, what are the typical
values for the objective of optimization?
Basically, we have two viewpoints:
• Local min-max formulation:
This formulation first searches for the highest value and then minimizes it. The quanti-
ties for u(x) mainly indicates parameters such as displacements, strains, stresses, volumes,
eigenfrequencies, etc.
min(max u(x))

• Global formulation:
Z 
min u(x)dΩ .

Recommended software: PDETOOL in Matlab (easy to use, export of system matrix, solution
vectors to matlab workspace for further processing), CALFEM (http://www.byggmek.lth.se/english/calfem/,
Finite Element Package based on Matlab provided by University of Lund), ANSYS (educational
license available at BUW), ABAQUS, FREEFEM, OOFEM, ...)

4.2.2 Geometric description of shapes by splines:


In optimization, we require flexibility, stability and efficiency. The shapes can be parameterized
by a collection of points and defined shape functions. For example: A straight line between two
points P1 and P2

R(u, P1 , P2 ) = P1 + (P2 − P1 ) · u
where u ∈ [0, 1].

4.2.2.1 Modeling of boundaries by B-splines


The basis for generating B-splines is a vector of knots

U = {u1 , u2 , ..., uA , ..., un+p+1 }, uA ∈ R

n-number of control points


p-polynomial order
u1 ≤ u2 ≤ u3 ≤ ....

[ui , ui+1 ] is called a knot span. To form the basis functions NA,p we use a recursion formula,
(
1 if uA ≤ u < uA+1
NA,0 (u) =
0 else

37
Figure 4.3: Basis functions for splines. Source [FSF02]

when p = 0.
If p > 0
u − uA uA+p+1 − u
NA,p (u) = NA,p−1 (u) + NA+1,p−1 (u)
uA+p − uA uA+p+1 − uA+1
The B-spline is finally obtained by combination of the control points with the shape functions.
n
X
S(u) = PA NA,p (u)
A=1

P = {PA }nA=1

Properties:

P
NA,p = 1, Partition of unity
• Local support
The use of splines can be used to define boundaries. Further, they can be generalized to
represent to surfaces and solids. The generalization is straight forward:
n X
X m
p,q
Surface:S(u, v) = PA,B NA,B (u, v)
A=1 B=1

n X
X m X
l
p,q,r
Volume:V (u, v, w) = PA,B,C NA,B,C (u, v, w)
A=1 B=1 C=1

4.2.2.2 Modeling of boundaries by NURBS


NURBS are the extension to splines. NURBS stands for Non Uniform Rational B Splines. By
non-uniform, we mean that the knot spans can have arbitrary length. NURBS are defined by knot
vectors,
u = {u1 , u2 , ...., un+p+1 }
by control points,
P = {PA }nA=1
by rational basis function,
NA,p wA
RA,P = Pn
A=1 wA NA,p
and by weights,
W = {wA }nA=1
N
X
T (u) = RA,p PA
A=1

38
Circle described by R-Spline Circle described by R-Spline Circle described by R-Spline
1 1 1

0.5 0.5 0.5

0 0 0
y

y
-0.5 -0.5 -0.5

-1 -1 -1
-1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1
X X X
(a) plain (b) changed weights: (c) changed control points:
w =[1 s45 5 s45 1 s45 1 s45 1] x = [1 1 0.5 -1 -1 -1 0 1 1];
y = [0 1 1 1 0 -1 -1 -1 0];

Figure 4.4: Circles generated by rational splines.

% Working With Splines In Matlab :

clear
close all
% Generates a circle with minimal information
% Define Control Points
x = [1 1 0 -1 -1 -1 0 1 1]; y = [0 1 1 1 0 -1 -1 -1 0];
% weights
s45 = 1/ sqrt (2) ;
w =[1 s45 1 s45 1 s45 1 s45 1];

% rsmak ( KNOTS , COEFS ) returns the rBform of the rational spline


% specified by the input .
% Put together rational spline
circle = rsmak ( augknt (0:4 ,3 ,2) , [ w .* x ; w .* y ; w ]) ;
% plot spline
figure
fnplt ( circle )
title ( ' Circle described by R - Spline ') ;
xlabel ( 'X ') ;
ylabel ( 'y ') ;
set ( gca , ' FontSize ' ,18) ;
print ( gcf , ' - dpdf ' , ' splineCurve . pdf ') ;

Also spheres can be generated by splines:


% Generate a sphere
figure
southcap = rsmak ( ' southcap ') ; fnplt ( southcap )
xpcap = fncmb ( southcap ,[0 0 -1;0 1 0;1 0 0]) ;
ypcap = fncmb ( xpcap ,[0 -1 0; 1 0 0; 0 0 1]) ;
northcap = fncmb ( southcap , -1) ;
% plot

hold on , fnplt ( xpcap ) ,


fnplt ( ypcap ) , fnplt ( northcap )
axis equal , shading interp , view ( -115 ,10) , axis off , hold off

39
4.3 Application: Optimization of the Shape of a Dam
Optimization of the cross-section of a dam described by splines. For the dam, a hydro-mechanical
coupled Finite Element Model is solved. The objective is formed by the volume (surface area).
A stress threshold is taken as constraint. The Nelder-Mead-Algorithm is chosen to optimize the
structure.

Optimized Structure

25 5000

20 0
15
y

-5000
10

5 -10000

0
0 10 20
x

Figure 4.5: Optimized Cross Section

Figure 4.6: Optimization of the Cross Section of a Dam. Left upper: Finite Element Mesh, Middle
upper: Different tested shapes described by splines, physical quantities computed on the optimized
structure.

40
4.4 Topology Optimization
In set theory, the properties of geometric structures are defined by the number of holes or voids
they contain. Two objects are equivalent if the number of voids is the same.
Topology optimization allows us to change from one topology class to another.

4.4.1 Examples of optimized structures

Figure 4.7: Meshed and optimized quarter wheel

Figure 4.8: Von Mises distribution in an optimized wheel

4.4.2 Evolutionary Structural Optimization(ESO):


The ESO is based on the concept that by sequentially removing inefficient materials from the
structure, the residual shape evolves toward an optimum solution. The ESO uses a sequence of
Finite Element Solutions where elements possessing low stress values are removed in each iteration.
The amount of material to be removed is steered by the Rejection Ratio (RRk ), defined as
σe
< RRk ,
σmax
where σe is the element stress and σmax is the threshold stress.
So, we remove the elements from the regions where this condition is true. The ratio can be adapted
during the iterations,
RRk+1 = RRk + ER
where, k is the iteration index and ER is the evolutionary rate.
Iterations are stopped when a target volume (for eg. 50% of the initial) is reached.

41
Figure 4.9: Mesh and Design Variables

4.4.3 ESO for Stiffness Optimization:


We regard now the compliance as objective, which is retrieved in FEA as follows,
1 >
C(x) = f u(x)
2
where, f is the vector of nodal forces, u(x) is the vector of nodal displacements (dending on the
design variables), x = [x1 , ...., xun ]> is the design vector (usually each Finite Element is a degree
of freedom), and n is the number of elements.

xi ∈ {0, 1}, i = 1, ...., un


and f , u are from the equilibrium eq. in Finite Element form:

K(x)u(x) = f.

Let us assume, we remove one Finite Element, i.e. xi = 0, xj = 1 ∀j 6= i. Then, the stiffness
matrix will change to:
∆K = K ∗ − K = −Ki
K ∗ is the stiffness matrix of the resulting structure after the element is removed and Ki is the ith
elemental stiffness matrix.
Varying both sides of equation (4.4.3) gives the changes in the displacements

∆u = K −1 (x)∆Ku(x).

From equations (4.4.3) and (4.4.3) we have


1 > 1
∆C(x) = f ∆u = f > K −1 (x)∆Ku(x)
2 2
1
= (Ku(x))> K −1 (x)∆Ku(x)
2
1
= u> (x)Ki (x)ui (x),
2 i
where ui is the vector of nodal displacements related to the ith element.
To minimize the compliance given a volume constraint, it is clear that only those elements should
be removed, which lead to the smallest increase in the compliance in order to fulfill the volume
constraint. These are the elements where 21 u>
i Ki ui is small.
Not, after deleting elements, the domain needs to be remeshed again. As the volume decreases,
less elements are needed, the solutions may become faster from iteration to iteration. The ESO
is a purely heuristic algorithm and does not share any convergence theory. Once an element is
deleted, it cannot be added anymore (as it is outside of the computation).

42
4.4.4 Bi-directional Evolutionary Structural Optimization (BESO):
BESO is an improvement compared to the ESO method. While ESO is limited to the removal
of elements, BESO can also add the elements. The optimization task is given as the following
compliance minimization (stiffness maximization) problem:
1 >
min C = f u(x),
2
n
X
subject to V ∗ − Vi xi = 0,
i=1

and xi ∈ {0, 1}, i = 1, ..., n


V ∗ is the target volume and Vi is the volume of element i. The number of elements in the FE
domain is denoted by n.
Sensitivities are similar to the ESO method for uniform meshes:
1 >
αie = ∆Ci = u Ki ui
2 i
For non-uniform meshes
1 >
2 ui Ki ui
αie = .
Vi

As once an element is removed, wo do not share any information about its sensitivity. However,
with neighboring elements, we can average the sensitivity for a certain region including the already
deleted element. So, we average the value of α over all elements connected to one node
M
X
αjn = wi αie ,
i=1

where M is the total number of elements connected to the j th node. wi is the weight factor defined
as: !
1 rij
wi = 1 − PM ,
M −1 i=1 rij

where rij is the distance of the midpoint of the ith element from node j. These nodal sensitivities
will now be taken and then we average over different nodes.

43
Pk
j=1 w(rij )αjn
αi = Pk ,
j=1 w(rij )
where k is the number of elements in a sub-domain Ωi and w(rij ) = rmin − rij is a weight factor
and j = 1, ...., k.
There are some thresholds based on which the elements are kept, deleted or added.
th
The elements are to be deleted (switched to 0) if αi ≤ αdel .
th
The elements are to be added (switched to 1) if αi > αadd .

Stability: Sensitivities are averaged between two iterations to obtain stabilization of the process

αi = αik + αik−1 .
|{z} | {z }
current old

(Try out the codes without this line, then you know why it is there!) In the above setting either
an element is active or not. Thus, we deal with a discrete optimization problem, a so called hard-
kill approach. Generally, in optimization theory it has been found that continuous problems are
more convenient to solve. So the following deals with an implementation of BESO as a continuous
optimization problem:

Goal:
xi ∈ [0, 1],
i.e. each design variable is now any value in the interval from 0 to 1. Such an approach is called
soft kill, where the design variable becomes continuous. (Earlier we started with discrete (hard
kill) where xi ∈ {0, 1}.)
In soft kill topology optimization algorithms, the Young’s Modulus is defined as a function of the
design variable xi follows:
Ei = Ei (xi ) = xpi E0 ,
where p > 0 is called as penalization factor, E0 is the Young’s modulus of the solid, and xi is the
design variable.

44
Figure 4.10: Effects if the power-law used in the soft-kill approaches.

Figure 4.11: One sided MBB beam (Messerschmitt–Bölkow–Blohm), which is actually a 3 point
bending test.

%%%%% A SOFT−KILL BESO CODE BY X. HUANG and Y.M. Xie %%%%%


f u n c t i o n b e s o ( n e l x , n e l y , v o l f r a c , er , rmin )
% INITIALIZE
x ( 1 : n e l y , 1 : n e l x ) = 1 . ; v o l = 1 . ; i = 0 ; change = 1 . ; p e n a l = 3 . ; maxloop = 1 5 0 ;
% START iTH ITERATION
w h i l e change > 0 . 0 0 1 && i < maxloop
i = i + 1; v o l = max( v o l ∗(1− e r ) , v o l f r a c ) ;
i f i >1; o l d d c = dc ; end
% FE−ANALYSIS
[U]=FE( n e l x , n e l y , x , p e n a l ) ;
% OBJECTIVE FUNCTION AND SENSITIVITY ANALYSIS
[KE] = l k ;
c( i ) = 0.;
for ely = 1: nely
for elx = 1: nelx
n1 = ( n e l y +1) ∗ ( e l x −1)+e l y ;
n2 = ( n e l y +1)∗ e l x +e l y ;
Ue = U( [ 2 ∗ n1 −1;2∗ n1 ; 2∗ n2 −1;2∗ n2 ; 2∗ n2 +1;2∗ n2 +2; 2∗ n1 +1;2∗ n1 + 2 ] , 1 ) ;
c ( i ) = c ( i ) + 0 . 5 ∗ x ( e l y , e l x ) ˆ p e n a l ∗Ue ' ∗KE∗Ue ;
dc ( e l y , e l x ) = 0 . 5 ∗ x ( e l y , e l x ) ˆ ( p e n a l −1)∗Ue ' ∗KE∗Ue ;
end
end
% FILTERING OF SENSITIVITIES
[ dc ] = c h e c k ( n e l x , n e l y , rmin , x , dc ) ;

45
% STABLIZATION OF EVOLUTIONARY PROCESS
i f i > 1 ; ( dc+o l d d c ) / 2 . ; end
% BESO DESIGN UPDATE
[x] = ADDDEL( n e l x , n e l y , v o l , dc , x ) ;
% PRINT RESULTS
i f i >10;
change=abs ( sum ( c ( i −9: i −5) )−sum ( c ( i −4: i ) ) ) /sum ( c ( i −4: i ) ) ;
end
V( i ) = sum ( sum ( x ) ) / ( n e l x ∗ n e l y ) ;
d i s p ( [ ' I t . : ' s p r i n t f ( '%4 i ' , i ) ' Obj . : ' s p r i n t f ( ' %10.4 f ' , c ( i ) ) ...
' Vol . : ' s p r i n t f ( ' %6.3 f ' , sum ( sum ( x ) ) / ( n e l x ∗ n e l y ) ) ...
' ch . : ' s p r i n t f ( ' %6.3 f ' , change ) ] )

% PLOT DENSITIES
colormap ( g r a y ) ; i m a g e s c (−x ) ; a x i s e q u a l ; a x i s t i g h t ; a x i s o f f ; drawnow ;
end
%%%%%%%%%% OPTIMALITY CRITERIA UPDATE %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
f u n c t i o n [ x]=ADDDEL( n e l x , n e l y , v o l f r a , dc , x )
l 1 = min ( min ( dc ) ) ; l 2 = max(max( dc ) ) ;
w h i l e ( ( l 2 −l 1 ) / l 2 > 1 . 0 e −5)
th = ( l 1+l 2 ) / 2 . 0 ;
x = max ( 0 . 0 0 1 , s i g n ( dc−th ) ) ;
i f sum ( sum ( x ) )− v o l f r a ∗ ( n e l x ∗ n e l y ) > 0 ;
l 1 = th ;
else
l 2 = th ;
end
end
%%%%%%%%%% MESH−INDEPENDENCY FILTER %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
f u n c t i o n [ d c f ]= c h e c k ( n e l x , n e l y , rmin , x , dc )
d c f=z e r o s ( n e l y , n e l x ) ;
for i = 1: nelx
for j = 1: nely
sum = 0 . 0 ;
f o r k = max( i −f l o o r ( rmin ) , 1 ) : min ( i+f l o o r ( rmin ) , n e l x )
f o r l = max( j −f l o o r ( rmin ) , 1 ) : min ( j+f l o o r ( rmin ) , n e l y )
f a c = rmin−s q r t ( ( i −k ) ˆ2+( j −l ) ˆ 2 ) ;
sum = sum+max ( 0 , f a c ) ;
d c f ( j , i ) = d c f ( j , i ) + max ( 0 , f a c ) ∗ dc ( l , k ) ;
end
end
d c f ( j , i ) = d c f ( j , i ) /sum ;
end
end
%%%%%%%%%% FE−ANALYSIS %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
f u n c t i o n [U]=FE( n e l x , n e l y , x , p e n a l )
[KE] = l k ;
K = s p a r s e ( 2 ∗ ( n e l x +1) ∗ ( n e l y +1) , 2 ∗ ( n e l x +1) ∗ ( n e l y +1) ) ;
F = s p a r s e ( 2 ∗ ( n e l y +1) ∗ ( n e l x +1) , 1 ) ; U = z e r o s ( 2 ∗ ( n e l y +1) ∗ ( n e l x +1) , 1 ) ;
for elx = 1: nelx
for ely = 1: nely
n1 = ( n e l y +1) ∗ ( e l x −1)+e l y ;
n2 = ( n e l y +1)∗ e l x +e l y ;
e d o f = [ 2 ∗ n1 −1; 2∗ n1 ; 2∗ n2 −1; 2∗ n2 ; 2∗ n2 +1; 2∗ n2 +2; 2∗ n1 +1; 2∗ n1 + 2 ] ;
K( e d o f , e d o f ) = K( e d o f , e d o f ) + x ( e l y , e l x ) ˆ p e n a l ∗KE;
end
end
% DEFINE LOADS AND SUPPORTS ( C a n t i l e v e r )
% F ( 2 ∗ ( n e l x +1) ∗ ( n e l y +1)−n e l y , 1 ) = −1.0;
% f i x e d d o f s = [ 1 : 2 ∗ ( n e l y +1) ] ;
% alldofs = [ 1 : 2 ∗ ( n e l y +1) ∗ ( n e l x +1) ] ;
% freedofs = setdiff ( alldofs , fixeddofs ) ;
% % SOLVING
% U( f r e e d o f s , : ) = K( f r e e d o f s , f r e e d o f s ) \ F( f r e e d o f s , : ) ;
% U( f i x e d d o f s , : ) = 0 ;

% DEFINE LOADS AND SUPPORTS (HALF MBB−BEAM)


F ( 2 , 1 ) = −1;
fixeddofs = union ( [ 1 : 2 : 2 ∗ ( n e l y +1) ] , [ 2 ∗ ( n e l x +1) ∗ ( n e l y +1) ] ) ;
alldofs = [ 1 : 2 ∗ ( n e l y +1) ∗ ( n e l x +1) ] ;
freedofs = setdiff ( alldofs , fixeddofs ) ;
% SOLVING
U( f r e e d o f s , : ) = K( f r e e d o f s , f r e e d o f s ) \ F( f r e e d o f s , : ) ;

46
U( f i x e d d o f s , : ) = 0 ;

%%%%%%%%%% ELEMENT STIFFNESS MATRIX %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


f u n c t i o n [KE]= l k
E = 1.;
nu = 0 . 3 ;
k=[ 1/2−nu /6 1/8+nu /8 −1/4−nu /12 −1/8+3∗nu /8 ...
−1/4+nu /12 −1/8−nu /8 nu /6 1/8−3∗nu / 8 ] ;
KE = E/(1−nu ˆ 2 ) ∗ [ k ( 1 ) k ( 2 ) k ( 3 ) k ( 4 ) k ( 5 ) k ( 6 ) k ( 7 ) k ( 8 )
k(2) k(1) k(8) k(7) k(6) k(5) k(4) k(3)
k(3) k(8) k(1) k(6) k(7) k(4) k(5) k(2)
k(4) k(7) k(6) k(1) k(8) k(3) k(2) k(5)
k(5) k(6) k(7) k(8) k(1) k(2) k(3) k(4)
k(6) k(5) k(4) k(3) k(2) k(1) k(8) k(7)
k(7) k(4) k(5) k(2) k(3) k(8) k(1) k(6)
k(8) k(3) k(2) k(5) k(4) k(7) k(6) k(1) ] ;

Results with BESO are given in Figures 4.13a and 4.13b and for different void ratios.

Top Opt Beam With BESO Top Opt Beam With BESO

(a) Classical half MBB topology optimization prob- (b) Classical half MBB topology optimization prob-
lem solved with BESO and volume fraction of 0.6. lem solved with BESO and volume fraction of 0.3.
The solution is obtained by calling the code with The solution is obtained by calling the code with
beso(80,60,0.6,2,3) beso(80,60,0.3,0.1,2)

47
4.4.5 Solid Isotropic Material with Penalization (SIMP):
SIMP uses the power law of the soft kill approach to solve the optimization problems.

min C(x) = f > u(x)

such that V (x) = x> V − V ∗ ≤ 0

X = {x ∈ Rn : 0 ≤ xi ≤ 1}

K(x)u(x) = f

V is the target volume and V is the vector of volumes of all finite elements.
As we are dealing with a contraint optimization problem (Ku = f is an equality constraint) the
Lagrangian function is used in order to solve the optimization problem

L(x) = f > u(x) + λV (x).

To find an optimal solution ∂L(x)


∂X must vanish and an optimum value of λ needs to be found. We
start by looking for the derivative of the compliance:
N
∂c(x) X ∂c(x) ∂xi
=
∂xl i=1
∂xi ∂xl

Now,
∂c(x) ∂u(xi ) ∂u(xi )
= f> = u(x)> K(x)
∂xi ∂xi ∂xi
∂u(xi )
Further, to find ∂xi , we regard the derivative of K(x)u(x)=f

∂K(x) ∂u(x)
u(x) + K(x) =0
∂xi ∂xi
∂u(x) ∂K(x)
= −K(x)−1 u(x)
∂xi ∂xi
Pn
Now we have the stiffness matrix, K(x) = i=1 xpi Ki
So,
n
∂K(x) X p−1
= pxi Ki
∂xi i=1

Finally, putting everything together we get:


∂C(x)
= −ui (x)> pxip−1 Ki ui (x).
∂xi
Below is the 99 line topology optimization code by Ole Sigmund (modified for increased speed,
September 2002):

f u n c t i o n top ( n e l x , n e l y , v o l f r a c , p e n a l , rmin ) ;
% INITIALIZE
x ( 1 : nely , 1 : nelx ) = v o l f r a c ;
loop = 0;
change = 1 . ;
% START ITERATION
w h i l e change > 0 . 0 1
loop = loop + 1;
xold = x ;
% FE−ANALYSIS
[U]=FE( n e l x , n e l y , x , p e n a l ) ;
% OBJECTIVE FUNCTION AND SENSITIVITY ANALYSIS
[KE] = l k ;
c = 0.;
for ely = 1: nely
for elx = 1: nelx
n1 = ( n e l y +1) ∗ ( e l x −1)+e l y ;

48
n2 = ( n e l y +1)∗ e l x +e l y ;
Ue = U( [ 2 ∗ n1 −1;2∗ n1 ; 2∗ n2 −1;2∗ n2 ; 2∗ n2 +1;2∗ n2 +2; 2∗ n1 +1;2∗ n1 + 2 ] , 1 ) ;
c = c + x ( e l y , e l x ) ˆ p e n a l ∗Ue ' ∗KE∗Ue ;
dc ( e l y , e l x ) = −p e n a l ∗x ( e l y , e l x ) ˆ ( p e n a l −1)∗Ue ' ∗KE∗Ue ;
end
end
% FILTERING OF SENSITIVITIES
[ dc ] = c h e c k ( n e l x , n e l y , rmin , x , dc ) ;
% DESIGN UPDATE BY THE OPTIMALITY CRITERIA METHOD
[x] = OC( n e l x , n e l y , x , v o l f r a c , dc ) ;
% PRINT RESULTS
change = max(max( abs ( x−x o l d ) ) ) ;
d i s p ( [ ' I t . : ' s p r i n t f ( '%4 i ' , l o o p ) ' Obj . : ' s p r i n t f ( ' %10.4 f ' , c ) ...
' Vol . : ' s p r i n t f ( ' %6.3 f ' , sum ( sum ( x ) ) / ( n e l x ∗ n e l y ) ) ...
' ch . : ' s p r i n t f ( ' %6.3 f ' , change ) ] )
% PLOT DENSITIES
colormap ( g r a y ) ; i m a g e s c (−x ) ; a x i s e q u a l ; a x i s t i g h t ; a x i s o f f ; pause ( 1 e −6) ;
end
%%%%%%%%%% OPTIMALITY CRITERIA UPDATE %%%%%%%%%%%%%%%%%%%%%%%%%
f u n c t i o n [ xnew]=OC( n e l x , n e l y , x , v o l f r a c , dc )
l 1 = 0 ; l 2 = 1 0 0 0 0 0 ; move = 0 . 2 ;
w h i l e ( l 2 −l 1 > 1 e −4)
lmid = 0 . 5 ∗ ( l 2+l 1 ) ;
xnew = max ( 0 . 0 0 1 , max( x−move , min ( 1 . , min ( x+move , x . ∗ s q r t (−dc . / lmid ) ) ) ) ) ;
i f sum ( sum ( xnew ) ) − v o l f r a c ∗ n e l x ∗ n e l y > 0 ;
l 1 = lmid ;
else
l 2 = lmid ;
end
end
%%%%%%%%%% MESH−INDEPENDENCY FILTER %%%%%%%%%%%%%%%%%%%%%%%%%%%
f u n c t i o n [ dcn ]= c h e c k ( n e l x , n e l y , rmin , x , dc )
dcn=z e r o s ( n e l y , n e l x ) ;
for i = 1: nelx
for j = 1: nely
sum = 0 . 0 ;
f o r k = max( i −f l o o r ( rmin ) , 1 ) : min ( i+f l o o r ( rmin ) , n e l x )
f o r l = max( j −f l o o r ( rmin ) , 1 ) : min ( j+f l o o r ( rmin ) , n e l y )
f a c = rmin−s q r t ( ( i −k ) ˆ2+( j −l ) ˆ 2 ) ;
sum = sum+max ( 0 , f a c ) ;
dcn ( j , i ) = dcn ( j , i ) + max ( 0 , f a c ) ∗x ( l , k ) ∗ dc ( l , k ) ;
end
end
dcn ( j , i ) = dcn ( j , i ) / ( x ( j , i ) ∗sum ) ;
end
end
%%%%%%%%%% FE−ANALYSIS %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
f u n c t i o n [U]=FE( n e l x , n e l y , x , p e n a l )
[KE] = l k ;
K = s p a r s e ( 2 ∗ ( n e l x +1) ∗ ( n e l y +1) , 2 ∗ ( n e l x +1) ∗ ( n e l y +1) ) ;
F = s p a r s e ( 2 ∗ ( n e l y +1) ∗ ( n e l x +1) , 1 ) ; U = z e r o s ( 2 ∗ ( n e l y +1) ∗ ( n e l x +1) , 1 ) ;
for elx = 1: nelx
for ely = 1: nely
n1 = ( n e l y +1) ∗ ( e l x −1)+e l y ;
n2 = ( n e l y +1)∗ e l x +e l y ;
e d o f = [ 2 ∗ n1 −1; 2∗ n1 ; 2∗ n2 −1; 2∗ n2 ; 2∗ n2 +1; 2∗ n2 +2; 2∗ n1 +1; 2∗ n1 + 2 ] ;
K( e d o f , e d o f ) = K( e d o f , e d o f ) + x ( e l y , e l x ) ˆ p e n a l ∗KE;
end
end
% DEFINE LOADS AND SUPPORTS (HALF MBB−BEAM)
F ( 2 , 1 ) = −1;
fixeddofs = union ( [ 1 : 2 : 2 ∗ ( n e l y +1) ] , [ 2 ∗ ( n e l x +1) ∗ ( n e l y +1) ] ) ;
alldofs = [ 1 : 2 ∗ ( n e l y +1) ∗ ( n e l x +1) ] ;
freedofs = setdiff ( alldofs , fixeddofs ) ;
% SOLVING
U( f r e e d o f s , : ) = K( f r e e d o f s , f r e e d o f s ) \ F( f r e e d o f s , : ) ;
U( f i x e d d o f s , : ) = 0 ;
%%%%%%%%%% ELEMENT STIFFNESS MATRIX %%%%%%%%%%%%%%%%%%%%%%%%%
f u n c t i o n [KE]= l k
E = 1.;
nu = 0 . 3 ;
k=[ 1/2−nu /6 1/8+nu /8 −1/4−nu /12 −1/8+3∗nu /8 ...
−1/4+nu /12 −1/8−nu /8 nu /6 1/8−3∗nu / 8 ] ;

49
KE = E/(1−nu ˆ 2 ) ∗ [ k ( 1 ) k(2) k(3) k(4) k(5) k(6) k(7) k(8)
k(2) k(1) k(8) k(7) k(6) k(5) k(4) k(3)
k(3) k(8) k(1) k(6) k(7) k(4) k(5) k(2)
k(4) k(7) k(6) k(1) k(8) k(3) k(2) k(5)
k(5) k(6) k(7) k(8) k(1) k(2) k(3) k(4)
k(6) k(5) k(4) k(3) k(2) k(1) k(8) k(7)
k(7) k(4) k(5) k(2) k(3) k(8) k(1) k(6)
k(8) k(3) k(2) k(5) k(4) k(7) k(6) k(1) ] ;
%

Top Opt Beam With SIMP Top Opt Beam With SIMP

(a) Classical half MBB topology optimization prob- (b) Classical half MBB topology optimization prob-
lem solved with SIMP and volume fraction of 0.6. lem solved with SIMP and volume fraction of 0.3.
The solution is obtained by calling the code with The solution is obtained by calling the code with
top(80,60,0.6,3,2) top(80,60,0.3,3,2)

For further details, variants and extension, please see https://www.topopt.mek.dtu.dk and
[Sig01]
There is a list of packages and codes which provide Topology Optimization .e.g Ansys Work-
bench, Matlab (codes top.m, beso.m, top88.m, top3D.m, ...). Architects and Designers use Rhino
and Grashopper.
Often, the optimized structures are remodelled in order to smooth the boundaries and re-
establish symmetry. Attached some works of previous students:

50
Topology Optimization
Projects SoSe 2019
Prof. Tom Lahmer and students
(Natural Hazards and Risks in Structural Engineering, Digital Engineering,
Bauingenieurwesen, Baustoffingenieurwesen)

Meisam Ansari, NHRE

Topology optimization of a bicycle frame


3. Topology Optimization
BESO (50 % volume fract.)

1. Object and loads

2. Modelling (ANSYS) 4. Re-modelling (Rhino)

Figure 4.14: Students’ Works on Topology Optimization

51
Hanan Hadidi & Mohamad Nour Alkhalaf

Topology optimization of a steel base plate

Topology optimized (50 % mass) Verification (ANSYS)

Re-modeling (SpaceClaim)
Object with constraints
(ANSYS)

Figure 4.15: Students’ Works on Topology Optimization

Sreekanth Buddhiraju (NHRE), Michael Glas (KIM)

Topology optimization of a foot bridge

Figure 4.16: Students’ Works on Topology Optimization

52
Sreekanth Buddhiraju (NHRE), Michael Glas (KIM)

Topology Optimization of a bridge (Truss)

Figure 4.17: Students’ Works on Topology Optimization

Mohamed Said Helmy Alabassy, Marat Khairtdinov (DEM)

Topology optimization of a drone chassis

Figure 4.18: Students’ Works on Topology Optimization

53
Alexander Benz (BWM), Andreas Kirchner (BIM)

Topology optimization

Re-modelling
(SpaceClaim)
Modelling
(ANSYS)

Object Visualization
Topology optimization
(15 % vol. fract.)
3D-printing (144 layers PETG)

Figure 4.19: Students’ Works on Topology Optimization

Jason Lai Poh Hwa, Seyed Kasra Valadi Somehsaraei (DEM)

Topology optimization of a chair

Stresses Modelling (ANSYS) Object


Topology optimization Re-modeling (SpaceClaim) Validation (stresses)

Figure 4.20: Students’ Works on Topology Optimization

54
Figure 4.21: Topology Optimization, Rendering and 3D Printing of a Chair (Andreas Lenz, 2019)

55
Bibliography

[Kel99] C.T. Kelley. Iterative Methods for Optimization. SIAM, 1999.


[Sig01] O. Sigmund. “A 99 line topology optimization code written in Matlab”. In: Structural
and Multidisciplinary Optimization 21.2 (Apr. 2001), pp. 120–127.
[FSF02] Michael Felsberg, Hanno Scharr, and Per-Erik Forssen. The B-Spline Channel Repre-
sentation: Channel Algebra and Channel Based Diffusion Filtering. Tech. rep. 2461.
Linköping University, Computer Vision, 2002, p. 26.
[Ast06] A. Astolfi. Optimization, An introduction. 2006.
[Ger16] Matthias Gerdts. Einfuehrung in die lineare und nichtlineare Handlungen. 2016.

56

You might also like