metamodel based design optimization
metamodel based design optimization
Ann-Britt
Ann-Britt Ryberg
LIU-TEK
LIU-TEK LIC
LIC-201
TEK-LIC 201 1
2013:1
January 201
2013
Cover:
Illustration of metamodel-based multidisciplinary design optimization with four different
loadcases. More information about this specific example is found in Section 6.3.
Printed by:
LiU-Tryck, Linköping, Sweden, 2013
ISBN 978-91-7519-721-0
ISSN 0280-7971
Distributed by:
Linköping University
Department of Management and Engineering
SE-581 83 Linköping, Sweden
Copyright © 2013 Ann-Britt Ryberg
No part of this publication may be reproduced, stored in a retrieval system, or be trans-
mitted, in any form or by any means, electronic, mechanical, photocopying, recording, or
otherwise, without prior permission of the author.
Preface
The work presented in this thesis has been carried out at Saab Automobile AB and
Combitech AB in collaboration with the Division of Solid Mechanics, Linköping
University. It has been partly sponsored by the Swedish governmental agency for
innovation systems (VINNOVA/FFI) in the project “Robust and multidisciplinary opti-
mization of automotive structures”, and it has also been a part of the SFI/ProViking
project ProOpt.
I would like to thank my supervisor Professor Larsgunnar Nilsson for his encouraging
guidance throughout the course of this work. A very special appreciation also goes to my
PhD student colleague Rebecka Domeij Bäckryd for our close collaboration and very
fruitful discussions.
Additionally, special thanks to my manager Tomas Sjödin for being one of the initiators of
the project and always supporting me. Likewise, I am very thankful to Gunnar Olsson for
helping me to continue my research after the bankruptcy of Saab Automobile AB.
I am also grateful to all my colleagues, friends and family for their support and interest in
my work. Finally, I would like to especially thank my beloved fiancé Henrik and dedicate
this work to our coming miracle!
iii
iv
Abstract
Automotive companies are exposed to tough competition and therefore strive to design
better products in a cheaper and faster manner. This challenge requires continuous impro-
vements of methods and tools, and simulation models are therefore used to evaluate every
possible aspect of the product. Optimization has become increasingly popular, but its full
potential is not yet utilized. The increased demand for accurate simulation results has led
to detailed simulation models that often are computationally expensive to evaluate. Meta-
model-based design optimization (MBDO) is an attractive approach to relieve the comput-
ational burden during optimization studies. Metamodels are approximations of the detailed
simulation models that take little time to evaluate and they are therefore especially att-
ractive when many evaluations are needed, as e.g. in multidisciplinary design optimization
(MDO).
In this thesis, state-of-the-art methods for metamodel-based design optimization are
covered and different multidisciplinary design optimization methods are presented. An
efficient MDO process for large-scale automotive structural applications is developed
where aspects related to its implementation is considered. The process is described and
demonstrated in a simple application example. It is found that the process is efficient,
flexible, and suitable for common structural MDO applications within the automotive
industry. Furthermore, it fits easily into an existing organization and product development
process and improved designs can be obtained even when using metamodels with limited
accuracy. It is therefore concluded that by incorporating the described metamodel-based
MDO process into the product development, there is a potential for designing better pro-
ducts in a shorter time.
v
vi
List of Papers
Own contribution
The work resulting in the two appended papers have been a joint effort by Rebecka
Domeij Bäckryd and me. My contribution to the first paper includes being an active part-
ner during the writing process. As for the second paper, I have had the main responsibility
for writing the paper and conducting the application example.
vii
viii
Contents
Preface iii
Abstract v
Contents ix
2 Optimization 5
2.1 Structural Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Metamodel-Based Design Optimization. . . . . . . . . . . . . . . . . . . . . . 6
2.3 Multi-Objective Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.4 Probabilistic-Based Design Optimization . . . . . . . . . . . . . . . . . . . . . 7
2.5 Multidisciplinary Design Optimization . . . . . . . . . . . . . . . . . . . . . . 7
ix
4.3.3 Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.3.4 Multivariate Adaptive Regression Splines . . . . . . . . . . . . . . . . . 30
4.3.5 Support Vector Regression . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.4 Metamodel Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.4.1 Error Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.4.2 Cross Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.4.3 Generalized Cross Validation and Akaike’s Final Prediction Error . . . . 39
4.5 Optimization Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.5.1 Evolutionary Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.5.2 Particle Swarm Optimization . . . . . . . . . . . . . . . . . . . . . . . . 45
4.5.3 Simulated Annealing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
7 Discussion 65
Bibliography 71
x
Part I
3
CHAPTER 1. INTRODUCTION
organization and fit into the product development process, which places restrictions on the
choice of method. Furthermore, MDO includes evaluations of several different detailed
simulation models for a large number of variable settings, which requires considerable
computer resources.
The VINNOVA/FFI project “Robust and multidisciplinary optimization of automotive
structures” (Swedish: ”Robust optimering och Multidisciplinär optimering av fordons
strukturer”) was established to find suitable methods for implementing robust and multi-
disciplinary design optimization in automotive development. The multidisciplinary aspect
is the focus in this thesis, and the goal has been to develop an efficient MDO process for
large-scale structural applications. The methodology takes the special characteristics of
automotive structural applications into account as well as considers aspects related to imp-
lementation within an existing organisation and product development process.
The presented work is also a part of the SFI/ProViking project ProOpt which aims at de-
veloping methods for optimization-driven design. The objective to find an MDO process
suitable for automotive structural applications fits perfectly also within the scope of that
project.
The chapters following this introduction will introduce important optimization concepts
and give a short description of automotive product development. Since the use of meta-
models is an essential part of automotive structural optimization, the main part of this
thesis is devoted to metamodel-based design optimization. After a general description of
multidisciplinary design optimization methods, a presentation of an MDO process suitable
for large-scale automotive structural applications is presented and demonstrated in a
simple example. The thesis is then ended with a discussion regarding the presented MDO
process, conclusions and an outlook on further needs.
4
2
Optimization
Optimization is a procedure for achieving the best possible solution to a specific problem
while satisfying certain restrictions. A general optimization problem can be formulated as
min ( )
subject to ( )≤ (2.1)
( )=
≤ ≤
The goal is to find the design variables x that minimize the objective function f(x). In
general, the problem is constrained, i.e. there are a number of inequality and equality
constraints represented by the vectors g(x) and h(x) that should be fulfilled. If the problem
lacks constraints, the problem is said to be unconstrained. The design variables are
allowed to vary between an upper and a lower limit, called x upper and x lower respectively,
which defines the design space. The design variables can be continuous or discrete,
meaning that they can take any value, or only certain discrete values, between the upper
and lower limits. Design points that fulfil all the constraints are feasible, while all other
design points are unfeasible.
The general formulation in Equation (2.1) can be reorganized into the simpler form
min ( )
(2.2)
subject to ( )≤
In this formulation, the inequality constraints g(x) contain all three types of constraints in
the former formulation. Each equality constraint is then replaced by two inequality con-
straints and included, together with the upper and lower limits of the design variables, in
the constraint vector g(x). Both formulations can be used for maximization problems if
the objective function f(x) is multiplied with -1.
The solution to an optimization problem is called the optimum solution and this solution is
usually found using some numerical technique. An iterative search process that uses
information from previous iterations is then applied. When the objective and constraint
functions are evaluated during the solution process, one or several analyzers are used. For
a vector of design variables x, an analyzer returns a number of responses denoted by y.
5
CHAPTER 2. OPTIMIZATION
These responses can be combined into the objective and constraint functions for that
specific vector of design variables.
6
CHAPTER 2. OPTIMIZATION
problem. It has one objective function that should be minimized. Many variants of this
problem can be found. When solving multi-objective optimization (MOO) problems, two
or more objective functions should be minimized simultaneously. The simplest approach
is to convert the problem into a single-objective problem. This can be done by minimizing
one of the objective functions, usually the most important one, and to treat all the others as
constraints. Another way is to create a single objective function as a combination of the
original objectives. Weight coefficients can then be used to mirror the relative importance
of the original objective functions. The drawback of the aforementioned methods is that
only one single optimum is found. If the designer wants to modify the relative importance
of the objective functions in retrospect, the optimization process must be performed again.
An alternative approach is to find a number of Pareto optimal solutions. A solution is said
to be Pareto optimal if there exist no other feasible solution yielding a lower value of one
objective without increasing the value of at least one other objective. The designer will
then have a set of solutions to choose among, and the trade-off between the different
objective functions can be performed after the optimization process has been carried out.
7
CHAPTER 2. OPTIMIZATION
8
Automotive Product 3
Development
The development of a new car is a complicated task that requires many experts with
different skills and responsibilities to cooperate in an organized manner. The product
development process (PDP) describes what should be done at different stages of the
development. It starts with initial concepts, which are gradually refined with the final aim
of fulfilling all predefined targets. Many groups within the company organization are
involved during the development. Some are responsible for designing a part of the
product, e.g. the body, the interior, or the chassis system, while others are responsible for a
performance aspect, e.g. crashworthiness, aerodynamics, or noise, vibration, and harsh-
ness (NVH). The groups work in parallel, and at certain times, the complete design is
synchronized and evaluated. If the design is found to be satisfactory, the development is
allowed to progress to the next phase.
Numerical simulations using finite element methods have been well integrated into the
PDP for more than two decades, and more or less drive the development of today
(Duddeck, 2008). Simulations can roughly be divided into two main categories in the
same way as reflected by the groups within the organization of a company. The first one
supports certain design areas and the other one evaluates disciplinary performance aspects
that depends on more than one design area. The former is consequently evaluating many
different aspects, e.g. stiffness, strength, and durability, for a certain area of the vehicle,
while the latter focuses on one performance area, which often depends on the complete
vehicle. One result of the increased focus on simulations is that the number of prototypes
needed to test and improve different concepts has been reduced, although the amount of
qualities to be considered during development has increased considerably. Hence, the
extended use of simulations has resulted in both shortened development times and in
reduced development costs. However, the increased demand of accuracy on the simulation
models often results in detailed models that are time-consuming to evaluate. For example,
it is not unusual that a crash model consists of several million elements and takes many
hours to run on a high performance computing cluster.
To improve designs in a systematic way, different optimization methods have gained in
popularity. Optimization can be used within different stages of the PDP: in the early
phases to find promising concepts and in the later phases to fine-tune the design. Even if
optimization has shown to result in better designs, the knowledge and additional resources
9
CHAPTER 3. AUTOMOTIVE PRODUCT DEVELOPMENT
needed have delayed the use of its full potential. Optimization studies are often performed
as an occasional effort when considered appropriate, and the time and scope are normally
not defined in the PDP. This is certainly the case for MDO and it is therefore important to
find methods that can fit into a modern PDP without jeopardizing its strict time limits.
Metamodel-based design optimization is an approach that can make it possible to include
also expensive simulation models in optimization studies.
10
Metamodel-Based 4
Design Optimization
A metamodel is an approximation of a detailed simulation model, i.e. a model of a model.
It is called metamodel-based design optimization (MBDO) when metamodels are used for
the evaluations during the optimization process. There are several descriptions on MBDO,
see for example Simpson et al. (2001), Queipo et al. (2005), Wang and Shan (2007),
Forrester and Keane (2009), Stander et al. (2010), and Ryberg et al (2012).
A metamodels is a mathematical description created based on a dataset of input and the
corresponding output from a detailed simulation model, see Figure 4.1. The mathematical
description, i.e. metamodel type, suitable for the approximation could vary depending on
the intended use or the underlying physics that the model should capture. Different data-
sets are appropriate for building different metamodels. The process of where to place the
design points in the design space, i.e. the input settings for the dataset, is called design of
experiments (DOE). Traditionally, the metamodels have been simple polynomials, but
other metamodels that are better at capturing complex responses increase in popularity.
The number of simulations needed to build a metamodel depends largely on the number of
variables. Variable screening is therefore often used to identify the important variables in
order to reduce the size of the problem and decrease the required number of detailed simu-
lations. Since metamodels are approximations, it is important to know the accuracy of the
models, i.e. how well the metamodels represent the detailed simulation model. This can be
done by studying various error measures, which are obtained using different approaches.
a) b) c)
Response
Response
Response
Figure 4.1 The concept of building a metamodel of a response depending on two design
variables: a) design of experiments, b) function evaluations, and c) metamodel.
11
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
12
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
Classical DOEs tend to spread the sample points around the border and only put a few
points in the interior of the design space. They are primarily used for screening purposes
and to build polynomial metamodels. When the dataset is used to fit more advanced
metamodels, other experimental designs are preferred. There seem to be a consensus
among scientists that a proper experimental design for fitting global metamodels depen-
ding on many variables over a large design space should be space-filling. These types of
DOEs aim at spreading the design points within the complete design space, which is
desired when the form of the metamodel is unknown and when interesting phenomena can
be found in any region of the design space. In addition to the different space-filling
designs, different criteria-based designs can be constructed if certain information about the
metamodel to be fitted is available a priori, which is not always the case. In an entropy
design the purpose is to maximize the expected information gained from an experiment,
while the mean squared error design minimizes the expected mean squared error
(Koehler and Owen, 1996).
13
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
and not more than one design point placed in each subspace. A comparison between the
LHS and these improved designs is found in Figure 4.3.
variables at
levels
Sampling plan
matrix
11 11 12
uniform distribution 21 22
S12
variable 2
31 32
41 42
51 52
variable 1
normal distribution
Figure 4.2 Latin hypercube sampling for two variables at five levels, one normally
distributed variable and the other uniformly distributed.
a) b) c)
Figure 4.3 Comparison between different space-filling DOEs with two variables and four
design points: a) Median Latin hypercube sampling, b) Randomized orthogonal array, and
c) Orthogonal array-based Latin hypercube sampling.
14
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
a) b)
Figure 4.4 Comparison of maximin and minimax designs with seven points in two
variables. a) Maximin, where the design space is filled with spheres with maximum radius
b) Minimax, where the design space is covered by spheres with minimum radius.
15
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
4.2 Screening
The number of simulations needed to build a metamodel depends on the number of design
variables. Eliminating the variables that do not influence the results can therefore substan-
tially reduce the computational cost. The process of studying the importance of different
16
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
variables, identifying the ones to be included, and eliminating the ones that do not in-
fluence the responses is called variable screening.
Several screening methods exist, see e.g. Viana et al. (2010). One of the simplest
screening techniques uses one-factor-at-a-time plans, which evaluate the effect of
changing one variable at a time. It is a very inexpensive approach but it does not estimate
the interaction effects between variables. Therefore, variants of this method that account
for interactions have been proposed. One example is Morris method (Morris, 1991)
which, at the cost of additional runs, tries to determine whether the variables have effects
that are (a) negligible, (b) linear and additive, or (c) non-linear or involved in interactions
with other variables.
Another category of screening techniques are variance-based. One simple and commonly
used approach is based on analysis of variance (ANOVA) as described by Myers et al.
(2008). The idea is to fit a metamodel using regression analysis, e.g. a simple polynomial
metamodel, and study the coefficients for each term in the model. The importance of a
variable can then be judged both by the magnitude of the related estimated regression
coefficients and by the level of confidence that the regression coefficient is non-zero. This
technique is used to separately identify the main and interaction effects that account for
most of the variance in the response.
An alternative variance-based method is Sobol's global sensitivity analysis (GSA), which
provides the total effect (main and interaction effects) of each variable (Sobol', 2001). The
method can be used for arbitrary complex metamodels and includes the calculation of
sensitivity indices. These indices can be used to rank the importance of the design
variables for a response and thus identify insignificant design variables. It is also possible
to quantify what amount of the variance that is caused by a single variable.
4.3 Metamodels
When running a detailed simulation model, a vector of input (design variable values)
results in a vector of output (response values). Each element in the response vector
represents a specific response. For each of these responses, a metamodel can be built to
approximate the true response. The metamodel is built from a dataset of input design
points x i = (x 1, x 2, ... , x k )T and the corresponding output responses y i = f(x i ), where k is
the number of design variables, i = 1, ... , n, and n is the number of designs used to fit the
model. For an arbitrary design point x, the predicted response ŷ will differ from the true
response y of the detailed model, i.e.
= ( )= + = ( )+ (4.1)
Here, f(x) represents the detailed model, s(x) is the mathematical function defining the
17
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
18
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
Traditionally, polynomial metamodels have often been used. These models are developed
using regression, i.e. fitting a regression model y = s(x,β) + ε to a dataset of n variable
settings x i and corresponding responses y i . The method of least squares chooses the reg-
ression coefficients β so that the quadratic error is minimized. The least square estimators
of the regression coefficients are denoted b and can be found using matrix algebra (Myers
et al., 2008) as
=( ) (4.2)
where y is the vector of n responses used to fit the model depending on k variables and X
is the model matrix
… … … … ( )
⎡1
( )
1 … … ( ) … …⎤ ⎡ ( ) ⎤
=⎢ ⋮ ⋱ ⋮ ⋮ ⋱ ⋮ ⋱
⎥ ⎢
…⎥ = ⎢ ⋮ ⎥
⎥ (4.3)
⎢⋮ ⋮ ⋮
⎣1 … … ( ) … …⎦ ⎣ ( )⎦
In this matrix, each row corresponds to one fitting point and each column is related to one
regression coefficient, i.e. the number of columns depends on the polynomial order and
how many interactions that are considered. The resulting polynomial metamodel becomes
( )= ( ) (4.4)
This metamodel will in general not interpolate the fitting data. One exception is when the
fitting set is so small that there is just enough data to determine all the regression coef-
ficients. However, such small fitting sets are generally not recommended. Low order
polynomial metamodels will capture the global trends of the detailed simulation model,
but will in many cases not be a good representation of the complete design space. These
metamodels are therefore mainly used for screening purposes and in iterative optimization
procedures.
Polynomial metamodels can produce large errors for highly non-linear responses but can
provide good local approximations if the response is less complex. These features are
taken advantage of in the method of moving least squares (MLS). For a specific value of
x, a polynomial is fitted according to the least squares method, but the influence of
surrounding points is weighted depending on the distance to x (Breitkopf et al., 2005).
Hence, compared to Equation (4.4) for polynomial metamodels, the MLS model has
coefficients b that depend on the location in the design space, i.e. depend on x. Thus, one
polynomial fit is not valid over the entire domain as for normal polynomial metamodels.
Instead, the polynomial is valid only locally around the point x where the fit is made.
Since b is a function of x, a new MLS model needs to be fitted for each new evaluation.
Furthermore, in order to construct the metamodel, a certain number of fitting points must
fall within the domain of influence. The number of influencing fitting designs can be
19
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
adjusted by changing the weight functions, or rather the radius of the domain of influence.
The denser the design space is sampled, the smaller the domain of influence can be, and
the more accurate the metamodel becomes.
Next, some other metamodels suitable for global approximations and frequently men-
tioned in the literature will be covered in more detail. These metamodels could be possible
alternatives for the MDO process presented in Chapter 6.
4.3.1 Kriging
Kriging is named after D. C. Krige, and this method for building metamodels has been
used in many engineering applications. Design and analysis of computer experiments
(DACE) is a statistical framework for dealing with Kriging approximations to complex
and expensive computer models presented by Sacks et al. (1989). The idea behind Kriging
is that the deterministic response y(x) can be described as
( )= ( )+ ( ) (4.5)
where f(x) is a known polynomial function of the design variables x and Z(x) is a
stochastic process (random function). This process is assumed to have mean zero, variance
σ 2 and a non-zero covariance. The f(x) term is similar to a polynomial model described in
the previous section and provides a global model of the design space, while the Z(x) term
creates local deviations so that the Kriging model interpolates the n sampled data points.
In many cases, f(x) is simply a constant term and the method is then called ordinary
Kriging. If f(x) is set to 0, implying that the response y(x) has mean zero, the method is
called simple Kriging.
A fitted Kriging model for an unknown point x can be written as
( )= ( ) + ( ) ( − ) (4.6)
where f(x) is a vector corresponding to a row of the model matrix X in the same way as
for the polynomial models previously described. b is a vector of the estimated regression
coefficients, r(x) = [R(x, x 1), R(x, x 2), ... , R(x, x n )]T is a vector of correlation functions
between the unknown point and the n sample points, R is the matrix of correlation
functions for the fitting sample, and y is a vector of the observed responses in the fitting
sample. The term (y - Xb) is a vector of residuals for all fitting points when the stochastic
term of the model is disregarded. The regression coefficients are found by
=( ) (4.7)
Many different correlation functions could be used, but two commonly applied functions
are the exponential and the Gaussian correlation functions (Stander et al., 2010), i.e.
20
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
( , )= (4.8)
and
( , )= (4.9)
1
max ( ) = − [ ln( ) + ln| |]
2 (4.10)
subject to > 0, = 1, … ,
where |R| is the determinant of R and the estimate of the variance is given by
( − ) ( − )
= (4.11)
21
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
data (Simpson et al., 2001). An interpolating Kriging model can also be modified by
adding a regularization constant to the diagonal of the correlation matrix so that the model
does not interpolate the data. The Kriging method is thus flexible and well suited for
global approximations of the complete design space. Kriging models also provide an
estimate of the prediction error in an unobserved point directly (Sacks et al., 1989), which
is a feature that can be used in adaptive sequential sampling approaches.
( , ) = (‖ − ‖) = ( ) (4.12)
where r is the distance between the points x and x i . The RBFs can be of many forms but
are always radially symmetric. The Gaussian function and Hardy's multiquadrics are
commonly used and expressed as
( )= (4.13)
and
( )= + (4.14)
respectively, where c is a shape parameter that controls the smoothness of the function,
see Figure 4.5.
a) b)
larger
larger
Figure 4.5 Examples of radial basis functions: a) Gaussian RBF and b) Hardy’s multi-
quadric RBF.
22
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
( )= (‖ − ‖) = (4.15)
The metamodel is thus represented by a sum of n RBFs, each associated with a sample
point x i , representing the centre of the RBF, and weighted by a coefficient w i . The coef-
ficients w i , i.e. the unknown parameters that must be determined when building the meta-
model, can be collected in a vector w. The vector Φ contains the evaluations of the RBF
for all distances between the studied point x and the sample designs x i .
Radial basis function metamodels are often interpolating, i.e. the parameters w i are chosen
such that the approximation matches the responses in the sampled dataset (x i , y i ), where
i = 1, ... , n. This can be obtained if the number of RBFs equals the number of samples in
the fitting set, resulting in a linear system of equations in w i
= (4.16)
where y is the vector of responses, w is the vector of unknown coefficients, and B is the
n × n symmetric interpolation matrix that contain evaluations of the RBF for the distances
between all the fitting points
= − (4.17)
The equation system (4.16) can be solved by standard methods, using matrix decom-
positions, for small n, but special methods have to be applied when n becomes large (Dyn
et al., 1986), since the interpolation matrix is often full and ill-conditioned.
When the number of basis functions n RBF is smaller than the sample size n s , the model will
be approximating. Similarly to the polynomial regression model, the optimal weights in
the least squares sense is obtained as
=( ) (4.18)
23
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
known point. A small value of c, on the other hand, means that only nearby points will
influence the prediction. Consequently, the selection of c also influences the risk of over-
fitting or underfitting. If the value is chosen too small, overfitting will occur, i.e. every
sample point will influence only the very close neighbourhood. On the other hand, if the
value is selected too large, underfitting will appear and the model loses fine details, see
Figure 4.6. So, while the correct choice of w will ensure that the metamodel can reproduce
the training data, the correct estimate of c will enable a smaller prediction error in
unknown points. The prediction error for a RBF metamodel can easily be evaluated at any
point in the design space (Forrester and Keane, 2009), which is a property that can be
useful in e.g. sequential sampling.
a) b)
true response true response
metamodel metamodel
fitting data fitting data
response
response
variable variable
Figure 4.6 Examples of models with poor prediction capabilities due to a) overfitting and
b) underfitting.
( )= + = ( ) (4.19)
where f is the transfer or activation function, b m is the bias value, and w mi the weight of
24
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
1
1
2
2
Σ
…
weights
bias
1
Figure 4.7 Illustration of neuron m in a neural network, where input is variables or output
from previous neurons.
One very common architecture is the multi-layer feedforward neural network (FFNN),
see Figure 4.8, in which the information only is passed forward in the network and no
information is fed backward. The transfer function in the hidden layers of an FFNN is
often a sigmoid function, i.e.
1
( )= (4.20)
1+
which is an S-shaped curve ranging from 0 to 1 and a is defined in Equation (4.19). For
the input and output layers, a linear transfer f(a) = a is often used with bias added to the
output layer but not to the input layer. This means that a simple neural network with only
one hidden layer of M neurons can be of the form
( )= + (4.21)
( ∑ )
1+
where b is the bias of the output neuron, w m is the weight on the connection between the
m th hidden neuron and the output neuron, b m is the bias in the m th hidden neuron, and w mi
is the weight on the connection between the i th input and the m th hidden neuron.
There are two distinct steps in building a neural network. The first is to choose the
architecture and the second is to train the network to perform well with respect to the
training set of input (design variable values) and corresponding output (response values).
The second step means that the free parameters of the network, i.e. the weights and biases
25
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
1 … = +Σ and usually,
• for input and output layers
2
… ( )=
…
…
( ) = 1 / (1+ - )
…
Figure 4.8 Illustration of a feedforward neural network architecture with multiple hidden
layers.
If the steepest descent algorithm is used for the optimization, the training is said to be
done by back-propagation (Rumelhart et al., 1986), which means that the weights are
adjusted in proportion to
= (4.22)
The studied error measure E is the sum of the squared differences between the target
output and the actual output from the network over all n points in the training set, i.e.
= ( − ) (4.23)
The adjustment of the weights starts at the output layer and is thus based on the difference
between the response from the NN and the target response from the training set. For the
hidden layers, where there is no specified target value y i , the adjustments of the weights
are instead determined recursively based on the sum of the changes at the connecting
nodes multiplied with their respective weights. In this way the adjustments of the weights
are distributed backwards in the network, and hence the name back-propagation.
It has been shown by Hornik et al. (1989) that FFNNs with one hidden layer can approxi-
mate any continuous function to any desired degree of accuracy, given a sufficient number
of neurons in the hidden layer and the correct interconnection weights and biases. In
26
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
theory, FFNN metamodels thus have the flexibility to approximate very complex func-
tions, and FFNNs are therefore well suited for global approximations of the design space.
The decision of the appropriate number of neurons in the hidden layer or layers is not
trivial. Generally, the correct number of neurons in the hidden layer(s) is determined
experimentally, i.e. a number of candidate networks are constructed and the one judged to
be the best is selected. Only one hidden layer is often used. Although FFNNs with one
hidden layer theoretically should be able to approximate any continuous function, only
one hidden layer is not necessarily optimal. One hidden layer may require many more
neurons to accurately capture complex functions than a network with two hidden layers,
since it might be easier to improve an approximation locally without making it worse
elsewhere in a network with two hidden layers (Chester, 1990).
Evidently, if the number of free parameters is sufficiently large and the training opti-
mization is run long enough, it is possible to drive the training error as close to zero as
preferred. However, a too small error is not desirable since it can lead to overfitting
instead of a model with good prediction capabilities. An overfitted model does not capture
the underlying function properly. It describes the noise rather than the principal relation-
ship and can result in poor predictions even for noise-free data, see Figure 4.9a. Over-
fitting generally occurs when a model is excessively complex, i.e. when it has too many
parameters relative to the number of observations in the training set. On the other hand, if
the network model is not sufficiently complex, the model can also fail in capturing the
underlying function, leading to underfitting, see Figure 4.9b. Given a fixed amount of
training data, it is beneficial to reduce the number of weights and biases as well as the size
of them in order to avoid overfitting.
a) b)
response without noise response without noise
metamodel metamodel
fitting data fitting data
response
response
variable variable
Figure 4.9 Examples of models with poor prediction capabilities due to a) overfitting and
b) underfitting.
Regularization means that some constraints are applied to the construction of the NN
model in order to reduce the prediction error in the final model. For FFNN models,
27
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
regularization can be done by controlling the number of hidden neurons in the network.
Another way is to impose penalties on the weights and biases or to use a combination of
both methods (Stander et al., 2010). A fundamental problem when modelling noisy or
using very limited data is to balance between the goodness of fit and the choice of how
strong the constraints forced on the model by regularization should be.
Another common type of neural network, in addition to the FFNN, is the radial basis
function neural network (RBFNN), which has activation functions in the form of RBFs.
An RBFNN has a defined three-layer architecture with the single hidden layer built of
non-linear radial units, each responding only to a local region of the design space. The
input layer is linear and the output layer performs a biased weighted sum of the hidden
layer units and creates an approximation over the entire design space, see Figure 4.10. The
RBFNN model is sometimes complemented with a linear part corresponding to additional
direct connections from the input neurons to the output neuron.
1
1
…
2
0
3
…
…
Figure 4.10 Illustration of a radial basis function neural network with Gaussian activation
functions.
= ( − ) (4.24)
between the input vector x = (x 1, ... , x k )T and the RBF centres w m = (w m1, ... , w mk ) in the
k-dimensional space. For a given input vector x, the output from an RBFNN with k input
28
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
neurons and a hidden layer consisting of M RBF units (but without a linear part) is given
by
= + w ( ) (4.25)
( ) ∑ ( ) (4.26)
( )= =
Thus, the hidden layer parameters w m = (w m1, ... , w mk ) represent the centre of the m th
radial unit, while w m0 determines its width. The parameters b and w 1, ... , w M are the bias
and weights of the output layer, respectively. All these parameters and the number of
neurons M must be determined when building the RBFNN metamodel.
In the same way as a feedforward neural network can approximate any continuous func-
tion to any desired degree of accuracy, an RBFNN with enough hidden neurons can too.
An important feature of the RBFNNs, which differs from the FFNNs, is that the hidden
layer parameters, i.e. the parameters governing the RBFs, can be determined by semi-
empirical, unsupervised training techniques. This means that RBFNNs can be trained
much faster than FFNNs although the RBFNN may require more hidden neurons than a
comparable FFNN (Stander et al., 2010).
The training process for RBFNNs are generally done in two steps. First, the hidden layer
parameters, i.e. the centre and width of the radial units, are set. Then, the bias and weights
of the linear output layer are optimized, while the basis functions are kept fixed. In
comparison, all of the parameters of an FFNN are usually determined simultaneously as
part of a single optimization procedure (training). The optimization in the second step of
the RBFNN training is done to minimize some performance criterion, e.g. the mean sum
of squares of the network errors on the training set (MSE), see Equation (4.39). If the
hidden layer parameters are kept fixed, the MSE performance function is a quadratic
function of the output layer parameters and its minimum can be found as the solution to a
set of linear equations. The possibility to avoid time consuming non-linear optimization
during the training is one of the major advantages of RBFNNs compared to FFNNs.
Commonly, the number of RBFs are chosen to be equal to the number of samples in the
training dataset (M = n), the RBF centres are set at the fitting designs (w m = x m ,
m = 1, ... , n), and the widths of the radial units are all selected equal. In general, the
widths are set to be a multiple s w of the average distance between the RBF centres so that
they overlap to some degree and hence result in a relatively smooth representation of the
data. Sometimes the widths are instead individually set to the distance to the n w (<< n)
closest neighbours so that the widths become smaller in areas with many samples close to
29
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
each other. This results in a model that preserves fine details in densely populated areas
and interpolates the data in sparse areas of the design space.
When building an RBFNN metamodel, the goal is to find a smooth model that captures
the underlying functional response without fitting potential noise, i.e. avoid overfitting.
For noisy data, the exact RBFNN that interpolates the training dataset is usually a highly
oscillatory function, and this needs to be addressed when building the model. Similarly as
can be done for an FFNN or a Kriging model, regularization can be applied to adjust the
output layer parameters in the second phase of the training. This will yield a model that no
longer passes through the fitting points. However, it is probably more effective to properly
select the hidden layer parameters, i.e. the width and centres of the RBF units, in the first
step of the training. Regularization in the second step can never compensate for large
inaccuracies in the model parameters. Another way of constructing an approximating
model is to reduce the number of RBFs. This could be done by starting with an empty
subset of basis functions and adding, one at a time, the basis function which reduces some
error metric the most. The selection is done from the n possible basis functions, which are
centred around the observed data points x i , and the process is continued until no signi-
ficant decrease in the studied error metric is observed.
Since the accuracy of the metamodel strongly depends on the hidden layer parameters, it is
important to estimate them well. Instead of just selecting the values, the widths can be
found by looping over several trial values of s w or n w and finally selecting the best
RBFNN. The selection can for example be based on the generalized cross validation error,
which is a measure of goodness of fit that also takes the model complexity into account,
see Section 4.4.3. Another approach to find the best possible RBFNN metamodel can be to
include the widths as adjustable parameters along with the output layer parameters in the
second step of training. However, this requires a non-linear optimization in combination
with a sophisticated regularization, and one of the benefits with the RBFNN, the speed of
training, will be lost.
30
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
where t is the truncation location, also called knot location, and q is the order of the
spline. The subscript “+” indicates that the argument, i.e. the value within the squared
brackets, should be positive, otherwise is the function assumed to be zero. For q > 0, the
spline is continuous and has q - 1 continuous derivatives. Often q = 1 is recommended and
the splines then become "hinge functions", see Figure 4.11. The resulting MARS model
will then have discontinuous derivatives but could be modified to have continuous first
order derivatives (Friedman, 1991).
+( – ) = max(0, – )
–( – ) = max(0, – )
0
0 2
( )= + ( ) (4.28)
( )= ∙( ( , ) − ) (4.29)
where J m is the number of factors in the m th basis function, i.e. the number of functions b
in the product. The parameter s jm = ±1 and indicates the "left” or “right” version of the
function, x v(j,m) denotes the v th variable, where 1 ≤ v(j,m) ≤ k and k is the total number of
31
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
variables, and t jm is the knot location for each of the corresponding variables. As pre-
viously, q indicates the power of the function.
Building a MARS metamodel is done in two steps. The first step starts with a 0, which is
the mean of the response values in the fitting set. Basis functions B m and B m+1 are then
added in pairs to the model, choosing the ones that minimize a certain measure of lack of
fit. Each new pair of basis functions consists of a term already in the model multiplied
with the “left” and “right” version of a truncated power function b, respectively. The
functions b are defined by a variable x v and a knot location t. When adding a new pair of
basis functions, the algorithm must therefore search over all combinations of the existing
terms of the metamodel (to select the term to be used), all variables (to select the one for
the new basis function), and all values of each variable (to find the knot location). For
each of these combinations, the best set of coefficients a m is found through a least square
regression of the model response ŷ to the response from the fitting set y. The process of
adding terms to the model is continued until a pre-defined maximum number of terms are
reached or until the improvement in lack of fit is sufficiently small. This so-called forward
pass usually builds a model that overfits the data. The second step of the model building is
therefore a backward pass where model terms are removed one by one, deleting the least
effective term until the best metamodel is obtained. The lack of fit for the models is
calculated using a modified form of generalized cross validation (see Section 4.4.3), which
takes both the error and complexity of the model into account. More details can be found
in Friedman (1991). The backward pass has the advantage that it can choose to delete any
term except a 0. The forward pass can only add pairs of terms at each step, which are based
on the terms already in the model.
A lot of searches have to be done during the model building. However, Jin et al. (2001)
state that one of the advantages of the MARS metamodel, compared to Kriging, is the
reduction in computational cost associated with building the model.
( )= + ( )= + ( ) (4.30)
Hence, a sum of basis functions Q = [Q 1(x), ... , Q M (x)]T with weights w = [w 1, ... , w M ]T
32
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
added to a base term b. The parameters b and w m are to be estimated, but in a different
way than the counterparts in other metamodels. The basis functions Q in the SVR model
can be seen as a transformation of x into some feature space in which the model is linear,
see Figure 4.12.
One of the main ideas with SVR is that a margin ε is given within which a difference
between the fitting set responses and the metamodel prediction is accepted. This means
that the fitting points that lie within the ±ε band (called the ε-tube) are ignored, and the
metamodel is defined entirely by the points, called support vectors, that lie on or outside
this region, see Figure 4.12. This can be useful when the fitting data has an element of
random error due to numerical noise etc. A suitable value of ε might be found by a sensi-
tivity study. In practical cases, however, the dataset is often not large enough to afford not
to use all of the samples when building the metamodel. In addition, the time needed to
train an SVR model is longer than what is required for many other metamodels.
Estimating the unknown parameters of an SVR metamodel is an optimization problem.
The goal is to find a model that has at most a deviation of ε from the observed responses y i
(i = 1, ... , n) and at the same time minimizes the model complexity, i.e. makes the meta-
model as flat as possible in feature space (Smola and Schölkopf, 2004). Flatness means
that w should be small, which can be ensured by minimizing the vector norm ‖ ‖ . Since
it might be impossible to find a solution that approximates all y i with precision ±ε and that
better predictions might be obtained if the possibility of outliers are allowed, slack
variables ξ + and ξ – can be introduced, see Figure 4.12. The optimization problem can then
be stated as
1
min ‖ ‖ + ( + )
2
(4.31)
subject to ∙ ( )− ≤ +
−
− + ∙ ( )+ ≤ +
, ≥0
This problem is a trade-off between model complexity and the degree to which errors
larger than ε are tolerated. This trade-off is governed by the user defined constant C > 0,
and this method of tolerating errors is known as the ε-insensitive loss function, see Figure
4.12. Other loss functions are also possible. The ε-insensitive loss function means that no
loss will be associated to the points inside the ε-tube, while points outside will have a loss
that increases linearly with a rate determined by C. A small constant will lead to a flatter
prediction, i.e. more emphasis on minimizing ‖ ‖ , usually with fewer support vectors. A
larger constant will lead to closer fitting of the data, i.e. more emphasis on minimizing
∑(ξ + + ξ –), usually with a larger number of support vectors. Although there might be an
optimum value of C, the exact choice is not critical according to Forrester and Keane
33
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
Q
Input space Feature space
+ + loss
0
response
response
+
variable variable
Figure 4.12 SVR metamodel in one design variable with support vectors marked with
dark dots and the designs disregarded in the model build marked with light dots. The non-
linear model is reduced to a linear model by the mapping Q from input space into feature
space and the support vectors contribute to the cost by the ε-insensitive loss function.
In most cases, the optimization problem described by Equation (4.31) is more easily
solved in its dual form, and it is therefore written as the minimization of the corresponding
Lagrangian function L. At optimum, the partial derivatives of L with respect to its primal
variables w, b, ξ +, and ξ – must vanish, which leads to the optimization problem
1
max − ( − ) − ( , )− ( + )+ ( − )
2
,
(4.32)
subject to ( − )=0
( − ) ∈ [0, ]
where α i + and α i – are dual variables (Lagrange multipliers) and k(x i ,x j ) = Q(x i ) ∙ Q(x j )
represents the so-called kernel function. This problem can be solved using a quadratic
programming algorithm to find the optimal choices of the dual variables. The kernel
functions must have certain properties and possible choices include linear and Gaussian
functions etc. as seen in Table 4.1 (Smola and Schölkopf, 2004).
The partial derivative of L with respect to w being zero yields = ∑ ( − ) ( ).
Equation (4.30) can then be rewritten and provide the response in an unknown point x as
̂( ) = + ∙ ( )= + ( − ) ( )∙ ( )= + ( − ) ( , ) (4.33)
34
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
The optimization problem in Equations (4.31) and (4.32) corresponds to finding the
flattest function in feature space, not in input space. The base term is still unknown but
could be determined from
which means that b can be calculated for one or more α i ± that fulfil the conditions. Better
results are obtained for α i ± not too close to the bounds according to Forrester and Keane
(2009). The set of equations could also be solved via linear regression.
It can be seen that SVR methods produce RBF networks with all width parameters set to
the same value and centres corresponding to the support vectors. The number of basis
functions, i.e. hidden layer units, M in Equation (4.25), is the number of support vectors.
Table 4.1 Kernel functions for SVR where c, ϑ, and κ are constants.
35
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
residuals mean that the model reflects the dataset more accurately than if the residuals
were larger, i.e. the fitting error is smaller. Several different error measures can be
evaluated based on the residuals, e.g. the maximum absolute error (MAE), the average
absolute error (AAE), the mean absolute percentage error (MAPE), the mean squared
error (MSE), and the root mean squared error (RMSE).
= | − |, = 1, … , (4.36)
∑ | − |
= (4.37)
| − |
∑
(4.38)
= × 100%
∑ ( − )
= (4.39)
∑ ( − )
= (4.40)
where n is the number of samples in the fitting set. The smaller these error measures are,
the smaller the fitting error is. The AAE, MAPE, MSE and RMSE provide a measure of the
overall accuracy, while the MAE is a measure of the local accuracy of the model. RMSE is
the most commonly used metric but can be biased as the residuals are not relatively
measured. If the dataset contains both high and low response values, it might be desirable
to study a relative error instead, i.e. an error measure that is independent of the magnitude
of the response. The MAPE measure takes this aspect into consideration.
Another commonly used statistic is the coefficient of determination R 2, which is a
measure of how well the metamodel is able to capture the variability in the dataset.
∑ ( − ) ∑ ( − )
=1− = (4.41)
∑ ( − ) ∑ ( − )
where n is the number of design points and ȳ, ŷ i , and y i represent the mean, the predicted,
and the actual response as defined in Figure 4.13. R 2 Î [0 1], where 1.0 indicates a perfect
fit. However, a high R 2 value can be deceiving if it is due to overfitting, which implies that
the model will have poor prediction capabilities between the fitting points. Another
occasion when the R 2 value can be misleading is when the response is insensitive to the
studied variables, i.e. the metamodel equals the mean value of the observed responses. In
this case R 2 will be close to 0 even for a well fitted model.
36
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
response
variable
Some metamodels are interpolating the dataset, which means that there are no residuals
and that R 2 equals 1.0. For the deterministic simulation case without random error or
numerical noise, this is ideal. However, there is no guarantee that these interpolating meta-
models are predicting the response between the known points better than other models.
Furthermore, when numerical noise is present, it can be beneficial to filter the response by
using an approximating model.
More interesting than studying the fitting error is often to estimate how well the meta-
model can predict responses at unknown design points, i.e. study the prediction error. This
can be done by evaluating the error measures mentioned above for a set of points that are
not used to fit the model. It is essential that the validation set is large enough and spread
over the design domain to provide a reliable picture of the accuracy. It is also important
that the points in the validation set are not placed too close to the fitting points, since that
can lead to an over-optimistic evaluation of the metamodel (Iooss et al., 2010).
37
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
omitted points. This approach is computationally more expensive than the p-fold CV.
However, for the special case when k = 1, called leave-one-out CV, an estimation of the
prediction error can be inexpensively computed for some metamodels, e.g. polynomial
(Myers et al., 2008), Kriging (Martin and Simpson, 2004), and RBF models (Goel and
Stander, 2009). The generalization error, i.e. the prediction error, for a leave-one-out
calculation when the error is described by the MSE is represented by
1 1 ( )
= = − (4.42)
where ŷ i (-i ) represents the prediction at x i using the metamodel constructed utilizing all
sample points except (x i , y i ), e.g. see Forrester and Keane (2009).
The vector of leave-one-out errors needed to estimate the prediction error for a Kriging
model fitted to all n points can be evaluated as
= ( − ) (4.43)
where R is the correlation matrix, y is the vector of observed responses, b is the vector of
estimated regression coefficients, X is the model matrix, and Q is a diagonal matrix with
elements that are the inverse of the diagonal elements of R -1.
For an RBF metamodel on the form ( ) = +∑ ( ), the vector of leave-one-
out errors can be evaluated as
= ( ) (4.44)
where y is the vector of observed responses and P is the projection matrix, which is
defined by
= − ( + ) (4.45)
F is the design matrix constructed using the response of the radial functions at the design
points such that F i1 = 1, F ij+1 = f j (x i ), i = 1, ... , n and j = 1, ... , n RBF . Λ is a diagonal
matrix, where Λ ii , i = 1, ... , n RBF , is the regularization parameter associated with the i th
weight as briefly mentioned at the end of Section 4.3.3.
It has been shown by Meckesheimer et al. (2002) that k = 1 in leave-k-out CV, i.e. leave-
one-out, provides a good prediction error estimate for RBF and low-order polynomial
metamodels. For Kriging models, the recommendation is instead to choose k as a function
of the sample size, e.g. k = 0.1n or k = √ .
The leave-one-out CV is a measure of how sensitive the metamodel is to lost information
at its data points. An insensitive metamodel is not necessarily accurate and an accurate
38
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
= (4.46)
1−
and
1+ +
= = (4.47)
1− −
respectively (Stander et al., 2010). n is the number of fitting points, which should be large,
and ν is the number of (effective) model parameters. In the original forms, valid for linear
or unbiased models without regularization, ν is the number of model parameters.
Otherwise, e.g. for neural network models, ν should be the number of effective model
parameters, which could be estimated in different ways.
39
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
40
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
Many of the stochastic optimization algorithms are population-based, i.e. a set of design
points search for the optimum solution. These algorithms are particularly suitable for
multi-objective optimization problems since a set of Pareto optimal points can be found in
one single optimization run. One popular algorithm for solving MOO problems is the non-
dominated sorting genetic algorithm (NSGA-II) developed by Deb et al. (2002).
Even if the stochastic optimization algorithms are associated with some drawbacks, they
are often a suitable choice for the MDO process presented in Chapter 6. Some of these
methods will therefore be presented in more detail in subsequent sections.
Since the different optimization algorithms have different benefits, hybrid optimization
algorithms can be used in which the merits of different methods are taken advantage of.
One example can be to initially use a stochastic optimization algorithm to find the vicinity
of the global optimum, and then use a gradient-based algorithm to identify the optimum
with greater accuracy.
Parent selection
Initialisation Parents
Recombination
Population
Mutation
Termination Offspring
Survivor selection
41
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
Different variants of evolutionary algorithms follow the same basic cycle. They differ only
in details related to a number of components, procedures and operators that must be speci-
fied in order to define a particular EA:
1. Representation
The candidate solutions are defined by a set of design variable settings and possibly
additional information. These, so-called genes, must be represented in some way for
the EA. This could for example be done by a string of binary code, a string of
integers, or a string of real numbers.
2. Fitness function
The fitness function assigns a quality measure to the candidate solutions. This is nor-
mally the objective function or a simple transformation of it. If penalty functions are
used to handle constraints the fitness is reduced for unfeasible solutions.
3. Population
A set of individuals, or candidate designs, forms a population. The number of indi-
viduals within the population, i.e. the population size, needs to be defined.
4. Parent selection mechanism
The role of parent selection is to distinguish among individuals based on their
quality and to allow the better ones to become parents in the next generation. This
selection is usually probabilistic so that high-quality individuals have higher chance
of becoming parents than those with low quality. Nevertheless, low-quality indi-
viduals often still have a small chance of getting selected to avoid the algorithm
from being trapped in a local optimum.
5. Variation operators
The role of variation operators are to create new individuals (offspring) from old
ones (parents), i.e. generate new candidate designs. Recombination is applied to two
or more selected candidates and results in one or more new candidates. Mutation is
applied to one candidate and results in one new candidate. Both operators are
stochastic and the outcome depends on a series of random choices. Several different
versions exist for the various representations.
6. Survivor selection mechanism
The role of survivor selection is to select, based on their quality, the individuals that
should form the next generation. Survivor selection is often deterministic, for
instance ranking the individuals and selecting the top segment from parents and
offspring (fitness biased) or selecting only from the offspring (age biased).
42
CHAPTER 4.. METAMODEL-
METAMODEL-BASED
BASED DESIGN OPTIMIZATION
OPTIMIZATION
a)
b)
Figure 4.15
15 Typical variation operators used in simple GA for a three variable design in a
representation:: a) recombination
binary string representation ecombination using 1--point crossover,, and b) bit
point crossover it-flip
flip
mutation.
Evolution strategies (ES(ESss)) also belong to the EA family and were developed by I.
Rechenberg and H.- H.-P. Schwefel, see Beyer and Schwefel (2002)(2002).. In the original ES
algorithm, one parent individual is subjected to mutation to form one offspring and the
better of these two individuals is chosen to form the next generation. Development of the
method has now lead to more complex algorithms. General ES ESs have a real valued
representation, random parent selection, and mutation as the primary operator for
enerating new candidate solutions. After creating λ offspring and calculating their fitness,
generating
the best μ are chosen deterministically, either from the offspring
offspring only, called (μ, λ)
selection, or from the union of parents and offspring, called (μ + λ) selec
selection.
tion. Often (μ, λ)
selection is preferred, especially if local optima exist. The value λ is generally much
higher than the value μ,, a ratio of 1 to 7 is recommended by Eiben and Smith (2003).
(2003) Most
ESs are self-adaptive
ES self adaptive
adaptive, which means that some parameters are are included in the represen
represen-
43
CHAPTER 4.
4 METAMODEL
METAMODEL-BASED
BASED DESIGN OPTIMIZ
OPTIMIZATION
ATION
a)
b)
Figure 4.16
16 Typical variation operators used in an ES for a three variable design in a real
representation a) discrete and intermediary recombination, and b) mutation
valued representation: utation by
Gaussian perturbation.
perturbation
A comparison between GAs and ES ESs is presented in Table 4.22. In general, GAs are con-
con
sidered more likely to find the global optimum
optimum, while ESESss are considered
considered faster. A general
recommendation is therefore to use a GA if it is important to find the global optimum,
while an ES should be used if speed is important and a "good enough" solution is accep accep-
table. However, the results depend on the algorithm settings and the problem at hand, and
there is no generally accepted cconclusion
onclusion on the superiority of any of the algorithms.
Instead, the merits of both algorithms could be taken advantage of if they are used
together ((Hwang Jang, 2008
Hwang and Jang 2008).
).
44
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
Table 4.2 Overview of typical features of genetic algorithms and evolution strategies
according to Eiben and Smith (2003).
Constraints are often enforced by using penalty functions that reduce the fitness of un-
feasible solutions. Preferably, the fitness is reduced in proportion to the number of con-
straints that are violated. A good idea is often also to reduce the fitness in proportion to the
distance from the feasible region. The penalty functions are sometimes set so large that
unfeasible solutions will not survive. Occasionally the penalty functions are allowed to
change over time and even adapt to the progress of the algorithm. There are also other
techniques to handle constraints. One of them is to use a repair function that modifies an
unfeasible solution into a feasible one.
45
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
= + ∆ (4.48)
where i refers to the i th particle in the swarm, q to the q th iteration and v i q to the velocity.
The time increment ∆t is usually set to one and the velocity vector is updated in each
iteration using
( − ) ( − )
= + + (4.49)
∆ ∆
where w is the inertia parameter, r 1 and r 2 are random numbers between 0 and 1, c 1 and c 2
are the trust parameters, p i is the best point found so far by the i th particle, and p g is the
best point found by the swarm. Thus, the user needs to select or tune the values of w, c 1
and c 2, and decide on the number of particles in the swarm, as well as how many iterations
that should be performed. The inertia parameter w controls the search behaviour of the
algorithm. Larger values (around 1.4) result in a more global search, while smaller values
(around 0.5) result in a more local search (Venter, 2010). The c 1 trust parameter indicates
how much the particle trusts itself, while c 2 specifies how much the particle trusts the
group. Recommended values are c1 = c 2 = 2 (Venter, 2010). Finally, p g can be selected to
represent either the best point in a small subset of particles or the best point in the whole
swarm.
The original PSO algorithm has been developed and enhanced, and different versions have
been applied to different types of optimization problems. Constraints can be handled by
penalty methods, as described in the previous section. Another simple approach is to use
strategies that preserve feasibility. Hu et al. (2003) describe a method where each particle
is initialized repeatedly until it satisfies all the constraints and where the particles then
search the whole space but only keep the feasible solutions in their memory.
46
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
2. Sampling
A new sampling point x' Î X is selected using a sampling distribution D(X (q)), and
the corresponding energy E' = E(x') is calculated. The set of checked points
X (q+1) = X (q) U {x'} now contains q + 2 designs.
3. Acceptance check
A random number ζ is sampled from the uniform distribution [0 1] and
( ) ( )
( )
= ≤ ( , , ) (4.50)
ℎ
where A is the acceptance function that determines if the new point is accepted. The
most commonly used acceptance function is the Metropolis criterion
( ( ))
, ( )
, ( )
= 1, ( ) (4.51)
4. Temperature update
The cooling schedule T (q+1) = C (X (q+1), T (q)) is applied to the temperature. It has
been proven that a global minimum will be obtained if the cooling is made
sufficiently slowly (Geman and Geman, 1984).
5. Convergence check
The search is ended if the stopping criterion is met, otherwise q = q + 1 and the
search continues at step 2. Usually, the search is stopped when there is no noticeable
improvement over a number of iterations, when the number of iterations has reached
a predefined value, or when the temperature has fallen to a desired level.
47
CHAPTER 4. METAMODEL-BASED DESIGN OPTIMIZATION
It is obvious that the efficiency of the algorithm depends on the appropriate choices of the
mechanisms to generate new candidate states D, the cooling schedule C, the acceptance
criterion A, and the stopping criterion. The choices of D and C are generally the most im-
portant issues in defining an SA algorithm and they are strongly interrelated. The next
candidate design x' is usually selected randomly in the neighbourhood of the current
design x, with the same probability for all neighbours. The size of the neighbourhood is
typically selected based on the idea that the algorithm should have more freedom when the
current energy is far from the global optimum. Larger step sizes are therefore allowed
initially. However, a more complicated, non-uniform selection procedure is used in adap-
tive simulated annealing to allow much faster cooling rates (Stander et al., 2010). The
basic idea of the cooling schedule is to start at a high temperature and then gradually drop
the temperature to zero. The primary goal is to quickly reach a temperature with low
energies, but where it is still possible to explore different areas of the design space.
Thereafter, the SA algorithm lowers the temperature slowly until the system freezes and
no further changes occur.
Simulated annealing algorithms generally handle constraints by penalty methods similar to
the ones described in Section 4.5.1, i.e. the energy for unfeasible solutions is increased so
that the probability of selecting such designs is reduced.
Hill-climbing is a very simple local optimization algorithm where new candidate designs
are iteratively tested in the region of the current design and adopted if they are better. This
enables the algorithm to climb uphill until a local maximum is found. A similar technique
could be used to find a local minimum. Simulated annealing differs from these simple
algorithms in that new candidate solutions can be chosen at a certain probability even if
they are worse than the previous one, i.e. have higher energy. A new worse solution is
more likely to be chosen early in the search when the temperature is high and if the dif-
ference in energy is small. Simulated annealing hence goes from being similar to a random
search initially, with the aim of finding the region of the global optimum, to being very
similar to "Hill-climbing" in order to locate the minimum more exactly.
48
Multidisciplinary 5
Design Optimization
Historically, the roots of multidisciplinary design optimization can be found in structural
optimization, mainly within the aerospace industry (Agte et al., 2010). Disciplines
strongly interacting with structural parts were first included in the optimization problem,
making it multidisciplinary. The development has then been heading towards incorpo-
rating whole systems in the MDO studies, i.e. also including design variables important
for other disciplines than the structural ones.
The development of MDO can be described in terms of three generations (Kroo and
Manning, 2000), see Figure 5.1. Initially, when the problem size was limited, all discip-
linary analyses were integrated directly with an optimizer. To be able to perform more
extensive MDO studies, analyses were distributed in the second generation of MDO
methods. The first two generations are so-called single-level optimization methods, i.e.
they rely on a central optimizer to coordinate the optimization and make all design
decisions. When MDO was applied to even larger problems involving several departments
within a company, it was found unpractical to rely on a central optimizer as the only
decision-maker. The third generation of MDO methods therefore consists of the so-called
multi-level optimization methods, where the optimization process as such is distributed.
The distribution of decisions more closely resembles the standard product development
process where different groups are responsible for the development of different parts and
aspects of the product.
a) b) c)
Figure 5.1 The three generations of MDO methods: a) single-level optimization with
integrated analyses, b) single-level optimization with distributed analyses, and c) multi-
level optimization.
49
CHAPTER 5. MULTIDISCIPLINARY DESIGN OPTIMIZATION
As the first generation of MDO methods are unsuitable for large-scale applications, single-
level methods will refer to the second generation of MDO methods in the following.
50
CHAPTER 5. MULTIDISCIPLINARY DESIGN OPTIMIZATION
known ones are briefly presented here. More detailed descriptions can be found in e.g.
Ryberg et al. (2012).
Concurrent subspace optimization (CSSO) was first introduced by J. Sobieszczanski-
Sobieski (1988). It is an iterative approach that starts with finding a multidisciplinary
consistent solution for the initial design. The main idea is then to distribute each shared
variable to the subspace whose objective and constraint functions it affects the most. Next,
all subspaces are optimized with respect to its local variables and a subset of its shared
variables, while all other variables are held constant. The formulation of each subspace
optimization problem includes minimization of the objective function subject to one
combined local constraint and to approximations of the combined constraints from the
other subspaces. The responsibility for fulfilling the constraints is thus shared between all
the subspace optimizers, and the distribution of the responsibility is governed by a system
coordinator. The new design point is simply the combination of optimized variables from
the different subspaces, and this point is not necessarily feasible (Pan and Diaz, 1989).
Finally, the system coordinator redistributes the responsibility for the different constraints
for the next iteration and the process is continued until convergence is reached.
Unfortunately, the method can experience convergence issues and may fail to solve some
simple problems (Shankar et al., 1993).
A more recent version of CSSO was developed by Renaud and Gabriele (1991, 1993,
1994) in a series of articles, and many subsequent approaches are based on their work. In
their formulation, the subspace optimizations are followed by the solution of an app-
roximation of the global optimization problem. This approximation is constructed around
the combination of optimized variables from the different subspaces. The obtained
optimum is then the design vector input to the next iteration. All variables are conse-
quently dealt with at the system level. This restricts the autonomy of the groups res-
ponsible for each subspace, which is the main motivation for using a multi-level method.
Bilevel integrated system synthesis (BLISS) was first presented by J. Sobieszczanski-
Sobieski et al. (1998) and the method has some similarities with CSSO. The original
implementation concerns four coupled subspaces of a supersonic business jet: structures,
aerodynamics, propulsion, and aircraft range. The method is iterative and optimizes the
design in two main steps. First, subspace optimizations are performed in parallel with
respect to the local variables subject to local constraints. Next, the system optimizer finds
the best design with respect to the shared variables. A linear approximation of the global
objective function is constructed and split into objectives for the system and subspace
optimization problems. Normally, the system optimization problem is considered to be an
unconstrained problem. However, if the constraints in the subspace optimizations depend
more strongly on the shared and coupling variables than on the local variables, they might
need to be included in the system optimization, turning the system optimization problem
into a constrained one. The BLISS procedure separates the optimization with respect to
51
CHAPTER 5. MULTIDISCIPLINARY DESIGN OPTIMIZATION
the local and shared variables, and sometimes a different solution can be obtained
compared to the case when all variables are optimized simultaneously (Kodiyalam and
Sobieszczanski-Sobieski, 2000).
A reformulation of the original method, referred to in the literature as BLISS 2000, or
simply BLISS, was presented by Sobieszczanski-Sobieski et al. (2003). The key concept
in BLISS 2000 is the use of surrogate models to represent optimized subspaces. To create
these surrogate models, a DOE is created and a number of subspace optimization prob-
lems are solved with respect to the local variables. In each subspace optimization problem,
the sum of the coupling variables output from that specific subspace multiplied with
weighting coefficients is minimized subject to local constraints. The resulting surrogate
models represent the coupling variables output from each subspace as functions of the
shared variables, the coupling variables input to that subspace, and the weighting coef-
ficients. Polynomial surrogate models are used in the original version of BLISS 2000, but
each subspace could, in principle, be given the freedom to choose their own surrogate
model. The system optimizer uses the surrogate models to minimize the global objective
subject to consistency constraints. The BLISS 2000 formulation was developed to handle
coupled subspaces and is not relevant for problems lacking coupling variables since the
subspace objective functions then no longer exist.
An early description of Collaborative optimization (CO) was published by Kroo et al.
(1994), and it was further refined by Braun (1996). Collaborative optimization can handle
coupling variables and has mainly been used for aerospace applications. In CO, the system
optimizer is in charge of target values of the shared and coupling variables. The subspaces
are given local copies of these variables, which they have the freedom to change during
the optimization process. The local copies converge towards the target values at optimum,
i.e. a consistent design is obtained. The system optimizer minimizes the global objective
function subject to constraints that ensure a consistent design. The subspace optimizers
minimize the deviation from consistency subject to local constraints. There are a number
of numerical problems associated with CO when used in combination with gradient-based
algorithms (DeMiguel and Murray, 2000; Alexandrov and Lewis, 2002). These problems
hinder convergence proofs and have an unfavourable effect on the convergence rate.
A number of attempts to modify the CO formulation in order to overcome the numerical
difficulties are documented in the literature. These include the introduction of polynomial
surrogate models to represent the subspace objective functions (Sobieski and Kroo, 2000),
modified collaborative optimization (DeMiguel and Murray, 2000), and enhanced colla-
borative optimization (Rooth, 2008). In enhanced collaborative optimization (ECO), the
goal of the system optimizer is to find a consistent design. There are no constraints at the
system level, which makes the system optimization problem easy to solve. The objective
functions of the subspaces contain the global objective in addition to measures of the
deviation from consistency. It is intuitively more appealing for the subspaces to work
52
CHAPTER 5. MULTIDISCIPLINARY DESIGN OPTIMIZATION
towards minimizing a global objective, rather than towards minimizing a deviation from
consistency as is done in the original CO formulation. The subspaces are subject to local
constraints as well as to linearized versions of the constraints in the other subspaces. The
inclusion of the latter constraints provides a direct understanding of the preferences of the
other subspaces, unlike CO, where this knowledge is only obtained indirectly from the
system optimizer. However, the complexity of ECO is a major drawback.
Analytical target cascading (ATC) was developed for automotive applications (Kim,
2001). It was originally intended as a product development tool for propagating targets,
i.e. convert targets on the overall system to targets on smaller parts of the system. How-
ever, it can also be used for optimization if the targets are unattainable. Analytical target
cascading was established for an arbitrary number of levels, but a formulation with two
levels, as in the previously presented methods, is also possible. The objective functions of
the optimization problems on all levels consist of terms that minimize the unattainability
of the target and terms that ensure consistency (Michalek and Papalambros, 2005b). In
each of these problems, there are local variables and local constraints. Shared and
coupling variables are handled in a fashion similar to CO, i.e. an upper level optimizer has
target values and the lower level optimizers have local copies of these variables.
Normally, consistency is obtained by including the square of the L2-norm of the deviation
from consistency multiplied by penalty weights in the objective functions. In this for-
mulation, it is important to choose the penalty weights appropriately (Michalek and
Papalambros, 2005a). Too small weights can yield solutions far from the solution of the
original problem, and too large weights can cause numerical problems. Other types of
penalty functions have therefore been proposed in the literature, see e.g. Tosserams et al.
(2006).
53
CHAPTER 5. MULTIDISCIPLINARY DESIGN OPTIMIZATION
rithms. Furthermore, it is not unusual that multi-level MDO methods require more com-
putational resources than single-level methods.
When performing optimization of automotive structures, it is often considered necessary
to use metamodels to reduce the required computational effort. It is possible to use meta-
models in both single-level and multi-level MDO methods. However, the effects are more
interesting and important when metamodels are used in combination with single-level
methods. In both cases, the effect from computational efficiency is obtained, but in
addition, the drawback of limited autonomy and concurrency associated with single-level
methods can be relieved by the introduction of metamodels. In metamodel-based design
optimization, the main computational effort is spent on building the metamodels. During
this process, the groups involved can work concurrently and autonomously using their pre-
ferred methods and tools. The issue of not participating in design decisions when using
single-level methods can also partly be compensated by involving the different groups in
the setup of the optimization problem and in the assessment of the results. However, since
the optimization is done on a system level, the individual groups cannot govern the choice
of optimization methods and tools. In addition, groups that have inexpensive simulations
either have to create metamodels and introduce an unnecessary source of error, or let the
central optimizer call their analyzers directly and give up more of their autonomy. Despite
these drawbacks, a single-level method in combination with the use of metamodels is
often the most convenient way to solve MDO problems for automotive structural
applications that involve computationally expensive simulations. The complexity of the
multi-level methods and lack of readily available software make them a less attractive
alternative.
54
An MDO Process for 6
Automotive Structures
Multidisciplinary design optimization is not yet a standard methodology for automotive
structures, but there are obvious benefits if this could be achieved. In order to implement
MDO into the product development process, there are a number of aspects that need to be
considered, both related to the characteristics of the problems and the implementation of
the MDO method itself. An MBDO approach based on global metamodels turns out to be
an appropriate methodology for MDO of automotive structures. Its suitability and
efficiency for general automotive structural problems can clearly be demonstrated in a test
problem, which resembles a typical MDO problem from the automotive industry.
6.1 Requirements
To solve a large-scale automotive MDO problem generally involves several groups within
a company. A process that fits the company organization and product development
process is therefore needed for MDO to be a part of the daily work. It is also important to
make use of the existing expertise within the company, i.e. let the experts take part in the
design decisions and let them use their own methods and tools. In addition, MDO studies
must be realized in a reasonable time in order to fit the product development process.
Altogether, this can be summarized in
The MDO process also needs to be computationally efficient. It is important to limit the
time needed for the study and not exceed the available computer resources. Multi-
disciplinary design optimization requires evaluations of many different variable settings
for all the included loadcases, and the detailed simulation models are commonly
computationally expensive to evaluate. Consequently, is it often efficient to use meta-
models in the optimization process in order to reduce the required computational
resources. Hence,
55
CHAPTER 6. AN MDO PROCESS FOR AUTOMOTIVE STRUCTURES
To solve an MDO problem that involves coupling variables is significantly more comp-
licated than if only shared variables are considered. Since coupling variables are rare when
solving MDO problems for automotive structures they can be neglected in the proposed
process. Consequently,
The nature of MDO studies can differ substantially, e.g. be extensive or limited depending
on the number of variables and loadcases, and have one or several objectives. Variations
are always present in reality, and there is an interest to consider uncertainties in the design
also during optimization. It is important that the process can handle as many of these
variants as possible. Thus,
The main driver for implementing a multi-level optimization method is to distribute the
design process (Kroo and Manning, 2000). However, multi-level methods complicate the
solution process and to justify their use, the advantages must be greater than the cost. It
was concluded in Section 5.3 that a single-level method in combination with the use of
metamodels is the most straightforward way of solving automotive MDO problems with-
out coupling variables. The drawback of not making design decisions can be relieved by
involving the different groups in the setup of the optimization problem and in the assess-
ment of the results. The benefits of a multi-level method are thus not considered to com-
pensate for the drawbacks. Hence,
56
CHAPTER 6. AN MDO PROCESS FOR AUTOMOTIVE
AUTOMOTIVE STRUCTURES
Figure 6.1
1 A metamodel
metamodel-based
based multidisciplinary design optimization process.
The aim of the described optimization process is to increase the knowledge of the system
so that balanced
balanced design decisions can be made
made,, which result in improved product perforperfor-
mance. However, there
there are a number of prerequisites for the MDO study to be successful
successful..
Firstly, stable simulation models that accurately capture the behaviour of the product must
be available.
available. A stable model can be run without failures when the variables are changed
within the design space
space. Additionally, the variations in output are mainly caused by
changes in input and are not due to numerical noise
noise.. It is also essential to have a thorough
understanding of the simulation models in order to know their limitations and be able to
assess the obtained results. Moreover,
Moreover, it is important to be aware of the simulation budget
to make a decision regarding the amount of simulations that can be afforded in the
different steps of the process. The six steps of the process are:
Step 1: Setup
First, the scope of the problem must be specified, i.e. the appropriate
First,
loadcases should be selected and the problem should be mathemamathema-
tically defined. The
The mathematical definition includes selection of the
objective or objectives; the design variables and the range within
which they are allowed to vary, i.e. the design space; and the perfor-
perfor
constraints.
mance cons traints. The objectives and performance constraints are
responses
responses from the simulation models or derived from the responses.
responses
57
CHAPTER 6. AN MDO PROCESS FOR AUTOMOTIVE STRUCTURES
Step 5: Optimization
When the metamodels are found to be satisfactory, the optimization
can be performed. In general, optimization algorithms cannot guaran-
tee that the global optimum is obtained. It is therefore preferable to
use more than one optimization algorithms for each set of meta-
models. Several design proposals are then obtained and can be
studied further.
Step 6: Verification
Based on the optimization results, one or several potential designs can
be selected and verified using the detailed simulation models.
Differences between results from the optimization study and results
from the verification simulations are caused by inaccurate meta-
models or improper selection of variables during the screening.
Sometimes, there are large constraint margins or no feasible design is
found. Then, manual adjustments of the design proposals based on
information from the metamodels and the screening can improve the
results. However, if the discrepancies are large, it might be necessary
to go back and improve the step causing the issues.
58
CHAPTER 6. AN MDO PROCESS FOR AUTOMOTIVE
AUTOMOTIVE STRUCTURES
a)
b)
Figure 6..2 a) Geometry of the studied structure. b) Loadcases with the measured res
res-
ponses indicated.
59
CHAPTER 6. AN MDO PROCESS FOR AUTOMOTIVE STRUCTURES
When the described process is followed, there are many choices that must be made, e.g.
related to software, screening methods, DOEs, metamodels, and optimization algorithms.
The example presented here is only an illustration of the proposed process and the selec-
tion of methods for other studies should be related to the nature of the considered problem,
rather than a copy of the methods used here. The execution and results for the different
steps are briefly described below, and a more detailed description can be found in Paper II.
The screening is performed using a simple technique in which one variable at a time is set
to the lowest value, while the other variables are left at their nominal values. It is then
possible to identify the variables that influence the different responses by comparing the
obtained results with the results from the nominal design. Here, the selection is done based
on a global sensitivity analysis (see Section 4.2) and a criterion that the difference in result
compared to the nominal design must not be too large. In this way, the number of
variables is reduced from the original 25 to 15, 7, 11, and 12, respectively, for the four dif-
ferent loadcases. It is noted that three of the variables are not considered to be important
for any loadcase and these variables are therefore set to their minimum values for the rest
of the study.
Among the different space-filling algorithms available, a maximin design is chosen, (see
Section 4.1). The sample size is initially set to 3n i , where i represents the different load-
cases and n i is the number of variables for loadcase i. Based on the estimated accuracy of
the metamodels, it is judged whether or not a larger DOE sample size is needed for the
different loadcases. Additional design points are then added sequentially in groups of n i ,
up to a maximum of 6n i design points for loadcase i. The final number of simulations for
the different loadcases is presented in Table 6.1. The variables not included in the
different DOEs are seen as constants and set to their nominal values.
Many different metamodel types can be used to approximate the detailed FE models (see
Section 4.3). First, radial basis function neural networks are chosen. An attractive feature
of these metamodels is that they can easily provide an estimation of the prediction error
(see Section 4.4.2). One metamodel is built for each response with variables selected
according to the screening. The metamodel for the mass is built with all the variables in-
cluded, i.e. from the sum of the initial datasets from all the loadcases. In order to deter-
60
CHAPTER 6. AN MDO PROCESS FOR AUTOMOTIVE STRUCTURES
mine the DOE sample sizes and judge the quality of the metamodels, the fitting error
represented by R 2 and the prediction error represented by RMSECV are studied. The target
is to reach R 2 > 0.95 and RMSE CV < 5% of the mean value. The mass is a linear function
of the variables and a perfect fit is therefore obtained. As can be seen in Figure 6.3, the
error measure targets are achieved for the responses in the roof crush and the modal
analysis loadcases, but not for the other two loadcases. In the hope of improving the
accuracy, feedforward neural networks are also fitted to the same datasets. This
metamodel type often results in a closer fit to the simulated responses, but it provides no
direct information about the prediction error.
R2 RMSECV, % of mean
1.00 30
28
0.90
26
0.80 24
22
0.70 3ni
3ni
20
0.60 18 4ni
4ni
16 5ni
0.50 14 5ni
0.40 12 6ni
6ni
10
0.30 8
0.20 6
4
0.10 2
0.00 0
intr_mid_front
freq_m1_modal
freq_m2_modal
tx05_mid_front
intr_upper_bending
forc_3_roof
forc_max_roof
intr_lower_bending
freq_m1_modal
freq_m2_modal
intr_mid_front
tx05_mid_front
intr_upper_bending
forc_3_roof
forc_max_roof
intr_lower_bending
Figure 6.3 Error measures for the RBFNN metamodels representing the constraints.
61
CHAPTER 6. AN MDO PROCESS FOR AUTOMOTIVE STRUCTURES
Table 6.2 Design proposals from optimization studies using RBFNN and FFNN meta-
models. (+) indicates an increased thickness and (-) a decreased thickness. Results from
both metamodels and verification simulations are presented.
Thickness (mm)
Variable
Min. Nom. Max. RBFNN FFNN
t_1001 0.70 1.00 1.30 0.80 (-) 0.85 (-)
t_1002 0.70 1.00 1.30 0.75 (-) 0.85 (-)
t_1003 0.70 1.00 1.30 1.20 (+) 0.80 (-)
t_1004 0.70 1.00 1.30 0.80 (-) 0.70 (-)
t_1005 0.70 1.00 1.30 0.70 (-) 0.70 (-)
t_1006 0.70 1.00 1.30 0.75 (-) 1.05 (+)
t_1007 0.70 1.00 1.30 0.70 (-) 0.70 (-)
t_1008 0.70 1.00 1.30 1.15 (+) 0.70 (-)
t_2010 1.40 2.00 2.60 2.60 (+) 2.25 (+)
t_2011 1.40 2.00 2.60 1.40 (-) 1.55 (-)
t_2012 1.40 2.00 2.60 1.75 (-) 2.25 (+)
t_2013 1.40 2.00 2.60 2.20 (+) 2.15 (+)
t_2014 1.90 2.50 3.10 3.05 (+) 1.90 (-)
t_2015 1.40 2.00 2.60 2.20 (+) 2.15 (+)
t_3016 1.40 2.00 2.60 1.40 (-) 1.40 (-)
t_3017 1.90 2.50 3.10 2.95 (+) 2.95 (+)
t_3018 1.40 2.00 2.60 1.40 (-) 2.40 (+)
t_4019 1.40 2.00 2.60 1.40 (-) 1.40 (-)
t_4020 1.40 2.00 2.60 1.40 (-) 1.40 (-)
t_4021 1.40 2.00 2.60 1.40 (-) 1.40 (-)
t_4022 1.40 2.00 2.60 2.00 (-) 2.40 (+)
t_4023 1.40 2.00 2.60 1.80 (-) 2.15 (+)
t_4024 1.40 2.00 2.60 2.60 (+) 1.45 (-)
t_4025 1.40 2.00 2.60 2.55 (+) 2.50 (+)
t_4026 1.40 2.00 2.60 2.60 (+) 2.55 (+)
RBFNN FFNN
Response Unit Nom. Req.
Pred. Calc. Pred. Calc.
Mass kg 56.96 Min. 56.30 56.30 55.95 55.95
intr_mid_front mm 7.74 < 7.8 7.80 5.37 7.77 7.33
tx05_mid_front ms 11.54 > 11.5 11.51 11.33 11.51 11.68
intr_upper_side mm 25.38 < 25.4 25.11 23.54 25.04 24.28
intr_lower_side mm 25.37 < 25.4 24.40 23.83 25.24 24.12
forc_3_roof kN 44.81 > 44.8 49.77 48.78 48.63 46.88
forc_max_roof kN 57.11 > 57.1 57.34 57.86 57.16 58.86
freq_m1_modal Hz 107.5 > 107 113.8 115.7 115.3 109.3
freq_m2_modal Hz 146.5 > 146 146.4 152.7 146.1 148.6
Mass reduction 7.89% 12.01%
62
CHAPTER 6. AN MDO PROCESS FOR AUTOMOTIVE STRUCTURES
The accuracy of the two sets of metamodels cannot be compared since there is no known
prediction error for the FFNN metamodels. Both design proposals are therefore simulated
using the detailed FE models. It is found that the design with the largest mass reduction
fulfils all the constraints, while the other design violates one of the constraints, see Table
6.2. By studying the deformations and the responses as function of time, it is concluded
that no undesired behaviour is obtained for the design with 12.0% mass reduction. The
optimization study can thus be considered successful and the design proposal obtained
with the FFNN metamodels resulting in a 12.0% mass reduction is seen as the final result.
No manual adjustments of the design proposal are necessary since all the constraints are
fulfilled. The satisfactory results indicate that no important variables are omitted in the
screening process and that the metamodels are reasonably accurate.
63
CHAPTER 6. AN MDO PROCESS FOR AUTOMOTIVE STRUCTURES
64
7
Discussion
Metamodels have been used for several years to approximate detailed and computationally
expensive simulation models. In particular, it has been an attractive method in opti-
mization studies where many evaluations of different variable settings are required. His-
torically, low-order polynomials, which have the ability to capture the behaviour of a
limited part of the design space, have often been used in an iterative approach. Today, the
trend goes towards using more advanced metamodels that can capture the behaviour of the
complete design space. If the metamodels are sufficiently accurate, the optimization can
then be performed in a single-stage approach. For these optimization problems, a
stochastic optimization algorithm is often the best choice in order to find the global opti-
mum. The approach with global metamodels is very attractive since it is flexible and can
be used for different kind of optimization studies. There is in principle no difference to
perform multidisciplinary design optimization compared to perform optimization taking
into account responses from only one single loadcase.
Metamodel-based MDO has been investigated by the automotive industry, but has not yet
been implemented within the product development process. According to Duddeck
(2008), the difficulties so far have mainly been related to insufficient computational re-
sources and inadequate metamodel accuracy. However, new metamodels are being deve-
loped and computer capacity is constantly increasing. Therefore, Agte et al. (2010) claim
that the late introduction of MDO is more related to organizational and cultural issues than
to technical barriers. An MDO process as straight-forward as the one proposed in this
thesis is therefore appealing, since it works well in an existing organizational structure and
leaves much of the work and part of the decision freedom to the disciplinary experts.
The success of the defined process depends on the accuracy of the metamodels. Some res-
ponses from the detailed simulation models are complex and difficult to capture, which
can result in inaccurate metamodels. As shown by the presented application example, this
is often the case for responses in front crash loadcases, where phenomena related to axial
buckling can be hard to model. In situations with limited metamodel accuracy, it is often
useful to try different metamodels and optimization algorithms in order to obtain several
different design proposals to choose from. Furthermore, it is shown by the presented
application example that the results can be satisfactory even when the metamodels do not
have the desired accuracy. In addition, if constraints are found to be violated in the
65
CHAPTER 7. DISCUSSION
66
8
Conclusion and Outlook
The aim of this thesis has been to find a suitable method for implementing multi-
disciplinary design optimization in automotive development. The goal has been to develop
an efficient MDO process for large-scale structural applications where different groups
need to work concurrently and autonomously using computationally expensive simulation
models. The presented process can be categorized as a single-level MDO method that uses
global metamodels to achieve autonomy and concurrency for the groups during the exe-
cution of the most computationally demanding steps of the process. The process has been
demonstrated in a simple application example and the results show that:
- The presented process is efficient, flexible, and suitable for common structural
MDO applications in the automotive industry, such as weight optimization
with respect to NVH and crashworthiness.
- The process can easily fit into an existing organization and product develop-
ment process with different groups developing the metamodels in parallel using
the tools of their choice.
- The different groups of disciplinary experts can take part in the design
decisions by participating in the setup of the problem and in the selection of
final designs.
A traditional approach during automotive development has been to find a design that
meets all requirements, i.e. a design that is feasible but not optimal. By incorporating the
described metamodel-based process for multidisciplinary design optimization into the
67
CHAPTER 8. CONCLUSION AND OUTLOOK
product development process, there is a potential for designing better products in a shorter
time.
Even if the process is suitable for implementation today, there are several aspects that can
be studied further. For example would it be interesting to investigate the implications of
coupled disciplines, i.e. the existence of coupling variables, to broaden the scope of app-
lication. The method is developed to be flexible and have the possibility to incorporate
robustness considerations and multiple objectives. The use of the process for different
types of problems and applications is therefore welcomed.
The success of the proposed method is relying on global metamodels that can represent
the complete design space accurately. It is therefore appropriate also to investigate ways to
achieve improved metamodels for complex responses that include e.g. discontinuities.
68
9
Review of Appended Papers
Paper I
Multidisciplinary design optimization methods for automotive
structures
The aim of the first paper is to find suitable MDO methods for large-scale automotive
structural applications. The paper includes descriptions and a comparison of a number of
MDO formulations presented in the literature. The suitability of these methods is then
assessed in relation to the special characteristics of automotive structural MDO problems.
It is stated that the method should allow the involved groups to work concurrently and
autonomously, must allow the use of metamodels, but does not need to handle coupling
variables. It is found that a single-level method in combination with the use of metamodels
is often the most convenient way of solving MDO problems for automotive structural
applications involving computationally expensive simulations.
Paper II
A metamodel-based multidisciplinary design optimization process
for automotive structures
The aim of the second paper is to describe an MDO process for automotive structures and
consider aspects related to its implementation. The process is developed to fit a normal
product development process of a large automotive company and is based on the findings
from the first paper. The requirements placed on the process are presented, and a process
meeting these requirements are described. The suitability of the process is then
demonstrated in a simple application example. It is concluded that the presented process is
efficient, flexible, and suitable for common structural MDO applications within the
automotive industry. Furthermore, it fits easily into an existing organization and product
development process.
69
CHAPTER 9. REVIEW OF APPENDED PAPERS
70
Bibliography
Agte, J., de Weck, O., Sobieszczanski-Sobieski, J., Arendsen, P., Morris, A., and Spieck,
M. (2010). MDO: Assessment and direction for advancement - An opinion of one
international group. Structural and Multidisciplinary Optimization, 40(1-6), 17-33.
Akaike, H. (1970). Statistical predictor identification. Annals of the Institute of Statistical
Mathematics, 22(1), 203-217.
Alexandrov, N. M., and Lewis, R. M. (2002). Analytical and computational aspects of
collaborative optimization for multidisciplinary design. AIAA Journal, 40(2), 301-
309.
Allison, J., Kokkolaras, M., and Papalambros, P. (2005). On the impact of coupling
strength on complex system optimization for single-level formulations. ASME
International Design Engineering Technical Conferences & Computers and
Information in Engineering Conference. Long Beach, California, USA.
Balling, R. J., and Sobieszczanski-Sobieski, J. (1994). Optimization of coupled systems: A
critical overview of approaches. AIAA-94-4330-CP. 5th AIAA/USAF/NASA/ISSMO
Symposium on Multidisciplinary Analysis and Optimization. Panama City Beach,
Florida, USA.
Beyer, H.-G., and Schwefel, H.-P. (2002). Evolution strategies - A comprehensive
introduction. Natural Computing, 1(1), 3-52.
Braun, R. D. (1996). Collaborative optimization: An architecture for large-scale
distributed design. Ph.D. thesis, Department of Aeronautics and Astronautics,
Stanford University.
Breitkopf, P., Naceur, H., Rassineux, A., and Villon, P. (2005). Moving least squares
response surface approximation: Formulation and metal forming applications.
Computers and Structures, 83(17-18), 1411-1428.
Chester, D. L. (1990). Why two hidden layers are better than one. In M. Caudill (Ed.),
Proceedings of the International Joint Conference on Neural Networks (IJCNN-90-
WASH DC), (pp. 265-268). Washington DC, USA.
71
BIBLIOGRAPHY
Clarke, S. M., Griebsch, J. H., and Simpson, T. W. (2005). Analysis of support vector
regression for approximation of complex engineering analyses. Journal of
Mechanical Design, 127(6), 1077-1087.
Craig, K. J., Stander, N., Dooge, D. A., and Varadappa, S. (2002). MDO of automotive
vehicle for crashworthiness and NVH using response surface methods. AIAA 2002-
5607. 9th AIAA/ISSMO Symposium on Multidisciplinary Analysis and Optimization.
Atlanta, Georgia, USA.
Cramer, E. J., Dennis Jr., J. E., Frank, P. D., Lewis, R. M., and Shubin, G. R. (1994).
Problem formulation for multidisciplinary optimization. SIAM Journal on
Optimization, 4(4), 754-776.
Craven, P., and Wahba, G. (1979). Smoothing noisy data with spline functions.
Numerische Mathematik, 31(4), 377-403.
Deb, K., Pratap, A., Agarwal, S., and Meyarivan, T. (2002). A fast and elitist
multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary
Computation, 6(2), 182-197.
DeMiguel, A.-V., and Murray, W. (2000). An analysis of collaborative optimization
methods. AIAA-2000-4720. 8th AIAA/USAF/NASA/ISSMO Symposium on
Multidisciplinary Analysis and Optimization, 6-8 September 2000. Long Beach,
California, USA.
Duddeck, F. (2008). Multidisciplinary optimization of car bodies. Structural and
Multidisciplinary Optimization, 35(4), 375-389.
Dyn, N., Levin, D., and Rippa, S. (1986). Numerical procedures for surface fitting of
scattered data by radial functions. SIAM Journal on Scientific and Statistical
Computing, 7(2), 639-659.
Eiben, A. E., and Smith, J. E. (2003). Introduction to evolutionary computing. Berlin:
Springer.
Fang, K.-T., J., D. K., Winker, P., and Zhang, Y. (2000). Uniform design: Theory and
application. Technometrics, 42(3), 237-248.
Forrester, A. I., and Keane, A. J. (2009). Recent advances in surrogate-based optimization.
Progress in Aerospace Sciences, 45(1-3), 50-79.
Friedman, J. H. (1991). Multivariate adaptive regression splines. The Annals of Statistics,
19(1), 1-67.
Geman, S., and Geman, D. (1984). Stochastic relaxation, Gibbs distribution, and the
Bayesian restoration in images. IEEE Transactions of Pattern Analysis and Machine
Intelligence, 6(6), 721-741.
72
BIBLIOGRAPHY
73
BIBLIOGRAPHY
Jin, R., Chen, W., and Simpson, T. (2001). Comparative studies of metamodelling
techniques under multiple modelling criteria. Structural and Multidisciplinary
Optimization, 23(1), 1-13.
Jin, R., Chen, W., and Sudjianto, A. (2002). On sequential sampling for global
metamodeling in engineering design. DETC2002/DAC-34092. Proceedings of the
ASME 2002 International Design Engineering Technical Conferences and
Computers and Information in Engineering Conference (DETC2002). Montreal,
Canada.
Johnson, M., Moore, L., and Ylvisaker, D. (1990). Minimax and maximin distance
designs. Journal of Statistical Planning and Inference, 26(2), 131-148.
Kalagnanam, J. R., and Diwekar, U. M. (1997). An efficient sampling technique for off-
line quality control. Technometrics, 39(3), 308-319.
Kennedy, J., and Eberhart, R. (1995). Particle swarm optimization. Proceedings of the
1995 IEEE International Conference on Neural Networks, (pp. 1942-1948). Perth,
Australia.
Kim, H. M. (2001). Target cascading in optimal system design. Ph.D. thesis, Mechanical
Engineering, University of Michigan, Ann Arbor.
Kim, B.-S., Lee, Y.-B., and Choi, D.-H. (2009). Comparison study on the accuracy of
metamodeling technique for non-convex functions. Journal of Mechanical Science
and Technology, 23(4), 1175-1181.
Kirkpatrick, S., Gelatt, C. D., and Vecchi, M. P. (1983). Optimization by simulated
annealing. Science, 220(4598), 671-680.
Kodiyalam, S., and Sobieszczanski-Sobieski, J. (2000). Bilevel integrated system
synthesis with response surfaces. AIAA journal, 38(8), 1479-1485.
Kodiyalam, S., and Sobieszczanski-Sobieski, J. (2001). Multidisciplinary design
optimization - Some formal methods, framework requirements, and application to
vehicle design. International Journal of Vehicle Design, 25(1-2 SPEC. ISS.), 3-22.
Kodiyalam, S., Yang, R., Gu, L., and Tho, C.-H. (2004). Multidisciplinary design
optimization of a vehicle system in a scalable, high performance computing
environment. Structural and Multidisciplinary Optimization, 26(3), 256-263.
Koehler, J., and Owen, A. (1996). Design and analysis of experiments. In S. Ghosh, and
C. Rao (Eds.), Handbook of Statistics (pp. 261-308). Amsterdam: North-Holland.
Kroo, I., Altus, S., Braun, R., Gage, P., and Sobieski, I. (1994). Multidisciplinary
optimization methods for aircraft preliminary design. AIAA-94-4325-CP. 5th
AIAA/USAF/NASA/ISSMO Symposium on Multidisciplinary Analysis and
Optimization. Panama City Beach, Florida, USA.
74
BIBLIOGRAPHY
Kroo, I., and Manning, V. (2000). Collaborative optimization: status and directions.
AIAA-2000-4721. 8th AIAA/USAF/NASA/ISSMO Symposium on Multidisciplinary
Analysis and Optimization. Long Beach, California, USA.
Li, Y., Ng, S., Xie, M., and Goh, T. (2010). A systematic comparison of metamodeling
techniques for simulation optimization in decision support systems. Applied Soft
Computing, 10(4), 1257-1273.
Lin, Y. (2004). An efficient robust concept exploration method and sequential exploratory
experimental design. Ph.D. thesis, Mechanical Engineering, Georgia Institute of
Technology, Atlanta.
Martin, J. D., and Simpson, T. W. (2004). On the use of Kriging models to approximate
deterministic computer models. DETC2004/DAC-57300. Proceedings of the ASME
2004 International Design Engineering Technical Conferences and Computers and
Information in Engineering Conference ( DETC2004). Salt Lake City, Utah, USA.
McKay, M. D., Beckman, R. J., and Conover, W. J. (1979). A comparison of three
methods for selecting values of input variables in the analysis of output from a
computer code. Technometrics, 21(2), 239-245.
Meckesheimer, M., Booker, A., Barton, R., and Simpson, T. (2002). Computationally
inexpensive metamodel assessment strategies. AIAA Journal, 40(10), 2053-2060.
Michalek, J. J., and Papalambros, P. Y. (2005a). An efficient weighting update method to
achieve acceptable consistency deviation in analytical target cascading. Journal of
Mechanical Design, Transactions of the ASME, 127(2), 206-214.
Michalek, J. J., and Papalambros, P. Y. (2005b). Weights, norms, and notation in
analytical target cascading. Journal of Mechanical Design, Transactions of the
ASME, 127(3), 499-501.
Morris, M. D. (1991). Factorial sampling plans for preliminary computational
experiments. Technometrics, 33(2), 161-174.
Myers, R. H., Montgomery, D. C., and Andersson-Cook, C. M. (2008). Response surface
methodology: Process and product optimization using designed experiments (Third
ed.). Hoboken, New Jersey, USA: Wiley.
Owen, A. B. (1992). Orthogonal arrays for computer experiments, integration and
visualization. Statistica Sinica, 2, 439-452.
Pan, J., and Diaz, A. R. (1989). Some results in optimization of non-hierarchic systems.
The 1989 ASME Design Technical Conferences - 15th Design Automation
Conference. Montreal, Quebec, Canada.
75
BIBLIOGRAPHY
Queipo, N. V., Haftka, R. T., Shyy, W., Goel, T., Vaidyanathan, R., and Tucker, P. K.
(2005). Surrogate-based analysis and optimization. Progress in Aerospace Sciences,
41(1), 1-28.
Renaud, J. E., and Gabriele, G. A. (1991). Sequential global approximation in non-
hierarchic system decomposition and optimization. 17th Design Automation
Conference presented at the 1991 ASME Design Technical Conferences, 32, pp.
191-200. Miami, Florida, USA.
Renaud, J. E., and Gabriele, G. A. (1993). Improved coordination in nonhierarchic system
optimization. AIAA journal, 31(12), 2367-2373.
Renaud, J. E., and Gabriele, G. A. (1994). Approximation in nonhierarchic system
optimization. AIAA journal, 32(1), 198-205.
Roth, B. D. (2008). Aircraft family design using enhanced collaborative optimization.
Ph.D. thesis, Department of Aeronautics and Astronautics, Stanford University.
Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Learning internal
represenations by error propagation. In D. E. Rumelhart, and J. L. McClelland
(Eds.), Parallel distributed processing: explorations in the microstructure of
cognition (Vol. 1: Foundations, pp. 318-362). Cambridge: MIT Press.
Ryberg, A.-B., Bäckryd, R. D., and Nilsson, L. (2012). Metamodel-based
multidisciplinary design optimization for automotive applications. Technical Report
LIU-IEI-R-12/003. Linköping University, Division of Solid Mechanics.
Sacks, J., Welch, W. J., Mitchell, T. J., and Wynn, H. P. (1989). Design and analysis of
computer experiments. Statistical Science, 4(4), 409-423.
Shankar, J., Haftka, R. T., and Watson, L. T. (1993). Computational study of a
nonhierarchical decomposition algorithm. Computational Optimization and
Applications, 2(3), 273-293.
Sheldon, A., Helwig, E., and Cho, Y.-B. (2011). Investigation and application of multi-
disciplinary optimization for automotive body-in-white development. Proceedings
of the 8th European LS-DYNA Users Conference. Strasbourg, France.
Shi, L., Yang, R. J., and Zhu, P. (2012). A method for selecting surrogate models in
crashworthiness optimization. Structural and Multidisciplinary Optimization, 46(2),
159-170.
Simpson, T., Peplinski, J., Koch, P. N., and Allen, J. (2001). Metamodels for computer-
based engineering design: Survey and recommendations. Engineering with
Computers, 17(2), 129-150.
Smola, A. J., and Schölkopf, B. (2004). A tutorial on support vector regression. Statistics
and Computing, 14(3), 199-222.
76
BIBLIOGRAPHY
77
BIBLIOGRAPHY
78