Elements of Strucural Optimization Haftka
Elements of Strucural Optimization Haftka
by
RAPHAEL T. HAFfKA
Department ofAerospace ami Ocean Engineering,
Virginia Polytechnic Institute and State Vniversity,
Blacksburg, Virginia, U.sA.
and
ZAFER GORDAL
Department of Engineering Science ami Mechanics,
Virginia Polytechnic Institute ami State VniverSity,
Blacksburg, Virginia, V.SA.
..
SPRINGER-SCIENCE+BUSINESS MEDIA, B.V.
Library of Congress Cataloging-in-Publication Data
Haftka. Raphael T.
Elements of structural optimization I by Raphael T. Haftka and
Zafer Gurdal. -- 3rd rev. and expanded ed.
p. cm. -- (Solid mechanlcs and its applications ; v. 11)
Includes blbllographlcal references and indexes.
ISBN 978-0-7923-1505-6 ISBN 978-94-011-2550-5 (eBook)
DOI 10.1007/978-94-011-2550-5
1. Structural optlmlzation. 1. Gurdal. Zafer. II, Tltle.
III. Ser Ies.
TA658.8.H34 1991
624.1' 7--dc20 91-37690
CIP
ISBN 978-0-7923-1505-6
AH Rights Reserved
© 1992 Springer Science+Business Media Dordrecht
Originally published by Kluwer Academic Publishers in 1992
Softcover reprint of the hardcover l st edition 1992
No part of the material protected by this copyright notice may be reproduced or
utilized in any form or by any means, electronic or mechanica1,
including photocopying, recording or by any information storage and
retrieval system, without written permission from the copyright owner.
This book is dedicated to
Rose
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. xiii
Chapter 1. Introduction ................................................. 1
1.1 Function Optimization and Parameter Optimization ................. 1
1.2 Elements of Problem Formulation ................................... 3
Design Variables ................................................. 3
Objective Function . .............................................. 5
Constraints . ..................................................... 9
Standard Formulation . ........................................... 9
1.3 The Solution Process .............................................. 12
1.4 Analysis and Design Formulations .................................. 14
1.5 Specific Versus General Methods ................................... 15
1.6 Exercises .......................................................... 16
1. 7 References ......................................................... 19
Chapter 2. Classical Tools in Structural Optimization ............... 23
2.1 Optimization Using Differential Calculus ........................... 23
2.2 Optimization Using Variational Calculus ........................... 29
Introduction to the Calculus of Variations . ...................... 29
2.3 Classical Methods for Constrained Problems ....................... 33
Method of Lagrange Multipliers .. ................................ 34
Function Subjected to an Integral Constraint . ...................... 37
Finite Subsidiary Conditions . ................................... 40
2.4 Local Constraints and the Minmax Approach ...................... 44
2.5 Necessary and Sufficient Conditions for Optimality ................. 49
Elastic Structures of Maximum Stiffness ......................... 50
vii
Contents
Vlll
Contents
IX
Contents
x
Contents
xi
Preface
The field of structural optimization is still a relatively new field undergoing rapid
changes in methods and focus. Until recently there was a severe imbalance between
the enormous amount of literature on the subject, and the paucity of applications
to practical design problems. This imbalance is being gradually redressed. There is
still no shortage of new publications, but there are also exciting applications of the
methods of structural optimizations in the automotive, aerospace, civil engineering,
machine design and other engineering fields. As a result of the growing pace of
applications, research into structural optimization methods is increasingly driven by
real-life problems.
t-.Jost engineers who design structures employ complex general-purpose software
packages for structural analysis. Often they do not have any access to the source
program, and even more frequently they have only scant knowledge of the details of
the structural analysis algorithms used in this software packages. Therefore the major
challenge faced by researchers in structural optimization is to develop methods that
are suitable for use with such software packages. Another major challenge is the high
computational cost associated with the analysis of many complex real-life problems.
In many cases the engineer who has the task of designing a structure cannot afford
to analyze it more than a handful of times.
This environment motivates a focus on optimization techniques that call for mini-
mal interference with the structural analysis package, and require only a small number
of stfllctural analysis runs. A class of techniques of this type, pioneered by Lucien
XUI
Preface
Schmit, and which are becoming widely used, are referred to in this book as sequen-
tial approximate optimization techniques. These techniques use the analysis package
for the purpose of constructing an approximation to the structural design problem,
and then employ various mathematical optimization techniques to solve the approx-
imate problem. The optimum of the approximate problem is then used as a basis for
performing one or more structural analyses for the purpose of updating or refining
the approximate design problem. Most of the approximate design problems are based
on derivatives of the structural response with respect to design parameters.
In the new environment the structural designer is typically called upon to provide
the interface between a commercially available analysis program, and a commercially
available optimization software package. The three most important ingredients of
the interface are: sensitivity derivative calculation, construction of an approximate
problem, and evaluation of results for the purpose of fine-tuning the approximate
problem or the optimization method for maximum efficiency and reliability.
This textbook is organized so that its middle part-Chapters 6, 7 and 8 deal with
the two issues of constructing the approximate problem and obtaining sensitivity
derivatives. Evaluating the results of the optimization calls for a basic understanding
of optimality conditions and optimization methods. This is dealt with in Chapters
1 through 5. The last three chapters deal with the specialized topics of optimality
criteria methods, multi-level optimization, and applications to composite materials.
The material in the textbook can be used in various ways in teaching a graduate
course in structural optimization, depending on the available amount of time, and
whether students have prior preparation in optimization techniques.
Without prior preparation in optimization techniques it is suggested that the
minimum time requirement is one semester. It is suggested to cover Chapter 1,
sections 2.1, 2.2 and 2.3 of Chapter 2, Sections 3.1 and 3.4 of Chapter 3, some
material from Chapters 4 and 5 depending on the instructor's favorite optimization
methods, most of Chapter 6 and the first two sections of Chapter 7. With a two-
quarter sequence it is suggested to cover Chapters 1 and 2, selected t.opics of Chapters
3 to 5 and Chapter 6 in the first quarter, and Chapters 7, 9, 11 and either Chapter
8 or Chapter 10 in the second quarter. Finally, in a two-semester sequence it is
recommended to cover Chapters 1 through 6 in the first semester, and Chapters 7
through 11 in the second semester.
With a preparatory course in mathematical optimization a one quarter and a
one semester versions of the course can be considered. A one-quarter version could
include Chapters 1 and 2, sections 3.1, 3.2, 3.3 and 3.7 of Chapter 3, and Chapters
6, the first two sections of Chapter 7, and Chapter 9 or 11.. A one-semester version
could include the same part of Chapters 1 through 7 and then Chapters 9 through
11.
The authors gratefully acknowledge the assistance of Drs. H. Adelman, B.
Barthelemy, J-F. Barthelemy, L. Berke, R. Grandhi, D. Grierson, E. Haug, R. Plaut,
J. Sobieski, and J. Starnes in reviewing parts of the manuscript and offering critical
comments.
xiv
Introd uction 1
Before the advent of high speed computation most of the solutions of structural
analysis problems were based on formulations employing differential equations. These
1
Chapter 1: Introduction
differential equations were solved analytically (e.g., by using infinite series) with oc-
casional use of numerical methods at the very end of the solution process. The
unknowns were functions (representing displacements, stresses, etc.) defined over a
continuum.
The early beginning of structural optimization followed the same route, in that
the unknowns were functions defining the optimal structural properties. Consider,
for example, the beam shown in Figure 1.1.1. Structural analysis is concerned with
finding the displacement w( x) of the beam by solving the well-known governing equa-
tion
J2 (Ef J2w)
dx 2 dx = q(x).
2 (1.1.1)
The structural designer may want to find the optimum distribution of the moment of
inertia f( x) of the beam along its length. Of course, the notion of optimality requires
that we have an objective function that we wish to maximize or to minimize. For
example, the objective function may be the mass of the beam. For many common
beam cross sections the mass m is given as
m =c 11 fP(x)dx, (1.1.2)
where the exponent p is usually between 0.4 and 0.5, and c is a known constant. An
optimization problem typically involves a number of constraints. Without any con-
straint the optimum beam would have zero moment of inertia and zero mass. In the
design of a beam, a typical constraint would be to limit the maximum displacement
of the beam to some specified allowable wo,
2
Section 1.2: Elements of Problem Formulation
in Chapter 2. The class of structural optimization problems that seeks an optimum
structural function is called function or distributed parameter structural optimization.
In the late fifties and early sixties high speed electronic computers had a profound
effect on structural analysis solution procedures. Techniques that were well suited to
computer implementation, in particular the finite element method (FEM), became
dominant. The finite element method discretizes the structure at the very beginning
of the analysis, so that the unknowns in the analysis are discrete values of displace-
ments and stresses at nodes of the finite element model, rather than functions. The
differential equations solved by earlier analysts are replaced by systems of algebraic
equations for the variables that describe the discretized system.
The same transformation began to take hold in the early sixties in the field of
structural optimization. When optimizing a structure discretized by finite elements
it is natural to discretize the structural properties which are optimized. Consider
again the beam example of Figure 1.1.1. A finite element solution for the displace-
ments starts by dividing the beam into a number of constant-property segments or
finite elements. An optimization of the same beam would naturally use the moments
of inertia of the segments as design parameters. Thus, instead of searching for an
optimum function, we will be looking for the optimum values of a number of param-
eters. The mathematical discipline t.hat deals with parameter optimization is called
mathematical programming. The bulk of this t.ext (Chapters 3- 7, 9-11) is concerned,
therefore, with mathematical programming techniques and their application to struc-
tural optimization problems defined by discretized models. In particular, it is often
implicitly assumed that the structural analysis is based on the finite element method.
3
Chapter 1.' Introduction
are commonly treated as continuous are often made discrete due to manufacturing
considerations. For example, if the beam of Figure 1.1.1 is designed to minimize cost,
then we may need to limit ourselves to commercially available cross sections. The
moment of inertia would then cease to be a continuous design variable, and would
become a discrete one.
In most structural design problems we tend to disregard the discrete nature of
the design variables in the solution of the optimization problem. Once the optimum
design is obtained, we then adjust the values of the design variables to the nearest
available discrete value. This approach is taken because solving an optimization
problem with discrete design variables is usually much more difficult than solving a
similar problem with continuous design variables. However, rounding off the design
to the closest integer solution works well when the available values of the design
variables are spaced reasonably close to one another, so that changing the value of a
design variable to the nearest integer does not change the response of the structure
substantially. In some cases the discrete values of the design variables are spaced too
far apart, and we have to solve the problem with discrete variables. This is done by
employing a branch of mathematical programming called integer programming. In
this text it is assumed that design variables are continuous unless otherwise stated.
JI I
==
The choice of design variables can be critical to the success of the optimization
process. In particular it is important to make sure that the choice of design variables is
consistent with the analysis model. Consider, for example, the process of discretizing
a structure by a finite element model and applying the optimization procedure to the
model. If the design variable distribution has a one-to-one correspondence with the
finite element model we can encounter serious accuracy problems. For example, the
plate shown in Figure 1.2.1 was analyzed [20] by a 7 X 7 finite element mesh, with
most design variables specifying the thickness of individual elements. While the 7 x 7
4
Section 1.2: Elements of Problem Formulation
model was adequate for the initial design which had uniform thickness, it was not
adequate for the final design shown in the Figure.
(a) (b)
Figure 1.2.2 Optimized shape of a hole in a plate, (a) initial design, (b) final design.
A similar problem may be encountered when the coordinates of nodes of the finite
element model are used as design variables. For example, the shape of the hole in the
plate shown in Figure 1.2.2 was optimized [21] to reduce the stress concentration near
the hole with the coordinates of the boundary nodes serving as design parameters.
Again, the finite element model was adequate for the analysis of the initial circular
shape of the hole, but not the "optimal shape" obtained. In general, the distribution
of design variables should be much coarser than the distribution of finite elements
(except for skeletal structures where often each element corresponds to a physical
member of the structure)
The notion of optimization also implies that there are some merit function f(x) or
functions f(x) = [II (x), h(x), ... , fp(x)] that can be improved and can be used as a
measure of effectiveness of the design. The common terminology for such functions is
objective functions. Optimization with more than one objective is generally referred
to as Multicriteria Optimization. For structural optimization problems, weight, dis-
placements, stresses, vibration frequencies, buckling loads, and cost or any combina-
tion of these can be used as objective functions. Consider, for example, the three-bar
truss of Figure 1.2.3. Our design problem may be to vary the horizontal locations of
the three support points so as to minimize the mass of the truss and t.he stresses in
its members. We have four objective functions: the mass and the three stresses.
Dealing with multiple objective functions is complicated and is usually avoided.
There are two intuitive ways commonly used for reducing the number of objective
functions to one. The first way is to generate a composite objective function that
replaces all the objectives. For example, if the mass of the structure is denoted m and
5
Chapter 1: Introduction
2 3
100 in
.
...P=l0000lb
x,u
Y,v
the stresses in the three bars as ai, i = 1,2,3, then a composite objective function I
could be
(1.2.1)
where the Qi are weighting coefficients selected to reflect the relative importance of
the four objective functions.
The second intuitive way to reduce the number of objective functions is to select
the most important as the only objective function and to impose limits on the others.
Thus we can formulate the three-bar truss design problem as minimization of mass,
subject to upper limits on the values of the three stresses.
When it is not intuitively clear how to weight or choose between the objective
functions, a systematic approach to the problem is through a branch of mathematical
programming called Edgeworth-Pareto optimization that deals with multiple objective
functions [22-24]. Stadler [25,26] was probably the first to apply Edgeworth-Pareto
optimality to structural design. More recent applications can be found in Refs. 27-31.
A vector of design variables x' is said to be Edgeworth-Pareto optimal if, for any
other vector x, either the values of all the objective functions remain the same, or
at least one of them worsens compared to its value at x*. \Vhen it is not possible to
specify intuitively the relative importance of the objective functions in an equation
such as (1.2.1), the values of the weights Qi, i = 0,1,2,3 in Eq. (1.2.1) can be decided
by studying various Edgeworth-Pareto optimal designs. Thus the design process is an
interactive process, and the imposition of constraints is postponed until knowledge of
the optimum performance is gained by studying Edgeworth-Pareto optimal designs.
One of the approaches for generating a pareto-optimal solution to multiple ob-
jective function optimization problems is based on the minimization of the deviation
of the individual objective functions from their individual minimum values. If the
independent minimizations of each of the objective functions result in function val-
ues of Ii, 12, ... , I; associated with design points xi, x2, ... , x;, then for an arbitrary
value of the design variable vector x the normalized distance of each of the objective
6
Section 1.2: Elements of Problem Formulation
It is then possible to pose the problem either as the minimization of the largest
deviation of the objective functions from their individual minima (too norm),
or of the distance (i.e., the l2 or Euclidean norm) from the reference point f* =
(Ji, f;, ... ,f;) to f = (II, 12, ... , fp);
L dT .
p
minimize (1.2.4)
i=1
It is also possible to use weighting coefficients in Eq. (1.2.4) for the contributions
of the individual objective functions. A more detailed discussion of the methods for
solving multicriteria optimization problems and their design applications is given by
Eschenauer et al. [31].
Example 1.2.1
I" w
~I
2 w 3 4 5 2 w 3 4 5
area contours shear stress contours
Figure 1.2.4 Design of a beam cross-section for minimum area and minimum shear
stress.
7
Chapter 1: Introduction
The contour lines for the two objective functions,
3
II = A = wh, and h = T = 2wh ' (a)
are shown in Figure 1.2.4 . The individual minima for the two functions are at the
opposite corners of the design space, wi = hi = 0.5 and w2 = h; = 5.0, with
associated function values of N = 0.25 in 2 and f2 = 0.06 Ib/in 2 •
The weighted objective function approach with equal weights results in minimiza-
tion of the function
3
F=wh+- · (b)
2w h
Since design variables wand h appear everywhere in the form of a product, we
can treat this product as a single variable. Minimization of Eq. (b) with respect
to the product results in w"h" = fil2
= 1.225 with objective function values of
It = 12 = 1.225. If, on the other hand, we use the minimization of the Euclidean
norm of the distance from the individual minima, the function that needs to be
minimized is
The resulting design is w"h' = 2.5 with objective function values of Ii = 2.5 and 12 =
0.6.
6 w*h* = 0.25
'""'
'"'" 5
~ =0.5 w*h*
'" 4
~
~ 3
'"
~ w*h* =1.225
'-'
~W*h*=2.5
......."'2 w*h* = 25.0
~W*h*=3.0
~
1
5 10 15 20 25
f 1 (area)
The two designs obtained above and the designs corresponding to the minimiza-
tion of the individual functions constitute a pareto-optimum. There are other so-
lutions that satisfy the condition for pareto-optimality. These solutions can be ob-
tained either by varying the weighting coefficients of the individual objectives, or by
8
Section 1.2: Elements of Problem Formulation
imposing one of the objectives as a constraint and varying the desired level of this
constraint. For example, if the second objective function is turned into a constraint
by imposing a condition that h::; 0.5 while minimizing the area, we would obtain a
design w*h* = 3.0 with objective function values of fi = 3.0 and I:; = 0.5. Similarly,
if we minimize h by imposing a constraint that h ::; 0.5, we obtain w* h* = 0.5 with
Ii = 0.5 and I:; = 3. All of these solutions lie on a curve in the function space that
connect the two individual minima as shown in Figure 1.2.5 . This curve is usually
called the efficiency curve. • ••
1.2.3 Constraints
The formulation of the three-bar truss example where the stresses are subject to
upper limits, and the beam cross-section design problem where the height and width
variables are limited to take values only in a certain range, introduces the notion
of limits on the design variables. Because of their simplicity, these upper and lower
limit constraints on the values of the design variables are often treated in a special
way by solution procedures, and are refereed to as side constraints. Constraints
which impose upper or lower limits on quantities are by their very nature inequality
constraints. Sometimes we need eq1tality constraints. For example, the three-bar
truss may be designed subject to a requirement that the vertical component of the
displacement at the point of application of the force be zero. Another example of
equality constraints is provided by the equations of equilibrium that a structure must
satisfy in terms of its design variables.
Some strategies for the solution of nonlinear optimization problems are unable
to handle equality constraints, but are limited to inequality constraints only. In
such instances it is possible to replace the equality constraint with two inequality
constraints that form upper and lower bound constraints with a same limiting value.
However, it is usually undesirable to increase the number of constraints. Another way
of handling equality constraints in such situations will be discussed later in Chapter
5.
The notation adopted in this text for design variables, objective function and con-
straints is summarized in the following formulation of the optimization problem. In
this text we deal only with problems formulated to have a single objective function.
minimize I (x)
such that gj(x) ~ 0 , j = 1, ... , ng , (1.2.5)
hk(x) = 0 , k = 1, ... , ne ,
where x denotes a vector of design variables with components Xi, i = 1, ... , n. The
equality constraints hj(x) and the inequality constraints gj(x) are assumed to be
transformed into the form (1.2.5). The fact that the optimization problem is as-
sumed to be a minimization rather than a maximization problem is not restrictive
9
Chapter 1: Introduction
since instead of maximizing a function it is always possible to minimize its negative.
Similarly, if we have an inequality of opposite type, that is
(1.2.6)
(1.2.7)
Example 1.2.2
Consider the three-bar truss of Figure 1.2.3. Assume that it is made of steel (density
0.29Ib/in3 ), and that we want to minimize the mass subject to the constraint that the
stress in any member does not exceed 30,000 psi in tension or compression. \Ve also
impose a side constraint that the minimum area of any member is 0.1 in 2 • The design
variables are the member cross-sectional areas A l , A 2 , and A 3 , and the horizontal
coordinates Xl, x2 and X3 of the support points. The point of application of the
force is assumed to be fixed. We seek to formulate this optimization problem in the
standard form of (1.2.5).
The objective function is easy to write in terms of the design variables.
Jxr +
where
L; = 100 2 , i=1,2,3.
To calculate the stress constraint it is convenient to introduce the displacements u
and v at the point of application of the force as intermediate variables. It can be
verified that the equations governing u and v are
where
~AiX~
kn(x) =E L...J L~ ,
i=1 I
a
" 100A i x i
k 12 (X ) -- - EL...J L~ ,
i=1 I
3
k 22 (X ) -
-
E " 10,000A i
L...J L~ ,
i=1 I
and where E is Young's modulus for steel (30 x 106 psi). In terms of u and v, the
stresses in the members are given as
Based on the above analysis, one way of formulating the optimization problem in the
standard form is to add u and v to the list of design variables. The formulation is
We then have a problem with eight design variables (Ai, Xi, i = 1,2,3 and u, v), two
equality constraints and nine inequality constraints. This formulation including the
response variables u and v together with the structural dimensions as design variables
is called simultaneous analysis and design. Most structural optimization formulations
eliminate the response variables by using the equations of equilibrium. In this problem
we can solve for u and v from the equality constraints, thus eliminating two equality
constraints and two design variables. The new formulation, which does not include the
displacements as design variables, is much more common in structural optimization.
As a result it is rare to encounter formulations of structural optimization problems
which include equality constraints.e e e
While the above formulation of Example 1.2.2 conforms to our standard formu-
lation, we may expect to encounter numerical difficulties when we solve this example
using many standard solution techniques. The reason for the expected numerical
difficulties is the large discrepancy between the magnitudes of the different design
variables and constraints. Consider first the design variables. The area design vari-
ables may be expected to be of the order of the ratio of the applied force to the
allowable stress, that is between 0.1 and 1 in 2 • The coordinate design variables, on
the other hand, may be expected to be of the order of 100 in.
11
Chapter 1: Introduction
Next consider the constraints. If the displacements u and v are about ten percent
below or above their optimal values we can expect the equality constraints hI and
h2 to be of the order of magnitude of ten percent of the applied load. Similarly the
inequality constraints 91 through 96 will be of the order of ten percent of the allowable
stress, 30000 psi. However, the minimum gage constraints 97 through 99 will be of
the order of 0.1 in 2 •
Because many optimization software packages are not numerically robust, it is
a good idea to eliminate such wide variations in the magnitudes of design variables
and constraints by normalization. Design variables may be normalized to order 1
by scaling. In Example 1.2.2 the coordinate design variables may be normalized by
the given vertical distance (100 in), and the area design variables by a nominal area,
Ao = 1/3 in 2 , which is the ratio of the applied load to the allowable stress.
The constraints may be similarly normalized. Usually, inequality constraints can
be normalized by the allowable value which is used to form them. Thus a constraint
that a stress component {J be smaller than an allowable stress {Jal is often written as
The value of the constraint depends on the units used, and can be large or small.
Instead the constraint can be normalized as
{J
g=l--;::::O. (1.2.9)
{Jal
Now the constraint values are of order one, and do not depend on the units used.
The optimization methods discussed in this text are mostly numerical search tech-
niques. These techniques start from an initial design and proceed in small steps to
improve the value of the objective function, or the degree of compliance with the
constraints, or both. The search is terminated when no progress can be made in
improving the objective function without violating some of the constraints. Some
optimization methods terminate when progress in improving the objective function
becomes very slow. Others check for optimality by employing the necessary condi-
tions, called the K uhn- Tucker conditions (sec Chapter 5), that must be satisfied at a
minimum. We will typically use n to denote the number of design variables, so that
the search for the optimum is carried out in the n-dimensional space of real variables
Rn. Every point in this space constitutes a possible design.
In structural optimization problems the constraints imposed on the design, such
as stress, displacements or frequency constraints, are important. That is, such con-
straints will affect the final design and force the objective function to assume a higher
value than it would take without the constraints. For example, in Example 1.2.2, if
the stress constraints were removed all the cross-sectional areas would be reduced to
their minimum-gage values of 0.1 in 2 , and the coordinates of points 1, 2 and 3, would
12
Section 1.3: The Solution Process
lie directly above point 4, so that the lengths of all three members would take the
minimum value, 100 in., corresponding to a total mass of 8. 7 lb. The resulting stresses
in the members would tend to infinity. Since we cannot tolerate infinite stresses, we
impose stress constraints, and we may expect that the optimum mass will be heavier
than 8.7 lb., and that, at the optimum design, the stress in at least one member will
be equal to the maximum allowable stress of 30,000 psi.
In general, we divide the space of design variables into a feasible domain and
infeasible domain. The feasible domain contains all possible design points that sat-
isfy all the constraints. The infeasible domain is the collection of all design points
that violate at least one of the constraints. Because we expect that the constraints
influence the optimum design, we expect that some constraints will be critical at the
optimum design. This is equivalent to the optimum being on the boundary between
the feasible and infeasible domains. Inequality constraints in our standard formu-
lation, Eq. (1.2.5), are critical when they are equal to zero. These constraints are
also called active constraints, while the rest of the constraints are inactive or pas-
sive. For example, consider the minimum gage constraint g7 of Example 1.2.2. For
Al = 0.lin2 the constraint is active, for Al = 0.llin2 the constraint is passive, and
for Al = 0.09in2 the constraint is violated.
It may be intuitively assumed that all the constraints which are active at the
optimum design influence it; that is, if they were removed the objective function
could be improved. This is not always true. It is possible to have constraints that
are active and can be removed without any impact on the optimum design. Many
optimization procedures calculate, along with the optimum design, a set of numbers,
one for each active constraint, called the Lagrange multipliers (see Chapter 5) which
measure the sensitivity of the optimum design to changes in each constraint. When
the Lagrange multiplier associated with a constraint is zero, it indicates that, to
a first order approximation, removing this constraint will not have any effect on
the optimum value of the objective function. These multipliers also provide very
important design information because in many structural optimization applications
there is some degree of arbitrariness in the choice of parameters that determine the
constraints such as stress limits or minimum gage values. For example, when we
impose stress constraints on a steel structure we typically select ahead of time the
grade of steel to be used. We can use the Lagrange multipliers to estimate the effect
of a change in the stress limit on the objective function. If we find that the optimum
design is very sensitive to this value we may consider using a better grade of steel.
One of the major problems in almost all optimization solution procedures is the
determination of the set of active constraints. If the solution procedure attempts
to consider all constraints during the search process the computational cost of the
optimization may be significantly increased. If, on the other hand, the procedure
deals only with constraints that are active or near active for the trial design, the
convergence of the optimization process may be endangered due to oscillation in the
set of active constraints. Most optimization procedures are usually complemented by
an active set strategy used to determine the set of constraints to be considered at
each trial design.
During the optimization process we move from one design point to another. While
13
Chapter 1: Intmduction
there are many optimization techniques, most of them proceed through four basic
steps in performing the move. The first step is the selection of the active constraint
set discussed above. The second step is the calculation of a search direction based
on the objective function and the active constraint set. Some methods (such as
the gradient projection method) look for a direction which is tangent to the active
constraint boundaries. Other methods, such as the feasible direction or the interior
penalty function method seek to move away from the constraint boundaries. The
third step is to determine how far to go in the direction found in the previous step.
This is often done by a process called a one dimensional line search because it seeks
the magnitude of a single scalar which is the distance to be travelled along the given
direction. The last step is a convergence step which determines whether additional
moves are required.
14
Section 1.5: Specific Versus General Methods
do not. For applications in structural optimization the former procedure gains an
advantage because the eigenvectors (vibration modes) change only gradually 11.<; the
design is modified. Therefore the eigenvectors from an earlier design can serve as
good initial approximations for the current eigenvectors.
Finally, in some cases it may be worthwhile to integrate the analysis and design
procedures. This happens when structural analysis is iterative in nature, as in the
case of nonlinear structural behavior. The analysis and design iterations may be then
integrated so that the analysis iteration is only partially converged for each design
iteration (e.g. [32,33]). In some cases it may be worthwhile to combine the analysis
and design iterations into a single iterative process. This simultaneous analysis and
design approach is discussed in Chapter 10.
The solution methods commonly used for obtaining optimum designs in structural
optimization may be divided into different categories. An important classification
of solution methods considers specific versus general methods. Specific methods are
used exclusively in structural optimization (even if they could be applicable also in
other disciplines). General methods apply to optimization problems in several other
fields. In the early stages of the development of structural optimization, specific
methods enjoyed great popularity. These included methods tailored to some special
structural optimization problems which they could solve more efficiently than any
general method.
The most successful of these specific methods was the fully stressed design tech-
nique described in Chapter 9. It is a method applicable to the design of a structure
subject to stress constraints only, and it works well for lightly-redundant single-
material structures.
The popularity of specific methods is currently waning as their limitations be-
come increasingly apparent. The approach taken in this text is to emphasize general
methods rather than specific ones. General methods not only have the advantage of
wider applicability but also a wider base of resources. Researchers in many disciplines
are constantly improving these methods and developing efficient and reliable software
implementations.
Besides playing down the role of specialized methods for structural design we also
do not discuss some mathematical programming methods applicable to problems of
specialized form such as dynamic programming, geometric programming and optimal
control techniques. These methods have been applied successfully to structural design
problems, but because of space considerations they are not covered here. The reader
is referred to Refs. 34-36 for information on the application of these methods to
structural design.
The important considerations for a structural analyst using general optimization
methods have to do with providing an interface between structural analysis software
and optimization software. This interface includes the three major components of
15
Chapter 1: Introduction
formulation, sensitivity and approximation, and is one of the major thrusts of this
text.
The formulation of a structural design problem is of crucial importance for the
success of the design process. A poor formulation can lead to poor results or pro-
hibitive computational cost. Chapter 3, for example, describes various structural
design problems that can be formulated with a linear objective function and lin-
ear constraints. The reason for the usefulness of a linear formulation is the highly
advanced state of methodology and software for solving such linear problems.
The efficient calculation of derivatives of the constraints and objective function
with respect to design variables, often referred to as sensitivity derivatives, is dis-
cussed in Chapters 7 and 8. Most general purpose optimization algorithms require
such derivatives, and their calculation is often the major computational expense in
the optimization of structures modeled by complex finite element models. These
derivatives can also be used to form constraint approximations which can then be
employed instead of costly exact constraint evaluations during portions of the op-
timization process. The use of constraint approximations is discussed in Chapter
6.
The importance of efficient and accurate calculation of sensitivity derivatives
and of employing constraint approximations is now recognized by most structural
optimization specialists. We believe that it affects the success and overall computa-
tional cost of the optimization process even more than the choice of the optimization
method.
1.6 Exercises
P=25 Kips
d
h
1. A tripod is made from three steel pipes as shown in Figure 1.6.1. The ends of these
pipes are placed 120 0 apart on a circle of radius 6 ft. A vertical downward force of
25 kips is applied at the top. It is required to minimize the weight of the tripod such
that the tripod is safe with respect to Euler buckling, local buckling and yielding.
Assume E = 30 X 106 psi, ayield = 60 X 103 psi and calculate the local buckling stress
in psi by the formula
a cr = 36 x 106 (~) •
1"1---------------1 P = 10 kips
=
1 20 ft
Figure 1.6.2 A narrow rectangular cantilever beam.
where E is Young's modulus, Ileast is the smallest moment of inertia and c is the
torsional rigidity of the beam given by 0.312hb3G, G being the shear modulus. Design
a minimum weight beam so as to prevent failure in both flexure and twisting. Assume
E = 30 X 106 psi, G = 12 X 106 psi and aal = 75 ksi in tension and compression.
Locate the optimum solution graphically.
3. Consider the design of the cross-section of an I-beam shown in Figure 1.6.3 with the
objectives of minimizing the cross-sectional area and minimizing the normal stresses
resulting from bending about the horizontal neutral axis. The thicknesses of the
flange and the web of the cross-section are fixed at t = O.lin. The design variables
are the width wand the height h of the cross-section. Determine graphically the
designs which minimize the individual objectives if the width and the height are
17
Chapter 1: Introduction
T
h
L,-------.,1
constrained to remain in the range 0.1 ::; w, h ::; 10. Also find the designs by using
weighting function approach with equal weights, and using Eq. (1.2.4).
4. The elastic grillage of Figure 1.6.4 consists of two uniform beams with cross-
sectional areas Al and A 2 . Both beams are subjected to a uniformly distributed load
of 1000 IbJin. The minimum weight design of such a structure was first proposed by
Moses and Onada [371. Develop expressions for the maximum stresses in tension and
compression at sections 1, 2 and 3 in terms of Al and A 2 . Assume that the section
modulus z and moment of inertia I are related to the cross-sectional area as
z= (~)1.82 A )2.65
1=1.007 ( - .
1.48 ' 1.48
18
Section 1.7: References
1.7 References
[I] Wilde, D.J., Globally Optimal Design, John Wiley and Sons, New York, 1978.
[2] Wasiutynski, Z., and Brandt, A., "The Present State of Knowledge in the Field
of Optimum Design of Structures," Appl. Mech. Rev., 16 (5), pp. 341-348, May
1963.
[3] Sheu, C.Y., and Prager, W., "Recent Developments in Optimum Structural De-
sign," Appl. Mech. Rev., 21 (10), pp. 985-992, Oct. 1968.
[4] Schmit, L.A. Jr., "Structural Synthesis 1959-1969: A Decade of Progress," in Re-
cent Advances in Matrix Methods of Structural Analysis and Design, University
of Alabama Press, Huntsville, pp. 565-634, 1971.
[5] Pierson, B.L., "A Survey of Optimal Structural Design Under Dynamic Con-
straints," Int. J. Num. Meth. Eng., 4, pp. 491-499, 1972.
[6] Niordson, F.I., and Pedersen P., "A Review of Optimal Structural Design," in
Theoretical and Applied Mechanics, Proceedings of the Thirteenth International
Congress of Theoretical and Applied Mechanics, E. Becker and G. K. Mikhalov
(eds.), pp. 264-278, Springer-Verlag, Berlin, 1973.
[7] Rao, S.S., "Optimum Design of Structures under Shock and Vibration Environ-
ment," Shock Vibr. Digest, 7 (12), pp. 61-70, Dec. 1975.
[8] Olhoff, N. J., "A Survey of Optimal Design of Vibrating Structural Elements,
Parts I and II," Shock Vibr. Digest, 8 (8&9) , pp. 3-10,1976.
[9] Venkayya, V. B., "Structural Optimization: A Review and Some Recommenda-
tions," Int. J. Num. Meth. Eng., 13, pp. 203-228, 1978.
[10] Lev, O. E., (ed.), Structural Optimization-Recent Developments and Applica-
tions, ASCE Committee on Electronic Computation, New York, 1981.
[11] Schmit, L.A., "Structural Synthesis-its Genesis and Development," AIAA J., 19
(10), pp. 1249-1263,1981.
[12] Haug, E.J., "A Review of Distributed Parameter Structural Optimization Liter-
ature," in Optimization of Distributed Parameter Structures, E.J. Haug and J.
Cea (eds.), Vol. 1, pp. 3-74, Sijthoff and Noordhoff, Alphen aan den Rijn, the
Netherlands, 1981.
19
Chapter 1: Introduction
[13] Ashley, H., "On Making Things the Best~Aeronautical Uses of Optimization,"
J. Aircraft, 19 (1), pp. 5-28, 1982.
[14] Kruzelecki, J., and Zyczkowski, M., "Optimal Structural Design of Shl'lls~A
Survey," SM Archives, 10, pp. 101-170,1985.
[15] Haftka, R. T., and Grandhi, R. V., "Structural Shape Optimization~A Survey,"
Computer Methods in Applied Mechanics and Engineering, 57, pp. 91-106, 1986.
[16] Bushnell, D., Holmes A. M. C., Flaggs, D. L., and McCormick, P. J., "Opti-
mum Design, Fabrication and Test of Graphite-Epoxy, Curved, Stiffened, Locally
Buckled Panels Loaded in Axial Compression" , in Buckling of Structures (ed. I.
Elishakoff et al.) Elsevier Science Publishers B. V., Amsterdam, pp. 61-131, 1988.
[17] Kirsch, U., "Optimal Topologies of Structures," Appl. Mech. Rev., 42, No.8, pp.
223-239, 1989.
[18] Friedmann, P. P., "Helicopter Vibration Reduction Using Structural Optimization
with AeroelasticjMultidisciplinary Constraints~A Survey," J. Aircraft, 28, No.
1, pp. 8-21, 1991.
[19] Sobieszczanski-Sobieski, J., "Structural Optimization: Challenges and Opportu-
nities," Int. J. Vehicle Design, 7, pp. 242-263, 1986.
[20] Prasad, B., and H aft ka, R. T., "Optimal Structural Design with Plate Finite
Elements," ASCE J. Structural Division, 105, pp. 2367-2382, 1979.
[21] Braibant, V., Fleury, C., and Beckers, P., "Shape Optimal Design: An Approach
Matching C.A.D. and Optimization Concepts," Report SA-109, Aerospace Lab-
oratory of the University of Liege, Belgium, 1983.
[22] Edgeworth, F. Y., Mathematical Physics, London, England, 1881.
[23] Pareto, V., Manuale di Economia Politica, Societa Editrice Libraria, Milano, Italy,
1906. Translated into English by A.S. Schwier as Manual of Political Economy,
MacMillan, New York, 1971.
[24] Zeleny, M., Multiple Criteria Decision Making, l\1cGraw-Hill Book Company, New
York, 1972.
[25] Stadler, W., "Natural Structural Shapes of Shallow Arches," J. App!. Mech, 44,
pp.291-298, 1977.
[26] Stadler, W., "Natural Structural Shapes (The Static Case)," Q. J. Mech. Appl.
Math., 31, pp. 169-217, 1978.
[27] Adali, S., "Pareto Optimal Design of Beams Subjected to Support Motions,"
Computers and Structures, 16, pp. 297-303, 1983.
[28] Bends0e, M.P., Olhoff, N., and Taylor, J.E., "A Variational Formulation for
Multicriteria Structural Optimization," J. Struct. Mech., 11 (4), pp. 523-544,
1984.
20
Section 1.7: References
21
Classical Tools in Structural Optimization 2
Classical optimization tools used for finding the maxima and minima of functions
and functionals have direct applications in the field of structural optimization. The
words 'classical tools' are implied here to encompass the classical techniques of or-
dinary differential calculus and the calculus of variations. Exact solutions to a few
relatively simple unconstrained or equality constrained problems have been obtained
in the literature using these two techniques. It must be pointed out, however, that
such problems are often the result of simplifying assumptions which at times lack
realism, and result in unreasonable configurations. Still, the consideration of such
problems is not a purely academic exercise, but is very helpful in the process of
solving more realistic problems.
In recent years there has been an increased interest in the application of classical
tools, especially variational methods, in structural optimization. Mathematical for-
mulations of broad classes of structures as optimization problems have been achieved
by adopting variational methods. In addition, the study of classical problems not only
serves to portray the underlying principles of the techniques of classical methods, but
it serves an even more basic need in structural optimization. Closed form exact so-
lutions to classical problems serve to validate solutions obtained using more general
but approximate numerical techniques. More importantly, classical optimization is
perhaps the best vehicle for letting a student of structural optimization appreciate
fully the questions of the existence and uniqueness of the optimum designs, and the
establishment of the necessity and sufficiency of the optimality conditions. Such
questions can be rigorously answered for only the simplest problems of optimization
similar to those considered in this chapter.
23
Chapter 2: Classical Tools in Structural Optimization
x* for which the n partial derivatives
...... , (2.1.1)
vanish simultaneously. This is the necessary condition for the point x' to be a
stationary point. We will see in later chapters that this property proves to be a
valuable tool in locating the optimum solution. For a scalar valued function, the
vector of first derivatives is referred to as the gradient vector V f and is used for
finding search directions in optimization algorithms.
Development of a sufficient condition for a stationary point x' to be an extreme
point requires the evaluation of the matrix of second derivatives H of the objective
function. The matrix of second derivatives is also referred as the Hessian matrix and
defined as
~ ~
~l
a:"
8x l 8 X l 8x 2
H= (2.1.2)
~ ~
OXnOXl 8xn8x2 8x n
It can be proved that if the matrix of second derivatives evaluated at x* is positive-
definite then the stationary point is a minimum, if it is negative-definite then the
stationary point is a maximum point [1]. A symmetric matrix H is said to be positive
(negative)-definite if the quadratic form Q = xTHx is positive (negative) for every
x, and is equal to zero if and only if x = O. A computational check for the positive
and negative definiteness of a matrix involves determinants of the principal minors,
Hi(i = 1, ... , n). A principal minor Hi is a square sub-matrix of H of order i whose
principal diagonal lies along the principal diagonal of the matrix H . The matrix H
is positive-definite if the determinants of all the principal minors located at the top
left corner of the matrix are positive; and negative-definite if -H is positive definite.
Alternatively, -His positive definite if Hi is negative and the following principal
minors, H 2 , H 3 , •.. ,H n , are alternately positive and negative [1]. Another property
of positive (negative)-definite matrices can be used as a test. A symmetric matrix is
positive (negative)-definite if and only if all its eigenvalues are positive (negative).
A symmetric matrix H is called positive semi-definite if the quadratic form Q =
xTHx is non-negative for every x. This happens when the eigenvalues of the matrix
are non-negative. Unfortunately, the expected condition that the principal minors
are non-negative is not sufficient for positive semi-definiteness. If a matrix is positive
semi-definite but not positive-definite, then there exist at least one x =J. 0 such
that the quadratic form is zero, at least one of the principal minors is zero, the
matrix is singular, and at least one of the eigenvalues is zero. In that case higher
order derivatives of the function f are needed to establish sufficient conditions for
a minimum. Similarly, when -H is positive semi-definite then H is negative semi-
definite. If H is negative semi-definite but not negative-definite, we need higher
order derivatives to establish sufficient conditions for a maximum. Finally when H
is neither positive semi-definite nor negative semi-definite, it is called indefinite. In
that case the stationary point is neither a minimum nor a maximum but a saddle
point.
24
Section 2.1: Optimization Using Differential Calculus
Two simple examples demonstrate the use of differential calculus in finding opti-
mum structural configurations.
Example 2.1.1
For the loading shown in the figure, the forces in each of the members can be
expressed in terms of the geometry of the structure as
P
F2 = - - , (2.1.3)
2
Fs =
[(hI - h2)2 + L2)t p .
(2.1.4)
2hI
If each member is to be fully stressed the cross sectional areas of the members Ai
can be related to the forces carried by the members as
From Eq. (2.1.3) the cross-sectional area A3 of the horizontal members vanishes.
However, based on stability considerations these members may be assumed to have
a minimum area of Amin. The contribution of the weight of these members to the
total weight of the structure is independent of the design variables hI and h 2 , and
25
Chapter 2: Classical Tools in tructural Optimization
will be ignored for the mini .zation problem. The total volume of material in the
remaining truss structure is de sum of the products of the cross sectional areas and
the member lengths that catt~ expressed in terms of the unknown variables. It can
be shown that the remaining total volume is
(2.1.6)
(2.1.7)
h~ = ~L, (2.1.8)
(2.1.9)
The matrix of second derivatives of the objective function for the problem is
(2.1.10)
H* - 2 P J3 [ 1 -1/2] (2.1.11)
- aD L -1/2 1 .
The matrix H* is positive definite (check principal minors), thereby, proving the
sufficiency condition for the optimality of the design .•••
Example 2.1.2
Consider an inextensible structural cable with zero bending stiffness. The cable is
stretched by applying a horizontal force Fh at the ends of the cable, two points
separated by a distance L, and carries a vertical distributed load of intensity p(x),
Figure (2.1.2).
If the cross-sectional area of the cable is allowed to vary along its length so that
the axial stress is equal to the allowable stress aD, determine the optimum value of
the horizontal pull Fh that will minimize the total volume of material of the cable for
a uniform load of p( x) = PD'
26
Section 2.1: Optimization Using Differential Calculus
F;t ~
where 0 is the angle between the horizontal coordinate axis x and the tangent to the
arc length coordinate s such that cosO = dsldx. For a uniform loading, the second
equilibrium equation can be solved for the vertical displacement along the length of
the cable by integrating twice and making use of the zero displacement conditions at
the two ends to yield
(2.1.13)
V= / dV, (2.1.14)
o
where
dV = A(s)ds . (2.1.15)
With the assumption that the cross-sectional area is to be fully stressed, A(s) = Flam
the total volume can be expressed as
(2.1.16)
Since
(2.1.17)
Eq. (2.1.16) can be written as
V = Fh
ao
/[1 +
L
o
x
2
(ddy ) Jdx. (2.1.18)
27
Chapter 2: Classical Tools \ Structural Optimization
Substituting the first derh ive of the displacement function of Eq. (2.1.13) into
the above equation, we car:"';:'0w that the volume of the material is related to the
horizontal pull as,
(2.1.19)
If the horizontal pull is small, the volume increases because the cable becomes longer.
If, on the other hand, the horizontal pull is very large the cross-sectional area has
to be large in order to keep the stress level at a o , although the length of the cable
approaches the minimum distance between the support points.
dV
dFh = 0, (2.1.20)
which produces
F' _ PoL (2.1.21)
h -v'12'
This corresponds to a minimum total volume of
(2.1.22)
dy PoL
1+(-)2=-
VII
-+(---)
x 2
(2.1.23)
dx ao 12 2 L
•••
Although applications of classical calculus can be demonstrated for many other
structures such as beams and arches, it is appropriate to mention the aspects and
assumptions which make these problems tractable using ordinary calculus. The truss
example discussed above, for example, could be treated by ordinary calculus because
of several simplifying assumptions. First, some of the potential design variables, such
as the cross-sectional areas of the truss members, were eliminated by assuming the
stresses in each member to be equal to the maximum allowable value. Second, the
analysis was simplified by neglecting the effect of selfweight of the truss members on
structural response, and by ignoring possible buckling of those members loaded in
compression. Most realistic structural optimization problems cannot be simplified to
the point where they can be solved by ordinary calculus.
28
Section 2.2: Optimization Using Variational Calculus
Consider the problem of determining a function y(x) given at two points, y(a) =
Ya and y(b) = Yb, for which the integral
J
b
where f is a small amplitude parameter and 1](x) a shape function. The function 1](x)
must satisfy the kinematic boundary conditions
J
b
29
Chapter 2: Classical Too.. n Structural Optimiza..-n
Knowing that the val~e of the integral J attains an extremum for E = 0, one can
use ordinary calculus to write the necessary condition
J(
b
OF dy + of dy') dx o. (2.2.5)
oy dE Oy'dE
a
Using Eq. (2.2.2) and defining E (dJ / dE 1.=0) to be the first variation of the functional
J denoted by fjJ we obtain
J
b
of of
fjJ = (oy fjy + Oyfjy')dx = 0. (2.2.6)
a
J
b
J= F{X,Yl,Y2,Yl',Y2',Y2")dx, (2.2.8)
a
then the condition that variation of the functional is zero may be written as
J(
b
fj J OFfj of fj' of of, of {j ")d
= ~ Yl
UYl
+"!lI
UYl
Yl + ~fjY2 + "!lIfjY2 + "?IIi Y2
UY2 UY2 UY2
X=0. (2.2.9)
a
The necessary condition for extremum expressed in the form of Eq. (2.2.6) or
(2.2.9) is usually not very useful. The terms that involve variation of derivatives
can be integrated by parts in order to obtain more useful conditions. For example,
integrating the second term of Eq. (2.2.6) and rearranging we write
For our problem the first term oil the right hand side of Eq. (2.2.10) vanishes due
to the fact that the arbitrary function 'r/(x) satisfies the boundary conditions, 'r/(a) =
'r/(b) = o. By the definition of the variation it follows that
(jy(a) = (jy(b) = o. (2.2.11)
30
Section 2.2: Optimization Using Variational Calculus
(2.2.12)
If the value of the unknown function is not specified at either or both ends, then
the variation of y(x) need not vanish at those points. However, the first term on
the right hand side of Eq. (2.2.lO) must still vanish independently, in order for the
relation to hold. That is if y(x) is not prescribed at the end points the following
conditions, often called the natuml boundary conditions, must be satisfied.
aF]
[ay' x=a
=0
'
and [8F]
ay' x=b
= o. (2.2.14)
Example 2.2.1
A B x
y 1/2
J
function as
J = P9yds, (2.2.15)
31
Chapter 2: Classical Tools in Structural Optimization
where pg is the weight per unit length and ds is an element of arc length of the cable.
Relating the arc length to the horizontal coordinate x, with the origin at the center,
we rewrite Eq. (2.2.15) as
JYV +
1/2
J = pg 1 y,2dx. (2.2.16)
-1/2
At this point one can either take the variation of Eq. (2.2.16) or, since this is a
fixed-end-point problem, apply the Euler-Lagrange equation of Eq. (2.2.13) derived
previously. The resulting necessary condition for the potential energy to be minimum
reduces to the following ordinary differential equation
VI + y,2 - ~
dx J1+Y'2 - .
( yy' ) - 0 (2.2.17)
Expanding the second term and rearranging terms, we simplify Eq. (2.2.17) to
tdt dy
(2.2.19)
t2 +1 y
t = dy =
dx
JCI
y2 - 1 . (2.2.20)
(2.2.21 )
The condition
dYI = 0 (2.2.22)
dx 0
32
Section 2.3: Classical Methods for Constrained Problems
33
Chapter 2: Classical Tools in Structural Optimization
2.3.1 Method of Lagrange Multipliers
However, now the derivative terms can not be set to zero individually because the
differential changes in the design variables (dXl, dX2, .... , dXn) are dependent on one
another through the constraint equations.
For simplicity, assume only a single constraint relation h(x) = 0, the differential
changes in the design variables are related through
oh oh oh
dh = ~dXl + ~dX2 + ........... + ~dxn = O. (2.3.6)
UXl U~ U~
We can multiply Eq. (2.3.6) by an arbitrary (for the time being) constant, A, and
add to the Eq. (2.3.5) to obtain (see [4])
(M+ M)
OXl AOXl dXl + (M + M)
OX2 AOX2 dX2 + ..... + (M + M)
oXn AoXn dX n = o. (2.3.7)
Let A be determined so that the quantities inside each of the parenthesis vanish
to satisfy the previous equation. This leads to n equations for the n + 1 unknowns,
the n design variables, and the unknown multiplier A called the Lagrange multiplier.
The constraint relation h(x) = 0 provides the requisite (n + l)th relation. Equations
(2.3.7) and (2.3.2) are exactly what one would obtained by an unconstrained mini-
mization of an auxiliary function f + Ah with respect to the design variables and the
Lagrange multiplier A.
For multiple constraint functions, one has to introduce a Lagrange multiplier for
each of the constraint functions. Therefore, in general an optimization problem with
an objective function with n design variables plus ne equality constraints stated in
Eq. (2.3.1) is equivalent to an unconstrained problem with an auxiliary function
L Ajhj .
n.
C(x, A) = f(x) + (2.3.8)
j=l
The optimum values of the design variables can be obtained by solving a system of
n + ne equations
i = 1, ... , n,
(2.3.9)
j = 1, ... ,n e ,
for n + ne unknowns.
34
Section 2.3: Classical Methods for Constrained Problems
Example 2.3.1
Here oW; is the internal complementary virtual work and oWE is the complementary
virtual work of the external forces. The dummy-load method starts by applying
a unit virtual load at the point of unknown displacement along the displacement
component of interest. The internal complementary virtual work under this loading
J
can be expressed as
oW; = -oUO = - 8r'ijE ij dV , (2.3.11)
v
where OU· is the complementary strain energy, Eij is the strain field under the actual
loads, and Orrij is the virtual stress field due to the dummy-load. In absence of the
35
Chapter 2: Classical Tools in Structural Optimization
8W~ = J
s
u;btjdS, (2.3.12)
where Ui are the components of the surface displacements and 8tj are the components
ofthe applied virtual tractions. For a two dimensional truss structure with n constant-
cross section members, Eqs. (2.3.10), (2.3.11), and (2.3.12) yield
n
6. x 1 = L 8a;€;L;A; , (2.3.13)
;=1
where L; is the length of the ith member, E; is the strain due to the actual loads, and
8a; is the dummy stress in the ith member. Relating the stresses and strains to the
design variables, we can rewrite Eq (2.3.13) as
6. = ~ j;F; L. (2.3.14)
~AB
i=l t t
"
where J; and F; are the dummy and actual internal forces in the ith member, respec-
tively, and E; is the elastic modulus of the ith member.
We can now formulate the design problem in the standard form of Eq. (2.3.1) as
n
Minimize
;=1
subject to ~ ji F; Li - 6. = o. (2.3.15)
~AB
i=l 1 1
n
£(A , A) -- "~ kL·
I I
+ A (n
"
jF
- ' - ' L· -
~ AB. '
6.
)
. (2.3.16)
i=1 i=l 1 t
Then the necessary conditions for extremum are given by the following set of equa-
tions.
3£ j;F;
3k = L; - AA2EL; = 0, (2.3.17)
, , ,
3£ = ~ J;p; L; _ 6. = o. (2.3.18)
3A ~
i=1
AB
1 t
Solving for the cross-sectional areas from Eq. (2.3.17) in terms of the Lagrange
multiplier and substituting back into Eq. (2.3.18), we can determine the value of the
Lagrange multiplier in terms of the specified displacement 6. by
(2.3.19)
36
Section 2.3: Classical Methods for Constrained Problems
(2.3.20)
Note that the term inside the square brackets is a constant. We determine the
corresponding total volume of material by substituting Eq. (2.3.20) into the objective
function to obtain
(2.3.21 )
•••
For problems in which the unknown design variables are functions constrained by
functionals, variational calculus also employs Lagrange multipliers. Recall that for
the supported cable problem the Euler-Lagrange equation was obtained by allowing
the variation of the cable shape function by to be arbitrary, or in other words by
allowing y(x) to be completely unconstrained except for the kinematic boundary
conditions. However, if the function y(x) is required to satisfy a subsidiary integral
constraint of the form
J
b
g[y(x)]dx = c, (2.3.22)
a
then the extremum of the functional J[y(x)] can be determined by the use of the
Lagrange multiplier technique. In this case the necessary condition for an extremum
is the vanishing of the first variation of an auxiliary functional
(2.3.23)
In the following example we illustrate the use of this technique for determination
of the cross-sectional area distribution of minimum weight beams for a specified
displacement at a point along the span.
37
Chapter 2: Classical Tools in Structural Optimization
Example 2.3.2
p(x)
w(x)
J
I
Minimize V = A(x)dx
o
subject to w(~) - ~ = O. (2.3.24)
(C)
w..
= JI
M(x)m(x)d
EI(x) x, (2.3.25)
o
where m(x) is the moment distribution generated by a unit load applied at x =~, E
is the elastic modulus of the beam material, and lex) is the cross-sectional moment
of inertia. Since the cross-sectional area distribution function of the beam is the
design variable, the moment of inertia term has to be expressed in terms of the area.
Commonly, the beam moment of inertia function is related to the cross-sectional area
function as
lex) = a[A(x)t , (2.3.26)
where a is a constant related to some physical dimension of the cross-section, and
n is a constant that depends on the physical relation between the two functions.
38
Section 2.3: Classical Methods for Constrained Problems
Here we limit the constant n to the integer values of 1,2, or 3. The case of n = 1
is for a rectangular cross-section beam of constant depth whose width varies along
the length. Such a beam is sometimes referred to as a plane-tapered beam. The
case n = 2 is obtained when both the width and the depth of the cross-section vary
without changing its aspect ratio, and finally the case n = 3 is for a cross-section
with a variable depth and a constant width. The latter may be referred to as the
depth-tapered beam.
The auxiliary functional for the minimization problem, Eq. (2.3.24) takes the
following form.
The necessary condition for the constrained minimum is the vanishing of the first
variation of this auxiliary functional. At this point we sct n = 1 in order to simplify
the following derivation. The first variation of Eq. (2.3.27) becomes
8C = J[1 -
I
AM(x)m(x)] 8Adx = O.
aEA2(x)
(2.3.28)
o
The corresponding Euler-Lagrange equation is
_ \ M(x)m(x) _
1 A aEA2(x) - 0, or (2.3.29)
The unknown Lagrange multiplier in Eq. (2.3.29) must be determined from the
displacement constraint in Eq. (2.3.24). That is, using Eqs. (2.3.25), (2.3.26), and
(2.3.29) in Eq. (2.3.24) we can extract
(2.3.30)
Then, the optimal area distribution and the corresponding volume are given by
(2.3.31 )
and
(2.3.32)
respectively.•••
39
Chapter 2: Classical Tools in Structural Optimization
2.3.3 Finite Subsidiary Conditions
The problems discussed in the previous section involve a rather simple integral
constraint that require a constant Lagrange multiplier in the auxiliary functional. In
a more general case, as mentioned earlier, we are interested in extremizing functionals
of several functions and their derivatives with respect to more than one independent
variable [see Eq. (2.2.8)]. In addition, there may be m finite subsidiary constraints
of the form
imposed on the problem. These constraints may range from simple algebraic equa-
tions to highly complicated differential equations that must be satisfied at every point
over the entire domain of the problem.
(2.3.34)
The Lagrange multipliers, however, are no longer constants but functions of the
coordinates Xl, ... , X n .
Example 2.3.3
The problem described above can be best illustrated by a design example of a can-
tilever beams of prescribed volume and prescribed loads for minimum deflection. Ex-
cept for a slight change of notation, this example is based upon Makky and Ghalib's
solution [7].
-jb(x) ~
x
c=J h(x)
T
w(x)
Figure 2.3.3 Optimum Design of a Beam for Minimum Deflection.
40
Section 2.3: Classical Methods for Constrained Problems
Figure 2.3.3 shows an elastic cantilever beam fixed at the end x = 0, free at the
end x = I, and acted upon by a specified distribution of transverse loading q(x) per
unit of length. The objective is to minimize some norm of the transverse displacement
of the beam for a given total volume, Yo. The norm we choose is the integral of the
transverse displacement w over the length of the beam. The loading q( x) is restricted
to be unidirectional in order to render the norm appropriate.
The functional to be minimized, in this case, is an integral of the displacement
field w(x) which must satisfy the equation of equilibrium of the beam as well as the
constraint on the total volume of material. The equation of equilibrium is expressed
as
[s(x)w"]" - q(x) = 0, (2.3.35)
with boundary conditions
at x = 0: w = 0, and w' = o. (2.3.36)
at x = I: sw" = 0, and s'w" + sw'" = 0, (2.3.37)
sex) being the bending stiffness ofthe beam that can be related, through Eq. (2.3.26),
to the cross-sectional area of the beam by
sex) = EI(x) = aEAn(x) , n=1,2,or3. (2.3.38)
f
I
-f
I
-J
1
-J
1
J
I
Equations (2.3.43) through (2.3.46) together with the associated boundary conditions
are general enough that they apply to simply supported as well as to clamped beams.
For the cantilever beam the boundary conditions are Eqs. (2.3.36) and (2.3.37).
Since the bending moment and the shear force at x = 0 cannot vanish because of the
unidirectional nature of the applied loading, the above conditions reduce to
We can integrate Eqs. (2.3.43) and (2.3.46) twice and make use of both boundary
conditions of Eqs. (2.3.37) and (2.3.54) to get
1
SA~ = -(x - 1)2, (2.3.55)
2
42
Section 2.3: Classical Methods for Constrained Problems
and
(2.3.56)
from which
).."w" _ ~ (x - 1)2p(x)
(2.3.57)
2 - 2 (s(x))2
Combining the last equation with the second Euler-Lagrange equation (2.3.44), we
obtain
2( )dA _ (x -1)2p(x)
s x ds - 2)..1 . (2.3.58)
-dA
ds
1
= -erE = c2 = constant. (2.3.60)
( ) = (x - 1)2
S X
2c
jf;0 Al
\ ' (2.3.61 )
Substituting Eq. (2.3.64) into Eq. (2.3.59) and integrating it twice, we obtain the
deflection function corresponding to the optimal beam as
( ) _ c qol3
2 2
W X - 12Vo x , (2.3.65)
43
Chapter 2: Classical Tools in Structural Optimization
where the boundary conditions in Eq. (2.3.36) were used. The constant c for a
rectangular plane-tapered beam with constant thickness h and varying width b(x) is
2 12
c = Eh 2 . (2.3.66)
(2.3.67)
For comparison, consider an equivalent uniform beam of the same total volume
Vo, length l, constant thickness h, but a constant width
Vo (2.3.68)
bo = hl·
44
Section 2.4: Local Constraints and the Minmax Approach
J[t + t
The auxiliary functional of Eq. (2.3.34) becomes
c=
v
Aj(gj - m] dv. (2.4.3)
This equation implies that the Lagrange multipliers are equal to zero when the slack
variables are not zero. That is, the Lagrange multipliers are zero at points in the
design space where the corresponding constraint is not critical. Equation (2.4.4) may
also be written as
i = 1, ... ,m, (2.4.5)
because tj = 0 if and only if gj = o. It can be shown that if we use Eq. (2.4.5), which
is called a constraint qualification equation, we can dispense with the slack functions
in the auxiliary functional. When we do that, we also dispense with the variation
of the auxiliary functional with respect to the Lagrange multiplier, and instead add
the inequality constraints to the optimality conditions. This treatment of inequality
constraints is demonstrated in the following example.
Example 2.4.1
Figure 2.4.1 Hanging cable: (a) general cross section; and (b) two constant-area
segments.
The cable in Figure 2.4.1(a) is loaded by a hanging weight W plus its own self-
weight. The cross-sectional area A( x) of the cable is to be designed for minimum
volume, subject to the constraint that the stress does not exceed an allowable value
of 0"0, and the cross-sectional area is not less than a minimum, Ao. We assume that
45
Chapter 2: Classical Tools in Structural Optimization
minimize 11 A( x )dx
such that A(x)ao - P(x) 2: 0, (2.4.8)
A-Ao2:0,
and p' + pA = 0 .
£(A(x), P(x), >'1 (x), .A2(X), .A3(X)) =11 Adx + 11 .AI (Aao - P)dx
We integrate the term including OP' by parts to convert it to oP, and then set the
coefficients of oA and OP to zero to obtain
(2.4.11)
(2.4.12)
These equations are augmented with the two inequalities
Aao - P 2: 0, (2.4.13)
A - Ao 2: 0, (2.4.14)
the constraint qualification equations
AI(AaO - P) = 0, (2.4.15)
46
Section 2.4: Local Constraints and the Minmax Approach
P=W+p(l-x)A o . (2.4.21 )
This solution becomes invalid when P exceeds Aoo-o, which from Eq. (2.4.21) happens
at x = Xt,
Xt = l _ Ao-o - W (2.4.22)
pAo
For x < Xt we have A > A o, so that P = Ao-o, and Eq. (2.4.7) can be replaced by
A' 0-0 + pA = 0 , (2.4.23)
This equation is easily solved to yield
A{x) = AoeP(x,-x)/uo , x < Xt . (2.4.24 )
•••
Another formulation of the problem in Example 2.4.1 is to find a cable with a
given volume that has the lowest possible stress. The objective function is
min max 0-( x) . (2.4.25)
A(x) O:O;:x:O;:l
This is an example of the so-called min-max problems that are common in structural
optimization. Min-max problems present a difficulty in that the maximum of a func-
tion does not have continuous derivatives. This can be seen by considering even the
simplest case of the maximum of a function at two points. Consider, for example,
the case when the cross-sectional area of the cable has to be piecewise constant to
keep down manufacturing cost. Figure 2.4.1(b) shows a case where the number of
segments is limited to two, and the design variables are the two cross-sectional areas
Al and A 2 .
47
Chapter 2: Classical Tools in Structural Optimizati01
0'2
~-----
~ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _-+A2
Example 2.4.2
Formulate the problem of designing the cable of Figure 2.4.1(a) for minimum max-
imum stress, subject to a limit Va on the available volume, and a lower limit Ao on
the cross-sectional area.
48
Section 2.5: Necessary and Sufficient Conditions for Optimality
11 A(x)dx = YO,
(2.4.27)
and pI + pA = O.
Note that we have formulated the volume constraint as an equality rather than an
inequality constraint, because common sense tells us that the volume will be fully
utilized in order to minimize the stress. Next we replace the min-max formulation
with the Taylor-Bends0e 'beta' formulation
minimize f3
such that A(x)f3 - P(x) ~ 0,
A - Ao ~ 0,
11 A(x)dx = YO,
(2.4.28)
The development in this section is based on the work of Prager and his collaborators
(see Refs. 9,10).
Consider an elastic structure being acted upon by a load 2P at a point X. Assume
the load is such that it produces a unit displacement in its direction. Then by the
principle of conservation of energy [111
J
or
~(2P x 1) = P= s(X)e[Q(X)]dv, (2.5.1)
v
where e[Q(X)] is the specific elastic strain energy or the strain energy in a structure of
unit stiffness due to a strain field Q(X) produced by the prescribed unit displacement
at X, and s(X) is the specific stiffness of the structure at X. That is, s(X) is the
stiffness per unit length for one-dimensional structures and stiffness per unit of area
for two-dimensional structures.
Thus s(X) specifies the design of the structure while the function e[Q(X)] is inde-
pendent of design parameters. For instance s(X) and e[Q(X)1 for a one-dimensional
beam element would be EI(x) and 1/2(curvature)2, respectively.
We wish to design a structure of a given total stiffness so as to maximize the
magnitude P of the load producing the prescribed unit displacement at X. From
Eq. (2.5.1) it is clear that maximizing P subject to the integral constraint on specific
J
stiffness
s(X)dv = so, (2.5.2)
v
(2.5.3)
(2.5.4)
Since the structure is required to satisfy the equations of equilibrium for every
structural design, then by the principle of minimum strain energy (which is a special
50
Section 2.5: Necessary and Sufficient Conditions for Optimality
case of the principle of minimum potential energy for prescribed displacements) the
second term within the first integral vanishes yielding
Equation (2.5.6) is the necessary condition for optimality. That is, the the stiff-
ness of an elastic structure is stationary for a given structural design if the specific
elastic strain energy is constant throughout the structure. We wish to examine if it is
also sufficient. To answer this question we assume two distinct designs sand s with
associated specific strain energies e[Q(X)] and e[Q(X)], both satisfying the constant
total stiffness constraint
Since Q(X) is also a kinematically admissible strain field for the design s, if we replace
Q(x) in the definition of P with Q(X) we are guaranteed by the principle of minimum
potential energy that
Thus
P - P ~ j s(X)e[Q(X)]dv - j s(X)e[Q(X)]dv . (2.5.11)
v v
This implies that condition (2.5.6) is not only a necessary but also a sufficient
condition for optimality.
51
Chapter 2: Classical Tools in Structural Optimization
. f~ EI(x)w" 2 dx . f~ Ea[A(x)]nw"2dx
p = max mm = Inax mm / ' (2.5.14)
I(x) w(x) fo w ,2 dx
I
A(x) w(x) fo W ,2 dx
subject to the constant volume constraint
/
e = max min
A(x) w(x)
fo/
Ea[A(x)]nw" 2 dx
/
r w ,2 dx
- A [ iA(x)dx
/
- Va 1 (2.5.16)
Ja a
The necessary conditions for stability and optimality can be determined by requiring
the first variation of the Lagrangian to vanish, that is
_ 2 f~ Ea[A(x)tw"bw"dx
be - 1 -
2 f~ Ea[A(x)]n w" 2 dx
2
[1/ W
I I
bw dx
]
+ f~ nEa[A(x)]n-l
/ w"
r w,2dx
2
bAdx
-
l
A [i bA (x) dx 1 = 0 . (2.5.17)
Jo 0
52
Section 2.5: Necessary and Sufficient Conditions for Optimality
Since the second term in equation (2.5.21) is a constant, the equation can be simplified
to
(2.5.22)
and this can be verified to be a statement of constant strain energy density in the
buckled mode shape of the optimum column.
The sufficiency of the optimality condition can be very easily established for the
case n=l. For this case Eq. (2.5.22) reduces to
(2.5.23)
We begin by assuming two distinct designs A( x) and A( x) both of which satisfy the
constant volume constraint (2.5.15) to yield
j(A -
1
A)dx = o. (2.5.24)
o
The corresponding buckling loads Per and Per with associated buckling modes wand
ill are given by
_ t EaAw" 2 dx
::.;0"--.,..-_ __ _
rl
Jo
- - 2
EaAw" dx
P
er - rl 2
Jo Wi dx
' Per = rl 2
JO Wi dx
(2.5.25)
Since the buckling mode w is also kinematically admissible for design A(x), by
the Rayleigh quotient Eq. (2.5.14), the magnitude of the quantity p defined by
= f~ EaAw" 2 dx
P= (2.5.26)
fo1 w ,2dx '
Thus,
_ > f~ E aw l/ 2 (A - A)dx
Per - Per - rl 2 (2.5.29)
Jo Wi dx
If the design A(x) satisfies the optimality condition (2.5.23), then by virtue of Eq.
(2.5.24)
Per - Per ~ 0, (2.5.30)
meaning that of all the designs with different cross-sectional shapes the one that
satisfies the optimality condition has the largest value of the critical load, thereby
establishing the sufficiency of the optimality condition.
53
Chapter 2: Classical Tools in Structural Optimization
Prager and Taylor [9J provide a similar sufficiency proof for the dual problem
namely the case of minimizing the volume or weight of an Euler-Bernoulli column for
a given buckling load.
Although it is difficult to prove the sufficiency of the optimality condition for
values of n other than I, explicit solutions for the optimum designs for all classical
boundary conditions are well known and are available from Refs. 12-16. Approx-
imate numerical solutions using the finite element displacement models have also
been reported by Refs. 17-20 for elastically supported columns with a very general
distributed axial loading and for portal frames.
Earlier works, especially those of Tadjbaksh and Keller [13]' assumed unimodal
behavior and did not allow for discontinuity in the slope and the shear force at places
where the area of cross-section vanished. Olhoff and Rasmussen [21J have shown that
the design of Tadjbaksh and Keller [13J for the clamped column is non-optimal and
have outlined more accurate bimodal numerical solutions with a constraint on the
minimum cross-sectional area. Olhoff and Rasmussen identify a threshold value for
the minimum area constraint below which the optimum clamped columns exhibit a
bimodal behavior. Papers by Masur [22,23]' Olhoff [24], and by Plaut, Johnson, and
Olhoff [25J outline less approximate and properly formulated multi-modal solutions
for the elastically supported columns.
Example 2.5.1
By way of illustration we outline the solution for one of the classical cases here while
relegating others to the exercises. Consider maximizing the critical load of a simply-
supported column oflength 1 subject to the constant volume constraint, Eq. (2.5.15).
An explicit solution to this problem was first outlined in [19J. We begin by listing
the governing equations and boundary conditions of the problem.
54
Section 2.5: Necessary and Sufficient Conditions for Optimality
The differential equation of (2.5.36) and the associated boundary conditions can be
solved by using a change of variables. Letting
A = u 1/ 2 , (2.5.38)
(2.5.39)
Cl being a constant of integration. The above equation can be integrated once more
J
giving
Ix - c21 = - du 1 • (2.5.40)
(Cl - 4jJ2u 1/ 2 )l"
Using another change of variables with Cl - 4f3 2 U 1/ 2 = t we can integrate the right-
hand-side of this equation once more to give
(2.5.41 )
The two constants of integration, namely Cl and C2, can be determined by using the
boundary condition given in Eq. (2.5.35) which yields
(2.5.42)
consequently
(2.5.43)
The optimal value of the cross-sectional area at any point along the length of the
column can, therefore, be determined from Eq. (2.5.41).
To determine the critical load parameter 13 we use the volume constraint
J J J J ul/2~:
1 1/2 1/2 u(I/2)
J
o
1
A(x)dx = Vo = 2 J
uU/~
0
(Cl _
~~
4f3 2 u 1/ 2 )1/2du. (2.5.45)
55
Chapter 2: Classical Tools in Structuml Optimization
Recalling the definition of u, we can find the value of u(I/2) from Eqs. (2.5.41) and
(2.5.43) as
(2.5.47)
Substituting Eq. (2.5.47) and the value of the constant CI from Eq. (2.5.43) into Eq.
(2.5.46) we determine the optimum value of the load parameter and the critical load
to be
2 (15Vo)3 125 EaV03
f30p t = 2431 5 ' and (Per )opt = 9 - [ 5 - . (2.5.48)
Al
Vo
1.5
1.0
0.5
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.6 0.7 0.8 0.9 1.0 X 11
56
Section 2.5: Necessary and Sufficient Conditions for Optimality
along the length of the column. However, this is not true for thin plates. The in-
plane stress-resultants in the pre buckled state of a thin plates are indeed functions
of the thickness distribution. The problem of optimizing thin plates for stability is,
therefore, significantly more complicated than that for a column.
The situation is not as bad for thin circular plates, for which the resulting govern-
ing equations (stability and optimality) are ordinary nonlinear differential equations
which can be solved approximately by some numerical schemes like those proposed
by Frauenthal [26].
The problem is more complicated for thin rectangular plates which are gov-
erned by nonlinear partial differential equations. For instance, questions about the
uniqueness of solutions are not as easily answered. Under the assumption of in-
extensional prebuckling deformations, which lead to thickness-independent in-plane
stress-resultants in the pre-buckled state, a condition of uniform strain energy density
has been established as being the optimality condition for such plates [27]. Even so,
optimization of plates on the basis of such assumptions have led to unsatisfactory
solutions for plates with aspect ratios close to unity.
Armand and Lodier [28] have attempted to explain this difficulty in optimizing
plates by linking it to the existence of infinitely many local extrema rather than
a single global optimum. According to this explanation, the solution obtained by
Frauenthal [26] is only a local optimum in the class of continuous thickness distri-
butions. Simitses [29] has shown that for the same volume, stiffened circular plates
yield much higher buckling loads than Frauenthal's optimum plate. Similarly, Kamat
[27] who optimized finite element models of rectangular plates, observed discontin-
uous thickness distributions that exhibit a tendency toward formation of ribs, and
suspected that stiffened plates would be superior. Haftka and Prasad [30], in their
survey paper on optimum structural design with plate bending elements, explain the
radically different designs obtained for the same problem by different researchers by
offering the conjecture that rib-stiffened plates are better than optimum plates with
continuous thickness distributions. Olhoff [31] provides a mathematical justification
for this behavior and for the questions of singularities and local optima in plates. The
reader is referred to the monograph by Gajewski and Zyczkowski [32] for additional
references on this topic.
2 . I~ Ea[A(x)]n w" 2 dx
w = max min (2.5.50)
A(x) w(x) IoI pA(x)w2 dx '
over all kinematically admissible displacement fields [11]. However, even though both
stability and vibration of Euler-Bernoulli beams are governed by a similar eigenvalue
system, the criteria for optimization of freely vibrating Euler-Bernoulli beams are
57
Chapter 2: Classical Tools in Structural Optimi: tion
different from those for Euler-Bernoulli columns.-.bnlike the case of columns, the de-
nominator of the Rayleigh quotient for a freely vibrating beam involves the structural
mass which is a function of the cross-sectional area.
Consider the problem of the maximization of the fundamental frequency of a
freely vibrating Euler-Bernoulli beam of a specified volume YO and specific mass p .
We again assume that I(x) = a[A(xtJ, n = 1,2, or 3.
The equation of motion of the beam and the necessary optimality condition are
then obtained by maximizing the minimum value of the Rayleigh quotient, w 2 , subject
to the constant volume constraint. In other words starting with the Lagrangian
which is a functional of the functions w(x) and A(x), and setting its total variation
with respect to both functions to zero we get
8£ =
2 f/a Ea[A(xWw I 8w"dx
I -
2 ~l Ea[A(xWw"2dx
a 2
[1 1
pA(x)w8wd:r
]
Integrating by parts the first term on the right-hand side of the above equation and
collecting the coefficients of 8w and 8A, we obtain the
Equation (2.5.56) can be interpreted to imply that the Lagrangian energy density
must be uniform in the fundamental mode of an optimum vibrating beam.
As with columns the sufficiency of this optimality condition can be easily demon-
strated for the case n = 1. For this case Eq. (2.5.56) reduces to
(2.5.57)
58
Section 2.5: Necessary and Sufficient Conditions for Optimality
We begin by assuming two designs A( x) and A( x) both of which satisfy the constant
volume constraint, Eq. (2.5.15) and hence also Eq. (2.5.24). Assume wand w to be
the fundamental frequencies and wand ill to be the associated fundamental modes
corresponding to the two designs A(x) and A(x), respectively. Thus
(2.5.58)
~2
rl
Jo
E a A- w 112dx -2
W = 1 >w (2.5.59)
10 pAw 2 dx -
But
(2.5.60)
and
(2.5.62)
Now assume that the design A(x) is one that satisfies the optimality condition, Eq.
(2.5.57). Equation (2.5.62) can then be written as
[;;21 1pAw dx -
2 w2 11 pAw dx = 11 (A -
2 2
A) ( c + w pw 2 )dx , (2.5.63)
(2.5.64)
In light of Eq. (2.5.59) it follows that
(2.5.65)
59
Chapter 2: Classical Too in Structural Optimization
thereby establishing the efficiency of the optimality condition of Eq. (2.5.57).
It should be noted tl t the same optimality condition can be shown to hold for
the dual problem of the I' imum weight design of the beam for a specified frequency.
Several similar examples optimization with frequency constraints may be found in
Refs. 33-38. In particui Turner [34] and Taylor [35] provide exact solutions for
axially vibrating minimc: _ aass structures at specified natural frequencies.
As in the case of columns, several approximate numerical solutions using the finite
element displacement method are available for maximum fundamental frequency of
elastically supported vibrating beams of fixed weight carrying a combination of con-
centrated and distributed non-structural masses and subjected to upper and lower
bounds on cross-sectional areas. For examples of this kind of approximate designs,
see Refs. 39 and 40. By comparison, published literature on the more practical dual
problem of minimizing the weight of beams for specified lower bounds on natural
frequencies and upper and lower bounds on design variables appear to be limited
[41]. It is not clear whether the primal and dual problems in this case are always
equivalent [41,42].
In closing this topic of vibrating beams, it is appropriate to point out that the
same optimality condition, Eq. (2.5.57) also applies to the optimum design of sand-
wich beams under the constraint of prescribed deflection at the point of application of
a single concentrated periodic load (e.g. see Icerman [43]). A more general optimal-
ity condition for the constraint of a prescribed deflection at a specified point under
a general distributed loading has been provided by Plaut [44]. For sandwich beams,
Plaut has shown that it is possible to establish the sufficiency of the optimality con-
dition on the basis of the principle of stationary mutual potential energy introduced
by Shield and Prager [45]. A mathematically more rigorous study of this problem
using the dynamic compliance of the structure as a constraint has been provided by
Mroz [46]. A very extensive bibliography on the topic of optimization for dynamic
response may be found in the survey papers referenced in Chapter 1.
Optimum design of Thin Plates for Vibration. The problem of the optimum
design of thin plates for vibration is not beset with the difficulty (encountered in
the design for buckling) associated with the dependence of the prebuckling stress-
resultants on the thickness distribution. That may explain why the problem of the
optimum design of thin plates for vibration appears to have received a greater at-
tention than the corresponding problem for stability. Haftka and Prasad [30] have
provided an extensive bibliography on the optimum design of plate bending elements
for vibration.
The solution to the problem of the optimum design of a circular plate for vi-
bration was first provided by Olhoff [47]. Olhoff showed (sec Exercise 8) that under
the assumption of a rotationally symmetric lowest mode, the problem reduces to
an ordinary, fourth-order, nonlinear, singular but homogeneous eigenvalue problem.
An approximate numerical solution to this problem was generated, but the solution
so obtained is only a local optimum belonging to the class of continuous thickness
distributions. For the same volume, it is easy to devise stiffened circular plate con-
figurations that possess far higher fundamental vibration frequencies than that of
Olhoff's original solution [47].
60
Section 2.6: Use of Series Solutions in Structural Optimization
For rectangular plates, the optimum designs of finite element models that allow
discontinuous thickness distributions again exhibit a tendency to distribute the ma-
terial of the plate along discrete ribs [48-50]. For the same volume, a stiffened rectan-
gular plate can be expected to have much higher fundamental frequency of vibration
than that of a plate optimized on the basis of a continuous thickness distribution.
The methods of calculus of variations discussed in the previous sections are ide-
ally suited for simple problems where the unknowns are design functions such as
area distributions. These problems are called distributed parameter optimization
problems.
Another approach for solving distributed parameter problems which are not sim-
ple enough to be attacked by the methods of Variational Calculus is the use of series
solutions. The basic idea is to assume a series representation of the unknown design
function within the domain of the structure along with the assumed response func-
tions such as displacements. In general, therefore, the series solution method reduces
continuous mathematical programming problems to discrete ones with a finite number
of design variables. These variables are the coefficients of the series representation of
the unknown design function. This idea was initially presented by Balasubramanyam
and Spillers [511 who solved various vibration and buckling problems using Fourier
series representation of the cross-sectional area of beam and column structures. A
similar procedure was recently used by Parbery [521 to obtain minimum-area shapes
for desired torsional and flexural rigidity. The method will be demonstrated by the
following example.
Example 2.6.1
The optimum design of a buckling critical simply supported column is repeated in this
example [51] to demonstrate the use of Fourier series approach. As in the examples
discussed earlier there is a fixed material volume constraint, see Eq. (2.5.15). The
objective is to find the cross-sectional area distribution of a plane-tapered column
that maximizes the buckling load. That is, the cross-sectional area distribution is as-
sumed to be related to a change in width (direction perpendicular to the deformation
direction) of a rectangular section with constant depth. This corresponds to n = 1
in Eq. (2.3.26).
We start with the governing stability equation for the problem
EaA(x)w" + pw = 0 . (2.6.1 )
Expanding the unknown quantities in two-term truncated Fourier series we have
. 7rX • 37rX
W = a1slllT +a3 s111 L , (2.6.2)
61
Chapter 2: Classical Tools in Structural Optimization
27rx
A(x) = {30 - {32 cos L ' (2.6.3)
Note that the boundary conditions of the column eliminate the need for a constant
term in Eq. (2.6.2). Because of the expected symmetry of the mode shape and the
cross-sectional area distribution a2 and {31 terms are omitted from the Fourier series.
The selection of the cosine representation for the cross-sectional area makes it possible
to reduce the products of Fourier series (Aw") directly to a single series.
The key strategy in this application is to reduce the number of unknown terms
by substituting these assumed forms into the appropriate equations. Equating the
coefficients of the similar trigonometric functions one obtains algebraic equations that
must be satisfied by these coefficients. For example, using Eq. (2.4.15) we can show
that
{30 = Vo (2.6.4 )
L'
Substituting {30 back to Eq. (2.6.3) and then using Eq. (2.6.2) we obtain the following
product
" 7r 2 [ aVo a{32 9a{32 . 7rX
aA(x)w = -( -) (-al + - a l - --a3) sm-
L L 2 2 L
-a{32 9aVo . 37rX 9a{32 . 57rX]
+(--al + --a3)sm- - --a3 sm - (2.6.5 )
2 L L 2 L'
where the trigonometric identity
has also been used. Using Eq. (2.6.5) in the equilibrium equation and equating the
coefficients of the sine terms we obtain the following algebraic equations.
(2.6.7)
7r 9a Vo a{32
-E(y)
2
(Ta3-Tal)+pa3=0, (2.6.8)
For a nontrivial solution, the determinant of the coefficient matrix for the un-
known mode shape (aI, a3)Y must vanish. This results in the following quadratic
relation for the buckling load p in terms of the only unknown coefficient (32 left in
the problem
(2.6.10)
62
Section 2.7: Exercises
The expression for the critical load is given by
In order to determine the value of (32 that maximizes the buckling load we take the
derivative of Eq. (2.6.11) with respect to the unknown parameter (32 and equate it
to zero. Resulting optimum value of (32 is
(3 * - 32 Vo (2.6.12)
2 - 37 L '
• 45 1[2 EaVo 45
Pcr = 37 L3 = 37 POcr . (2.6.13)
where POcr is the buckling load of the constant-cross-section column of volume Vo.
Although 22% stronger than the constant cross-section column of the same volume,
this design is inferior to the design obtained in Example 2.5.1. In that example the
change in area was achieved by varying the depth of the cross section keeping the
width constant (n = 3). Clearly modifying the depth of the cross section is a more
effective way of achieving increased buckling resistance. Example 2.5.1 repeated with
n = 1 results in a quadratic distribution of the cross-sectional area with a critical load
of P~r = 12/1[2pOcr , which is almost identical to the result obtained in Eq. (2.6.13).
Moreover, the advantage of this method over other classical methods is in its ability
to deal with more general structural problems under a variety of load conditions that
may not be possible to solve using variational calculus .•••
The success of the series solution in optimization is closely related to the form
of the series chosen for the representation of the unknown function. In order to keep
the number of design variables to a minimum, only few terms in the series represen-
tation should be used. But, with a small number of terms used in the series, the
approximation of the solution of the governing differential equations of the problem
may be poor. Selection of the two-term approximation for the mode shape in the
example just covered makes it possible to come up with a one parameter solution for
the maximum buckling load in a closed form. However, it is important to note that
the two-term solution shown above does not satisfy the equilibrium equation exactly.
The last term in Eq. (2.6.5), when substituted into the equilibrium equation does
not vanish. If, on the other hand, one uses too many terms in the series finding the
optimum values of the coefficients of the terms becomes difficult, and may require
the use of a formal search technique. A simple way of reducing the number of design
variables without the loss of accuracy is to use possible symmetry inherent to the
problem so that only a part of the geometry needs to be modelled. A good example
of this approach is demonstrated in [52] where three-fold symmetry is used for the
cross-sectional shape of a bar in torsion.
63
Chapter 2: Classical Tools in Structural Optimization
2.7 Exercises
rr=~jaEdv- jq(X)WdX,
v 0
where
rf-w
E = -y dx2 ' and a = EE,
and q(x) is the distributed external transverse loading acting along the beam.
2. Solve Example Problem 2.3.2 for q = qo, ~ = 1/2, assuming n = 2 and n = 3.
3. Solve Example Problem 2.3.3 for the following cases
a)n = 1; q(x) = qo(l- x)/l.
b)n = 1; q(x) = 4qo(lx - x 2)/12.
c)n = 2; q(x) = qo.
d)n = 2; q(x) = qo(l- x)/l.
e)n = 3; q(x) = qo.
J)n = 3; q(x) = qo(l- x )/1.
4. Solve Example 2.4.2.
5. Determine the optimum area distribution and corresponding buckling loads of the
following Euler-Bernoulli columns subject to the constant volume constraint;
a) cantilever column, n=l, 2, and 3.
b) simply-supported column, n = 1,2.
6. The Rayleigh quotient for an axially vibrating bar with an attached non-structural
mass m at the free end x = I is given by
2 f~ EA(x)u '2 dx
W = I .
fo pA(x)u 2 dx + m[u(l)]2
a) Derive the equation of motion and the optimality condition for the minimum
mass design of the bar with a specified fundamental frequency WOo
64
Section 2.8: Exercises
b) Verify Turner's solution [34] that for such a bar
where f3 = ";wgp/ E.
7. Begin with a Rayleigh quotient similar to that of the previous problem for a
vibrating cantilever beam of sandwich construction. Assume that the beam carries a
distributed non-structural mass m(x) »pA(x). Verify Taylor's solution [35] that the
area distribution A(x) for the optimum beam with a specified fundamental frequency
Wo is given by
-
I
w(x) 1
J h(~)W2~d~
o
where
J
a
c-!:.. ,
<,- 27rtrdr,
and Vo =
a
o
with primes denoting differentiation with respect to the non-dimensional radial co-
ordinate ~.
Also, show that the optimality condition for maximizing the fundamental fre-
quency of such a plate with a specified volume Vo,
2.8 References
[1] Hancock, H., Theory of Maxima and ~Jinima. Ginn and Company, New York,
1917.
[2] Gelfand, I.M. and Fomin, S.V., Calculus of Variations. Prentice Hall, Inc., En-
glewood Cliffs, NJ, 1963.
[3] Pars, L.A., An Introduction to the Calculus of Variations. Heinmann, London,
1962.
[4] Hildebrand, F. 13., lvlethods of Applied Mathematics. Prentice-Hall, New Jersey,
1965.
[5] Reddy, J.N., Energy and Variational Methods in Applied l\Iechanics. John Wiley
and Sons, New York, 1984.
[6] Barnett, R.L., "Minimum 'Weight Design of Beams for Deflection," J. EM Divi-
sion, ASCE, Vol. EMl, 1961, pp. 75-95.
[7] Makky, S.M. and Ghalib, M.A., "Design for Minimum Deflection," Eng. Opt., 4,
pp. 9-13, 1979.
[8] Taylor, J.E., and nends(le, M.P." "An Interpretation for Min-Max Structural
Design Problems Including a IvIethod for Relaxing Constraints," International
Journal of Solids and Structures, 30, 4, pp. 301-314, 1984.
[9] Prager, \V. and Taylor, J.E., "Problems of Optimal Structural Design," J. App!.
Mech. 35, pp. 102-106,1968.
[10] Prager, \V., "Optimization of Structural Design," J. Optimization Theory and
Applications, 6, pp. 1·21,1979.
[11] Washizu, K., Variational Methods in Elasticity and Plasticity. 2nd ed. Pergamon
Press, 1975.
[12] Keller, J.B., "The Shape of the Strongest Column," Arch. Rat. Mech. Ana!' 5,
pp. 275-285, 1960.
[13] Tadjbaksh, I. and Keller, J.B., "Strongest Columns and Isoperimetric Inequalities
for Eigenvalues," J. App!. Mech. 29, pp. 159-164,1962.
66
Section 2.8: References
[14] Keller, J.B. and Niordson, F.L, "The Tallest Column," J. Math. Mech., 29, pp.
433-446,1966.
[15] Huang, N.C. and Sheu, C.Y., "Optimal Design of an Elastic Column of Thin-
Walled Cross Section," J. Appl. Mech., 35, pp. 285-288, 1968.
[16] Taylor, J.E., "The Strongest Column ~ An Energy Approach," J. Appl. Mech.,
34, pp. 486-487, 1967.
[17] Salinas, D., On Variational Formulations for Optimal Structural Design. Ph.D.
Dissertation, University of California, Los Angeles, 1968.
[18] Simitses, G.J., Kamat, M.P. and Smith, C.V., Jr., "The Strongest Column by the
Finite Element Displacement Method," AIAA Paper No: 72-141, 1972.
[19] Hornbuckle, J.C., On the Automated Optimal Design of Constrained Structures.
Ph.D. Dissertation, University of Florida, 1974.
[20] 'furner, H.K. and Plaut, R.H., "Optimal Design for Stability under Multiple
Loads," J. EM Div. ASCE 12, pp. 1365-1382,1980.
[21] Olhoff, N.J. and Rasmussen, H., "On Single and Bimodal Optimal Buckling
Modes of Clamped Columns," Int. J. Solids and Structures, 13, pp. 605-614,
1977.
[22] Masur, E.F., "Optimal Structural Design under Multiple Eigenvalue Con-
straints," Int. J. Solids Structures, 20, pp. 211-231, 1984.
[23] Masur, E.F., "Some Additional Comments on Optimal Structural Design under
Multiple Eigenvalue Constraints," Int. J. Solids Structures, 21, pp. 117-120, 1985.
[24] Olhoff, N.J., "Structural Optimization by Variational Methods," in Computer
Aided Structural Design: Structural and Mechanical Systems (C.A. Mota Soares,
Editor), Springer Verlag, pp. 87-164, 1987.
[25] Plaut, R.H., Johnson, L.W. and Olhoff, N., "Bimodal Optimization of Com-
pressed Columns on Elastic Foundations," J. Appl. Mech., 53, pp. 130-134,1986.
[26] Frauenthal, J.C., "Constrained Optimal Design of Circular Plates against Buck-
ling," J. Struct. Mech., 1, pp. 159-186,1972.
[27] Kamat, M.P., Optimization of Structural Elements for Stability and Vibration.
Ph.D. Dissertation, Georgia Institute of Technology, Atlanta, GA, 1972.
[28] Armand, J.L. and Lodier, B., "Optimal Design of Bending Elements," Int. J.
Num. Meth. Eng., 13, pp. 373-384, 1978.
[29] Simitses, G.J., "Optimal Versus the Stiffened Circular Plate," AlA A J., 11, pp.
1409-1412, 1973.
[30] Haftka, R.T. and Prasad, B., "Optimum Structural Design with Plate Bending
Elements ~ A Survey," AIAA J., 19, pp. 517-522, 1981.
67
Chapter 2: Classical Tools in Structural Optimization
[31] Olhoff, N., "On Singularities, Local Optima and Formation of Stiffeners in Op-
timal Design of Plates," In: Optimization in Structural Design, A. Sawczuk and
Z. Mroz (eds.). Springer-Verlag, 1975, pp. 82-103.
[32] Gajewski, A., and Zyczkowski, M., Optimal Structural Design under Stability
Constraints, Kluwer Academic Publishers, 1988.
[33] Niordson, F.!., "On the Optimal Design of a Vibrating Beam," Quart. App!.
Math., 23, pp. 47-53, 1965.
[34] Turner, M.J., "Design of Minimum-Mass Structures with Specified Natural Fre-
quencies," AIAA J., 5, pp. 406-412, 1967.
[35] Taylor, J.E., "Minimum-Mass Bar for Axial vibration at Specified Natural Fre-
quency," AIAA J. ,5, pp. 1911-1913,1967.
[36] Zarghamee, M.S., "Optimum Frequency of Structures," AIAA J., 6, pp. 749-750,
1968.
[37] Brach, R.M., "On Optimal Design of Vibrating Structures," J. Optimization The-
ory and Applications, 11, pp. 662 667, 1973.
[38] Miele, A., Mangiavacchi, A., Mohanty, B.P. and \Vu, A.K., "Numerical Determi-
nation of Minimum Mass Structures with Specified Natural Frequencies," Int. J.
Num. Meth. Engng., 13, pp. 265282, 1978.
[39] Kamat, M.P. and Simitses, G.J., "Optimum Beam Frequencies by the Finite
Element Displacement Method," Int. J. Solids and Structures, 9, pp. 415-429,
1973.
[40] Kamat, M.P., "Effect of Shear Deformations and Rotary Inertia on Optimum
Beam Frequencies," Int. J. Num. Meth. Engng., 9, pp. 51-62, 1975.
[41] Pierson, B.L., "A Survey of Optimal Structural Design under Dynamic Con-
straints," Int. J. Num. Meth. Engng., 4, pp. 491-499, 1972.
[42] Kiusalaas, J., "An Algorithm for Optimal Structural Design with Frequency Con-
straints," Int. J. Nnm. Meth. Engng., 13, pp. 283-295, 1978.
[43] kerman, L.J., "Optimal Structural Design for given Dynamic Deflection," Int. J.
Solids and Structures, 5, pp. 473-490, 1969.
[44] Plaut, R.H., "Optimal Structural Design for given Deflection under Periodic
Loading," Quart. App!. Math., 29, pp. 315-318, 1971.
[45] Shield, R.T. and Prager, W., "Optimal Structural Design for given Deflection,"
z. Angew. Math. Phys., 21, pp. 513-523, 1970.
[46] Mroz, Z., "Optimal Design of Elastic Structures subjected to Dynamic, Harmon-
ically Varying Loads," Z. Angew. Math. Mech., 50, pp. 303-309,1970.
[47] Olhoff, N., "Optimal Design of Vibrating Circular Plates," Int. J. Solids and
Structures, 6, pp. 139-156,1970.
68
Section 2.8: References
[48] Olhoff, N., "Optimal Design of Vibrating Rectangular Plates," lnt. J. Solids and
Structures, 10, p. 93-109, 1974.
[49] Kamat, M.P., "Opt.imal Thin Rectangular Plates for Vibration," Recent Advances
in Engineering Science, Vol. 3. Proceedings of the 10th Annual Meeting of the
Society of Engineering Science, pp. 101-108, 1973.
[50] Armand, J.L., Lurie, K.A. and Cherkaev, A.V., "Exist.ence of Solutions of
the Plate Optimization Problem," Proceedings of the Int.ernational Symposium
St.ructural Design, Tucson, AZ, pp. 3.1-3.2, 1981.
[51] Balasubramanyam, K. and Spillers, W.R., "Examples of the Use of Fourier Series
in Structural Opt.imization," Quart. of Appl. Math., 3, pp. 559-566, 1986.
[52] Parbery, R.D., "On Minimum-Area Convex Shapes of given Torsional and Flex-
ural Rigidity," Eng. Opt., 13, pp. 189-196,1988.
69
Linear Programming 3
71
Chapter 3: Linear Programming
72
Section 3.1: Limit Analysis and Design of Structures Formulated as LP Problems
Strain, E
A simple example of a three bar truss is used in the following example to illustrate
the difference between the calculation of the load which initiates yielding and the
estimate of the collapse load.
Example 3.1.1
We perform the collapse analysis of a three bar pin jointed truss under a vertical
load as shown in Fig. 3.1.2. All three bars have the same cross-sectional area A, and
are made of material having Young's modulus E and yield stress 0"0' We start by
calculating the load p at which the first bar yields. Denoting the vertical displacement
at the common joint D by v, we obtain the strains in the three members
V V
fB = I' fA = fC = 4l . (3.1.1)
(3.1.3)
73
Chapter 3: Linear Programming
Clearly, as the load is increased from zero member B yields first, when
or p = 1.25Ao-o . (3.1.5)
The structure does not collapse, however, at p = 1.25Ao-o since members A and
C can still carry the applied load without experiencing excessive deformations. We
may increase the load until member A or C yields. Since we have assumed elastic-
perfectly plastic material behavior, the stress in member B will remain at 0-0 as we
increase the load beyond the initial yield load. Due to the symmetry in this problem,
the next yielding takes place simultaneously in members nA and nco Therefore, at
collapse all three members will be at the yield point so that
(3.1.6)
(3.1. 7)
This is a 60% increase over the load when first yielding starts .•••
In example 3.1.1 it was easy to identify the sequence of yielding of the members
and determine the state of stress in the members at collapse. This fact permitted us to
determine the collapse load without difficulty. In general, it is not easy to determine
the combination of members that will yield at collapse, and the stress distribution at
the collapse is not known. Fortunately, it is possible to cast the problem as an LP
problem in order to determine the collapse load [1] based on a general theorem of
the theory of plasticity. This theorem is the lower bound theorem, and it is quoted
below from Calladine Ref. 2.
The Lower Bound Theorem: If any stress distribution throughout the structure
can be found which is everywhere in equilibrium internally and balances the external
loads, and at the same time does not violate the yield conditions, these loads will be
camed safely by the structure.
The application of this theorem will now be demonstrated for a problem where
the choice of stress at collapse is not as trivial as it was in example 3.1.1. We use the
same structure used in the previous example, but with an added horizontal load at
point D.
74
Section 3.1: Limit Analysis and Design of Structures Formulated as LP Problems
Example 3.1.2
p
Figure 3.1.3 Limit analysis of a three bar truss subjected to two loads.
Consider the limit analysis of the three bar truss of Figure 3.1.3 under the com-
bined vertical and horizontal loads of equal magnitude, p. The equations of equilib-
rium in this case are
I
nB + 2{nA + nc) - p = 0,
(3.1.8)
y'3
T (nA - nc) - p = 0 ,
and we have the constraints
(3.1.9)
It is no longer easy to know which two of the three bars yield at the collapse. However,
we may try different combinations of nA, nB, and nc that satisfy the equations of
equilibrium in order to obtain a lower bound to the collapse load. For example, if we
try nc = 0, we obtain from the equilibrium relations (3.1.8)
2
nA = y'3p = 1.155p, and nB = 0.423p . (3.1.10)
Clearly in this case nA reaches its yield value of Aao before nB so that
y'3
nA = Aao, nB = 0.366Aao, and p = TAao = 0.866Aao . (3.1.11)
Having satisfied all the requirements for the lower bound theorem, we thus know
that the collapse load is bounded below by 0.866Aao. We can now try different
combinations of member force distribution until we obtain a higher value of p than
the one obtained in Eq. (3.1.11). To get the best estimate, we cast the problem as a
maximization problem
maximize P
such that Eqs. (3.1.8) and Eqs. (3.1.9) are satisfied. (3.1.12)
This is clearly a LP problem in the variables nA, nB, nc and p , and may be solved
using any LP algorithm. It is also simple enough to admit a graphical solution if
required (see Exercise 1) .•••
75
Chapter 3: Linear Programming
The general formulation of the calculation of the limit load for truss structures
is similar to the procedure used in example 3.1.2 . It is assumed that no part of the
truss structure fails by buckling before the plastic collapse load is reached. If we have
a truss structure with r members loaded by a system of loads AP, where p is a given
load vector and A is a scalar, the limit load can be determined by finding the largest
value of A that the structure can support. The equations of equilibrium are written
as r
where nj (j = 1, ... , r) are the forces in each of the truss members, eij are direction
cosines, and m is the number of equilibrium equations. The yield constraints are
written as
(3.1.14)
where A j , (lCj, and (lTj are the cross-sectional areas, and the yield stresses in com-
pression and tension, respectively. The limit or collapse load is then the solution to
the following linear programming problem:
maximize A
such that Eq. (3.1.13) and Eq. (3.1.14) are satisfied, (3.1.15)
where A and the member forces nj are treated as the design variables.
A related problem is the problem of limit design where the collapse load is spec-
ified and the optimal cross-sectional areas are sought. Often, the objective is to
minimize the total mass of the structure
where Pj and lj are the mass density and the length of member j, respectively. The
minimization problem of Eq. (3.1.16) has the same set of constraints, Eqs. (3.1.13)
and (3.1.14), that applies to the limit analysis problem, but both nj and Aj are
treated as design variables. This time, however, the load amplitude A is specified.
Example 3.1.3
Formulate the limit analysis and design of the five bar truss shown in Figure (3.1.4)
as linear programs. Assume that all bars are made of the same material and that
(lc = -(IT = (10·
The vertical and horizontal equations of equilibrium at the unrestrained nodes of
the structure are
n13
V2
+ Tn23 = 0,
V2
n24 + Tn14 = 0, (3.1.17a)
V2 V2
n34 + Tn23 = 0, n34 + Tn14 = P . (3.1.17b)
76
Section 3.1: Limit Analysis and Design of Structures Formulated as LP Problems
4
P
Figure 3.1.4 Limit analysis and design of a five bar truss.
For the limit design problem both the cross-sectional areas and the member forces
are treated as design variables .•••
The analysis and design of structures that include members under bending may
be formulated as LP problems as in Refs. 3-5. Cohn, Ghosh, and Parimi [3] provide
an excellent unified approach to both the analysis and design of beams, frames, and
arches of given configurations under fixed, alternating, and variable repeated or shake-
down loadings. We focus our attention here only on simple examples in this class of
problems.
The basic hypothesis regarding the material is that the beam or frame is elastic-
perfectly plastic. The fully plastic moment, m p , of a beam cross-section is defined as
the bending moment, m, required to make the entire cross-section yield so as to form
a hinge with constant. bending resistance.
77
Chapter 3: Linear Programming
Example 3.1.4
1/2
maximize >.
subject to i = 1,2,3, (3.1.20)
where ml,m2,and ,m3 are the magnitudes of the bending moment at those points
along the beam which have the potential to form plastic hinges; at these points
the bending moments have local maxima. These three moments are also unknowns
for the problem and need to be determined. At the onset of either of the collapse
mechanisms shown in Figure 3.1.5, we can write down two equations of equilibrium
by using the principle of virtual displacements. The basic assumption in writing the
virtual displacements is that the hinges in the figure are not plastic hinges, but are
introduced to permit the small displacements that are assumed to take place while
the members between them remain straight. The resulting equilibrium relations are
(3.1.21)
78
Section 3.1: Limit Analysis and Design of Structures Formulated as LP Problems
(3.1.22)
where Oi, O2are the virtual rotations of the member at the expected plastic joints and
8i, 82 the virtual displacements of the beam under the load points. The virtual dis-
placements and the rotations are related to one another through kinematic relations,
and can be eliminated from the equations. Furthermore, using the two equilibrium
equations, we can eliminate the two variables, ml and m3, to reduce the LP problem
of 3.1.20 to finding the ..\ and m2 such that
maximize ,\
pi 1
subject to - mp<
-
(-..\
4 - -m2)
2 -< m p,
- mp ::; m2 ::; mp , (3.1.23)
- m < (pl,\
p- 2 - !m2)
2 -< m p .
This is a simple two variable (m2 and ..\) LP problem that can be solved graphically.
•••
Example 3.1.5
As an illustration of limit design for bending type problems, consider the well-known
problem of minimizing the weight of a plane frame to resist a given set of ultimate
loads. A single bay, single story portal frame is loaded by a horizontal and a vertical
load of magnitude p as shown in Figure 3.1.6. For this design problem the top hori-
zontal member is assumed to be different from the two vertical columns. Accordingly,
we assume the beam and the column cross-sections to have associated fully plastic
moments mpB and mpC, respectively. These two plastic moments depend on the
cross-sectional properties of their respective members and, therefore, are the design
variables for the problem.
P
1
p--,--...--i-------;
21
between the weight per running foot, WI, and the plastic section modulus, mp/(J'o.
Over the relevant range of sections that may be expected to be used for a given
frame the error involved in this linearization is of the order of 1%. It is this single
assumption which renders the plastic design problem linear.
We will, therefore, assume that the problem of minimizing the weight of a frame
for a set of ultimate loads reduces to minimizing a function
(3.1.24)
(3.1.25)
p p p
p P
p--.,...-----~
---1~ __
--~,....,--
Figure 3.1.7 Collapse mechanisms for the portal frame of Figure 3.1.6.
The equations of equilibrium can be obtained by using the same approach used
in the previous example. Figure 3.1.7 shows all possible collapse mechanisms for the
frame. The ultimate load carrying capacity of the structure for any given collapse
mechanism is obtained by the virtual work equivalence between the external work
of the applied loads and the internal work of the fully plastic moments experienced
80
Section 3.2: Prestressed Concrete Design by Linear Programming
while undergoing virtual rotations of the plastic hinges. Thus a permissible design
is one for which the capacity for internal virtual work is greater than or equal to
the external work. It is left as an exercise (see Exercise 4) to verify that behavioral
constraints associated with the collapse mechanism of Figure 3.1.7 reduce to
and (3.1.32)
Thus the problem of weight minimization under a set of ultimate load has been
reduced to the determination of those non-negative values of xland X2 for which I
as given by Eq. (3.1.25) is minimized subject to constraints Eqs. (3.1.26 - 3.1.32).
The problem is clearly an LP problem. We will defer the analytical solution of this
problem until later .•••
prestressing force in service is a constant fraction a of the initial prestressing force fo.
In calculating the bending moment distribution or the deflected shape of a prestressed
beam, in addition to the usual dead and live loads, we must allow for the equivalent
distributed loading (see Exercise 6a) and the end loads resulting from the curved
profile of the eccentrically placed tendons. It can be shown [7,8] that for parabolic
profiles of the cables (see Figure 3.2.1) the induced moments and deflections are
linearly related to the quantity foe with the constant of proportionality k being a
function of the known material and cross-sectional properties. With this assumption
maximum stresses and the deflections of a simply supported beam occur at the center
of the beam. If the maximum positive bending moment and maximum deflection at
the center of the simply-supported beam of Figure 3.2.1 due to external loads in
the ith loading condition are denoted by mei and bei, respectively, then the beam
optimization problem reduces to
minimize fUo,e) = fo (3.2.1 )
subject to a Ii < - ± mei - afoe <
_ -afo _a ui , (3.2.2)
a z
bli ~ bei + akfoe ::; b" i , (3.2.3)
el ~ e ~ e", (3.2.4)
fo 2: 0, i = 1, ... ,nl. (3.2.5)
Here nl denotes the number of different loading conditions; ai, aU, bl , b" , el , and e"
denote lower and upper bounds on stress, deflections and the tendon eccentricity;
a and z denote the effective area and the section modulus of the cross-section.
The problem as formulated by Eqs. (3.2.1) through (3.2.5) is not an LP problem
because it includes the product foe of the two variables. However, it can be easily
cast as one by letting
m = foe, (3.2.6)
and expressing the problems in terms of the new design variables fo and m. The
transformed problem thus reduces to the following LP problem
minimize fUo,m) = fo (3.2.7)
82
Section 3.3: Minimum Weight Design of Statically Determinate Trusses
subject to
Ii alo ± mei - am ui
(J :::; - - :::; (J , (3.2.8)
a z
81i + akm
:::; 8ei :::; 8ui , (3.2.9)
ml:::; m:::; m U ,
(3.2.10)
fo 2: 0, i = 1, ... , nl, (3.2.11 )
with mt and m U being the upper and lower bounds on foe.
Morris [9] has treated a similar problem, but with additional constrains on ulti-
mate moment capacity. He also modified the constraint (3.2.11) to allow the Ameri-
can Concrete Institute's limit on the prestressing force intended to prevent premature
failure of the beam by pure crushing of the concrete. Morris linearizes part of the
problem by using the reciprocal of the prestressing force as one of the design variables;
this transformation however fails to linearize the constraint on the ultimate moment
capacity. In the interest of linearization, this nonlinear constraint is replaced by a
series of piecewise linear connected chords with true values at chord intersections.
Kirsch [10] has shown that appropriate transformations can also be used to reduce
the design of continuous prestressed concrete beams to equivalent linear program-
ming problems. These problems involve not only the optimization of the prestressing
force and the tendon configuration, but also the optimization of the cross-sectional
dimensions of the beam.
As another example of the design problems that can be turned into LP problems
we consider the minimum weight design of statically determinate trusses under stress
and deflection constraints. The difficulty in these problems arises due to the nonlinear
nature of the deflections as a function of the design variables which are the cross-
sectional areas of the truss members. This type of problem, however, belongs to
the class of what is known as separable programming [11] problems. In this class of
programming the objective function and the constraints can be expressed as a sum
of functions of a single design variable. Each such function can be approximated by
a piecewise linear function or a set of connected line segments or chords interpolating
the actual function at the chord intersections.
A nonlinear separable function of n design variables,
(3.3.1)
can be linearized as
m m m
with
m m
83
Chapter 3: Linear Programming
m m m
The variables, (Xl, ... , xn), of the function have thus been replaced by the interpola-
tion functions, 'T/jk, only two of which are constrained to be non-zero for each of the
design variables. Therefore, we have a linear approximation to the function at every
design variable.
Example 3.3.1
As an illustration we consider a problem similar to the one solved by Majid [12]. The
objective is the minimum weight design of the four bar statically determinate truss
shown in Figure 3.3.1 with stress constraints in the members and a displacement
constraint at the tip joint of the truss. In order to simplify the problem we assume
members 1 through 3 to have the same cross-sectional area A l , and the member 4 the
area A 2 • Under the specified loading, the member forces and the vertical displacement
at joint 2 can easily verified to be
(3.3.7)
(3.3.8)
84
Section 3.3: Minimum Weight Design of Statically Determinate Trusses
and (3.3.9)
as
minimize f(xt, x 2) = -
3 v'3
+- (3.3.10)
Xl X2
subject to 18xl + 6v'3X2 :::; 3, (3.3.11)
0.05 :::; Xl :::; 0.1546, (3.3.12)
0.05 :::; X2 :::; 0.1395, (3.3.13)
where lower bound limits on Xl and X2 have been assumed to be 0.05. Except for the
objective function which is a separable nonlinear function, the rest of the problem is
linear. The objective function can be put in a piecewise linear form by using Eqs.
(3.3.2) and (3.3.3). For the purpose of demonstration, we divide the design variable
intervals of Eqs. (3.3.12) and (3.3.13) into two equal segments (m = 2) resulting in
XIO = 0.05, Xu = 0.1023, X12 = 0.1546,
and X20 = 0.05,
X2l = 0.09475, X22 = 0.1395 .
Objective function values corresponding to these points are
flO = 20, fll = 9.76, h2~= 6.47,
and 120 = 34.64, 121 = 18.28, 122 = 12.42 .
Therefore, the linearized objective function is
f(xl, X2) = 207]10 + 9.761]11 + 6.471]12 + 34.641]20 + 18.281]21 + 12.421]22 .
85
Chapter 3: Linear Programming
After substituting
For simple problems with no more than two design variables a graphical solution
technique may be used to find the solution of a LP problem. A graphical method
not only gives a solution, but also helps us to understand the nature of LP problems.
The following example is included in order to illustrate the nature of the design space
and the optimal solution.
Example 3.4.1
Consider the portal frame limit design problem of example 3.1.5. The problem was
reduced to minimizing the objective function
XI = X2 = 1/2, (3.4.2)
86
Section 3.4: Graphical Solutions of Simple LP Problems
..... \.\~ \. ,,
Eq. (3.1.28) 1 \~
'~\J' "
\. \.
0.5 '~o \.
, '
Eq. (3.1.27)
0
o 0.5 1 1.5 2 2.5 3 3.5
(3.4.3)
with c being a constant, then every point along the line [a,bl in Figure 3.4.1 would
constitute an optimum solution.
The concept of a convex polygon with corners or vertices in two dimensions
generalizes to a convex polytope with extreme points in Rn. For example, a convex
polytope [111 is defined to be the set which is obtained by the intersection of a finite
number of closed half-spaces. Similarly, an extreme point of a set is defined to be a
point x in Rn which cannot be expressed as a convex combination OXI + (1 - O)X2
(0 < a < 1) of two distinct points Xl and X2 belonging to the set. Finally, as in the
two-dimensional case of Figure 3.4.1, barring degeneracy, a linear objective function
in Rn achieves its minimum only at an extreme point of a bounded convex polytope.
87
Chapter 3: Linear Programming
4X2 - X3 = 1, (3.5.4)
2Xl + 2X2 - X4 = 1 , (3.5.5)
:7:1 + X2 - X5 = 1 , (3.5.6)
2.rl - X6 = 1, (3.5.7)
2Xl + 4X2 - X7 = 3, (3.5.8)
4.rl + 2.1:2 - Xs = 3, (3.5.9)
by the addition of the surplus variables X3 through Xs, provided that these variables
are restricted to be non-negative, that is
88
Section 3.6: The Simplex Method
so that the new variable never becomes negative during the design. Such artificial
variables are often used in structural design problems where quantities such as stresses
are used as design variables. Stresses can be both positive or negative depending upon
the loading condition. It is clear from Eq. (3.5.11) that putting LP program in a
standard form may cause an increase in the dimension of the design space. Using
Eq. (3.5.12) does not increase the dimension of the problem but it may be difficult to
know a priori the value of the constant M that will make the design variable positive
(the choice of a very large number may result in numerical ill-conditioning).
Going back to Eq. (3.5.2) we notice that if m = n and all the equations are
linearly independent, we have a unique solution to the system of equations, whereas
with m > n we have, in general, an inconsistent system of equations. It is only when
m < n that we have many possible solutions. Of all these solutions we seek the one
which satisfies the non-negativity constraints and minimizes the objective function
f.
3.5.1 Basic Solution
We assume the rank of the matrix A to be m and select from the n columns of A a
set of m linearly independent columns. We denote this m X m matrix by D. Then
D is non-singular and we can obtain the solution
Xn = D- 1 b n ,
(3.5.13)
mx1 mxmmx1
(3.5.14)
89
Chapter 3: Linear Progmmming
3.6 The Simplex Method
The idea of the simplex method is to continuously decrease the value of the
objective function by going from one basic feasible solution to another until the
minimum value of the objective function is achieved. We will postpone the discussion
of how to generate a basic feasible solution and assume that we have a basic feasible
solution to start the algorithm. Indeed, if we had the following inequality constraints
,
0 +0 + ... +xs + ... +0 +as,m+l Xm+l + ... +a.,n Xn = bs
Xl = bl , X2 = b2 , X. = b., Xm = bm ,
Xm+l = X m +2 = = o. (3.6.8)
The variables Xl through Xm are called basic and the Xm+l through Xn are called
non-basic variables.
90
Section 3.6: The Simplex Method
The simplex procedure changes the set of basic variables while improving the ob-
jective function at the same time. However, for the purpose of clarity we will first
demonstrate the approach for going from one basic feasible solution to another. The
objective function improvement will be discussed in the following section.
We wish to make one ofthe current non-basic variables of Eq. (3.6.7), say Xt (m <
t ::;n), basic and in the process cause a basic variable, xs(l :::; s :::; m), to become
non-basic. At this point we assume that we know the variable Xt which we will bring
into the basic set. We only need to decide which variable to drop from the basic set.
Consider the selected terms shown below for the coefficients of the sth equation and
an additional arbitrary ith equation.
s t
1
° = bi (3.6.9)
s 0 1
Since we want to make Xt basic, we need to eliminate it from the rest of the equations
except the sth one by reducing the coefficients ait (i = 1, ... ,n; i =j:. s) to zeroes, and
making the coefficient ast unity by dividing the sth equation throughout by ast. We
can do this only if ast is non-zero. Also, unless asl is positive, the process of dividing
the sth equation by ast will produce a negative term on the right-hand side since
bs is positive because the current solution is a basic feasible solution. To eliminate
the new basic variable Xt from the ith equation (i = 1, ... , n; i =j:. s) we have to
multiply the 8th equation by the factor (aidasl) and subtract the resulting equation
from each of these equations. The resulting coefficients on the right-hand side of the
ith equation will be
b~ = bi - bs( ail) . (3.6.10)
asl
To guarantee that the resulting solution is a basic feasible solution we must require
that b~ ~ 0, or rearranging Eq. (3.6.10) we have
91
Chapter 3: Linear Programming
Example 3.6.1
vVe illustrate the foregoing discussion with an example. Consider the system of
equations
2Xl + 2X2 + X3 = 6 ,
The system is already in the canonical form with a basic feasible solution being
(3.6.14)
The variables Xl and X2 are the non-basic variables, whereas X3, X4, and, X5 are the
basic variables. Now, let us assume that we want to make Xl basic. Rewriting Eqs.
(3.6.13) in a matrix form we have
(3.6.15)
Since Xl is to made basic we consider the first column. To chose the variable to be
made non-basic we form the ratios (b;jaid, i = 1,2,3.
~=3, ~-3~
3'
all a21 -
The smallest ratio is bJ/all and so we pivot on all. Thus the new system of equations
IS
[~
1
1
1 ;~5~ ~ ~1{~j} ~ n} , (3.6.16)
and the process of making Xl basic has resulted in the variable X3 being non-basic.
The new feasible solution .is
Xl = 3, X4 = 1, X5 = 1.
It may be verified by the reader that by using a pivot other than all we would end
up with an infeasible basic solution. For example, if al3 is a pivot we obtain
Xl = 4, X3 = -2, X4 = -2,
minimize f = Xl + X2 + X3 (3.6.18)
subject to 2XI + 2X2 + X3 = 6 , (3.6.19)
3XI + 4X2 + X4 = 10, (3.6.20)
Xl + 2X2 + X5 = 4 . (3.6.21)
As mentioned above we rewrite the constraint equations (3.6.21) in the matrix form
1] {~1} ±}
together with the objective function appended as the last row of the matrix
[J_ ~ ~ ~
1 1 1 0 0 X5
= {
0
(3.6.22)
93
Chapter 3: Linear Programming
A basic solution is
The variable X3 is a basic variable that appears in the last equation of Eqs. (3.6.22)
and must be eliminated from it so that its right-hand side yields the negative of the
current value of the objective function.
}
2 1 o
[J_ J-l {~1} ~t
4 o 1
2 o o (3.6.24)
={
-1 -1 o o o X5 -6 =-f
We can pivot either on column (1) or column (2). That is to say the objective function
will decrease in value by bringing either Xl or X2 into the basis. If we pivot on column
(1) (bringing Xl into the basis) the pivot element is all because it yields the smallest
(b;jail) ratio. The new simplex tableau becomes
1 0.5 0
J-l {~~}
1 -1.5 1
1
o
-0.5
0.5
0
0 o X5
={ J_=-f }
-3
(3.6.25)
The value of the objective function has been reduced from 6 to 3. Since the last
equation contains no non-basic variable with a negative coefficient, it is no longer
possible to decrease the value of the objective function further. Thus the minimum
value of the objective function is 3 and corresponds to the basic solution
Xl = 3, X4 = 1, Xl) =1 . (3.6.26)
If we had decided to bring X2 into the basis first, we would have reduced the objective
function from 6 to 4, and there would have been a negative number in the last equation
in the first column indicating the need for another round of pivoting to bring Xl into
the basis .•••
This would have completed the discussion of the simplex method except for the
fact that we need a basic feasible solution to start the simplex method and we may
not have one readily available. This is our next topic.
In the process of converting an LP problem given in the form of Eqs. (3.6.4) and
(3.6.5)
Ax :::; b, where b > 0, and x ;::: 0, (3.6.27)
into the standard form by adding slack variables we obtained a basic feasible solution
to start the simplex method. However, when we have a linear program which is
94
Section 3.6: The Simplex Method
already in the standard form of Eqs. (3.5.2) and (3.5.3) we cannot, in general,
identify a basic feasible solution. The following technique can be used in such cases.
Consider the following minimization problem
m
minimize L Yi (3.6.28)
i=l
subject to Ax+y = b, (3.6.29)
x ~ 0, and y ~ 0, (3.6.30)
where y is a vector of artificial variables. There is no loss of generality in assuming
that b > 0 so that the LP problem (3.6.29) has a known basic feasible solution
y = b, and x =0, (3.6.31)
so that the simplex method can be easily applied to solve the LP problem of Eqs.
(3.6.30). Note that if a basic feasible solution to the original LP problem (3.6.28)
exists then the optimum solution to the modified problem (3.6.30) must have Yi'S
as non-basic variables (y= 0). However, if no basic feasible solution to the original
problem exists then the minimum value of Eq (3.6.29) will be greater than zero.
Example 3.6.3
We illustrate the use of artificial variables with the following example for which we
seek a basic feasible solution to the system
Xl + 2X2 + X3 = 7, (3.6.32)
Xi ~ 0, i=I,2,3.
Introduce the artificial variables Yl and Y2 and pose the following minimization prob-
lem.
minimize f = Y1 + Y2 (3.6.33)
subject to 2X1 + X2 + 3X3 + Y1 = 13,
Xl + 2X2 + X3 + Y2 = 7, (3.6.34)
Xi ~ 0, i = 1,2,3, and Yi ~ 0, j = 1,2.
With the basic feasible solution, Y1 = 13, Y2 = 7, and Xl = X2 = X3 = 0 known, we
append the objective function (3.6.33) and clear the basic design variables Yl and Y2
from it to obtain the initial simplex tableau
1 3 1
2 1 o (3.6.35)
-3 -4 o
95
Chapter 3: Linear Programming
Since it has the largest negative number we choose column (3) for pivoting with ala
as the pivot element since 13/3 < 7/1,
2/3
[ 1/3
-1/3 -5/3
1/3
5/3
1
o
o
1/3
-1/3
4/3
}-l {~~}
o Yl
Y2
={ 1~3
-8/3
} (3.6.36)
It was shown by Dantzig [131 that the primal problem of minimization of a linear
function over a set of linear constraints is equivalent to the dual problem of the
maximization of another linear function over another set of constraints. Both the
dual objective function and constraints of the dual problem are obtained from the
objective function and constraints of the primal problem. Thus if the primal problem
is defined to be
minimize (n variables)
L
n
L
m
subject to aijA; S Cj, j = 1, ... , n, (n constraints)
;=1
Ai ~ 0, j = 1, ... , m . (3.7.2)
The choice of the primal or dual formulation depends on the number of design vari-
ables and the number of constraints. The computational effort in solving an LP
96
Section 3. 7: Duality in Linear Programming
problem increases as the number of constraints increases. Therefore, if the number
of constraint relations is large compared to the number of design variables then it
may be desirable to solve the dual problem which will require less computational
effort. The classification of problems into the primal and dual categories is, however,
arbitrary since if the maximization problem is defined as the primal then the min-
imization problem is its dual. It can be shown [13] that the optimal values of the
basic variables of the primal can be obtained from the solution of the dual and that
(fp)min = (fd)max. Thus if Xj is a basic variable in the primal problem, then it implies
that the jth constraint of the dual problem is active and vice versa.
If the primal problem is stated in its standard form; namely with equality con-
straints
minimize (n variables)
n
As an example of the simplex method for solving an LP problem via the dual formu-
lation we use the portal frame problem formulated in Example 3.1.5 with a slightly
different loading condition. The new loading condition is assumed to correspond to
a 25% increase in the magnitude of the horizontal load while keeping the magnitude
of the vertical load the same. The corresponding constraint equations have different
right-hand sides than those given in Eqs. (3.5.4) through (3.5.9), namely
4X2 ~ 1,
2Xl + 2X2 ~ 1,
Xl + X2 ~ 1.25, (3.7.5)
2Xl ~ 1.25,
97
Chapter 3: Linear Progrc _ 'ing
However, when put into the standard form, not only does the problem involve a total
of 8 variables, but also a basic feasible solution to the problem is not immediately
obvious. Because the objective function (3.1.25) involves only two variables Xl and X2
the solution of the dual problem may be more efficient. The dual problem is
1 1 1 1
maximize + A2 + 14" A3 + 14" A4 + 3'2A5 + 3'2A6
fd = Al (3.7.7)
subject to 2A2 + A3 + 2A4 + 2A5 + 4A6 :::; 2 ,
4Al + 2A2 + A3 + 4A5 + 2A6 :::; 1 , (3.7.8)
Ai 2 0, i = 1, ... ,6 .
Maximizing fd is same as minimizing - fd and the process of converting the above
linear problem to the standard form yields
1 1 1 1
minimize - fd = -AI - A2 - 14" A3 - 14" A4 - 3'2A5 - 3'2A6 (3.7.9)
subject to 2A2 + A3 + 2A4 + 2A5 + 4A6 + A7 = 2 ,
4A1 + 2A2 + A3 + 4A5 + 2A6 + A8 = 1, (3.7.10)
Ai20, i=1, ... ,8,
with the basic feasible solution
Ai=O, i=1, ... ,6, and A7 = 2, A8 =1 .
We can begin with the initial simplex tableau with the basic variables cleared
from the last equation which represents the objective function.
Al
A2
[-~
2 1 2 2 4 1 A3
2 1 0 4 2 0 A4
A8
(3.7.11)
Although we should perhaps be choosing fifth or sixth column for pivoting, since
it has the largest negative value, pivoting on third column produces the same final
answer with one less simplex tableau. Pivoting on element a23 we have
Al
-1]
A2
=UJ
0 0 2 -2 2 1 A3
[~
2 1 0 4 2 0 A4 (3.7.12)
3/2 0 -5/4 3/2 -1 0 5;4 A5
A6
A7
A8
98
Section 3.7: Duality in Linear Programming
Because of the presence of negative terms in the last equation, it is clear that the
objective function can still be decreased further. Pivoting on element a14 we obtain
Al
[-2 -t]
.A2
~ r2}
0 0 1 -1 1 1/2 .A3
2 1 0 4 2 0 .A4
3~ 1~/8
(3.7.13)
.As
3/2 0 0 1/4 1/4 5/8 5/8 .A6
.A7
.As
Hence we conclude that (Jd)min = -15/8 or (Jd)max = (Jp)min = 15/8 with the
solution
(3.7.14)
The non-zero A's indicate that the active constraints in the primal problem are the
third and fourth, namely
2XI = 1.25, and Xl + X2 = 1.25, (3.7.15)
Solution of Eqs. (3.7.15) yields Xl = X2 = 5/8 .•••
In closing this section, it is interesting to point out that the dual variables can be
interpreted as the prices of the constraints. For a given variation on the right hand
side b of the constraint relations of Eq. (3.7.5), the change in the optimum value of
the objective function can be determined from
Example 3.7.2
Consider the portal frame design problem solved in Example 3.7.1 using dual vari-
ables. We will determine the change in the value of the optimum objective function
1* = 1.875 corresponding to a 25% reduction in the value of the horizontal force,
99
Chapter 3: Linear Programming
keeping the vertical force at p. These loads correspond to the problem formulated in
Example 3.1.5 and solved graphically in Example 3.4.1 .
From Eqs. (3.7.5) and (3.1.26) through (3.1.31) the change in the right-hand side
is b.ba = b.b4 = -i, and b.b5 = b.b6 = -~. Using the values of the dual variables
from Example 3.7.1 in Eq. (3.7.15) we obtain
In using the simplex algorithm discussed in section 3.6, we operate entirely along
the boundaries of the polytope in Rn moving from one extreme point (vertex) to
another following the shortest path between them, an edge of the polytope. Of all
the possible vertices adjacent to the one at which we start, the selection of the next
vertex is based on the maximum reduction in the objective function. With these
basic premises, the simplex algorithm is only a systematic approach for identifying
and examining candidate solutions to the LP problem. The number of operations
needed for convergence grows exponentially with the number of variables. In the
worst case, the number of operations for convergence for an n variable problem with
a set of s constraints can be s!/n!(s - n)!. However, it is possible to choose a move
direction different from an edge of the polytope, be consistent with the constraint
relations, and attain larger gains in the objective function. Although such a choice can
lead to a rapid descent toward the optimal vertex, it will do so through intermediate
points which are not vertices.
Interior methods of solving LP problems have drawn serious attention only since
the dramatic introduction of Karmarkar's algorithm [14J by AT&T Bell Laborato-
ries. This new algorithm was originally claimed to be 50 times faster than the simplex
100
Section 3.8: An Interior Method - Karmarkar's Algorithm
method. Since then, much work has been invested in improvements and extensions
of Karmarkar's algorithm. Developments include demonstration of how dual solu-
tions can be generated during the course of this algorithm [15], and extension of
Karmarkar's algorithm to treat upper and lower bounds more efficiently [16] byelim-
inating the slack variables which are commonly used for such bounds in the Simplex
algorithm.
Because some of the recent developments of the algorithm are mathematically
involved and beyond the scope of this book, only a general outline of Karmarkar's
algorithm are presented in the following sections. At this point we would like to
warn the reader that the tools used in the algorithm were originally introduced for
minimization of constrained and unconstrained nonlinear functions which are covered
in Chapters 4 and 5. Therefore, the reader is advised to read these chapters before
proceeding to the next section.
3.8.1 Direction 0/ Move
101
Chapter 3: Linear Programming
subject to Xl + X2 + X3 = 1, (3.8.4)
x~ O. (3.8.5)
Starting at an initial point x(O) = (1/3,1/3, 1/3)T determine the direction of move.
/-PC=(-1,O,1)T
:\I_C=(1, 2, 3)T
The design space and the constraint surface for the problem are shown in Figure
(3.8.1). The direction corresponding to the negative of the gradient vector is marked
as -c. The projection matrix for the problem can be obtained from Eq. (3.8.2) where
A = [1 1 1]. The system AAT Y = Ac produces a scalar for y,
Y= -2.
The projected direction Pc is then given by
Pc=c-yAT , (3.8.7)
Pc ={ =~ }-{=~} ={ 11 } (3.8.8)
and the procedure has to be repeated until a reasonable convergence to the minimum
point is achieved .•••
In the preceding example no explanation is provided for the selection of the initial
design point, and for the distance travelled in the chosen direction. Karmarkar [14]
=
stops the move before hitting the polytope boundary, say at x(1) (19/30,1/3, 1/30)T
in the previous example, so that there will be room left to move in the next iteration.
That is, starting either at the polytope or close to it increases the chances of hitting
another boundary before making real gains in the objective function. The solution
to this difficulty is accomplished by transforming the design space discussed in the
next section.
3.8.2 Transformation of Coordinates
In order to focus on the ideas which are important for his algorithm, Karmarkar
[14] makes several assumptions with respect to the form of the LP problem. In his
canonical representation, the LP problem takes the following form,
minimize f= cT:x: (3.8.9)
subject to Ax = 0 , (3.8.10)
(3.8.11)
(3.8.12)
where e is a 1 X n vector, e = (1, ... , l)T. The variable:x: represents the transformed
coordinate such that the initial point is the center, x(O) = e/n, of a unit simplex,
and is a feasible point, Ax(O) = o. A simplex is a generalization to n dimensions of
a 2-dimensional triangle and 3-dimensional tetrahedron. A unit simplex has edges
of unit length along each of the coordinate directions. Karmarkar also assumes that
c T x ~ 0 for every point that belongs to the simplex, and the target minimum value of
the objective function is zero. Conversion of the standard form of an LP problem into
this new canonical form can be achieved through a series of operations that involve
combining the primal and dual forms of the standard formulation, introducing of
slack and artificial variables, and transforming coordinates. The combination of the
primal and dual formulations is needed to accommodate the assumption that the
target minimum value of the objective function be zero. Details of the formation of
this new canonical form is provided in Ref. [14]. In this section we will demonstrate
the coordinate transformation which is referred as projective rescaling transformation.
This is the same transformation that helps to create room for move as we proceed
from one iteration to another.
Consider an arbitrary initial point x(a) in the design space, and let
D ., -- D·lag «a) (a»)
Xl , ... , Xn • (3.8.13)
The transformation, T." used by Karmarkar maps each facet of the simplex given by
Xi = 0 onto the corresponding facet Xi = 0 in the transformed space, and is given by
• 1 n- l (3.8.14)
x = eTD-lx ., x .
"
103
Chapter 3: Linear P1'Ogramming
While mapping the unit simplex onto itself, this transformation moves the point
x(a)to the center of the simplex, xeD) = (1/ n )e. Karmarkar showed that repeated
application of this transformation, in the worst case, leads to convergence to the
optimal corner in less than O( n f) arithmetic operations.
Karmarkar's transformation is nonlinear and a simpler form of this transformation
has been suggested. A linear transformation,
(3.8.15)
Solution techniques for the LP problems considered so far have been developed
under the assumption that the design variables are positive and continuously-valued;
they can thus assume any value between their lower and upper bounds. In certain
design situations, some or all of the variables of a LP problem are restricted to take
discrete values. That is, the standard form of the LP problem of Eq. (3.5.1-3.5.3)
takes the form
minimize f(x) = cTx
such that Ax=b, (3.9.1)
Xi E X'i = {d il , di2 , . .. ,dil },
where Id is the set of design variables that can take only discrete values, and Xi is
the set of allowable discrete values. Design variables such as cross-sectional areas of
104
Section .'1.9: Integer Linear Programming
trusses and ply thicknesses of laminated composite plates often fall in this category.
Those problems with discrete-valued design variables are called discrete programming
problems.
This form, where certain design variables are allowed to be continuous, is referred to
as mixed integer linear programming (MILP) problem. Problems where all variables
are integer are called pure ILP problems or in short ILP problems. It is also common
to have problems where design variables are used to indicate a 0/1 type decision
making situation. Such problems are referred to as zer%ne or binary ILP problems.
For example, a truss design problem where the presence of a particular member or
the lack of it is represented by a binary variable falls into this category. Any ILP
problem with an upper bound on the design variable Xi of 2K - 1 can be posed as
binary ILP problem by replacing the variable with f{ binary variables XiI, ... ,XiK
such that
K-I
Xi = XiI + 2Xi2 + ... + 2 XiK· (3.9.3)
(3.9.4)
Example 3.9.1
Consider the binary ILP problem of choosing a combination of five variables such
that the following summation is satisfied
5
f = Lixi = 5.
;=1
Figure 3.9.1 Enumeration tree for binary ILP problem of f = L:~=l iXi = 5.
106
Section 3.9: Integer Linear Progmmming
For the present problem, after considering 19 possible combinations of variables,
we identified 3 feasible solutions which are marked by an asterisk. This is a 40%
=
reduction in the total number of possible trials, namely 25 32, needed to identify
all feasible solutions. For a structural design problem in which trials with different
combinations of variables would possibly require expensive analysis an enumeration
tree can yield substantial savings .•••
The basic concept behind the enumeration technique forms the basis for this powerful
algorithm suitable for MILP problems as well as nonlinear mixed integer problems
[20,21]. The original algorithm developed by Land and Doig [22] relies on calculating
upper and lower bounds on the objective function so that nodes that result in designs
with objective functions outside the bounds can be fathomed and, therefore, the
number of analyses required can be cut back. Consider the mixed ILP problem of
Eq. (3.9.4). The first step of the algorithm is to solve the LP problem obtained from
the MILP problem by assuming the variables to be continuous valued. If all the x
variables for the resulting solution have integer values, there is no need to continue,
the problem is solved. Suppose several of the variables assume noninteger values and
the objective function value is h. The h value will form a lower bound h = h for the
MILP since imposing conditions that require any of the noninteger valued variables
to take integer values can only cause the objective function to increase. This initial
problem is labeled as LP-1 and is placed in the top node of the enumeration tree as
shown in Figure (3.9.2). For the purpose of illustration, it is assumed that only two
variables Xk and Xk+l violate the integer requirement with Xk = 4.3 and Xk+1 = 2.8 .
,,
.. . .-
.. ..
.. . .
.. . ...
...
The second step of the algorithm is to branch from the node into two new LP
problems by adding a new constraint to the LP-1 that would involve only one of the
noninteger variables, say Xk. One of the problems, LP-2, will require the value of the
branched variable, Xk to be less than or equal to the largest integer smaller than Xk,
107
Chapter 3: Linear Programming
and the other, LP-3, will have a constraint that Xk is larger than the smallest integer
larger than Xk. As will be demonstrated later in Example 3.9.2, these two problems
actually do branch the feasible design space of the LP-l into two segments. There
are several possibilities for the solution of these two new problems. One of these
possibilities is to have no feasible solution for the new problem. In that case the new
node will be fathomed. Another possibility is to reach an all integer feasible solution
(see LP-3 of Figure 3.9.2) in which case the node will again be fathomed but the value
of the objective function will become an upper bound lu for the MILP problem. That
is, beyond this solution point, any node that has an LP solution with a larger value
of the objective function will be fathomed, and only those solutions that have the
potential of producing an objective function between h and lu will be pursued. If
there are no solutions with an objective function smaller than lu, then the node is
an optimum solution. If there are other solutions with an objective function smaller
than lu, they may still include noninteger valued variables (LP-2 of Figure 3.9.2),
and are labeled as live nodes. Live nodes are then branched again by considering one
of the remaining noninteger values and resulting solutions are analyzed until all the
nodes are fathomed.
Example 3.9.2
Consider the portal frame problem problem of Example 3.1.5 (see Eqs. (3.1.25)
through (3.1.31)) with the requirement that Xi E {O.O, 0.2, 0.4, 0.6, 0.8, 1.0}, i = 1,2.
We rescale the design variables by a factor of 5 to pose the problem as an integer
linear programming problem,
1
minimize I = "5( 2X I + X2)
such that X2 ~ 1.25,
Xl + X2 ~ 2.5,
Xl + X2 ~ 5,
Xl ~ 2.5,
Xl + 2X2 ~ 7.5,
2XI + X2 ~ 7.5,
Xi ~ 0 integer, i = 1,2 .
Graphical solution of this scaled problem (presented in Example 3.4.1 without the
integer design variable requirement before scaling) is
Xl = X2 = 2.5, I = 7.5,
and forms a lower bound for the objective function, h = 7.5. That is, the optimal
integer solution cannot have an objective function smaller than h = 7.5. Next, we
choose Xl and investigate solutions for which Xl s:; 2 and Xl ~ 3 by forming two new
LP's by adding each one of these constraints to the original set of constraints. Since
the original set has a constraint that requires Xl ~ 2.5, the first LP problem with
Xl s:; 2 has no solution. The solution of the second LP is shown graphically in Figure
(3.9.3). The active constraints at the optimum are, Xl ~ 3 and Xl + 2X2 ~ 7.5, and
the solution is,
X2 = 2.25, 1= 8.25.
108
Section 3.9: Integer Linear Programming
\ \
\
\
4 \
\
\,...,
3 \ \ \\
\~ \~
\~ \
2 \
\
1
2 3 4 5 6
Since X2 is still non integer, we create two more LP's, this time by imposing
X2 ::;2 and X2 ~ 3, respectively. Graphical solutions of the new LP's are shown in
Figure (3.9.4). The solution for the case X2 ~ 3 is at the vertex Xl = 3 and X2 = 3,
and is a feasible solution for the integer problem with an objective function value of
I = 9. This value of the objective function, therefore, establishes an upper bound,
Iv = 9 for the problem. The solution for the case X2 ::; 2, on the other hand is at the
intersection of X2 = 2 and Xl + 2X2 = 5 leading to
Xl = 3.5, X2 = 2, and I =9 .
This solution is not discrete and can be interrogated further by branching on Xl (that
is creating new LP's by adding Xl ~ 3 and Xl ~ 4). However, since its objective
function is equal to the upper bound, we cannot improve the objective function any
further. To do so would necessitate introducing a further constraint which could
only increase the objective function. Therefore, the optimal solution is the one with
Xl = X2 = 3, and f = 9.•••
Branch-and-Bound is only one of the algorithms for the solution of ILP or MILP
problems. However, because of its simplicity it is incorporated into many commer-
cially available computer programs [23, 241. There are a number of other techniques
which are capable of handling general discrete-valued problems (see, for example,
Ref. [25]). Some of these algorithms are good not only for ILP problems but also
for NLP problems with integer variables. Particularly, methods based on proba-
bilistic search algorithms are emerging for many applications, including structural
design applications, that involve linear and nonlinear programming problems. Two
of such techniques, namely simulated annealing and genetic algorithms, are discussed
in Chapter 4. Another approach, which is based on an extension of the penalty
function approach for constrained NLP problems, is presented in Chapter 5. Finally,
the use of dual variables (which are presented to be useful as prices of constraints in
section 7.3) in ILP problems are discussed in Chapter 9.
One of the interesting design applications of the ILP was introduced by Haftka
and Walsh [261 for the stacking sequence design of laminated composite plates for
improved buckling response. Since the formulation of this problem involves mate-
rial introduced in Chapter 11, discussion and demonstration of this application is
presented in that chapter.
3.10 Exercises
1. Estimate the limit load for the three bar truss example 3.1.2 using a graphical
approach. Verify your solution using the simplex method.
110
Section 3.10: Exercises
1 2
3 4
6
w1
5
w2
2. Consider the platform support system shown in Figure 3.10.1 in which cables 1
and 2 can support loads up to 400 lb each; cables 3 and 4 up to 150 lb each and
cables 5 and 6 up to 75 lb each. Neglect the weight of the platforms and cables,
and assume the weights WI, w2, and W3 at the positions indicated in the figure. Also
neglect the bending failure of the platforms. Using linear programming determine
the the maximum total load that the system can support.
3. Solve the limit design problem for the truss of Figure 3.1.4 using the sim-
plex algorithm. Assume Al3 = A24 = A 34 , Al4 = A 23 , and use appropriate non-
dimensionalization.
4. Using the method of virtual displacements verify that the collapse mechanisms for
the portal frame of Figure 3.1.6 lead to Eqs. (3.1.26) through (3.1.31) in terms of the
non dimensional variables Xl and X2.
5. The single bay, two story portal frame shown in Figure (3.10.2) is subjected
to a single loading condition consisting of 4 concentrated loads as shown. Following
Example 3.1.5 formulate the LP problem for the minimum weight design of the frame
against plastic collapse.
6. Consider the continuous prestressed concrete beam shown in Figure (3.10.3),
a) Verify that the equivalent uniformly distributed upward force exerted on the
concrete beam by a prestressing cable with a force f and a parabolic profile defined
by eccentricities YI, Y2, and Y3 at the three points X = 0, x = 1/2, and x = I
respectively is given by
b) The beam in the figure is subjected to two loading conditions: the first con-
sisting of a dead load of 1 kip/ft together with an equivalent load due to a parabolic
111
Chapter 3: Linear Programming
2p
P
rnb
3p 21
rna rna
2p
rnb
21
.1, 31/2 .1
0.5 ft
2.0 ft
0.5 ft
prestressing cable with a force f, and the second due to an additional live load of 2.5
kips/ft in service. It is assumed, however, that in service a 15% loss of prestressing
force is to be expected. Formulate the LP problem for the minimum cost design
of beam assuming f, Yl, and Y2 as design variables. Assume the allowable stress
for the two loading conditions to be (11 = 200 psi, (1i = -3000 psi, (12 = 0 psi,
(1~ = -2000 psi and the upper and lower bound limits on the eccentricities Yl and Y2
to be OAft ~ Yi ~ 2.6ft, i = 1,2.
c) Solve the LP problem by the simplex algorithm and obtain the solution for the
minimum prestressing force and the tendon profile.
7. Consider the statically determinate truss of Figure 3.3.1 and its minimum weight
design formulation as described by Eqs. (3.3.9) through (3.3.13). Use the linearization
scheme implied by Eqs. (3.3.2) through (3.3.5) to formulate the LP prohlem for m=3.
Solve the LP by the simplex algorithm and compare the approximate solution with
112
Section 3.11.' References
3.11 References
[1] Charnes, A. and Greenberg, H. J., "Plastic Collapse and Linear Programming,"
Bull. Am. Math. Soc., 57, 480, 1951.
[2] Calladine, C.R., Engineering Plasticity. Pergamon Press, 1969.
[3] Cohn, M.Z., Ghosh, S.K. and Parimi, S.R., "Unified Approach to Theory of Plas-
tic Structures," Journal of the EM Division, 98 (EM5), pp. 1133-1158, 1972.
[4] Neal, B. G., The Plastic Methods of Structural Analysis, 3rd edition, Chapman
and Hall Ltd., London, 1977.
[5] Zeman, P. and Irvine, H. M., Plastic Design, An Imposed Hinge-Rotation Ap-
proach, Allen and Unwin, Boston, 1986.
[6] Massonet, C.E. and Save, M.A., Plastic Analysis and Design, Beams and Frames,
Vol. 1. Blaisdell Publishing Co., 1965.
[7] Lin, T.Y. and Burns, N.H., Design of Prestressed Concrete Structures, 3rd ed.
John Wiley and Sons, New York, 1981.
[8] Parme, A.L. and Paris, G.H., "Designing for Continuity in Prestressed Concrete
Structures," J. Am. Concr. Inst., 23 (1), pp. 45-64, 1951.
[9] Morris, D., "Prestressed Concrete Design by Linear Programming," J. Struct.
Div., 104 (ST3), pp. 439-452, 1978.
[10] Kirsch, U., "Optimum Design of Prestressed Beams," Computers and Structures
2, pp. 573-583, 1972.
[11] Luenberger, D. G., Introduction to Linear and Nonlinear Programming, Addison-
Wesley, Reading, Mass., 1973.
[12] Majid, K.I., Nonlinear Structures, London, Butterworths, 1972.
[13] Dantzig, G., Linear Programming and Extensions, Princeton University Press,
Princeton, NJ, 1963.
[14] Karmarkar, N., "A New Polynomial-Time Algorithm for Linear Programming,"
Combinatorica,4 (4), pp. 373-395, 1984.
113
Chapter 3: Linear Programming
[15] Todd, M. J. and Burrell, B. P., "An Extension of Karmarkar's Algorithm for
Linear Programming Using Dual Variables," Algorithmica, 1, pp. 409-424, 1986.
[16] Rinaldi, G., "A Projective Method for Linear Programming with Box-type Con-
straints," Algorithmica, 1, pp. 517-527, 1986.
[17] Strang, G., "Karmarkar's Algorithm and its Place in Applied Mathematics," The
Mathematical Intelligencer, 9, 2, pp. 4-10, 1987.
[18] Vanderbei, R. F., Meketon, M. S., and Freedman, B. A., "A Modification of Kar-
markar's Linear Programming Algorithm," Algorithmica, 1, pp. 395-407, 1986.
[19] Garfinkel, R. S., and Nemhauser, G. L., Integer Programming, John Wiley &
Sons, Inc., New York, 1972.
[20] Lawler, E. L., and Wood, D. E., "Branch-and-Bound Methods-A Survey," Op-
erations research, 14, pp. 699-719,1966.
[21] Tomlin, J. A., "Branch-and-Bound Methods for Integer and Non-convex Pro-
gramming," in Integer and Nonlinear Programming, J. Abadie (cd.), pp. 437-450,
Elsevier Publishing Co., New York, 1970.
[22] Land, A. H., and Doig, A. G., "An Automatic Method for Solving Discrete Pro-
gramming Problems," Econometrica, 28, pp. 497-520, 1960.
[23] Johnson, E. L., and Powell, S., "Integer Programming Codes," in Design and
Implementation of Optimization Software, Greenberg, H. J. (ed.), pp. 225-240,
1978.
[24] Schrage, L., Linear, Integer, and Quadratic Programming with LINDO, 4th Edi-
tion, The Scientific Press, Redwood City CA., 1989.
[25] Kovacs, 1. B., Combinatorial Methods of Discrete Programming, Mathematical
Methods of Operations Research Series, Vol. 2, Akademiai Kiad6, I3udapest, 1980.
[26] Haftka, R. T., and Walsh, J. L., "Stacking-sequence Optimization for Buckling
of Laminated Plates by Integer Programming," AIAA J. (in press).
114
Unconstrained Optimization 4
115
Chapter 4: Unconstrained Optimization
where a is usually referred to as the step length. The function f(x) to be minimized
can, therefore, be expressed as
Thus, the minimization problem reduces to finding the value a* that minimizes the
function, f(a). In fact, one of the simplest methods used in minimizing functions
of n variables is to seek the minimum of the objective function by changing only
one variable at a time, while keeping all other variables fixed, and performing a one-
dimensional minimization along each of the coordinate directions of an n-dimensional
design space. This procedure is called the univariate search technique.
In classifying the minimization algorithms for both the one-dimensional and
multi-dimensional problems we generally use three distinct categories. These cat-
egories are the zeroth, first, and second order methods. Zeroth order methods use
only the value of the function during the minimization process. First order methods
employ values of the function and its first derivatives with respect to the variables.
Finally, second order methods use the values of the function and its first and sec-
ond derivatives. In the following discussion of one-variable function minimizations,
the function is assumed to be in the form f = f(a). However, the methods to be
discussed are equally applicable for minimization of multivariable problems along a
preselected direction, s, using Eq. (4.1.1).
Bracketing Method. As the name suggests, this method brackets the minimum of the
function to be minimized between two points, through a series of function evaluations.
The method begins with an initial point ao, a function value f(ao), a step size /30,
and a step expansion parameter, > 1. The steps of the algorithm [2] are outlined as
1. Evaluate f(ao) and f(ao + /30).
2. If f(ao + /30) < f(ao), let a1 ao + /30 and /31 = ,/30, and evaluate
/(a1 + /3d. Otherwise go to step 4.
3. If f(a1 + /3d < f(a1), let a2 = a1 + /31 and /32 = ,/31, and continue
incrementing the subscripts this way until f(ak + /3k) > f(ak). Then, go to step 8.
4. Let a1 = ao and /31 = -~/30, where ~ is a constant that satisfies 0 < ~ < 1/"
and evaluate f(a1 + /31).
5. If f( a1+ /3d > f( ad go to step 7.
6. Let a2 = a1 + /31 and /32 = ,/3h and continue incrementing the subscripts
this way until f(ak + /3k) > f(ak). Then, go to step 8.
7. The minimum has been bracketed between points (ao - ~/30) and (ao + /30).
Go to step 9.
8. The last three points satisfy the relations f(ak-2) > f(ak-d and f(ak-d <
f(ak), and hence, the minimum is bracketed.
116
Section 4.1: Minimization of FUnctions of One Variable
9. Use either one of the two end points of the bracket as the initial point. Begin
with a reduced step size and repeat steps 1 through 8 to locate the minimum to a
desired degree of accuracy.
Quadratic Interpolation. The method known as quadratic interpolation was first
proposed by Powell [3] and uses the values of the function f to be minimized at three
points to fit a parabola
p( a) = a + ba + ca2 , (4.1.3)
through those points. The method starts with an initial point, say, a = 0 with
a function value Po = f(xo), and a step size fJ. Two more function evaluations
are performed as described in the following steps to determine the points for the
polynomial fit. In general, however, we start with a situation where we have already
bracketed the minimum between al = al and a2 = au by using the bracketing
method described earlier. In that case we will only need an intermediate point ao in
the interval (ai, au).
1. Evaluate PI = p(fJ) = f(xo + fJs)
2. If PI < Po, then evaluate P2 = p(2fJ) = f(xo + 2fJs). Otherwise evaluate
P2=p(-fJ)=f(xo-fJs). Theconstantsa,b, and cinequationEq. (4.1.3) can now
be uniquely expressed in terms of the function values Po, PI, and P2 as
a =Po,
b = PI - P2 and PI - 2po + P2
= 2fJ2 ' if P2 = f(xo - fJs) . (4.1.5)
2fJ ' C
3. The value of a = a* at which pea) is extremized for the current cycle is then
given by
b
a * = -2c
- (4.1.6)
.
f(a)
a* a
In-Ii
(\(1 = ao + - f 0,
n+l
( 4.1.9)
fn-li
(\(2 = b0 - - - 0, (4.1.lO)
In+l
and
= ak + In-(Hl) Ik bk -
fn-(k+1) I
k, (4.1.11)
In-(k-l) fn-(k-l)
where In are Fibonacci numbers defined by the sequence 10 = 1, II = 1, In =
In-2 + In-I, and lk is the length of the kth interval (ak,b k ). The total number of
required function evaluations n may be determined from the desired level of accuracy.
It can be shown that the interval of uncertainty after n function evaluations is 2do
where
1
E=--. (4.1.12)
In+l
Thus, it is possible to approximate the optimal location of the points given by Eqs.
(4.1.9 - 4.1.11) by the following relations
119
Chapter 4: Unconstrained Optimization
Example 4.1.1
Determine the value of a, to within t = ±0.1, that minimizes the function f(a) =
a(a - 3) on the interval 0 ~ a ~ 2 using the golden section search technique.
From Eqs. (4.1.14) and (4.1.15) we can calculate
al = 0 + 0.382(2) = 0.764, f(ad = -1.708,
a2 = 2 - 0.382(2) = 1.236, f(a2) = -2.180 .
Since f(a2) < f(al) we retain (al, 2). Thus, the next point is located at
a3 = 2 - 0.382(2 - 0.764) = 1.5278, f(a3) = -2.249 .
Since f(a3) < f(a2) we reject the interval (al, (2). The new interval is (a2,2). The
next point is located at
a4 = 2 - 0.382(2 - 1.236) = 1.7082, f(a4) = -2.207 .
f (a)
a
0.0
-0.5
-1.0
-1.5
-2.0
-2.5
Figure 4.1.2 Iteration history for the function minimization f(a) = a(a - 3).
Since f(a4) < f(a2) < f(2) we reject the interval (a4,2) and retain (a2, (4) as
the next interval and locate the point a5 at
a5 = 1.236 + 0.382(1.7082 - 1.236) = 1.4164, f(a5) = -2.243 .
Since f(a5) < f(a4) < f(a2) we retain the interval (a5,a4). The next point is
located at
a6 = 1.7082 + 0.382(1.7082 - 1.4164) = 1.5967, f(a6) = -2.241 .
Since f(a6) < f(a4) we reject the interval (a6, (4) and retain the interval (a5, (6)
of length 0.18, which is less than the interval of specified accuracy, 2£ = 0.2. The
iteration history for the problem is shown in Figure 4.1.2. I-Ience, the minimum has
been bracketed to within a resolution of ±0.1. That is, the minimum lies between
a5 = 1.4164 and a6 = 1.5967. We can take the middle of the interval, a = 1.5066 ±
0.0902 as the solution. The exact location of the minimum is at a = 1.5 where the
function has the value -2.25 .•••
120
Section 4.1: Minimization of Functions of One Variable
Bisection Method. Like the bracketing and the golden section search techniques
which progressively reduce the interval where the minimum is known to lie, the
bisection technique locates the zero of the function f' by reducing the interval of
uncertainty. Beginning with the known interval (a,b) for which 1'(a)1'(b) < 0, an
approximation to the root of l' is obtained from
* a+b
a =-2-' (4.l.17)
which is the point midway between a and b. The value of l' is then evaluated at
00*. If l' (00*) agrees in sign with l' (a) then the point a is replaced by 00* and the new
interval of uncertainty is given by (00*, b). If on the other hand 1'( 00*) agrees in sign
with 1'(b) then the point b is replaced by 00* and the new interval of uncertainty is
( a, 00*). The process is then repeated using Eq. (4.1.17).
Davidon's Cubic Interpolation Method. This is a polynomial approximation
method which uses both the function values and its derivatives for locating its min-
imum. It is especially useful in those multivariable minimization techniques which
require the evaluation of the function and its gradients.
We begin by assuming the function to be minimized f(xo + 0080) to be approxi-
mated by a polynomial in the form
p( a) = a + boo + ca 2 + da 3 , (4.1.18)
with constants a, b, c, and d to be determined from the values of the function,
Po and PI, and its derivatives, go and gl, at two points, one located at a = 0 and
the other at a = (3.
Po = p(O) = f(xo), Pl = p((3) = f(xo + (38), (4.1.19)
and
dp( dp
go = da 0) = 8
T
'V f(xo), gl = -((3) = 8 'V f(xo
T
+ (38) . (4.1.20)
da
After substitutions, Eq. (4.l.18) takes the following form
( ) _ go +e 2 go + gl + 2e 3
P a - Po + goa - -(3-00 + 3(32 a, (4.l.21)
where
( 4.l.22)
We can now locate the minimum, a = am, of Eq. (4.1.21) by setting its derivative
with respect to a to be zero. This results in
(3 ( go +c± h ) (4.1.23)
am = go + gl + 2e '
121
Chapter 4: Unconstrained Optimization
where
( 4.1.24)
It can be easily verified, by checking d 2p/da2 , that the positive sign must be retained
in Eq. (4.1.23) for am to be a minimum rather than a maximum. Thus, the algorithm
for Davidon's cubic interpolation [5] may be summarized as follows.
1. Evaluate Po = f(xo) and go = sTV f(xo) and make sure that go < O.
2. In the absence of an estimate of the initial step length {3, we may calculate it
on the basis of a quadratic interpolation derived using Po, go and an estimate of Pmin.
Thus,
{3 = 2(Pmin - Po) . (4.1.25)
go
(4.1.26)
122
Section 4.2: Minimization of Functions of Several Variables
serves as a new approximation for a repeated application of Eq. (4.1.29) with i re-
placed by i + 1. For a successful convergence to the minimum it is necessary that the
second derivative of the function f be greater than zero. Even so the method may
diverge depending on the starting point. Several strategies exist [6] which modify
Newton's method to make it globally convergent (that is, it will converge to a mini-
mum regardless of the starting point) for multi variable functions; some of these will
be covered in the next section.
The reason this method is known as a second order method is not only because
it uses second derivative information about the function f, but also because it has
a rate of convergence to the minimum that is quadratic. In other words, Newton's
algorithm converges to the minimum a* such that
· la;+l - a* 1
11m (3
2 = , ( 4.1.30)
(a; - a*)
;--->00
where a; and a;+l are the ith and the (i + 1)st estimates of the minimum value of
the a*, (3 is a non-zero constant.
Several methods exist for minimizing a function of several variables using only func-
tion values. However, only two of these methods may be regarded as being useful.
These are the sequential simplex method of Spendley, Hext and Himsworth [8] and
123
Chapter 4: Unconstrained Optimization
Powell's conjugate direction method [3]. Both of these methods require that the
function f(x),x ERn, be unimodal; that is the function f has only one minimum.
The sequential simplex does not require that the function f be differentiable, while
the differentiability requirement on f is implicit in the exact line searches of Powell's
method. It appears from tests by NeIder and Mead [9] that for most problems the
performance of the sequential simplex method is comparable to if not better than
Powell's method. Both of these methods are considered inefficient for n :2: 10; Pow-
ell's method may fail to converge for n :2: 30. A more recent modification of the
simplex method by Chen, et al. [10] extends the applicability of this algorithm for
high dimensional cases. If the function is differentiable, it is usually more efficient
to use the more powerful first and second order methods with derivatives obtained
explicitly or from finite difference formulae.
Sequential Simplex Method. The sequential simplex method was originally pro-
posed by Spendley, Hext and Himsworth [8] and was subsequently improved by NeIder
and Mead [9]. The method begins with a regular geometric figure called the simplex
consisting of n + 1 vertices in an n-dimensional space. These vertices may be defined
by the origin and by points along each of the n coordinate directions. Such a simplex
may not be geometrically regular. The following equations are suggested in Ref. 8
for the calculation of the positions of the vertices of a regular simplex of size a in the
n-dimensional design space
n
with
p= arnCVn+1 + n -1), and q= alO (Vn+1-1), (4.2.2)
ny2 ny2
where ek is the unit base vector along the kth coordinate direction, and Xo is the
initial base point. For example, for a problem in two-dimensional design space Eqs.
(4.2.1) and (4.2.2) lead to an equilateral triangle of side a.
Once the simplex is defined, the function f is evaluated at each of the n+ 1 vertices
XO,xl, ... ,xn' Let Xh and XI denote the vertices where the function f assumes its
maximum and minimum values, respectively, and Xs the vertex where it assumes the
second highest value. The simplex method discards the vertex Xh and replaces it
by a point where f has a lower value. This is achieved by three operations namely
reflection, contraction, and expansion.
The reflection operation creates a new point Xr along the line joining Xh to the
centroid x of the remaining points defined as
1 n
X=-LXi' i ¥ h . (4.2.3)
n ;=0
124
Section 4.2: Minimization 01 Functions 01 Several Variables
with 0: being a positive constant called the reflection coefficient which is usually
assumed to be unity. Any positive value of the reflection coefficient in Eq. (4.2.4)
guarantees that Xr is on the other side of the x from Xh. If the value of the function
at this new point, Ir = I(x r ), satisfies the condition II < Ir ~ Is, then Xh is replaced
by Xr and the process is repeated with this new simplex. If, on the other hand, the
value of the function Ir at the end of the reflection is less than the lowest value of the
function II = l(xl), then there is a possibility that we can still decrease the function
by going further along the same direction. We seek an improved point Xe by the
expansion technique using the relation
with the expansion coefficient {3 often being chosen to be 2. If the value of the function
Ie is smaller than the value at the end of the reflection step, then we replace Xh by
Xe and repeat the process with the new simplex. However, if the expansion leads to a
function value equal to or larger than fr, then we form the new simplex by replacing
Xh by Xr and continue.
Finally, if the process of reflection leads to a point Xr such that, fr < fh, then
we replace Xh by Xr and perform contraction. Otherwise (lr ~ fh)' we perform
contraction without any replacement using
( 4.2.6)
°
with the contraction coefficient /, < / < 1, usually chosen to be 1/2. If fe = f(xe)
is greater than Ih, then we replace all the points by a new set of points
1
x·, = X·'+
2 -(Xl - x·)
' , i=O,I, ... ,n, (4.2.7)
and restart the process with this new simplex. Otherwise, we simply replace Xh by
Xc and restart the process with this simplex. The operation in Eq. (4.2.7) causes the
distance between the points of the old simplex and the point with the lowest function
value to be halved and is therefore referred to as the shrinkage operation. The flow
chart of the complete method is given in Figure 4.2.1. For the convergence criterion
to terminate the algorithm NeIder and Mead [9] proposed the following
In
{ 1 + n ~ [fi - f( X W
}t < f, (4.2.8)
Initialize a simplex
Detennine
r----~ xh. xs' xl. and x
fh. f s• fl
f,ev -- (/h +
2
fi) + "Is, (4.2.9)
where s is the standard deviation of the values of the function corresponding to the
126
Section 4.2: Minimization of Functions of Several Variables
(4.2.10)
and 'I} is a parame!er (discussed below) that controls the number of vertices to be
operated on. The f value in Eg. (4.2.10) is the average of the function values over
the entire current simplex.
The vertices with function values higher than the cutting value form the group
to be reflected (and to be dropped). The other vertices serve as reference points.
If the parameter 'I} is sufficiently large, all the vertices of the simplex except the Xh
stay in the group to be used as the reference points and, therefore, the algorithm is
equivalent to the original form. For sufficiently small values of the parameter 'I} , all
points except the Xn are dropped. The selection of the parameter 17 depends on the
difficulty of the problem as well as the number of variables. Recommended values
for 'I} are given in Table II of Ref. [10]. Among the n + 1 vertices of the current
simplex, we rearrange and number the vertices from largest to smallest function
values as Xo, Xl, ... ,Xcv, ... ,Xn where i = 0, ... ,ncv are the elements of the group to
be reflected next. The centroid of the vertices in the reference group is defined as
LXi.
n
1
X= (4.2.11)
n - ncv .
l=n cv +l
The performance of this modified simplex method has been compared [10] with the
simplex method proposed by NeIder and Mead, and also with more powerful meth-
ods such as the second order Davidon-Fletcher-Powell (DFP) method which will be
discussed later in this chapter. For high dimensional problems the modified simplex
algorithm was found to be more efficient and robust than the DFP algorithm. Nclder
and Mead [9] have also provided several illustrations of the use of their algorithm
in minimizing classical test functions and compared its performance with Powell's
conjugate directions method which will be discussed next.
Powell's Conjugate Directions Method and its Subsequent Modification. Al-
though most problems have functions which are not quadratic, many unconstrained
minimization algorithms are developed to minimize a quadratic function. This is be-
cause a function can be approximated well by a quadratic function near a minimum.
Powell's conjugate directions algorithm is a typical example. A quadratic function in
Rn may be written (l.')
1
f(x) = 2xTQx+bTx+c. (4.2.12)
Furthermore, it can be shown that if the function f is minimized once along each
direction of a set s of linearly independent Q-conjugate directions then the minimum
127
Chapter 4: Unconstrained Optimization
of f will be located at or before the nth step regardless of the starting point provided
that no round-off errors are accumulated. This property is commonly referred to as
the quadratic termination property. Powell provided a convenient method for gen-
erating such conjugate directions by a suitable combination of the simple univariate
search and a pattern search technique [3]. However, in certain cases Powell's algo-
rithm generates directions which are linearly dependent and thereby fails to converge
to the minimum. Hence, Powell modified his algorithm to make it robust but at the
expense of its quadratic termination property.
2. After completing the univariate cycle find the index m corresponding to the
direction of the univariate search which yields the largest function decrease in going
from X~_l to x~.
4. If
(4.2.14)
then use the same old directions again for the next univariate cycle (that is do not
discard any of the directions of the previous cycle in preference to the pattern direction
s~). If Eq. (4.2.14) is not satisfied then replace the mth direction by the pattern
direction s~.
5. Begin the next univariate cycle with the directions decided in step 4, and
repeat the steps 2 through 4 until convergence to a specified accuracy. Convergence
is assumed to be achieved when the Euclidean norm Ilxk- 1 - xkll is less than a prc-
specified quantity E.
128
Section 4.2: Minimization of Functions of Several Variables
Example 4.2.1
~~..~ ~~!
(b)
.. x ===(a)
1, EI
=&'CD
Figure 4.2.2 Tip loaded cantilever beam and its finite element model.
v(€) ~ [(1- 3<' + ~') I(€ - 2<' H') (3<' - 2<') 1(-<' +e')] nn '
(4.2.15)
where ~ = x/l. The corresponding potential energy of the beam model is given by
(4.2.Hi)
Because of the cantilever end condition at ~ = 0, the first two degrees of freedom
in Eq. (4.2.15) are zero. Therefore, substituting Eq. (4.2.15) into Eq. (4.2.16) we
obtain
(4.2.17)
Defining f = 2rrP / El, Xl = V2, X2 = 02l, and choosing pl3 / El = 1, the problem
of determining the tip deflection and rotation of the beam reduces to an unconstrained
minimization of
(4.2.18)
Starting with an initial point of x6 = (-1, _2)T and f(x6) = 2 we will minimize
f using Powell's conjugate directions method. The exact solution of this problem is
at x* = (-1/3, -1/2)T.
129
Chapter 4: Unconstrained Optimization
Since we have an explicit relation for the objective function I, the one dimensional
minimizations along a given direction will be performed exactly without resorting to
any of the numerical techniques discussed in the previous section. However, if these
minimizations were done numerically, one of the zeroth order techniques would be
sufficient. We use superscripts to denote the univariate cycle number and subscripts
to denote the iteration number within a cycle.
First, we perform the univariate search along the Xl and X2 directions. Choosing
sf = (1, Of we have
Xl
1
= {-I}
-2 + 0 {I}
0 = {-1+0}
-2 ' (4.2.19)
and
1(0) = 12( -1 + o? + 4( _2)2 - 12( -1 + 0)( -2) + 2( -1 + 0) . (4.2.20)
Taking the derivative of Eq. (4.2.20) with respect to 0, we obtain the value of 0
which minimizes 1 to be 0 = -1/12. Hence,
x~ ={
-13 }
..2~ and l(xD = 1.916666667 .
Choosing s~ = (0, If, we obtain
1_ { 12
-13 } { 0 } - { 12
-13 } (4.2.21)
x2 - -2 +0 1 - -2 + 0 '
and
(4.2.23)
Xo =
2 { }+ {-I} {
-1
-2 0
12
~ =
-1 _ ~
12 }
-2 + 3: '
(4.2.24)
130
Section 4-2: Minimization of Functions of Seveml Variables
which attains its minimum value for a = 40/49 at
-157 }
2 147
Xo = {
-83
and f(x~) = 1.319727891 .
49
The direction that corresponds to the largest decrease in the objective function f
during the first cycle of the univariate search is associated with the second variable.
We can now decide whether we want to replace the second (m = 2) univariate search
direction by the pattern direction or not by checking the condition stated in step 4
of the algorithm, Eq. (4.2.24). That is, Powell's criterion
1
40 [ 2 - 1.319727891 ] 2"
( 4.2.25)
lal = 49 < 1.916666667 - 1.354166667 .
is satisfied, therefore, we retain the old univariate search directions for the second
cycle and restart the procedure by going back to step 2 of the algorithm. The results
of the second cycle are tabulated in Table 4.2.1.
Table 4.2.1. Solution of the beam problem using Powell's conjugate directions method
CycleNo. f
o -1.0 -2.0 2.0
1 -1.083334 -2.0 1.916667
1 -1.083334 -1.625 1.354167
2 -0.895834 -1.625 0.9322967
2 -0.895834 -1.34375 0.6158854
2 -0.33334 -0.499999 -0.333333
the step length can be determined directly by substituting Eq. (4.2.30) into Eq.
(4.2.31) for the (k + 1 )st iteration followed by a minimization of / with respect to a
which yields
(4.2.32)
132
Section 4.2: Minimization of Functions of Several Variables
In obtaining Eq. (4.2.32) we assume that the Hessian matrix Q of the quadratic form
is available explicitly, and we make use of the symmetry of Q.
The performance of the steepest descent method depends on the condition number
of the Hessian matrix Q. The condition number of a matrix is the ratio of the largest
to the smallest eigenvalue. A large condition number implies that the contours of
the function to be minimized form an elongated design space, and therefore the
progress made by the steepest descent method is very slow and proceeds in a zigzag
pattern known as hemstitching. This is even true for quadratic functions, and can
be improved by re-scaling the variables.
Example 4.2.2
Conjugate
Gradient
The cantilever problem discussed in the previous example illustrates this behavior
most vividly. The steepest descent method when applied to this problem may exhibit
the typical hemstitching phenomenon as shown in Figure 4.2.3 for certain initial
starting points. However, a simple transformation of variables to improve the scaling
of the variables causes the steepest descent method to converge to the minimum in a
single step. For example, consider the following transformation
( 4.2.33)
The function f may now be expressed in terms of the new variables Yl and Y2 as
(4.2.34)
133
Chapter 4: Unconstrained Optimization
As a result of the scaling and elimination of the cross-product term, the condition
number of the Hessian of f is unity. Contours of the function f in the YI - Y2 plane
will appear as circles. Beginning with any arbitrary starting point Yo and applying
the steepest descent method we have
YI = { _~~ } ,
at which the gradient of f is zero, implying that it is a minimum point. The corre-
sponding values of the original variables xi, and x 2are -1/3 and -1/2, respectively.
This simple demonstration clearly shows the effectiveness of scaling in convergence
of the steepest descent algorithm to the minimum of a function in R n. It can be
shown [6] that the steepest descent method has only a linear rate of convergence in
the absence of an appropriate scaling .•••
Unfortunately, in most multivariable function minimizations it is not easy to de-
termine the appropriate scaling transformation that leads to a one step convergence
to the minimum of a general quadratic form in R n using the steepest descent algo-
rithm. This would require calculating the Hessian matrix and then performing an
expensive eigenvalue analysis of the matrix. Hence, we are forced to look at other
alternatives for rapid convergence to the minimum of a quadratic form. One such
alternative is provided by minimizing along a set of conjugate gradient directions
which guarantees a quadratic termination property. Hestenes and Stiefel [12] and
later Fletcher and Reeves [13] offered such an algorithm which will be covered next.
Fletcher-Reeves' Conjugate Gradient Algorithm. This algorithm begins from
an initial point Xo by first minimizing f along the steepest descent direction,
So = - '\l f(xo) = go, to the new iterate XI. The direction for the next iteration
SI must be constructed so that it is Q-conjugate to 80 where Q is the Hessian of
the quadratic f. The function is then minimized along Sl to yield the next iterate
X2. The next direction 82 from X2 is constructed to be Q-conjugate to the previous
directions 80 and 81, and the process is continued until convergence to the mini-
mum is achieved. By virtue of Powell's theorem on conjugate directions for quadratic
functions, convergence to the minimum is theoretically guaranteed at the end of the
minimization of the function f along the conjugate direction 8 n -1. For functions
which are not quadratic, conjugacy of the directions s;, i = 1, ... , n loses its mean-
ing since the Hessian of the functions is not a matrix of constants. However, it is a
common practice to use this algorithm for non-quadratic functions. Since, for such
functions, convergence to the minimum will rarely be achieved in n steps or less, the
algorithm is restarted after every n steps. The basic steps of the algorithm at the
(k + 1 )th iterate is as follows
1. Calculate Xk+1 = Xk + ak+18k where ak+1 is determined such that
df(ak+d = 0 . ( 4.2.36)
daHl
134
Section {2: Minimization of Functions of Several Variables
(4.2.37)
We will show the effectiveness of this method on the cantilever beam problem for
which we minimize
f = 12xi + 4x~ - 12xIX2 + 2Xl ,
starting with the initial design point xT; = (-1, -2). The initial move direction is
calculated from the gradient
So =- V f(xo) = { ~2} ,
and at the end of the first step we have
/3 - (-2.6154)2+(-1.3077)2
1- (_2)2+(4)2 = 0.4275,
81 =- { -2.6154}
-1.3077 + 0.4275
{-2}
4 = { 1.76036}
3.0178 '
and
-1.0961} {1.76036}
X2 = { -1.8077 + a2 3.0178 .
135
Chapter 4: Unconstrained Optimization
Again setting dl(0:2)/d(0:2) = 0 we obtain 0:2 = 0.4334,
_ {-0.3334}
X2 - -0.50 '
Finally, since
{_2}T
4
[24
-12
-12] { 1.76036} '" 0
8 3.0178 - .
we have verified the Q-conjugacy of the two directions So and S1' The progress of
minimization using this method is illustrated in Figure (4.2.3) .•••
Beale's Restarted Conjugate Gradient Technique. In minimizing non-quadratic
functions using the conjugate gradient method, restarting the method after every
n steps is not always a good strategy. Such a strategy seems to be insensitive to
the nonlinear character of the function being minimized. Beale [14] and latcr Powcll
[15] have proposed restart techniques that take the nonlinearity of the function into
account in deciding when to restart the algorithm. Numerical experiments with
minimization of several general functions have led to the following algorithm by Powell
[15].
1. Given Xo, define So to be the steepest descent direction,
So = -V!(xo) = go,
let k = t = 0, and begin iterations by incrementing k.
2. For k ~ 1 the direction Sk is defined by Beale's formula [14]
Sk = -gk + (3kSk-1 + 'YkSt, and gk = -V!(Xk), (4.2.38)
where
(4.2.39)
and
gf[gH1 - gtl
'Yk = StT[gt+1 - if k > t + 1, (4.2.40)
gt ] ,
'Yk = 0, if k = t +1. (4.2.41 )
The oldest second order method for minimizing a nonlinear multivariable function
in Rn is Newton's method. The motivation behind Newton's method is identical
to the steepest descent method. In arriving at the steepest descent direction, s, we
minimized the directional derivative, Eq. (4.2.27), subject to the condition that the
Euclidean norm ofs was unity, Eq. (4.2.28). The Euclidean norm, however, does not
consider the curvature of the surface. Hence, it motivates the definition of a different
norm or a metric of the surface. Thus, we pose the problem as finding the direction
s that minimizes
(4.2.44)
s = -Q-1V'f, ( 4.2.46)
where Q is the Hessian of the objective function. The general form of the update
equation of Newton's method for minimizing a function in Rn is given by
( 4.2.47)
(4.2.48)
137
Chapter 4: Unconstraine Jptimization
regardless of the initial p .nt Xo.
Newton's method can also shown to have a quadratic rate of convergence (see
for example [4] or [8]), by.t the serious disadvantages of the method are the need to
evaluate the Hessian Q ~-d then solve the system of equations
Qs=-Vf, (4.2.49)
to obtain the direction vector s. For every iteration (if Q is non-sparse), Newton's
method involves the calculation of n(n + 1)/2 elements of the symmetric Q matrix,
and n 3 operations for obtaining s from the solution of Eqs. (4.2.49). It is this feature
of Newton's method that has led to the development of methods known as quasi-
Newton or variable-metric methods which seek to use the gradient information to
construct approximations for the Hessian matrix or its inverse.
Quasi-Newton or Variable Metric Algorithms. Consider the Taylor series expan-
sion of the gradient of f around Xk+l
(4.2.50)
where Q is the actual Hessian of the function f. Assuming Ak(A k == A(Xk)) to be
an approximation to the Hessian at the kth iteration, we may write equation (4.2.50)
in a more compact form as
(4.2.51)
where
and (4.2.52)
Similarly, the solution of Eq. (4.2.51) for Pk can be written as
Bk+lYk = Pk, (4.2.53)
with Bk+l being an approximate inverse of the Hessian Q. If Bk+l is to behave
eventually as Q-l then Bk+l A k = I. Equation (4.2.53) is known as the quasi-Newton
or the secant relation. The basis for all variable-metric or quasi-Newton methods is
that, the formulae which update the matrix Ak or its inverse Bk must satisfy Eq.
(4.2.53) and, in addition, maintain the symmetry and positive definiteness properties.
In other words, if Ak or Bk are positive definite then Ak+l or Bk+l must remain so.
A typical variable-metric algorithm with an inverse Hessian update may be stated
as
(4.2.54)
where
(4.2.55)
with Bk being a positive definite symmetric matrix.
Rank-One Updates. In the class of rank-one updates we have the well-known
symmetric Broyden's update [19] for Bk+l given as
BkYkyIBk PkPr
Bk+l = [ Bk - TB + fhvkVkT] Pk + -T-' (4.2.57)
Yk kYk PkYk
where
(4.2.58)
and fh and Pk are scalar parameters that are chosen appropriately. Updates given
by Eqs. (4.2.57) and (4.2.58) are subsets of Huang's family of updates [20] which
guarantee that Bk+IYk = Pk for all choices of Ok and Pk. If we set Ok = and°
Pk = 1 for all k we obtain the Davidon-Fletcher-Powell's (DFP) update formula [21,
22] which is given as
B k+l -- B k - BkYkyIBk
TB
+ -T-
PkPI
. ( 4.2.59)
Yk kYk PkYk
The DFP update formula preserves the positive definiteness and symmetry of the
matrices B k , and has some other interesting properties as well. \Vhen used for mini-
mizing quadratic functions, it generates Q-conjugate directions and, therefore, at the
nth iteration Bn becomes the exact inverse of the Hessian Q. Thus, it has the features
of the conjugate gradient as well as the Newton-type algorithms. The DFP algorithm
can be used without an exact line search in determining Ok+l in Eq. (4.2.54). How-
°
ever, the step length must guarantee a reduction in the function value, and must
be such that prYk > in order to maintain positive definiteness of B k . The perfor-
mance of the algorithm, however, was shown to deteriorate as the accuracy of the line
search decreases [20]. In most cases the DFP formula works quite successfully. In a
few cases the algorithm has been known to break down because Bk became singular.
This has led to the introduction of another update formula developed simultaneously
139
Chapter 4: Unconstrained Optimization
by Broyden [19]' Fletcher [23], Goldfarb [24], and Shanno [25] and known known as
BFGS formula. This formula can be obtained by putting fh = 1 and Pk = 1 in Eq.
(4.2.57) which reduces to
B
k+1
= B
k
+ [1 + YkBkYk]
T
PkPk _ PkYkBk _ BkYkPk
T T T· (4.2.60)
PkYk PkYk PkYk PkYk
Equation (4.2.60) can also be written in a more compact manner as
Using A k + l = Bk~l and Ak = Bkl we can invert the above formula to arrive at an
update for the Hessian approximations. It is found that this update formula reduces
to
A k+l -A AkPkPk Ak + YkYk (4.2.62)
- k - TA -T- ,
Pk kPk YkPk
which is the analog of the DFP formula (4.2.59) with Bk replaced by A k , and
Pk and Yk interchanged. Conversely, if the inverse Hessian Bk is updated by the DFP
formula then the Hessian Ak is updated according to an analog of the DFP formula.
It is for this reason that the BFGS formula is often called the complementary DFP
formula. Numerical experiments with BFGS algorithm [26J suggest that it is superior
to all known variable-metric algorithms. We will illustrate its use by minimizing the
potential energy function of the cantilever beam problem.
Example 4.2.4
Minimize f(XI, X2) = 12xy + 4x~ -12xIX2 + 2XI by using the BFGS update algorithm
with exact line searches starting with the initial guess xij = (-1, -2).
We initiate the algorithm with a line search along the steepest descent direction.
This is associated with the assumption that Bo = 1 which is symmetric and positive
definite. The resulting point is previously calculated in example 4.2.3 to be
Xl
-1.0961}
= { -1.8077 ' and "f(xd = { -2.6154}
-1.3077 .
-1.0961} { -1 } { -0.0961}
Po = { -1.8077 - -2 = 0.1923 '
-2.6154} {2}
Yo = { -1.3077 - -4 = { -4.6154
2.6923
}
B _ ([ 1
1 - 0
0] _
1
1 [0.44354
0.96127 -0.88754
-0.25873])
0.51773
[10 0]1
( [ 1 0] 1 [0.44354 -0.88754]) 1 [0.00923 -0.01848]
X 0 1 - 0.96127 -0.25873 0.51773 + 0.96127 -0.01848 0.03698 '
_ [0.37213 0.60225]
- 0.60225 1.10385 .
Next, we calculate the new move direction from Eq. (4.2.55)
and obtain
-1.0961 } { 1. 7608 }
X2 = { -1.8077 + 0'2 3.0186 .
Setting the derivative of f(x2) with respect to 0:2 to 0 yields the value CY2 = 0.4332055,
and
-0.3333}
X2 = { -0.5000 ' with
This implies convergence to the exact solution. It is left to the reader to verify that
if Bl is updated once more we obtain
B - [0.1667 0.25]
2 - 0.25 0.5 '
It can also be verified that, as expected, the directions 80 and 81 are Q-conjugate .
•••
Q-conjugacy of the directions of travel has meaning only for quadratic functions,
and is guaranteed for such problems in the case of variable-metric algorithms be-
longing to Huang's family only if the line searches are exact. In fact, Q-conjugacy
of the directions is not necessary for ensuring a quadratic termination property [26].
This realization has led to the development of methods based on the DFP and I3FGS
formulae that abandon the computationally expensive exact line searches. The line
searches must be such that they guarantee positive definiteness of the Ak or Bk
matrices while reducing the function value appropriately. Positive definiteness is
141
Chapter 4: Unconstrai1 . Jptimization
guaranteed as long as pIYk > O. To ensure a wide radius of convergence for a quasi-
Newton method, it is also necessary to satisfy the following two criteria. First, a
sufficiently large decrease in the function f must be achieved for the step taken and,
second, the rate of decrease of f in the direction Sk at Xk+l must be smaller than the
rate of decrease of f at Xk [26]. In view of this observations, most algorithms with
inexact line searches require the satisfaction of the following two conditions.
( 4.2.63)
and
(4.2.64 )
The convergence of the BFGS algorithm under these conditions has been studied by
Powell [27]. Similar convergence studies with Beale's restarted conjugate gradient
method under the same two conditions have been carried out by Shanno [28].
where the Hessian of f and the Jacobian of g are the same. In cases where the
problems are posed directly as
g(x) = 0, ( 4.2.66)
Dennis and Schnabel [6] and others solve Eq. (4.4.2) by minimizing the nonlinear
least squares function
(4.2.67)
In this case, however, the Hessian of f and the Jacobian of g are not identical but a
positive definite approximation to the Hessian of f appropriate for most minimiza-
tion schemes can be easily generated from the Jacobian of g [6]. Minimization of f
then permits the determination of not only stable but also unstable equilibrium con-
figurations provided the minimization does not converge to a local minimum. In the
case of convergence to a local minimum, certain restart [6] or deflation and tunnelling
techniques [29, 30] can be invoked to force convergence to the global minimum of f
at which Ilgll = O.
142
Section 4.3: Specialized Quasi-Newton Methods
4.3 Specialized Quasi-Newton Methods
The rank-one and rank-two updates that we discussed in the previous section yield
updates which are symmetric but not necessarily sparse. In other words the Hessian
or Hessian inverse updates lead to symmetric matrices which are fully populated. In
most structural analysis problems using the finite element method it is well known
that the Hessian of the potential energy (the tangent stiffness matrix) is sparse. This
may be also true of many structural optimization problems. For such sparse systems
the solution phase for finite element models exploits the triple LDLT factorization.
Thus the Hessian or the Hessian inverse updates discussed previously are not ap-
propriate for solving large-scale structural analysis problems which involve sparse
Hessians.
In applying the BFGS method for solving large-scale nonlinear problems of struc-
tural analysis Matthies and Strang [31] have proposed an alternate implementation
of the method suitable for handling large sparse problems by storing the vectors
(4.3.1)
and
(4.3.2)
and reintroducing them to compute the new search directions. After a sequence of
five to ten iterations during which the BFGS updates are used, the stiffness matrix
is recomputed and the update information is deleted.
Sparse updates for solving large-scale problems were perhaps first proposed by
Schubert [32], who proposed a modification of Broyden's method [33] according to
which the ith row of the Hessian Ak+l is updated by using
(4.3.3)
143
Chapter 4: Unconstrained Optimization
Curtis, Powell and Reid [36], and Powell and Toint [37] have proposed finite
difference strategies for the direct evaluation of sparse Hessians of functions. In
addition to using the finite difference operations, they used concepts from graph
theory that minimize the number of gradient evaluations required for computing the
few non-zero entries of a sparse Hessian. By using these strategies, we can exploit
the sparsity not only in the computation of the Newton direction but also in the
formation of Hessians [38, 39]
The Curtis-Powell-Reid (CPR) strategy exploits sparsity, but not the symmetry
of the Hessian. It divides the columns of the Hessian into groups, so that in each
group the row numbers of the unknown elements of the column vectors are all dif-
ferent. After the formation of the first group, other groups are formed successively
by applying the same strategy to columns not included in the previous groups. The
number of such groups for sparse or banded matrices is usually very small by compar-
ison with n. To evaluate the Hessian of f at Xo we evaluate the gradient of f at Xo.
After this initial gradient evaluation, only as many more gradient evaluations as the
number of groups are needed to evaluate all the non-zero elements of the Hessian
using forward difference approximation. Thus
()gi gi(XO + hjej) - gi(XO)
aij = -;-- = h ' (4.3.4)
uXj j
where ej is the jth coordinate vector and hj is a suitable step size. Each step size may
be adjusted such that the greatest ratio of the round-off to truncation error for any
column of the Hessian falls within a specified range. However, such an adjustment of
step sizes would require a significantly large number of gradient evaluations. Hence,
to economize on the number of gradient evaluations the step sizes are not allowed to
leave the range
( 4.3.5)
where f is the greatest relative round-off in a single operation, 1] is the relative machine
precision, and huj is an upper bound on hj [36].
Powell and Toint [37] extended the CPR strategy to exploit symmetry of the
Hessian. They proposed two methods, one of which is known as the substitution
method. According to this, the CPR strategy is first applied to the lower triangular
part, L, of the symmetric Hessian, A. Because, all the elements of A computed this
way will not be correct, the incorrect elements are corrected by a back-substitution
scheme. Details of this back-substitution schcme may be found in Ref. 37.
The Powell-Toint (PT) strategy of estimating sparse Hessians directly appears
to be a much bettcr alternative to Toint's sparse update algorithm [38]. One major
drawback of Toint's update algorithm is that the updated Hessian approximation is
not guaranteed to remain positive definite even if the initial Hessian approximation
was positive definite.
4.3.2 Coercion of Hessians for Suitability with Quasi-Newton Methods
144
Section 4.4: Probabilistic Search Algorithms
definite. If this is not so, then Newton's direction is not guaranteed to be a descent
direction. There are several strategies for coercing an indefinite Hessian to a positive
definite form. Prominent among these strategies is the one proposed by Gill and
Murray [40]. The most impressive feature of this strategy is that the coercion of
the Hessian takes place during its LDLT decomposition for the computation of the
Newton direction. The diagonal elements of the D matrix are forced to be sufficiently
positive to avoid numerical difficulties while the off-diagonal terms of LDl/2 are lim-
ited by a quantity designed to guarantee positive definiteness of the resulting matrix.
This is equivalent to modifying the original non-positive definite Hessian matrix by
the addition of an appropriate diagonal matrix. Because this matrix modification
is carried out during its LDLT decomposition, the strategy for the computation of
Newton's descent direction does not entail a great deal of additional computations.
4.3.3 Making Quasi-Newton Methods Globally Convergent
145
Chapter 4: Unconstrained Optimization
Dealing with the problem of local minima becomes even worse if the design vari-
ables are required to take discrete values. First of all, for such problems the design
space is discontinuous and disjointed, therefore derivative information is either useless
or is not defined. Secondly, the use of discrete values for the design variables intro-
duces multiple minima corresponding to various combinations of the variables, even if
the objective function for the problem ha.'l a single minimum for continuous variables.
A methodical way of dealing with multiple minima for discrete optimization prob-
lems is to use either random search techniques that would sample the design space
for a global minimum or to employ enumerative type algorithms. In either case, the
efficiency of the solution process deteriorates dramatically as the number of variables
is increa.'led.
146
Section 4.4: Probabilistic Search Algorithms
the new state might be accepted or rejected based on a random probabilistic decision.
The probability of acceptance, P(~E), of a higher energy state is computed as
(4.4.1)
where kB is the Boltzmann's constant. If the temperature of the system is high, then
the probability of acceptance of a higher energy state is close to one. If, on the other
hand, the temperature is close to zero, then the probability of acceptance becomes
very small.
The decision to accept or reject is made by randomly selecting a number in an
interval (0,1) and comparing it with P(~E). If the number is less than P(~E), then
the perturbed state is accepted, if it is greater than P(~E), the state is rejected.
At each temperature, a pool of atomic structures would be generated by randomly
perturbing positions until a steady state energy level is reached (commonly referred
to as thermal equilibrium). Then the temperature is reduced to start the iterations
again. These steps are repeated iteratively while reducing the temperature slowly to
achieve the minimal energy state.
The analogy between the simulated annealing and the optimization of functions
with many variables was established recently by Kirkpatrick et al. [49], and Cerny
[50]. By replacing the energy state with an objective function J, and using variables
x for the the configurations of the particles, we can apply the Metropolis algorithm
to optimization problems. The method requires only function values. The moves in
the design space from one point, Xi to another xi causes a change in the objective
function, ~Jij. The temperature T now becomes a control parameter that regulates
the convergence of the process. Important elements that affect the performance of
the algorithm are the selection of the initial value of the "temperature", To, and
how to update it. In addition, the number of iterations (or combinations of design
variables) needed to achieve "thermal equilibrium" must be decided before the T can
be reduced. These parameters are collectively referred to as the "cooling schedule" .
A flow chart of a typical simulated annealing algorithm is shown in Figure 4.4.l.
The definition of the cooling schedule begins with the selection of the initial temper-
ature. If a low value of To is used, the algorithm would have a low probability of
reaching a global minimum. The initial value of To must be high enough to permit vir-
tually all moves in the design space to be acceptable so that almost a random search
is performed. Typically, To is selected such that the acceptance ratio X (defined as
the ratio of the number of accepted moves to total number of proposed moves) is
approximately Xo = 0.95 [51]. Johnson et al. [52] determined To by calculating the
average increase in the objective function, z;:t+), over a predetermined number of
moves and solved
( 4.4.2)
leading to
(4.4.3)
147
Chapter 4: Unconstrained Optimization
initialize xO , To
°
f o= f (x )
k=o,m =o
~ =xi +L\x
ti = f (xi)
~tij = fl- ti
m=m+ l
reduce temperat.
k=k+l
y
N
is performed
Once the temper ature is set, a numbe r of moves in the variable space
ature must be large
by perturb ing the design. The numbe r of moves at a given temper is to
local minimu m. One possibi lity
enough to allow the solutio n to escape from a d number ,
ve functio n does not change for a specifie
move until the value of the objecti discrete
[53) for
M, of successive iteratio ns. Anothe r possibility suggested by Aarts
ations of design
valued design variables is to make sure that every possible combin
state design is visited at least once with a
variables in the neighborhood of a steady
148
Section 4.4: Probabilistic Search Algorithms
M = SIn (_1_) ,
1-P
(4.4.4)
where P = 0.99 for S > 100, and P = 0.995 for S < 100. For discrete valued
variables there are often many options for defining the neighborhood of the design.
One possibility is to define it as all the designs that can be obtained by changing one
design variable to its next higher or lower value. A broader immediate neighborhood
can be defined by changing more than one design variables to their next higher or
lower values. For an n variable problem, the immediate neighborhood has
S = 3n - 1. ( 4.4.5)
where 0.5 ::; a ::; 0.95. Nahar [54) fixes the number of decrement steps J(, and
suggests determination of the values of the Tk experimentally. It is also possible to
divide the interval [0, To) into a fixed J( number of steps and use
J( - k
TK=~To, k= 1,2, ... ,J(. (4.4.7)
Genetic algorithms use techniques derived from biology, and rely on the principle of
Darwin's theory of survival of the fittest. When a population of biological creatures
is allowed to evolve over generations, individual characteristics that are useful for
survival tend to be passed on to the future generations, because individuals carry-
ing them get more chances to breed. Those individual characteristics in biological
populations are stored in chromosomal strings. The mechanics of natural genetics
is based on operations that result in structured yet randomized exchange of genetic
149
Chapter 4: Unconstrained Optimization
information (i.e., useful traits) between the chromosomal strings of the reproducing
parents, and consists of reproduction, crossover, occasional mutation, and inversion
of the chromosomal strings.
Genetic algorithms, developed by Holland [59], simulate the mechanics of natural
genetics for artificial systems based on operations which are the counterparts of the
natural ones (even called by the same names), and are extensively used as multi-
variable search algorithms. As will be described in the following paragraphs, these
operations involve simple, easy to program, random exchanges of location of num-
bers in a string, and, therefore, at the outset look like a completely random search
of extremum in the parameter space based on function values only. However, ge-
netic algorithms are experimentally proven to be robust, and the reader is referred to
Goldberg [47] for further discussion of the theoretical properties of genetic algorithms.
Here we discuss the genetic representation of a minimization problem, and focus on
the mechanics of three commonly used genetic operations, namely; reproduction,
crossover, and mutation.
Application of the operators of the genetic algorithm to a search problem first
requires the representation of the possible combinations of the variables in terms
of bit strings that are counterparts of the chromosomes. Naturally, the measure of
goodness of a specific combination of genes is represented in an artificial system by
the objective function of the search problem. For example, if we have a minimization
problem
minimize f(x), (4.4.8)
a binary string representation of the variable space could be of the form
(4.4.9)
where string equivalents of the individual variables are connected head-to-tail, and,
in this example, base 10 values of the variables are Xl = 6, X2 = 5, X3 = 3, X4 = 11,
and their ranges correspond to {15 ~ XI,X4 ~ 0},{7 ~ X2 ~ O},and {3 ~ X3 ~
O}. Because of the bit string representation of the variables, genetic algorithms are
ideally suited for problems where the variables are required to take discrete or integer
variables. For problems where the design variables are continuous values within a
range xf ~ Xi ~ xf, one may need to use a large number of bits to represent the
variables to high accuracy. The number of bits that are needed depends on the
accuracy required for the final solution. For example, if a variable is defined in a
range {0.01 ~ Xi ~ l.81} and the accuracy needed for the final value is x incr = 0.001,
then the number of binary digits needed for an appropriate representation can be
calculated from
(4.4.10)
where m is the number of digits. In this example, the smallest number of digits that
satisfy the requirement would bem = 11, which actually produces increments of
0.00087 in the value of the variable, instead of the required value of 0.00l.
Unlike the search algorithms discussed earlier that move from one point to another
in the design variable space, genetic algorithms work with a population of strings
150
Section 4.4: Probabilistic Search Algorithms
(chromosomes). This aspect of the genetic algorithms is responsible for capturing
near global solutions, by keeping many solution points that may have the potential
of being close to minima (local or global) in the pool during the search process rather
than singling out a point early in the process and running the risk of getting stuck at
a local minimum. Working on a population of designs also suggests the possibility of
implementation on parallel computers. However, the concept of parallelism is even
more basic to genetic algorithms in that evolutionary selection can improve in parallel
many different characteristics of the design. Also, the outcome of a genetic search is
a population of good designs rather than a single design. This aspect can be very
useful to the designer.
Initially the size of the population is chosen and the values of the variables in
each string are decided by randomly assigning O's and 1's to the bits. The next
important step in the process is reproduction, in which individual strings with good
objective function values are copied to form a new population, an artificial version
of the survival of the fittest. The bias towards strings with better performance can
be achieved by increasing the probability of their selection in relation to the rest of
the population. One way to achieve this is to create a biased roulette wheel where
individual strings occupy areas proportional to their function values in relation to
the cumulative function value of the entire population. Therefore, the population
resulting from the reproduction operation would have multiple copies of the highly
fit individuals.
Once the new population is generated, the members are paired off randomly for
crossover. The mating of the pair also involves a random process. A random integer
k between 1 and L - 1, where L is the string length, is selected and two new strings
are generated by exchanging the O's and 1's that comes after the kth location in the
first parent with the corresponding locations of the second parent. For example, the
two strings of length L = 9
parent 1: o 1 1 0 1110 1 1 1 (4.4.11 )
parent 2: o 1 0 0 1110 0 0 1 '
are mated with a crossover point of k = 5, the offsprings will have the following
composition,
offspring 1: 011010001
(4.4.12)
offspring 2: o 1 001 0 1 1 1
Multiple point crossovers in which information between the two parents are swapped
among more string segments are also possible, but because of the mixing of the strings
the crossover becomes a more random process and the performance of the algorithm
might degrade, De Jong [60J. Exception to this is the two-point crossover. In fact,
the one point crossover can be viewed as a special case of the two point crossover in
which the end of the string is the second crossover point. Booker [61] showed that
by choosing the end-point of the segment to be crossed randomly, the performance
of the algorithm can actually be improved.
Mutation serves an important task of preventing premature loss of important
genetic information by occasional introduction of random alteration of a string. As
151
Chapter 4: Unconstrained Optimization
In closing, the basic ideas behind the simulation of a natural phenomena is find-
ing a more mathematically sound foundation in the area of probabilistic search al-
gorithms, especially for discrete variables. Improvements in the performance of the
algorithms are constantly being made. For example, modifications in the cooling
schedule proposed by Szu [65] led to the development of a new algorithm know as
the fast simulated annealing. Applications and analysis of other operations that
mimic the natural biological genetics (such as inversion, dominance, niches, etc.) are
currently being evaluated for genetic algorithms.
4.5 Exercises
Begin with X6 = (-1, -2). For the simplex algorithm assume an initial simplex
of size a=2.0. Assume an initial base point Xo with the coordinates of the other
vertices to be given by Eqs. (4.2.1) and (4.2.1).
152
Section 4.5: Exercises
X2) 4
1
f(XI, X2) = 2" m 'Y( -O'IXI
1 2
+ 2"X 1
1 + X2) + 2"
2 ( 1
-0'1 X1 + 2"X 1 -
2
-:y 'Y -
_
P'Y X1 ,
- P
P = EA 2 '
and E is the elastic modulus, Al and A2 are the cross-sectional areas of the bars. Us-
ing the BFGS algorithm determine the equilibrium configuration in terms of Xl and X2
for m = 5, 'Y = 4,0'1 = 0.02,15 = 2 x 10- 5 . Use X6 = (0,0).
5. Continuing the analysis of the problem 4 it can be shown that the critical load Per
at which the shallow truss is unstable (snap-through instability) is given by
EAIA2'Yb + 1? O'r
Per = (AI + A 2'Y) 3V3·
Suppose now that Per as given above is to be maximized subject to the condition that
AlII + A2l2 = Vo = constant.
The exterior penalty formulation of Chapter 5 reduces the above problem to the
unconstrained minimization of
153
Chapter 4: Unconstrained Optimization
where r is a penalty parameter. Carry out the minimization of an appropriately
nondimensionalized form of po. for II = 200 in, l2 = 50 in, h = 2.50 in, Vo =
200 in3 , E = 106 psi, r = Wi to determine an approximate solution for the op-
timum truss configuration and the corresponding value of Pc.. Use the BFGS al-
gorithm for unconstrained minimization beginning with an initial feasible guess of
A1 = 0.952381in2 and A2 = 0.190476in2 .
6. a) Minimize the directional derivative of / in the direction s
'
~,"8
2
=1,
;=1
\1/ (4.5.1)
s = -11\1/11 .
b) Repeat the above with the constraint condition on s replaced by
STQS = 1,
to show that the Newton direction is given by
4.6 References
[1] Kamat, M.P. and Hayduk, R.J., "Recent Developments in Quasi-Newton Meth-
ods for Structural Analysis and Synthesis," AIAA J., 20 (5), 672-679, 1982.
[2] Avriel, M., Nonlinear Programming: Analysis and Methods. Prentice-Hall, Inc.,
1976.
[3] Powell, M.J.D., "An Efficient Method for Finding the Minimum of a Function of
Several Variables without Calculating Derivatives," Computer J., 7, pp. 155-162,
1964.
[4] Kiefer, J., "Sequential Minmax Search for a Maximum," Proceedings of the Amer-
ican Mathematical Society, 4, pp. 502-506, 1953.
154
Section 4.6: References
[5] Walsh, G.R., Methods of Optimization, John Wiley, New York, 1975.
[6] Dennis, J.E. and Schnabel, R.B., Numerical Methods for Unconstrained Opti-
mization and Nonlinear Equations, Prentice-Hall, 1983.
[7] Gill, P.E., Murray, W. and Wright, M.H., Practical Optimization, Academic
Press, New York, p. 92, 198!.
[8] Spendley, W., Hext, G. R., and Himsworth, F. R., "Sequential Application of
Simplex Designs in Optimisation and Evolutionary Operation," Technometrics,
4 (4), pp. 441-461,1962.
[9] NeIder, J. A. and Mead, R., "A Simplex Method for Function Minimization,"
Computer J., 7, pp. 308-313, 1965.
[10] Chen, D. H., Saleem, Z., and Grace, D. W., "A New Simplex Procedure for
Function Minimization," Int. J. of Modelling & Simulation, 6, 3, pp. 81-85, 1986.
[11] Cauchy, A., "Methode Generale pour la Resolution des Systemes D'equations
Simultanees," Compo Rend. l'Academie des Sciences Paris, 5, pp. 536-538, 1847.
[12] Hestenes, M.R. and Stiefel, E., "Methods of Conjugate Gradients for Solving
Linear Systems," J. Res. Nat. Bureau Stand., 49, pp. 409-436, 1952.
[13] Fletcher, R. and Reeves, C.M., "Function Minimization by Conjugate Gradients,"
Computer J., 7, pp. 149-154, 1964.
[14] Gill, P.E. and Murray, W., "Conjugate-Gradient Methods for Large Scale Nonlin-
ear Optimization," Technical Report 79-15; Systems Optimization Lab., Dept. of
Operations Res., Stanford Univ., pp. 10-12, 1979.
[15] Powell, M.J.D., "Restart Procedures for the Conjugate Gradient Method," Math.
Prog., 12, pp. 241-254,1975.
[16] Polak, E., Computational Methods in Optimization: A Unified Approach, Aca-
demic Press, 1971.
[17] Axelsson, O. and Munksgaard, N., "A Class of Preconditioned Conjugate Gra-
dient Methods for the Solution of a Mixed Finite Element Discretization of the
Biharmonic Operator," Int. J. Num. Meth. Engng., 14, pp. 1001-1019, 1979.
[18] Johnson, O.G., Micchelli, C.A. and Paul, G., "Polynomial Preconditioners for
Conjugate Gradient Calculations," SIAM J. Num. Anal., 20 (2), pp. 362-376,
1983.
[19] Broyden, C.G., "The Convergence of a Class of Double-Rank Minimization Al-
gorithms 2. The New Algorithm," J. Inst. Math. Appl., 6, pp. 222-231, 1970.
[20] Oren, S.S. and Luenberger, D., "Self-sealing Variable Metric Algorithms, Part
I," Manage. Sci., 20 (5), pp. 845-862, 1974.
[21] Davidon, W.C., Variable Metric Method for Minimization. Atomic Energy Com-
mission Research and Development Report, ANL-5990 (Rev.), November 1959.
155
Chapter 4.. Unconstrained ( timization
[22] Fletcher, R. and Powel M.J.D., "A Rapidly Convergent Descent Method for
Minimization," Compu . J., 6, pp. 163-168, 1963.
[23] Fletcher, R., "A New k,roach to Variable Metric Algorithms," Computer J., 13
(3), pp. 317-322, 1970.
[24] Goldfarb, D., "A Fan~ of Variable-metric Methods Derived by Variational
Means," Math. Comput., 24, pp. 23-26, 1970.
[25] Shanno, D.F., "Conditioning of Quasi-Newton Methods for Function Minimiza-
tion," Math. Comput., 24, pp. 647-656, 1970.
[26] Dennis, J.E., Jr. and More, J.J., "Quasi-Newton Methods, Motivation and The-
ory," SIAM Rev., 19 (1), pp. 46-89, 1977.
[27] Powell, M.J.D., "Some Global Convergence Properties of a Variable Metric Algo-
rithm for Minimization Without Exact Line Searches," In: Nonlinear Program-
ming (R.W.Cottle and C.E. Lemke, eds.), American Mathematical Society, Prov-
idence, RI, pp. 53-72, 1976.
[28] Shanno, D.F., "Conjugate Gradient Methods with Inexact Searches," Math.
OpeL Res., 3 (2), pp. 244-256, 1978.
[29] Kamat, M.P., Watson, L.T. and Junkins, J.L., "A Robust Efficient Hybrid
Method for Finding Multiple Equilibrium Solutions," Proceedings of the Third
Intl. Conf. on Numerical Methods in Engineering, Paris, France, pp. 799-807,
March 1983.
[30] Kwok, H.H., Kamat, M.P. and Watson, L.T., "Location of Stable and Unstable
Equilibrium Configurations using a Model Trust Region, Quasi-Newton Method
and Tunnelling," Computers and Structures, 21 (6), pp. 909-916, 1985.
[31] Matthies, H. and Strang, G., "The Solution of Nonlinear Finite Element Equa-
tions," Int. J. Num. Meth. Enging., 14, pp. 1613-1626, 1979.
[32] Schubert, L.K., "Modification of a Quasi-Newton Method for Nonlinear Equations
with a Sparse Jacobian," Math. Comput., 24, pp. 27-30, 1970.
[33] Broyden, C.G., "A Class of Methods for Solving Nonlinear Simultaneous Equa-
tions," Math. Comput., 19, pp. 577-593, 1965.
[34] Toint, Ph.L., "On Sparse and Symmetric Matrix Updating Subject to a Linear
Equation," Math. Comput., 31, pp. 954-961, 1977.
[35] Shanno, D.F., "On Variable-Metric Methods for Sparse Hessians," Math. Com-
put., 34, pp. 499-514, 1980.
[36] Curtis, A.R., Powell, M.J.D. and Reid, J.K., "On the Estimation of Sparse Jaco-
bian Matrices," J. Inst. Math. Appl., 13, pp. 117-119,1974.
[37] Powell, M.J .D. and Toint, Ph.L., "On the Estimation of Sparse Hessian Matrices,"
SIAM J. Num. Anal., 16 (6), pp. 1060-1074,1979.
156
Section 4.6: References
[38] Kamat, M.P., Watson, L.T. and VandenBrink, D.J., "An Assessment of Quasi-
Newton Sparse Update Techniques for Nonlinear Structural Analysis," Comput.
Meth. Appl. Mech. Enging., 26, pp. 363~375, 1981.
[39] Kamat, M.P. and VandenBrink, D.J., "A New Strategy for Stress Analysis Using
the Finite Element Method," Computers and Structures 16 (5), pp. 651~656,
1983.
[40] Gill, P.E. and Murray, W., "Newton-type Methods for Linearly Constrained Opti-
mization," In: Numerical Methods for Constrained Optimization (Gill & Murray,
eds.), pp. 29~66. Academic Press, New York 1974.
[41] Griewank, A.O., Analysis and Modifications of Newton's Method at Singularities.
Ph.D. Thesis, Australian National University, 1980.
[42] Decker, D.W. and Kelley, C.T., "Newton's 11ethod at Singular Points, I and II,"
SIAM J. Num. Anal., 17, pp. 66~70; 465~471, 1980.
[43] Hansen, E., "Global Optimization Using Interval Analysis~ The Multi Dimen-
sional Case," Numer. Math., 34, pp. 247~270, 1980.
[44] Kao, J.-J., Brill, E. D., Jr., and Pfeffer, J. T., "Generation of Alternative Optima
for Nonlinear Programming Problems," Eng. Opt., 15, pp. 233~251, 1990.
[45] Ge, R., "Finding More and More Solutions of a System of Nonlinear Equations,"
Appl. Math. Computation, 36, pp. 15-30, 1990.
[46] Laarhoven, P. J. M. van., and Aarts, E., Simulated Annealing: Theory and Ap-
plications, D. Reidel Publishing, Dordrecht, The Netherlands, 1987.
[47] Goldberg, D. E., Genetic Algorithms in Search, Optimization, and Machine
Learning, Addison-Wesley Publishing Co. Inc., Reading, Massachusetts, 1989.
[48] Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., and Teller,
E., "Equation of State Calculations by Fast Computing Machines," J. Chern.
Physics, 21 (6), pp. 1087~1092, 1953.
[49] Kirkpatrick, S., Gelatt, C. D., Jr., and Vecchi, M. P., "Optimization by Simulated
Annealing," Science, 220 (4598), pp. 671 ~680, 1983.
[50] Cerny, V., "Thermodynamical Approach to the Traveling Salesman Problem: An
Efficient Simulation Algorithm," J. Opt. Theory Appl., 45, pp. 41~52, 1985.
[51] Rutenbar, R. A., "Simulated Annealing Algorithms: An Overview," IEEE Cir-
cuits and Devices, January, pp. 19~26, 1989.
[52] Johnson, D. S., Aragon, C. R., McGeoch, L. A., and Schevon, C., "Optimization
by Simulated Annealing: An Experimental Evaluation. Part I. Graph Partition-
ing," Operations Research, 37, 1990, pp. 865~893.
[53] Aarts, E., and Korst, J., Simulated Annealing and Boltzmann Machines, A
Stochastic Approach to Combinatorial Optimization and Neural Computing,
John Wiley & Sons, 1989.
157
Chapter 4: Unconstrained Optimization
[541 Nahar, S., Sahni, S., and Shragowithz, E. V., in the Proceedings of 22nd Design
Automation Conf., Las Vegas, June 1985, pp. 748-752.
[551 Elperin, T, "Monte Carlo Structural Optimization in Discrete Variables with
Annealing ALgorithm," Int. J. Num. Meth. Eng., 26, 1988, pp. 815-821.
[56] Kincaid, R. K., and Padula, S. L., "Minimizing Distortion and Internal Forces
in Truss Structures by Simulated Annealing," Proceedings of the AIAA/ ASME
/ ASCE/ AHS/ ASC 31st Structures, Structural Dynamics, and Materials Confer-
ence, Long Beach, CA., 1990, Part 1, pp. 327-333.
[57] Balling, R. J., and May, S. A., "Large-Scale Discrete Structural Optimization:
Simulated Annealing, Branch-and-Bound, and Other Techniques," presented at
the AIAA/ AS ME/ ASCE/ AHS/ ASC 32nd Structures, Structural Dynamics, and
Materials Conference, Long Beach, CA., 1990,
[58] Chen, G.-S., Bruno, R. J., and Salama, M., "Optimal Placement of Active/Passive
Members in Structures Using Simulated Annealing," AIAA J., 29 (8), August
1991, pp. 1327-1334.
[59] Holland, J. H., Adaptation of Natural and Artificial Systems, The University of
Michigan Press, Ann Arbor, MI, 1975.
[60] De Jong, K. A., Analysis of the Behavior of a Class of Genetic Adaptive Systems
(Doctoral Dissertation, The University of Michigan; University Microfilms No.
76-9381), Dissertation Abstracts International, 36 (10), 5140B, 1975.
[61] Booker, L., "Improving Search in Genetic Algorithms," in Genetic Algorithms
and Simulated Annealing, Ed. L. Davis, Morgan Kaufmann Publishers, Inc., Los
Altos, CA. 1987, pp. 61-73.
[62] Goldberg, D. E., and Samtani, M. P., "Engineering Optimization via Genetic
Algorithm," Proceedings of the Ninth Conference on Electronic Computation,
ASCE, February 1986, pp. 471-482.
[63] Hajela, P., "Genetic Search-An Approach to the Nonconvex Optimization Prob-
lem," AIAA J., 28 (7), July 1990, pp. 1205-1210.
[64] Rao, S. S., Pan, T.-S., and Venkayya, V. B., "Optimal Placement of Actuators in
Actively Controlled Structures Using Genetic Algorithms," AIAA J., 29 (6), pp.
942-943, June 1991.
[65] Szu, H., and Hartley, R.L., "Nonconvex Optimization by Fast Simulated Anneal-
ing," Proceedings of the IEEE, 75 (11), pp. 1538-1540,1987.
158
Constrained Optimization 5
159
Chapter 5: Constrained Optimization
great impact on the design, so that typically several of the inequality constraints are
active at the minimum. .
While the methods described in this section are powerful, they can often per-
form poorly when design variables and constraints are scaled improperly. To prevent
ill-conditioning, all the design variables should have similar magnitudes, and all con-
straints should have similar values when they are at similar levels of criticality. A
common practice is to normalize constraints such that g(x) = 0.1 correspond to a
ten percent margin in a response quantity. For example, if the constraint is an upper
limit aa on a stress measure a, then the constraint may be written as
a
g=1--2:0. (5.2)
aa
Some of the numerical techniques offered in this chapter for the solution of con-
strained nonlinear optimization problems are not able to handle equality constraints,
but are limited to inequality constraints. In such instances it is possible to re-
place the equality constraint of the form h;(x) = 0 with two inequality constraints
h;(x) ~ 0 and hi(x) 2: O. However, it is usually undesirable to increase the number of
constraints. For problems with large numbers of inequality constraints, it is possible
to construct an equivalent constraint to replace them. One of the ways to replace a
family of inequality constraints (g;(x) 2: 0, i = 1 ... m) by an equivalent constraint is
to use the Kreisselmeier-Steinhauser function [1] (KS-function) defined as
(5.3)
, ~(m)
gmin ~ K S[g;(x)] ~ gmin - -- . (5.4)
p
2
KS
h
o
-2 - hex) =0",
-4 ,, ,, "KS -(p.";' o.ir -. - - -- - ~ _. -
, ,.
-6
,
-8
o 2 4 6 8 10 12 14 16
x
minimize J(x)
(5.6)
such that hk(x) = 0, k = 1, ... , n e ,
may be reformulated as
minimize J(x)
(5.7)
such that KS(hI, -hI, h2' -h 2, ... , hne' -hnJ ~ -10 •
In general, problem (5.1) may have several local minima. Only under special circum-
stances are sure of the existence of single global minimum. The necessary conditions
for a minimum of the constrained problem are obtained by using the Lagrange mul-
tiplier method. We start by considering the special case of equality constraints only.
Using the Lagrange multiplier technique, we define the Lagrangian function
n•
.c(x,.\) = J(x) - L >.; h; (x) , (5.1.1)
;=1
161
Chapter 5: Constrained Optimization
where Aj are unknown Lagrange multipliers. The necessary conditions for a stationary
point are
ac aj ne ahj
-=--~Aj-=O i = 1, ... ,n, (5.1.2)
ax· ax ~ ax
,
'
'j=l •
ac
aA' = hj(x) = 0, j = 1, ... , ne . (5.1.3)
J
These conditions, however, apply only at a regular point, that is at a point where the
gradients of the constraints are linearly independent. If we have constraint gradients
that are linearly dependent, it means that we can remove some constraints without
affecting the solution. At a regular point, Eqs. (5.1.2) and (5.1.3) represent n + ne
equations for the ne Lagrange multipliers and the n coordinates of the stationary
point.
The situation is somewhat more complicated when inequality constraints are
present. To be able to apply the Lagrange multiplier method we first transform the
inequality constraints to equality constraints by adding slack variables. That is, the
inequality constraints are written as
j = 1, ... , n g , (5.1.4)
where tj is a slack variable which measures how far the jth constraint is from being
critical. We can now form a Lagrangian function
ng
ac
aXi
= Of _
aXi )=1
.
I:>-j aXiagj = 0, i=l, ... ,n, (5.1.6)
ac 2
aA' = -gj + tj = 0, j=l, ... ,ng , (5.1.7)
J
ac
-;::;- = 2Ajtj = 0, j = 1, ... , ng . (5.1.8)
utj
Equations (5.1.7) and (5.1.8) imply that when an inequality constraint is not critical
(so that the corresponding slack variable is non-zero) then the Lagrange multiplier
associated with the constraint is zero. Equations (5.1.6) to (5.1.8) are the necessary
conditions for a stationary regular point. Note that for inequality constraints a regular
point is one where the gradients of the active constraints are linearly independent.
These conditions are modified slightly to yield the necessary conditions for a minimum
and are known as the Kuhn-Tucker conditions. The Kuhn-Tucker conditions may be
summarized as follows:
162
Section 5.1,' The Kuhn-Tucker Conditions
Vf
Figure 5.1.1 A geometrical interpretation of Kuhn-Tucker condition for the case of
two constraints.
(5.1.9)
163
Chapter 5: Constrained .'Jtimization
,
where fA is the set of active constraints Equality in Eq. (5.1.10) is permitted only
for linear or concave constraints (see Section 5.1.2 for definition of concavity). The
condition for a usable direction (one that decreases the objective function) is
In view of Eqs. (5.1.10) and (5.1.11), Eq. (5.1.12) is impossible if the A/S are positive.
If the Kuhn-Tucker conditions are satisfied at a point it is impossible to find a
direction with a negative slope for the objective function that does not violate the
constraints. In some cases, though, it is possible to move in a direction which is
tangent to the active constraints and perpendicular to the gradient (that is, has zero
slope), that is
(5.1.13)
The effect of such a move on the objective function and constraints can be determined
only from higher derivatives. In some cases a move in this direction could reduce the
objective function without violating the constraints even though the Kuhn-Tucker
conditions are met. Therefore, the Kuhn-Tucker conditions are necessary but not
sufficient for optimality.
The Kuhn-Tucker conditions are sufficient when the number of active constraints
with s °
is equal to the number of design variables. In this case Eq. (5.1.13) cannot be satisfied
t- because '\1 gj includes n linearly independent directions (in n dimensional
space a vector cannot be orthogonal to n linearly independent vectors).
When the number of active constraints is not equal to the number of design
variables sufficient conditions for optimality require the second derivatives of the
objective function and constraints. A sufficient condition for optimality is that the
Hessian matrix of the Lagrangian function is positive definite in the subspace tangent
to the active constraints. If we take, for example, the case of equality constraints,
the Hessian matrix of the Lagrangian is
n,
164
Section 5.1: The Kuhn-Tucker Conditions
Example 5.1.1
subject to
gl = 10 - XIX2 ;::: 0,
g2 = Xl ;::: 0,
g3 = 10 - X2 ;::: 0 .
The Kuhn-Tucker conditions are
is clearly negative definite, so that this point is a maximum. We next assume that the
first constraint is active, XIX2 = 10, so that Xl i 0 and g2 is inactive and therefore
A2 = O. We have two possibilities for the third constraint. If it is active we get .Tl = 1,
X2 = 10, Al = -0.7, and A3 = 639.3, so that this point is neither a minimum nor a
maximum. If the third constraint is not active A3 = 0 and we obtain the following
three equations
-3xi + 10 + AIX2 = 0,
-4X2 - 6x~ + Al Xl = 0,
XjX2 = 10 .
The only solution for these equations that satisfies the constraints on Xl and X2 is
This point satisfies the Kuhn-Tucker conditions for a minimum. However, the Hessian
of the Lagrangian at that point
\7 2 .c _ [-23.08 13.24 ]
- 13.24 -35.19 '
Next we consider the possibility that gl is not active, so that )'1 = 0, and
-3xI + 10 - = 0,
°.
A2
-4X2 - 6x~ + A3 =
We have already considered the possibility of both A'S being zero, so we need to
°
consider only three possibilities of one of these Lagrange multipliers being nonzero,
or both being nonzero. The first case is A2 f:. 0, A3 = 0, then g2 = and we get Xl = 0,
X2 = 0, A2 = 10, and f = -6, or Xl = 0, X2 = -2/3, A2 = 10, and f = -6.99. Both
points satisfy the Kuhn-Tucker conditions for a minimum, but not the sufficiency
°
condition. In fact, the vectors tangent to the active constraints ~XI = is the only
one) have the form ST = (0, a), and it is easy to check that sT'1 £'s < 0. It is also
easy to check that these points are indeed no minima by reducing X2 slightly.
The next case is A2 A3 f:. 0, so that g3 = O. We get Xl = 1.826, X2 = 10,
= 0,
A3 = 640 and f = -2194. this point satisfies the Kuhn-Tucker conditions, but it is
not a minimum either. It is easy to check that '12 £, is negative definite in this case
so that the sufficiency condition could not be satisfied. Finally, we consider the case
Xl = 0, x2 = 10, A2 = 10, A3 = 640, f = -2206. Now the Kuhn-Tucker conditions
are satisfied, and the number of active constraints is equal to the number of design
variables, so that this point is a minimum .•••
5.1.2 Convex Problems
There is a class of problems, namely convex problems, for which the Kuhn-Tucker
conditions are not only necessary but also sufficient for a global minimum. To define
convex problems we need the notions of convexity for a set of points and for a function.
A set of points S is convex whenever the entire line segment connecting two points
that are in S is also in S. That is
ifxI,X2ES, theno:xI+(1-0:)X2ES, 0<0:<1. (5.1.l7)
A function is convex if
0<0:<1. (5.1.18)
This is shown pictorially for a function of a single variable in Figure (5.1.2). The
straight segment connecting any two points on the curve must lie above the curve.
Alternatively we note that the second derivative of f is non-negative J"(x) ;::: O. It
can be shown that a function of n variables is convex if its matrix of second derivatives
is positive semi-definite.
A convex optimization problem has a convex objective function and a convex
feasible domain. It can be shown that the feasible domain is convex if all the inequality
constraints gj are concave (that is, -gj are convex) and the equality constraints are
linear. A convex optimization problem has only one minimum, and the Kuhn-Tucker
conditions are sufficient to establish it. Most optimization problems encountered in
practice cannot be shown to be convex. However, the theory of convex programming is
still very important in structural optimization, as we often approximat.e optimization
problems by a series of convex approximations (see Chapter 9). The simplest such
approximation is a linear approximation for the objective function and constraints-
this produces a linear programming problem.
166
Section 5.1: The Kuhn-Tucker Conditions
Example 5.1.2
Consider the minimum weight design of the four bar truss shown in Figure (5.1.3).
For the sake of simplicity we assume that members 1 through 3 have the same area
A1 and member 4 has an area A 2. The constraints are limits on the stresses in the
members and on the vertical displacement at the right end of the truss. Under the
specified loading the member forces and the vertical displacement /) at the end are
found to be
f1 = 5p, 12 = -p, h = 4p, f4 = -2V3p,
/) = 6pl
E
(~+
A1
v'A23)
We assume the allowable stresses in tension and compression to be 8.74 X 10-4 E and
4.83 x 10-4 E, respectively, and limit the vertical displacement to be no greater than
3 x 10- 3 1. The minimum weight design subject to stress and displacement constraints
167
Chapter 5: Constrained Optimization
can be formulated in terms of nondimensional design variables
Xl -_ 10- 3AIE
--,
P
as
minimize f = 3XI + V3X2
18 613
subject to gl = 3- - - - ~ 0,
Xl X2
g2 = Xl - 5. 73 ~ 0,
g3 = X2 - 7.17 ~ °.
The Kuhn-Tucker conditions are
~
UXi
3
() f _ "'"
~]~
A ,ogj
. 1 UXi
-
- °, i = 1,2,
]=
or
-AI = [360/X~ 0]
12v'3x~ .
°
Clearly, for Xl > and X2 > 0, -AI is positive definite so that the minimum that we
found is a global minimum .•••
168
Section 5.2: Quadratic Programming Problems
5.2 Quadratic Programming Problems
where>. and I' are the vectors of Lagrange multipliers for the inequality constraints
and the nonnegativity constraints, respectively, and {tn and {sn are the vectors of
positive slack variables for the same. The necessary conditions for a stationary point
are obtained by differentiating the Lagrangian with respect to the x, >., 1', t, and s,
~~ =c-Qx-AT>.-I'=O, (5.2.3)
ac { 2}
a>. = Ax - tj - b = °, (5.2.4)
ac
al'=x-{sn=o, (5.2.5)
ac
-a = 2>'jtj = 0, j = 1, ... , ng , (5.2.6)
tj
ac
-a = 2j.Li S i = 0, i = 1, ... ,n . (5.2.7)
Si
Qx + AT ,\ + p. =c , (5.2.8)
Ax-q=b, (5.2.9)
j=l, ... ,ng , (5.2.10)
= 0, i = 1, ... ,n, (5.2.11)
°.
J.liXi
x ~ 0, ,\ ~ 0, and p. ~ (5.2.12)
Equations (5.2.8) and (5.2.9) form a set of n + ng linear equations for the solution
of unknowns Xi,Aj,J.l;, and qj which also need to satisfy Eqs. (5.2.10) and (5.2.11).
Despite the nonlinearity of the Eqs. (5.2.10) and (5.2.11), this problem can be solved
as proposed by Wolfe [3] by using the procedure described in 3.6.3 for generating
a basic feasible solution through the use of artificial variables. Introducing a set
of artificial variables, y;, i = 1, ... , n, we define an artificial cost function to be
minimized,
L y;
n
minimize (5.2.13)
;=1
subject to Qx + AT ,\ + p. + y =c , (5.2.14)
°.
Ax-q=b, (5.2.15)
x ~ 0, ,\ ~ 0, p. ~ 0, and y ~ (5.2.16)
Equations (5.2.13) through (5.2.16) can be solved by using the standard simplex
method with the additional requirement that (5.2.10) and (5.2.11) be satisfied. These
requirements can be implemented during the simplex algorithm by simply enforcing
that the variables Aj and qj (and J.li and Xi) not be included in the basic solution
simultaneously. That is, we restrict a non-basic variable J.li from entering the basis if
the corresponding Xi is already among the basic variables.
Other methods for solving the quadratic programming problem are also available,
and the reader is referred to Gill et al. ([4], pp. 177-180) for additional details.
As may be seen from example 5.1.1, trying to find the minimum directly from
the Kuhn-Tucker conditions may be difficult because we need to consider many com-
binations of active and inactive constraints, and this would in general involve the
solution of highly nonlinear equations. The Kuhn-Tucker conditions are, however,
often used to check whether a candidate minimum point satisfies the necessary con-
ditions. In such a case we need to calculate the Lagrange multipliers (also called the
Kuhn-Tucker multipliers) at a given point x. As we will see in the next section, we
170
Section 5.3: Computing the Lagrange Multipliers
may also want to calculate the Lagrange multipliers for the purpose of estimating the
sensitivity of the optimum solution to small changes in the problem definition. To
calculate the Lagrange multipliers we start by writing Eq. (5.1.6) in matrix notation
as
Vj-N,x=O, (5.3.1)
where the matrix N is defined by
agj
nij =-, j = 1, ... , r, and i=1, ... ,n. (5.3.2)
aXi
We consider only the active constraints and associated lagrange multipliers, and as-
sume that there are r of them.
Typically, the number, r, of active constraints is less than n, so that with n
equations in terms of r unknowns, Eq. (5.3.1) is an overdetermined system. We
assume that the gradients of the constraints are linearly independent so that N has
rank r. If the Kuhn-'TUcker conditions are satisfied the equations are consistent and
we have an exact solution. We could therefore use a subset of r equations to solve for
the Lagrange multipliers. However, this approach may be susceptible to amplification
of errors. Instead we can use a least-squares approach to solve the equations. We
define a residual vector u
u=N,x-Vj, (5.3.3)
A least squares solution ofEq. (5.3.1) will minimize the square of the Euclidean norm
of the residual with respect to ,x
To minimize lIull 2 we differentiate it with respect to each one of the Lagrange multi-
pliers and get
{5.3.5}
or
(5.3.6)
This is the best solution in the least square sense. However, if the Kuhn-'TUcker
conditions are satisfied it should be the exact solution of Eq. (5.3.1). Substituting
from Eq. (5.3.6) into Eq. (5.3.1) we obtain
PVj = 0, (5.3.7)
where
(5.3.8)
P is called the projection matrix. It will be shown in Section 5.5 that it projects a
vector into the subspace tangent to the active constraints. Equation (5.3.7) implies
that for the Kuhn-'TUcker conditions to be satisfied the gradient of the objective
function has to be orthogonal to that subspace.
In practice Eq. (5.3.6) is no longer popular for the calculation of the Lagrange
multipliers. One reason is that the method is ill-conditioned and another is that it is
171
Chapter 5: Constmined Optimization
not efficient. An efficient and better conditioned method for least squares calculations
is based on the QR factorization of the matrix N. The QR factorization of the matrix
N consists of an r x r upper triangular matrix R and an n x n orthogonal matrix Q
such that
(5.3.9)
(5.3.10)
From this form it can be seen that lIull is minimized by choosing). so that
2
(5.3.11)
The last n - r rows of the matrix Q denoted Q2 are also im/ortant in the following.
They are orthogonal vectors which span the null space of N . That is NT times each
one of these vectors is zero.
Example 5.3.1
Check whether the point (-2, -2,4) is a local minimum of the problem
1 = Xl + X2 + x3,
91 = 8 - X~ - X~ ;::: 0,
92 = x3 - 4;::: 0,
93 = X2 + 8;::: °.
Only the first two constraints are critical at (-2, -2,4)
891 = 0,
8X3
172
n,
Section 5.4: Sensitivity 01 Optimum Solution to Problem Pammeters
A= (NTN)-lNTVI = {1{4} ,
also
[I - N(NTNt1NT] VI =0 .
Equation (5.3.7) is satisfied, and all the Lagrange multipliers are positive, so the
Kuhn-Tucker conditions for a minimum are satisfied .•••
The Lagrange multipliers are not only useful for checking optimality, but they
also provide information about the sensitivity of the optimal solution to problem
parameters. In this role they are extremely valuable in practical applications. In
most engineering design optimization problems we have a host of parameters such as
material properties, dimensions and load levels that are fixed during the optimization.
We often need the sensitivity of the optimum solution to these problem parameters,
either because we do not know them accurately, or because we have some freedom to
change them if we find that they have a large effect on the optimum design.
We assume now that the objective function and constraints depend on a param-
eter p so that the optimization problem is defined as
minimize I(x,p)
(5.4.1)
such that gj(x,p) ;::: 0, j = 1, ... ,ng .
The solution of the problem is denoted x*(p) and the corresponding objective function
f*(p) = I(x*(p),p). We want to find the derivatives of x* and f* with respect to
p. The equations that govern the optimum solution are the Kuhn-Tucker conditions,
Eq. (5.3.1), and the set of active constraints
(5.4.2)
where ga denotes the vector of r active constraint functions. Equations (5.3.1) and
(5.4.2) are satisfied by x*(p) for all values of p that do not change the set of active
constraints. Therefore, the derivatives of these equations with respect to p are zero,
provided we consider the implicit dependence of x and A on p. Differentiating Eq.
(5.3.1) and (5.4.2) with respect to p we obtain
dx* dA
(A-Z)--N-+-(Vf)- a (ON)
- A=O, (5.4.3)
dp dp op op
173
Chapter 5: Constrained Optimization
where A is the Hessian matrix of the objective function f, aij = eJ2 f jox/)Xj, and Z
is a matrix whose elements are
( 5.4.5)
Equations (5.4.3) and (5.4.4) are a system of simultaneous equations for the deriva-
tives of the design variables and of the Lagrange multipliers. Different special cases
of this system are discussed by Sobieski et al. [6].
Often we do not need the derivatives of the design variables or of the Lagrange
multipliers, but only the derivatives of the objective function. In this case the sensi-
tivity analysis can be greatly simplified. We can write
(5.4.6)
df _ of >.70ga
dp - op - op' (5.4.7)
Equation (5.4.7) shows that the Lagrange multipliers are a measure of the effect
of a change in the constraints on the objective function. Consider, for example,
a constraint of the form gj(x) = Gj(x) - p ~ O. By increasing p we make the
constraint more difficult to satisfy. Assume that many constraints are critical, but
that p affects only this single constraint. We see that ogjjop = -1, and from Eq.
(5.4.7) df j dp = Aj, that is Aj is the 'marginal price' that we pay in terms of an
increase in the objective function for making gj more difficult to satisfy.
The interpretation of Lagrange multipliers as the marginal prices of the con-
straints also explains why at the optimum all the Lagrange multipliers have to be
non-negative. A negative Lagrange multiplier would indicate that we can reduce the
objective function by making a constraint more difficult to satisfy- an absurdity.
Example 5.4.1
f = Xl + X2 + X3,
gl = P - xi - x~ ~ 0,
g2 = X3 - 4 ~ 0,
g3 = X2 +P ~ 0.
This problem was analyzed for p = 8 in Example 5.3.1, and the optimal solution was
found to be (-2, -2,4). We want to find the derivative of this optimal solution with
respect to p. At the optimal point we have f = 0 and >.7 = (0.25,1.0), with the
174
Section 5.5: Gradient Projection and Reduced Gradient Methods
first two constraints being critical. We can calculate the derivative of the objective
function from Eq. (5.4.7)
of
op
= 0
,
oga =
op
{I}
0'
so
df
dp = -0.25.
To calculate the derivatives of the design variables and constraints we need to set up
Eqs. (5.4.3) and (5.4.4). We get
oVf =0 oN
A=O, op , op =0.
-2 o
Z11 = -2A2 = -2, Z22 = -2A2 = -2, Z =[ ~ -2
o
With N from Example 5.3.1, Eq. (5.4.3) gives us
2Xl - 4Al = 0,
2X2 - 4~1 = 0,
~2 = 0,
where a dot denotes derivative with respect to p. From Eq. (5.4.4) we get
4Xl + 4X2 + 1 = 0,
X3 = 0 .
Xl = X2 = -0.125, X3 = 0, ~1 = -0.0625, ~2 =0 .
We can check the derivatives of the objective function and design variables by chang-
ing p from 8 to 9 and re-optimizing. It is easy to check that we get Xl = X2 = -2.121,
X3 = 4, f = -0.242. These values compare well with linear extrapolation based on
the derivatives which gives Xl = X2 = -2.125, X3 = 4, f = -0.25.e ••
175
Chapter 5: Constrained rnization
5.5 Gradient Projecti. I.lld Reduced Gradient Methods
Rosen's gradient projection method is based on projecting the search dirt'ction into
the subspace tangent to the active constraints. Let us first examine the method for
the case of linear constraints [7]. We define the constrained problem as
minimize f (x)
n
such that gj(X) = L ajiXi - bj 2: 0, j = 1, ... , ng .
(5.5.1)
i=1
In vector form
(5.5.2)
If we select only the r active constraints (j E fA), we may write the constraint
equations as
(5.5.3)
where ga is the vector of active constraints and the columns of the matrix N are
the gradients of these constraints. The basic assumption of the gradient projection
method is that x lies in the subspace tangent to the active constraints. If
NTs = o. (5.5.5)
If we want the steepest descent direction satisfying Eq. (5.5.5), we can pose the
problem as
minimize
such that (5.5.6)
and sTs = 1 .
That is, we want to find the direction with the most negative directional deriva-
tive which satisfies Eq. (5.5.5). We use Lagrange multipliers oX and f.1 to form the
Lagrangian
(5.5.7)
The condition for £ to be stationary is
a12
as = V' f - NoX - 2f.18 = 0 . (5.5.8)
or
(5.5.10)
176
Section 5.5: Gradient Projection and Reduced Gradient Methods
P is the projection matrix defined in Eq. (5.3.8). The factor of 1/2f.1, is not significant
because s defines only the direction of search, so in general we use s = -PV f. To
show that P indeed has the projection property, we need to prove that if w is an
arbitrary vector, then Pw is in the subspace tangent to the active constraints, that
is Pw satisfies
NTpw=O. (5.5.12)
We can easily verify this by using the definition of P.
Equation (5.3.8) which defines the projection matrix P does not provide the most
efficient way for calculating it. Instead it can be shown that
(5.5.13)
where the matrix Q2 consists of the last n - r rows of the Q factor in the QR
factorization of N (see Eq. (5.3.9)).
A version of the gradient projection method known as the generalized reduced
gradient method was developed by Abadie and Carpentier [8). As a first step we
select r linearly independent rows of N, denote their transpose as NI and partition
NT as
(5.5.14)
Next we consider Eq. (5.5.5) for the components Si of the direction vector. The r
equations corresponding to N I are then used to eliminate r components of sand
obtain a reduced order problem for the direction vector.
Once we have identified N I we can easily obtain Q2 which is given as
(5.5.15)
After a search direction has been determined, a one dimensional search must be
carried out to determine the value of a in Eq. (5.5.4). Unlike the unconstrained case,
there is an upper limit on a set by the inactive constraints. As a increases, some
of them may become active and then violated. Substituting x = Xi + as into Eq.
(5.5.2) we obtain
(5.5.16)
or
(5.5.17)
0: = <>j>O,
min a j .
]3IA
(5.5.18)
At the end of the move, new constraints may become active, so that the set of active
constraints may need to be updated before the next move is undertaken.
s=-BV'j, (5.5.19)
(5.5.20)
The gradient projection method has been generalized by Rosen to nonlinear con-
straints [9]. The method is based on linearizing the constraints about Xi so that
(5.5.21)
178
Section 5.5: Gradient Projection and Reduced Gradient Methods
restoration
move
projection
move -----..,.
179
Chapter 5: Constrained Optimization
Example 5.5.1
N=
[2 1 0]
1 1 0
1 2 0 ' NTN = [229 9
7 !j,
4 1
4 1 1
-5
(NTNt l = 6~ [ -~ -19]
14
l1 -19 14 73
[1 -3
!] , Vf={j}.
1
P = I _ N(NTN)-lN T = ~ -3 9 -3
l1 1 -3 1
o 0 0
The projection move direction is s = -PVf = [8/11,-24/l1,8/11,0]T. Since the
magnitude of a direction vector is unimportant we scale s to ST = [1, -3, 1, OJ. For a
10% improvement in the objective function 'Y = 0.1 and from Eq. (5.5.25)
• = _ O.lf = _ 0.1 x 5 = 0 0625
a sTVf -8 . .
For the correction move we need the vector ga of constraint values, gr = (0, -0.1,0),
so the correction is
-1 { _;
-N(NT N) -1 ga = l10 -i } .
H} ,
Combining the projection and restoration moves, Eq. (5.5.26)
x, = + 00625 { -t }- 1 :0 { =} } F~~}
=
we get f(xt} = 4.64, 9l(Xl) = 0, g2(Xt} = 0.016. Note that instead of lO% reduction
we got only 7% due to the nonlinearity of the objective function. However, we did
satisfy the nonlinear constraint.e e e
180
Section 5.5: Gmdient Projection and Reduced Gmdient Methods
Example 5.5.2
Consider the four bar truss of Example 5.1.2. The problem of finding the minimum
weight design subject to stress and displacement constraints was formulated as
minimize f = 3XI + ..j3x2
18 6v'3
subject to gl = 3 - -Xl - - - ~ 0,
°,
X2
g2 = Xl - 5. 73 ~
g3 = X2 - 7.17 ~ 0,
Xl
Assume that we start the search at the intersection of gl = and g3 = 0, where
= 11.61, X2 = 7.17, and f = 47.25.
°
The gradients of the objective function and
two active constraints are
0.1335} N _ [0.1335 0]
VgI = { 0.2021 ' - 0.2021 1 .
Because N is nonsingular, Eq. (5.3.8) shows that P = 0. Also since the number of
linearly independent active constraints is equal to the number of design variables the
tangent subspace is a single point, so that there is no more room for progress. Using
Eqs. (5.3.6) or (5.3.11) we obtain
,\ _ { 22.47 }
- -2.798 .
The negative multiplier associated with g3 indicates that this constraint can be
dropped from the active set. Now
N _ [0.1335]
- 0.2021 .
The projection matrix is calculated from Eq. (5.3.8)
Xl • = {11.61}
= Xo + as 7.17 + 0.98 8{-1.29}
0.854 = {10.34}
8.01 .
0.1684}
N = Vg l = { 0.1620
The projection matrix is calculated to be
-N(NTN)-l ga -_ { 0.118}
0.113 .
Altogether
The feasible directions method (11) has the opposite philosophy to that of the
gradient projection method. Instead of following the constraint boundaries, we try to
stay as far away as possible from them. The typical iteration of the feasible direction
method starts at the boundary of the feasible domain (unconstrained minimization
techniques are used to generate a direction if no constraint is active).
182
Section 5.6: The Feasible Directions Method
Figure 5.6.1 Selection of search direction using the feasible directions method.
(5.6.1)
(5.6.2)
maximize j3
such that - sTVg·3 + O·j3
3 <
- 0,
(5.6.3)
sTVf + j3 ~ 0,
Isd ~ 1.
The OJ are positive numbers called "push-off' factors because their magnitude deter-
mines how far x will move from the constraint boundaries. A value of OJ = 0 will
result in a move tangent to the boundary of the the jth constraint, and so may be
appropriate for a linear constraint. A large value of OJ will result in a large angle
between the constraint boundary and the move direction, and so is appropriate for a
highly nonlinear constraint.
183
Chapter 5: Constrained Optimization
The optimization problem defined by Eq. (5.6.3) is linear and can be solved using
the simplex algorithm. If (3max > 0, we have found a usable feasible direction. If we
get (3max = 0 it can be shown that the Kuhn-Tucker conditions are satisfied.
Once a direction of search has been found, the choice of step length is typically
based on a prescribed reduction in the objective function (using Eq. (5.5.25». If
at the end of the step no constraints are active, we continue in the same direction
as long as sT"V f is negative. We start the next iteration when x hits the constraint
boundaries, or use a direction based on unconstrained technique if x is inside the
feasible domain. Finally, if some constraints are violated after the initial step we
make x retreat based on the value of the violated constraints. The method of fea.<;ible
directions is implemented in the popular CONMIN program [12].
Example 5.6.1
Consider the four bar truss of Example 5.1.2. The problem of finding the minimum
weight design subject to stress and displacement constraints was formulated as
g2 = Xl - 5.73 ~ 0,
g3=x2-7.17~0,
The first constraint represents a limit on the vertical displacement, and the other two
constraints represent stress constraints.
Assume that we start the search at the intersection of gl = 0 and g3 = 0 where
x'{; = (11.61,7.17) and f = 47.25. The gradient of the objective function and two
active constraints are
0.1335}
"V gl = { 0.2021 '
maximize (3
subject to - 0.1335s 1 - 0.2021s 2 + (3 :::; 0,
- S2 + (3 :::; 0,
3s 1 + -/3S2 + (3 :::; 0 ,
- 1 :::; Sl :::; 1,
- 1 :::; S2 :::; 1 .
184
Section 5.6: The Feasible Directions Method
_ {11.61}
X1- 7.17
+ (\' {-0.6172}
1 .
Because the objective function is linear, this direction will remain a descent direction
indefinitely, and (\' will be limited only by the constraints. The requirement that g2
is not violated will lead to (\' = 9.527, Xl = 5.73, X2 = 16.7 which violates gl. We
see that because gl is nonlinear, even though we start the search by moving away
from it we still bump into it again (see Figure 5.6.2). It can be easily checked that
for (\' > 5.385 we violate gl. So we take (\' = 5.385 and obtain Xl = 8.29, X2 = 12.56,
f = 46.62.
5 6 7 8 9 10 11 12 Xl
0.2619}
V gl = { 0.0659 ' Vf={~} .
The linear program for obtaining s is
maximize {3
subject to - 0.2619s 1 - 0.0659s 2 + {3 ~ 0,
3S1 + v'3s 2 + {3 ~ 0,
- 1 ~ Sl ~ 1,
-1 ~ S2 ~ 1.
185
Chapter 5: Constrained Optimization
The solution of the linear program is Sl = 0.5512, S2 = -1, so that the one-
dimensional search is
_ { 8.29 }
x - 12.56
+Q {0.5512 }
-1 .
Again Q is limited only by the constraints. The lower limit on X2 dictates Q :::; 5.35.
However, the constraint gl is again more critical. It can be verified that for Q > 4.957
it is violated, so we take Q = 4.957, Xl = 11.02, X2 = 7.60, f = 46.22. The optimum
design found in Example 5.1.2 is Xl = X2 = 9.464, f = 44.78. The design space and
the two iterations are shown in Figure 5.6.2 .•••
\Vhen the energy crisis erupted in the middle seventies, the United States Congress
passed legislation intended to reduce the fuel consumption of American cars. The
target was an average fuel consumption of 27.5 miles per gallon for new cars in 1985.
Rather than simply legislate this limit Congress took a gradual approach, with a
different limit set each year to bring up the average from about 14 miles per gallon
to the target value. Thus the limit was set at 26 for 1984, 25 for 1983, 24 for 1982,
and so on. Furthermore, the limit was not absolute, but there was a fine of $50 per
0.1 miles per gallon violation per car.
This approach to constraining the automobile companies to produce fuel efficient
cars has two important aspects. First, by legislating a penalty proportional to the
violation rather than an absolute limit, the government allowed the auto companies
more flexibility. That meant they could follow a time schedule that approximated
the government schedule without having to adhere to it rigidly. Second, the gradual
approach made enforcement easier politically. Had the government simply set the ul-
timate limit for 1985 only, nobody would have paid attention to the law in the 1970's.
Then as 1985 moved closer there would have been a rush to develop fuel efficient cars.
The hurried effort could mean both non-optimal car designs and political pressure to
delay the enforcement of the law.
The fuel efficiency law is an example in which constraints on behavior or eco-
nomic activities are imposed via penalties whose magnitude depends on the degree of
violation of the constraints. It is no wonder that this simple and appealing approach
has found application in constrained optimization. Instead of applying constraints
we replace them by penalties which depend on the degree of constraint violations.
This approach is attractive because it replaces a constrained optimization problem
by an unconstrained one.
The penalties associated with constraint violation have to be high enough so
that the constraints are only slightly violated. However, just as there are political
problems associated with imposing abrupt high penalties in real life, so there are
numerical difficulties associated with such a practice in numerical optimization. For
this reason we opt for a gradual approach where we start with small penalties and
increase them gradually.
186
Section 5.7: Penalty Function Methods
5.7.1 Exterior Penalty Function
is replaced by
Re n,
minimize ¢(x, r) = f(x) + r L h~(x) + r L < -gj >2
(5.7.2)
;=1 j=1
ri -+ 00,
where < a > denote the positive part of a or max(a,O). The inequality terms are
treated differently from the equality terms because the penalty applies only for con-
straint violation. The positive multiplier r controls the magnitude of the penalty
terms. It may seem logical to choose a very high value of r to ensure that no con-
straints are violated. However, as noted before, this approach leads to numerical
difficulties illustrated later in an example. Instead the minimization is started with
a relatively small value of r, and then r is gradually increased. A typical value for
ri+1/r; is 5. A typical plot of ¢(x, r) as a function of r is shown in Figure 5.7.1 for a
simple example.
¢(x,r) x=4
~
r=2.5
f(x) =0.5 x
the high values of the curvature associated with large values of r which often lead
to numerical difficulties. By using a sequence of values of r, we use the minima
obtained for small values of r as starting points for the search with higher r values.
Thus the ill-conditioning associated with the large curvature is counterbalanced by
the availability of a good starting point.
Based on the type of constraint normalization given by Eq. (5.2) we can select
a reasonable starting value for the penalty multiplier r. A rule of thumb is that
one should start with the total penalty being about equal to the objective function
for typical constraint violation of 50% of the response limits. In most optimization
problems the total number of active constraints is about the same as or just slightly
lower than the number of design variables. Assuming we start with one quarter of
the eventual active constraints being violated by about 50% (or g = -0.5) then we
have
f(xo)
or 1'0 = 16--. (5.7.3)
n
It is also important to obtain a good starting point for restarting the optimization
as l' is increased. The minimum of the optimization for the previous value of l' is a
reasonable starting point, but one can do better. Fiacco and McCormick [13] show
that the position of the minimum of ¢(x, 1') has the asymptotic form
Once the optimum has been found for two values of 1', say 1'i-1, and 1'i, the vectors a
and b may be estimated, and the value of x*(r) predicted for subsequent values of 1'.
It is easy to check that in order to satisfy Eq. (5.7.4), a and b are given as
cx*(1'i-d - x*(1'i)
a= ,
c-1 (5.7.5)
b = [x*(ri-d - a] 1'i-1 ,
where
(5.7.6)
In addition to predicting a good value of the design variables for restarting the op-
timization for the next value of 1', Eq. (5.7.4) provides us with a useful convergence
criterion, namely
Ilx* - all:::; 1'1 , (5.7.7)
where a is estimated from the last two values of 1', and 1'1 is a specified tolerance
chosen to be small compared to a typical value of Ilxll.
A second convergence criterion is based on the magnitude of the penalty terms,
which, as shown in Example 5.7.1, go to zero as l' goes to infinity. Therefore, a
reasonable convergence criterion is
(5.7.8)
188
Section 5.7: Penalty FUnction Methods
Finally, a criterion based on the change in the value of the objective function at the
minimum !* is also used
!*(r;) - !*(r;-d < 0 .
I I (5.7.9)
f*(r;) -
A typical value for f2 or f3 is 0.001.
Example 5.7.1
_ { 2XI (1 + r) + 2rx2 - 8r }
g- 2x2(10 + r) + 2rxi - 8r
It can be seen that as r is increased the solution converges to the exact solution
of xT = (3.636,0.3636), f = 14.54. The convergence is indicated by the shrinking
difference between the objective function and the augmented function ¢. The Hessian
of ¢ is given as
H _ [2 + 2r
- 2r
2r]
20 + 2r .
As r increases this matrix becomes more and more ill-conditioned, as all four compo-
nents become approximately 2r.
This ill-conditioning of the Hessian matrix for large
values of r often occurs when the exterior penalty function is used, and can cause
numerical difficulties for large problems.
We can use Table 5.7.1 to test the extrapolation procedure, Eq. (5.7.4). For
example, with the values of r = 1 and r = 10, Eq. (5.7.5) gives
_ O.lx*(l) - x*(lO) _ { 3.492 }
a - -0.9 - 0.3492 '
189
Chapter 5: Constrained Optimization
h = x *(1) - a = { -0.0159
-0.159 }
We can now use Eq. (5.7.4) to find a starting point for the optimization for r = 100
to get
a + h/lDO = (3.490, 0.3490f ,
which is substantially closer to x*(100) = (3.604,0.3604f than to x*(IO) = (3.333,
0.3333f· •••
\Vith the exterior penalty function, constraints contribute penalty terms only when
they are violated. As a result, the design typically moves in the infeasible domain.
If the minimization is terminated before r becomes very large (for example, because
of shortage of computer resources) the resulting designs may be useless. \Vhen only
inequality constraints are present, it is possible to define an interior penalty function
that keeps the design in the feasible domain. The common form of the interior penalty
method replaces the inequality constrained problem
minimize f (x)
(5.7.10)
such that gj(x);::: 0, j = 1, ... , n g ,
by
ng
ri -+ 0, ri >0.
¢(x, r)
r
x-4
x=4 x
Figure 5.7.2 Interior penalty function for f(x) = 0.5.T subject to x - 4 ;::: o.
190
Section 5.7: Penalty Function Methods
The penalty term is proportional to 1/ gj and becomes infinitely large at the
boundary of the feasible domain creating a barrier there (interior penalty function
methods are sometimes called barrier methods). It is assumed that the search is
confined to the feasible domain. Otherwise, the penalty becomes negative which
does not make any sense. Figure 5.7.2 shows the application of the interior penalty
function to the simple example used for the exterior penalty function in Figure 5.7.l.
Besides the inverse penalty function defined in Eq. (5.7.11), there has been some use
of a logarithmic interior penalty function
n.
4>(x, r) = f(x) - r L log(gj(x)) . (5.7.12)
j=l
While the interior penalty function has the advantage over the exterior one in
that it produces a series of feasible designs, it also requires a feasible starting point.
Unfortunately, it is often difficult to find such a feasible starting design. Also, because
of the use of approximation (see Chapter 6), it is quite common for the optimization
process to stray occasionally into the infeasible domain. For these reasons it may be
advantageous to use a combination of interior and exterior penalty functions called
an extended interior penalty function. An example is the quadratic extended interior
penalty function of Haftka and Starnes [14]
n.
4>(x, r) = f(x) + r LP(gj) ,
j=l
(5.7.13)
rj -0,
where
(5.7.14)
It is easy to check that p(gj) has continuity up to second derivatives. The transi-
tion parameter go which defines the boundary between the interior and exterior parts
of the penalty terms must be chosen so that the penalty associated with the con-
straint, rp(gj), becomes infinite for negative gj as r tends to zero. This results in the
requirement that
(5.7.15)
This can be achieved by selecting go as
go = cr 1/ 2 , (5.7.16)
where c is a constant.
It is also possible to include equality constraints with interior and extended in-
terior penalty functions. For example, the interior penalty function Eq. (5.7.11) is
augmented as
n, ne
"
x=4 x
Figure 5.7.3 Extended interior penalty function for f(x) = 0.5x subject to g(x) =
x - 4 2:: O.
The considerations for the choice of an initial value of r are similar to those for
the exterior penalty function. A reasonable choice for the interior penalty function
would require that n/4 active constraints at g = 0.5 (that is 50% margin for properly
normalized constraints) would result in a total penalty equal to the objective function.
Using Eq. (5.7.3) we obtain
n l'
f(x) = 4" 0.5' or r = 2f(x)/n .
For the extended interior penalty function it is more reasonable to assume that the
n/4 constraints are critical (g = 0), so that from Eq. (5.7.13)
n 3 4
f(x) = r--, or l' = -gof(x)/n .
4 go 3
A reasonable starting value for go is 0.1. As for the exterior penalty function, it is
possible to obtain an expression for the asymptotic (as r --+ 0) coordinates of the
minimum of 1> as [10]
x*(r) = a+ br 1/ 2 , as l' --+ 0, (5.7.18)
and
/*(1') = a + br 1 / 2 , as r --+ 0.
a, b, a and b may be estimated once the minimization has been carried out for two
values of r. For example, the estimates for a and bare
c1 / 2 x*(ri_d - x*(ri)
a = c1/2 _ 1 '
(5.7.19)
b = x*(ri-d - a
1/2 '
ri- 1
where c = r;j ri-1. As in the case of exterior penalty function, these expressions may
be used for convergence tests and extrapolation.
192
Section 5.7: Penalty Function Methods
5.7.3 Unconstrained Minimization with Penalty Functions
(5.7.21)
Because of the equality constraint, h; is close to zero, especially for the later stages
of the optimization (large r), and we can neglect the last term in Eq. (5.7.21). For
large values of r we can also neglect the first term, so that we can calculate second
derivatives of ¢ based on first derivatives of the constraints. The availability of
inexpensive second derivatives permits the use of Newton's method where the number
of iterations is typically independent of the number of design variables. Qua.<;i-Newton
and conjugate gradient methods, on the other hand, require a number of iterations
proportional to the number of design variables. Thus the use of Newton's method
becomes attractive when the number of design variables is large. The application of
Newton's method with the above approximation of second derivatives is known as
the Gauss-Newton method.
For the interior penalty function we have a similar situation. The augmented
objective function ¢ is given as
+r L
ng
(5.7.23)
193
Chapter 5: Constrained Optimization
Now the argument for neglecting the first and last terms in Eq. (5.7.23) is somewhat
lengthier. First we observe that because of the 1/ g] term, the second derivatives
are dominated by the critical constraints (gj small). For these constraints the last
term in Eq. (5.7.23) is negligible compared to the first-derivative term because gj is
small. Finally, from Eq. (5.7.18) it can be shown that rig] goes to infinity for active
constraints as r goes to zero, so that the first term in Eq. (5. 7.23) can be neglected
compared to the second. The same argument can also be used for extended interior
penalty functions [14].
The power of the Gauss-Newton method is shown in [14] for a high- aspect-ratio
wing made of composite materials (see Figure 5.7.4) designed subject to stress and
displacement constraints.
1872
_L ___.J~_¥i 1;-..,=-l'T--
Figure 5.7.4 Aerodynamic planform and structural box for high-aspect ratio wing,
from [14}.
The structural box of the wing was modeled with a finite element model with
67 nodes and 290 finite elements. The number of design variables controlling the
thickness of the various elements was varied from 13 to 146. The effect of the number
of design variables on the number of iterations (analyses) is shown in Table 5.7.2.
194
Section 5.7: Penalty Function Methods
where s is a penalty multiplier for non-discrete values of the design variables, and
1fJd(Xi) the penalty term for non-discrete values of the ith design variable. Different
forms for the discrete penalty function are possible. The penalty terms 1fJd(Xi) are
assumed to take the following sine-function form in Ref. [16],
./. () = -1(.
'f/d Xi
2
SIn
21l'[Xi - ~(di(j+l) + 3dij )] +
dii+l - dij
1) , dij ::; Xi::; diCi+l)' (5.7.26)
While penalizing the non-discrete valued design variables, the functions 1fJd(Xi) as-
sure the continuity of the first derivatives of the augmented function at the discrete
values of the design variables. The response surfaces generated by Eq. (5.7.25) are
determined according to the values of the penalty multipliers rand s. In contrast
to the multiplier r, which initially has a large value and decreases as we move from
one iteration to another, the value of the multiplier s is initially zero and increases
gradually.
One of the important factors in the application of the proposed method is to
determine when to activate s, and how fast to increase it to obtain discrete optimum
design. Clearly, if the initial value of s is too big and introduced too early in the
design process, the design variables will be trapped away from the global minimum,
resulting in a sub-optimal solution. To avoid this problem, the multiplier s has to be
activated after optimization of several response surfaces which include only constraint
penalty terms. In fact, since sometimes the optimum design with discrete values is
195
Chapter 5: Constrained Optl ;zation
I¢ 7f I ~ Ec . (5.7.27)
A typical value for fc is 0.01. The magnitude of the non-discrete penalty multiplier,
s, at the first discrete iteration is calculated such that the penalty associated with
the discrete-valued design variables that are not at their allowed values is of the order
of 10 percent of the constraint penalty.
As the iteration for discrete optimization proceeds, the non-discrete penalty multiplier
for the new iteration is increased by a factor of the order of 10. It is also important to
decide how to control the penalty multiplier for the constraints, r, during the discrete
optimization process. If r is decreased for each discrete optimization iteration as in
the continuous optimization process, the design can be stalled due to high penalties
for constraint violation. Thus, it is suggested that the penalty multiplier r be frozen at
the end of the continuous optimization process. However, the nearest discrete solution
at this response surface may not be a feasible design, in which case the design must
move away from the continuous optimum by moving back to the previous response
surface. This can be achieved by increasing the penalty multiplier, r, by a factor of
10.
The solution process for the discrete optimization is terminated if the design
variables are sufficiently close to the prescribed discrete values. The convergence
criterion for discrete optimization is
(5.7.29)
Example 5.7.2
Cross-sectional areas of members of a two-bar truss shown in the Figure 5.7.5 are
to be selected from a discrete set of values, Ai E {1.0, 1.5, 2.0}, i = 1,2. Determine
the minimum weight structure using the modified penalty function approach such
that the horizontal displacement u at the point of application of the force does not
exceed 2/3(FI/ E). Use a tolerance Ec = 0.1 for the activation of the penalty terms
for non-discrete valued design variables, and a convergence tolerance for the design
variables fd = 0.001.
196
Section 5.7: Penalty Function Methods
The minima of the augmented function as functions of the penalty multiplier rare
shown in Table 5.7.3 . After four iterations the constraint penalty (¢ - f) is within
the desired range of the objective function to activate the penalty terms for the
non-discrete values of the design variables.
From Eq. (5.7.25) the augmented function for the modified penalty function
approach has the form
A.. r s{l +sin[41r(xl -1.125)]}
'f' =XI + X2 + + ~----..!..-~'----~
1.5 - l/XI - 1/x2 2
+(s/2) {I + sin[41r (X2 - 1.125)]} .
197
Chapter 5: Constrained Optimization
Table 5.7.3 Minimization of ¢ without the discrete penalty
r Xl X2 f g ¢
5.000 5.000 10.00 1.100
11 3.544 3.544 7.089 0.9357 18.844
1.1 2.033 2.033 4.065 0.5160 6.197
0.11 1.554 1.554 3.109 0.2134 3.624
0.011 1.403 1.403 2.807 0.0747 2.954
The minimum of the augmented function can again be obtained by setting the gra-
dient to zero
r
1- 2 + 27fS cos[47f (Xl - 1.125)J = 0,
(1.5 - 2/XI) Xl 2
which can be solved numerically. The initial value of the penalty multiplier s is
calculated from Eq. (5.7.28)
1
s = 0.1 (0.011) 0.0747 = 0.0147 .
The minima of the augmented function (which includes the penalty for the non-
discrete valued variables) are shown in Table 5.7.4 as a function of s.
Multiplier methods combine the use of Lagrange multipliers with penalty functions.
\Vhen only Lagrange multipliers are employed the optimum is a stationary point
rather than a minimum of the Lagrangian function. When only penalty functions
are employed we have a minimum but also ill-conditioning. By using both we may
hope to get an unconstrained problem where the function to be minimized does not
suffer from ill-conditioning. A good survey of multiplier methods was conducted by
198
Section 5.8: Multiplier Methods
Bertsekas [17]. We study first the use of multiplier methods for equality constrained
problems.
minimize f (x)
(5.8.1)
such that hj(x) = 0, j = 1, ... , ne .
ne ne
If all the Lagrange multipliers are set to zero, we get the usual exterior penalty
function. On the other hand, if we use the correct values of the Lagrange multipliers,
A;, it can be shown that we get the correct minimum of problem (5.8.1) for any
positive value of r. Then there is no need to use the large value of r required for the
exterior penalty function. Of course, we do not know what are the correct values of
the Lagrange multipliers.
Multiplier methods are based on estimating the Lagrange multipliers. When the
estimates are good, it is possible to approach the optimum without using large r
values. The value of r needs to be only large enough so that C has a minimum rather
than a stationary point at the optimum. To obtain an estimate for the Lagrange
multipliers we compare the stationarity conditions for C,
(5.8.3)
of _
OXi
t A/
.
)=1
hj = 0 .
O.ri
(5.8.4)
(5.8.5)
as the minimum is approached. Based on this relation, Hestenes [18] suggested using
Eq. (5.8.5) as an estimate for A;.
That is
(5.8.6)
199
Chapter 5: Constrained Optimization
Example 5.8.1
f(x) = xi + 10x~ ,
hex) = Xl + X2 - 4 =0 .
The augmented Lagrangian is
which yield
+ 40r
Xl = 10x2 = 5>'
10 + llT
.
We want to compare the results with those of Example 5.7.1, so we start with the
same initial r value ro = 1, the initial estimate of >. = 0 and get
X2 = (3.492,0.3492f, = -0.1587 .
h
For the same value of r, we obtained in Example 5.7.1 X2 = (3.333, 0.3333f, so that
we are now closer to the exact solution of x = (3.636,0, 3636)T. Now we estimate a
new>. from Eq. (5.8.6)
X3 = (3.624,0.3624), h = -0.0136,
which shows that good convergence can be obtained without increasing r.e e e
200
Section 5.9: Projected Lagrangian Methods (Sequential Quadratic Programming)
There are several ways to extend the multiplier method to deal with inequality
constraints. The formulation below is based on Fletcher's work [19J. The constrained
problem that we examine is
minimize f (x)
(5.8.7)
such that gj(x) ~ 0, j = 1, ... ,ng .
(5.8.8)
(5.8.9)
of _ ~ AJJgj = 0 (5.8.10)
ax·• ~
j=l
J ax·• '
where it is also required that A;gj = o. Comparing Eqs (5.8.9) and (5.8.10) we expect
an estimate for A; of the form
(5.8.11)
201
Chapter 5: Constrained Optimization
minimize f (x)
(5.9.1 )
such that gj(x) ~ 0, j = 1, ... , ng .
Assume that at the ith iteration the design is at Xi, and we seek a move direction s.
The direction s is the solution of the following quadratic programming problem
1
minimize ¢(s) = f(Xi) + ST g(Xi) + 2sT A(Xi' Ai)S
(5.9.2)
such that gj(Xi) + sT\7gj(Xi) ~ 0, j = 1, ... , n g ,
and the f-lj are equal to the absolute values of the Lagrange multipliers for the first
iteration, i.e.
f-lJ. = rnax [I,(i). ~( (i-I)
A J '2 f-lJ
+ Idi-lll)]
J A ' (5.9.5)
with the superscript i denoting iteration number. The matrix A is initialized to some
positive definite matrix (e.g the identity matrix) and then updated using a I3FGS type
equation (see Chapter 4).
A~X~XT A ~l~lT
Anew = A - ~xTA~x + ~XT ~X '
(5.9.6)
where
~X = Xi+l - Xi , (5.9.7)
where L is the Lagrangian function and \7 x denotes the gradient of the Lagrangian
function with respect to x. To guarantee the positive definiteness of A, ~l is modified
if ~XT ~l :::; 0.2~xT A~x and replaced by
where
0.8~XTA~x
(5.9.9)
e= ~XT A~x _ ~xT~1
202
Section 5.9: Projected Lagrangian Methods (Sequential Quadratic Programming)
Example 5.9.1
Consider the four bar truss of Example 5.1.2. The problem of finding the minimum
weight design subject to stress and displacement constraints was formulated as
0.1335}
V gl = { 0.2021 '
N _ [0.1335
- 0.2021
0]1 .
vVe solve this quadratic programming problem directly with the use of the Kuhn-
Tucker conditions
3 + 81 - 0.1335).1 - ).2 = °,
V3 + 82 - 0.2021).1 - ).3 = 0.
A consideration of all possibilities for active constraints shows that the optimum is
obtained when only gl is active, so that ).2 = ).3 = 0 and ).1 = 12.8, 81 = -1.29,
82 = 0.855. The next design is
11.61} {-1.29}
Xl = { 7.17 + 0: 0.855 '
where 0: is found by minimizing 1jJ(0:) of Eq. (5.9.4). For the first iteration Jlj = I).jl
so
1jJ = 3(I1.61-1.290:)+V3(7.17+0.8550:)+12.8 3 -
I 18 - 6V3
11.61 - 1.290: 7.17 + 0.8550:
. I
203
Chapter 5: Constrained Optimization
By changing a systematically we find that 1j; is a minimum near a = 2.2, so that
Xl = (8.77, 9.05f, f(xd = 41.98, gl(xd = -0.201 .
To update A we need ~x and ~l. We have
so that
VxL = (3 - 230.4jxI, V3 - 133.0jx~f,
and
-2.84}
~x = Xl - Xo ={ 1.88 '
With A being the identity matrix we have ~XT A~x = 11.6, ~XT ~l = 5.53. Because
~xT ~l > 0.2~XT A~x we can use Eq. (5.9.5) to update A
where
fJ1 = max(A1' ~(IAll + fJ~ld)) = 14.31 .
The one-dimensional search yields approximately a = 0.5, so that
X2 = (9.30, 8.86f, f(X2) = 43.25, gl(X2) = -0.108,
so that we have made good progress towards the optimum x· = (9.46, 9.46)T . •••
204
Section 5.11: References
5.10 Exercises
Check for a minimum at the following points: (a) (5/3, 5.00) (b) (1/3, 5.00) (c)
(3.97,1.55).
3. Calculate the derivative of the solution of Example 5.1.2 with respect to a change in
the allowable displacement. First use the Lagrange multiplier to obtain the derivative
of the objective function, and then calculate the derivatives of the design variables
and Lagrange multipliers and verify the derivative of the objective function. Finally,
estimate from the derivatives of the solution how much we can change the allowable
displacement without changing the set of active constraints.
4. Solve for the minimum of problem 1 using the gradient projection method from
the point (17, 1/2, 4).
6. Find a feasible usable direction for problem 1 at the point (17, 1/2,4).
9. Consider the design of a box of maximum volume such that the surface area is
equal to S and there is one face with an area of S /4. Use the method of multipliers
to solve this problem, employing three design variables.
205
Chapter 5: Constrained Optimization
5.11 References
[1] Kreisselmeier, G., and Steinhauser, R., "Systematic Control Design by Optimiz-
ing a Vector Performance Index," Proceedings of IFAC Symposium on Computer
Aided Design of Control Systems, Zurich, Switzerland, pp. 113-117,1979.
[2] Sobieszczanski-Sobieski, J., "A Technique for Locating Function Roots and for
Satisfying Equality Constraints in Optimization," NASA TM-104037, NASA
LaRC, 1991.
[3] Wolfe, P .. "The Simplex Method for Quadratic Programming," Econometrica, 27
(3), pp. 382-398, 1959.
[4] Gill, P.E., Murray, W., and Wright, M.H., Practical Optimization, Academic
Press, 1981.
[5] Dahlquist, G., and Bjorck, A., Numerical Methods, Prentice Hall, 1974.
[6] Sobieszczanski-Sobieski, J., Barthelemy, J.F., and Riley, K.M., "Sensitivity of
Optimum Solutions of Problem Parameters", AIAA Journal, 20 (9), pp. 1291-
1299, 1982.
[7] Rosen, J.B., "The Gradient Projection Method for Nonlinear Programming-
Part 1: Linear Constraints", The Society for Industrial and App!. Mech. Journal,
8 (1), pp. 181- 217,1960.
[8] Abadie, J., and Carpentier, J., "Generalization of the Wolfe Reduced Gradient
Method for Nonlinear Constraints", in: Optimization (R. Fletcher, ed.), pp. 37-
49, Academic Press, 1969.
[9] Rosen, J.B., "The Gradient Projection Method for Nonlinear Programming-Part
II: Nonlinear Constraints", The Society for Industrial and App!. Mech. Journal,
9 (4), pp. 514-532, 1961.
[10] Haug, E.J., and Arora, J.S., Applied Optimal Design: Mechanical and Structural
Systems, John Wiley, New York, 1979.
[11] Zoutendijk, G., Methods of Feasible Directions, Elsevier, Amsterdam, 1960.
[12] Vanderplaats, G.N., "CONMIN-A Fortran Program for Constrained Function
Minimization", NASA TM X-62282, 1973.
[13] Fiacco, V., and McCormick, G.P., Nonlinear Programming: Sequential Uncon-
strained Minimization Techniques, John Wiley, New York, 1968.
[14] Haftka, R.T., and Starnes, J.H., Jr., "Applications of a Quadratic Extended
Interior Penalty Function for Structural Optimization", AIAA Journal, 14 (6),
pp.718-724,1976.
[15] Moe, J., "Penalty Function Methods in Optimum Structural Design-Theory and
Applications", in: Optimum Structural Design (Gallagher and Zienkiewicz, eds.),
pp. 143-177, John Wiley, 1973.
206
Section 5.11: References
[16] Shin, D.K, Giirdal, Z., and Griffin, O. H. Jr., "A Penalty Approach for Nonlinear
Optimization with Discrete Design Variables," Engineering Optimization, 16, pp.
29-42, 1990.
[17] Bertsekas, D.P., "Multiplier Methods: A Survey," Automatica, 12, pp. 133-145,
1976.
[18] Hestenes, M.R., "Multiplier and Gradient Methods," Journal of Optimization
Theory and Applications, 4 (5), pp. 303-320, 1969.
[19] Fletcher, R., "An Ideal Penalty Function for Constrained Optimization," Journal
of the Institute of Mathematics and its Applications, 15, pp.319-342, 1975.
[20J Powell, M.J.D., "A Fast Algorithm for Nonlinearly Constrained Optimization
Calculations", Proceedings of the 1977 Dundee Conference on Numerical Analy-
sis, Lecture Notes in Mathematics, Vol. 630, pp. 144-157, Springer-Verlag, Berlin,
1978.
207
Aspects of The Optimization Process in Practice 6
Occasionally, a structural analyst will write a design program that includes the
calculation of structural response as well as an implementation of a constrained opti-
mization algorithm, such as those discussed in Chapter 5. More often, however, the
analyst will have a structural analysis package, such as a finite-element program, as
well as an optimization software package available to him. The task of the analyst
is to combine the two so as to bring them to bear on the structural design problem
that he wishes to solve.
Two major difficulties are associated with the process of interfacing a structural
analysis package with an optimization program. The first is a programming difficulty.
Optimization packages typically expect subroutines that evaluate the objective func-
tion and constraints. When the structural analysis program is large, or if the analyst
does not have access to the source code of the program (a common situation), it
is very difficult to transform the analysis package into a subroutine called by the
optimization program.
The second serious problem is the high computational cost required for many
applications. For many structural optimization problems the evaluation of objective
function and constraints requires the execution of costly finite element analyses for
displacements, stresses or other structural response quantities. The optimization pro-
cess may require evaluating objective function and constraints hundreds or thousands
of times. The cost of repeating the finite element analysis so many times is usually
prohibitive.
Fortunately, there is an approach to interfacing an optimization program with
an analysis program that solves both problems. This increasingly popular approach,
called sequential approximate optimization, was suggested by Schmit and Farshi [1].
The computational cost problem is addressed by the use of approximate analyses
during portions of the optimization process. The structural analysis package is first
used to analyze an initial design, and then to generate information that allows the
construction of constraint approximations. For example, when the number of design
variables is small it is practical to analyze the structure at a number of points in
the design space, and use the response at those points to construct a polynomial
approximation to the response at other points. The optimization package is then
209
Chapter 6: Aspects of The Optimization Process in Pmctice
applied to the approximate problem represented by the polynomial approximation.
Since the polynomial approximation is typically easy to program, it is straight-forward
to interface it to the optimization package.
The simple approximations generated by repeated use of the analysis package
are often referred to as low-cost explicit approximations, in contrast to the implicit
dependence of the response on the structural design variables via a finite element
solution. The polynomial approximation obtained by analyzing the structure at a
number of design points is a global approximation. Obtaining such a global approxi-
mation can be quite expensive for a large number of design variables. For example, if
we want to fit the structural response by a quadratic polynomial, we need to analyze
the structure for at least n(n + 1)/2 design points (typically many more to ensure a
robust approximation), where n is the number of design variables. This will result
in thousands of analyses when the number of design variables is larger than, say 40.
Therefore, it is more common to use local approximations based on ch'rivatives of the
objective function and constraints with respect to the design variables. The simplest
approach is to replace the objective function and constraints with linear approxima-
tions based on these derivatives. However, these approximations are useful only in a
neighborhood of the design space. Therefore, it is necessary to impose limits, called
move limits, on the magnitudes of changes in the design that are permitted while the
approximate analysis is used.
Following an optimization based on approximate analysis and move limits, an ex-
act analysis is performed at the design point obtained by the approximate optimiza-
tion, and new derivatives are calculated so that a new approximation for objective
function and constraints can be constructed. The process is repeated until conver-
gence is achieved, typically measured by the magnitude of changes in the objective
function or the degree of satisfaction of the optimality conditions (e.g., the Kuhn-
Tucker conditions). Because each approximate optimization is only one cycle in the
overall optimization process, it is usually possible to employ lax convergence criteria
for these approximate problems, except for the last one. To distinguish them from
the iterations inside approximate optimizations, each such optimization is referred to
as a cycle rather than as an iteration.
When linear approximations are used, and the move limits are posed as linear in-
equalities, this process is called sequential linear progmmming (SLP), and was known
for many years before Schmit and Farshi proposed the use of approximations for
structural optimization. However, there is no need to limit the process to linear ap-
proximations, as long as the approximations are substantially cheaper to calculate
than the exact analyses. For example, Schmit and Farshi demonstrated the use of in-
expensive nonlinear approximations by using the reciprocal approximation, discussed
in Section 6.1.
The use of sequential approximate optimization in the design process is the key
step in interfacing a structural analysis program with an optimization program, and
so it is the major topic discussed in this chapter. However, there are other aspects of
the practical use of the optimization process in design that deserve consideration. For
shape optimization problems, it is important to be able to modify the discretization of
the structure (e.g., the finite-element model) as the design is changed. This requires
210
Section 6.1: Generic Approximations
sophisticated mesh generators, and is discussed in Section 6.5. Other topics discussed
in this chapter include optimization packages, and test problems that are often used
to check on the performance of these packages. One important topic which is not
discussed in this chapter is the calculation of the derivatives of the response of the
structure needed for constructing the approximation. This topic requires a more
detailed study and is the subject of Chapters 7 and 8.
The use of sequential approximate optimization is by no means universally ac-
cepted as the only way to deal with the optimization of complex structures. Many
analysts prefer to use their judgement so as to produce a design model of the problem
which employs a much coarser discretization than they would accept for the final
analysis of the structure. They hope that the design trends revealed by optimizing
the coarse model will hold for the more refined model. While this approach is quite
legitimate, it will not be discussed here, because it requires a great deal of experi-
ence on the part of the analyst, and is highly problem dependent. As such it is very
difficult to codify in a textbook.
The simplest local approximation is the linear approximation based on the Taylor
senes. Given a function g(x), the linear approximation gL(X) is
(6.1.1)
For many applications the linear approximation is inaccurate even for design
points x that are close to Xo. Accuracy can be increased by retaining additional
terms in the Taylor series expansion. This, however, requires the costly calculation of
higher-order derivatives. A more attractive alternative is to find intervening variables
that would make the approximated function behave more linearly. That is, define
where Yi are m functions of the design variables called intervening variables. The
linear approximation, gI, in terms of the intervening variables is
(6.1.3)
where YOi = Yi(XO), and the derivatives of g with respect to the Vi'S can be calculated
from the derivatives with respect to the Xi'S.
Example 6.1.1
The beam shown in Fig. (6.1.1) has a rectangular cross section of width b; and
height hi, i = 1,2. The tip displacement is constrained not to exceed Wall; with
elementary beam theory this constraint can be written as
This expression is a highly non-linear function of the design variables, but it can be
linearized by using the intervening variables
1 12 1 12
and
Yl = Ii = b1hf ' Y2 = 12 = b2h~ .
g= W II _
a
(23) pl3E Yl _ (~) ptE3Y2 .
6 6
•••
212
Section 6.1: Generic Approximations
The cases where intervening variables can exactly linearize the constraint are
rather rare. Example (6.1.1) is typical of statically determinate structures where
such linearization is often possible. However, as shown by Mills-Curran et al. [2],
even in the case of statically indeterminate beam and frame structures, the reciprocals
of moments of inertia are good intervening variables for displacement constraints.
In many applications the intervening variables are functions of a single design
variable, that is
Yi = Yi(Xi) i = 1, ... ,n . (6.1.4)
In this case it is often convenient to write gr, Eq. (6.1.3), in terms of the original
variables
gr(x) = g(xo) + ~ (
L...J Yi(Xi) - Yi(XOi) ) (Og d
~/ d- )
Yi . (6.1.5)
i=l uX, X, Xo
Note that while gr is a linear function of y it is, in general, a nonlinear function of x.
One of the more popular intervening variables is the reciprocal of Xi
1
Yi=- . (6.1.6)
Xi
This popularity reflects the fact that many of the early structural optimization studies
were performed on structures consisting of truss or plane-stress elements. The design
variables in these studies were usually the cross-sectional areas of the truss elements
and the thicknesses of the plane-stress elements. For statically determinate structures
stress and displacements constraints are linear functions of the reciprocals of these
design variables. For statically indeterminate structures, using the reciprocals of the
design variables still proved to be a useful device in making the constraints more
linear (see, for example, Storaasli and Sobieszczanski [3], and Noor and Lowder [4]).
For the reciprocal approximation Eq. (6.1.5) becomes
gR(X) = g(xo) + ~
L...J(Xi - XOi
XOi)-. (Og)
~ (6.1.7)
i=l X, uX, Xo
One of the attractive features of the reciprocal approximation, even for statically in-
determinate structures, is that it preserves the property of scaling. That is, when the
stiffness matrix is a homogeneous function of order h in the components of x, the dis-
placements are homogeneous functions of order -h in the components ofx. For truss
and membrane elements, h = 1 so that the displacements are homogeneous functions
of the reciprocals of the design variables. If all the design variables are scaled by a
factor, the displacement vector is scaled by the reciprocal of that factor. Therefore
the reciprocal approximation is exact for scaling the design. Fuchs [5] has investi-
gated the importance of the homogeneity property, and Fuchs and Haj Ali [6] have
proposed a family of approximations that generalizes the reciprocal approximation
to any order of homogeneity.
Another approximation, called the conservative approximation [7], is a hybrid
form of the linear and reciprocal approximations which is more conservative than
213
Chapter 6: Aspects of The Optimization Process in Practice
either. It is particularly suitable for interior and extended interior penalty function
methods (see Section 5.7) which do not tolerate well constraint violations. To obtain
the conservative approximation we start by subtracting the reciprocal approximation
from the linear approximation
gL ()
X - gR ()
X = ~ (Xi -
~
XOi)2 (Og)
~. (6.1.8)
i=1 Xi VXi Xo
The sign of each term in the sum is determined by the sign of the ratio (Og/OXi)/Xi
which is also the sign of the product Xi(Og/OXi). Contributions from design vari-
ables for which this product is negative make the reciprocal approximation larger
(more positive) than the linear approximation, and vice versa. Since the constraint
is expressed as g(x) ;::: 0, a more positive approximation is less conservative. The
conservative approximation, ge, is, therefore, created by selecting for each design
variable the smaller (less positive) contribution
where
I if XOi(Og/OXi) :::; 0,
G;= { (6.1.10)
XO;/Xi otherwise.
Note that G; = 1 corresponds to a linear approximation, and Gi = xo;/Xj corresponds
to a reciprocal approximation in Xj.
The conservative approximation is not the only hybrid linear-reciprocal approx-
imation possible. Sometimes physical considerations may dictate the use of linear
approximation for some variables and the reciprocal for others, (see Haftka and Shore
[8]' and Prasad [9]). The conservative approximation, however, has the advantage of
being concave (Exercise 1). If all the constraints are approximated by the conserva-
tive approximation, the feasible domain of the approximate optimization problem is
convex (see Section 5.1.2). If we also approximate the objective function by a convex
function, the approximate optimization problem is convex. Convex problems are
guaranteed to have only a single optimum, and they are amenable to treatment by
dual methods (see Section 9.2.2). In fact, a convex approximation fc(x) to the objec-
tive function, f(x), is obtained by reversing the process for obtaining the conservative
concave approximation. That is (Exercise 1),
where
F; = {~O;/Xi if xOi(of /OXi) :::; 0, (6.1.12)
otherwise.
This process of using the conservative approximation for the constraints and the
convex approximation for the objective function has been introduced by Braibant and
214
Section 6.1: Generic Approximations
Fleury [10], and is known as convex linearization. In many papers and textbooks, the
constraints are posed as g(x) ~ 0 rather than g(x) ~ O. In this case, the conservative
approximation is convex rather than concave (that is we use the form of Eqs. (6.1.11)
and (6.1.12) also for the constraints). There are other conservative approximations
(for example, see Prasad [11] or Woo [12]), but it is important to note that the one
presented here, as well as the others, are not guaranteed to be conservative in an
absolute sense (that is, we do not know that the approximation is more conservative
than the exact constraint, gc(x) ~ g(x) ). The approximation presented here is only
more conservative than either the linear and reciprocal approximations.
Higher order approximations are also used occasionally. For example, the
quadratic approximation, gQ is obtained by including the quadratic terms in the
Taylor series expansion
Example 6.1.2
f
I
three bar truss shown in Figure 6.1.2. The horizontal force p can act either to the right
(as shown) or to the left. The truss is designed subject to stress and displacement
215
Chapter 6: Aspects of The Optimization Process in Practice
constraints with the design variables being the cross-sectional areas AA, A B, and Ac.
Because of the symmetry of the truss and the arbitrary direction of the horizontal
load we must have AA = Ac. We examine the approximations to the constraint on
the stress in member C, which requires that stress to be less than 110 both in tension
and compression.
The stresses in the three members can be expressed in terms of the displacement
components at the tip of the truss as
3EA A
or -----;u- U = P .
Similarly, from the vertical equation of equilibrium
or -Ev (A B+-
AA) = 8p,
I 4
so that
v = 8pl/ E(AB + O.25A A ) ,
and
V3- +
I1c =p ( - -
3A A AB
2) .
+ O.25A A
Assuming that member C is in tension, we may write the constraint function as
g =1-
I1c
110 =1-
P
110
(J3
- 3A A + AB + O.25A.4
2)
We now define normalized design variables
so that
g=1+------
J3 2
3Xl X2 + 0.25xl
We approximate g about the point X6 = (1,1). The first derivatives are
ag = (_ J3 + 0.5 ) = -0.2574
aXl 3xi (X2 + O.25xd 2 Xo
aX2
ag =
(X2
2
+ O.25xd 2
I
Xo
= 1.28 .
216
Section 6.1: Generic Approximations
82 9 =
8XIX2 (X2
1
+ 0. 25x I)3
IXo
= -0.512 ,
829 - 4 I = -2.048.
8x~ - (X2 + 0. 25x I)3 Xo
Using these derivatives and 9(Xo) = -0.0227 we can construct the following approx-
imations
9L = -0.0227 - 0.2574(XI - 1) + 1.28(X2 - 1) ,
9QR = -0.0227 - :J :J
0.2574 (2 - :J :J (1- + 1.28 (2 - (1 -
+ 0.5134 (1 - :J :J :J - :J
2 - 0.512 (1 - (1 - 1.024 (1 - 2
All of these approximations have the correct value and correct derivatives at Xo =
(1, If. The two quadratic approximations also have the correct second derivatives
at that point. The reciprocal approximations tend to one as the design variables
tend to infinity. This corresponds to the stress in member C tending to zero a.'l the
cross-sectional areas tend to infinity. This correct physical behavior is not shared by
the other approximations. Table 6.1.1 compares the predictions of the five approxi-
mations to the exact values when Xl and X2 vary between 0.75 and 1.25.
Table 6.1.1
Xl X2 9 9L 9R 9c 9Q 9QR
0.75 0.75 -0.3635 -0.2783 -0.3635 -0.3850 -0.3422 -0.3635
1.00 0.75 -0.4227 -0.3426 -0.4493 -0.4493 -0.4066 -0.4209
1.25 0.75 -0.4205 -0.4070 -0.5008 -0.5137 -0.4070 -0.4280
0.75 1.00 0.0856 0.0417 0.0631 0.0417 0.0738 0.0915
1.25 1.00 -0.0619 -0.0870 -0.0741 -0.0871 -0.0549 -0.0639
0.75 1.25 0.3786 0.3617 0.3191 0.2977 0.3617 0.3919
1.00 1.25 0.2440 0.2974 0.2334 0.2334 0.2334 0.2435
1.25 1.25 0.1819 0.2330 0.1819 0.1690 0.1691 0.1819
The Table shows that the approximations based on reciprocal variables are more
accurate than the approximations based on the actual variables, and in particular,
217
Chapter 6: Aspects of The Optimization Process in Practice
they are exact when the two variables are scaled by the same factor (that is x is
replaced by o:x where 0: is a scalar). The quadratic approximations are substan-
tially more accurate than the three first-order approximations. The conservative
approximation is not guaranteed to be more conservative than the second-order ap-
proximations, but usually, as in this example, it is. We see, however, that the price
of this extra conservativeness is that it is the least accurate approximation.
0.4
0 gL
0.3 D gQR
0.2 gQ
gc
0.1 gR
0::
0
0:: 0.0
0::
t.1l
-0.1
-0.2
-0.3
-0.4 +---+--4----...-+---+--4----...-01---+-_
0.00.1 0.20.3 0.4 0.5 0.60.7 0.8 0.9 1.0
The constraint approximations can also be used to check for errors in the deriva-
tives used to construct them. This is done by calculating the exact constraint along
a line in design space and plotting the error in the approximation along that line. A
first order approximation must have a zero slope for the error curve at the nominal
design, while a second-order approximation must also have zero curvature there. For
example, let us compare the various approximations along the line
where t = 0.5 represents the nominal design. Figure 6.1.3 shows the error as a
function of t. It is seen that the first-order approximations indeed have zero slope
at t = 0.5, while the second-order approximations also have zero curvature there.
For this example, the reciprocal approximation is quite conservative, so that the
conservative approximation is almost identical to it.e e e
The approximations covered so far are obtained by algebraically manipulating the
constraint functions. In an effort to improve the quality of the approximations recent
research efforts have concentrated on the extension of the concept of intermediate
design variables to the concept of intermediate response quantities. The concept was
introduced by Schmit and Miura [13J in 1976, but it was not applied until about ten
218
Section 6.1: Generic Approximations
years later (e.g., [14]). The approach seeks intermediate response quantities that are
well approximated linearly. If the response quantities appearing in the constraint
can be calculated inexpensively from the intermediate response, than we can have a
nonlinear inexpensive and accurate approximation.
One of the most successful intermediate response approximation was proposed for
stress constraints in structural design by Vanderplaats and coworkers (e.g., [15-17]).
Vanderplaats argued that an approximation for member forces will be more accurate
than the corresponding approximation for member stresses. This is expected because
member forces change more slowly than member stresses when cross-sectional areas
are changed. In particular, for a statically determinate truss, force in each of the
members is constant, while member stresses are inversely proportional to member ar-
eas. This motivates the use of the member forces as intermediate response quantities.
Consider, for example, a typical stress constraint for a truss member of the form
a·
gi = 1- - ' ~ O. (6.1.15)
aall
A common approximation for member stresses uses the reciprocal design variables,
Xi = I/A i , where Ai is the cross-sectional area of the ith member. Using a linear
approximation for the member forces, and then dividing by the cross-sectional area
to obtain an approximation to the stress, as suggested by Vanderplaats, we obtain a
constraint of the form
This is linear in the cross-sectional area design variables. Note that for a statically
determinate truss, where the gradient of the member forces with respect to the cross-
sectional areas is zero, the approximation of Eq. (6.l.16) is a constant. Equation
(6.l.16) has the dimension of area, and it should be nondimensionalized by dividing
it by a reference area. A comparison of the performance of this linear force approxi-
mation with other approximations is given in Section 6.4.
The most common global approximation is the response surface approach. With
this approach the function is sampled at a number of points, and then an analytical
expression called the response surface (typically a polynomial) is fitted to the data.
Construction of response surface often relies heavily on the theory of experiments [18]
and is an iterative process that begins with the assumption of the analytical form
of the response surface, for example, a quadratic polynomial. The approximation
contains a number of unknown parameters (such as polynomial coefficients) that
must be adjusted to match the function to be approximated. To do so, analyses are
performed at a number of carefully selected design points, and a least square solution
is typically used to extract the parameter values from the analysis results. Then the
approximate model (the response surface) is used to predict the function at a number
of selected test points, and statistical measures are used to assess the goodness-of-fit,
219
Chapter 6: Aspects of The Optimization Process in Practice
or the accuracy of the response surface. If the fit is not satisfactory, the process is
restarted, and further experiments are made, or the postulated model is improved by
removing and/or adding terms.
Response surface techniques have not been used extensively in structural opti-
mization (see Barthelemy and Haftka [19J for applications). This may be due to the
fact that the technique is practical only for problems with a small number of design
variables (less than 20 ). The number of analyses required to construct the response
surface increases dramatically with the number of design variables.
Example 6.1.3
To demonstrate the use of response surfaces we fit a linear response surface to the
stress constraint of Example 6.1.2
(a)
\Ve assume that the design space is
To find 0" b, and c we need to evaluate 9 at 3 or more points. For robustness we use
more points, so we select the following 4 points:
xi :;:; (0.5,0.5), x~ = (1.5,0.5), xI = (0.5,1.5), xI = (1.5,1.5) .
Substituting each of these points into Eq. (a) we get 4 equations
0.5
1.5 0.5]
0.5 { ~ } = { -1.0453}
-0.9008
0.5 1.5 0.9239 .
[! 1.5 1.5 c 0.3182
To get a least-square solution of these 4 equations in 3 unknowns, we multiply both
sides by the transpose of the coefficient matrix and solve the resulting 3 X 3 system.
We obtain a = -1.5395, b = -0.2306, c = 1.5941, or
g,. .. = -1.5395 - 0.2306.T1 + 1.5941x2·
vVe compare this with the linear approximation about (1,1) that we found ill Example
6.1.2
gL = -0.0227 - O.2574(.Tl - 1) + 1.28(X2 - 1).
As expected, gL is more accurate near (1,1), and g,., further away. For example at
(0.75,0.75) we get 9 :;:; -0.3635, gL = -0.2783, gr8 = -0.5169. While at (0.5,0.5) we
get 9 = -1.0453, gL = -0.5340, gr. = -0.8578 .•••
220
Section 6.1: Generic Approximations
In response surface techniques the design space is sampled ahead of the opti-
mization process. However, because the optimization process requires the calculation
of constraints and their derivatives at more than one point, it makes sense to use
the information from previous calculations to construct wide ranging approximations
rather than approximations based on information at a single point. This leads to the
concept of multipoint approximations that qualify for the label midrange approxima-
tions. Haftka et al. [20] examined approximations based on two and three points.
Their experience was that the approximation worked well when it represented inter-
polation (for example, at points inside the triangle formed by three data points in
a three-point approximation), but gave only marginal improvement in accuracy for
extrapolation.
A two-point approximation that shows more promise was proposed by Fadel et
al. [21]. The approximation is a linear approximation in the variahles Yi = Xfi, where
the exponentials are selected to match the data. \Ve start by constructing a linear
approximation in Yi at the first point Xo. The approximation may he written in terms
of the original variables as
Then the exponentials Pi are found from the condition that the derivatives of 9 match
those of gtp at a second point, Xl. It is easy to show that this leads to
8g ) / (~) }
log { (
8Xi 8Xi
Pi = 1 + ---'----,---:----:----=--
Xl Xo
(6.1.18)
10g(XIi/XOi)
When Pi is larger in magnitude than 1 it is set to sign{pi) so as to avoid large
exponents. Special provisions need to be made when the ratios in the numerator or
denominator in Eq. (6.1.18) are negative or if Pi is zero. In the first case Pi is taken
to be 1, while in the second case it can be shown by the use of Taylor series expansion
that
·
11m [(~ri-1] = I og (Xi)
- . (6.1.19)
Pi-+ O Pi XOi
Another midrange approximation is the scaling or local-glohal approximation [22].
It is intended to improve a global approximation, available from a response surface
approach or from a simpler model of the prohlem, by injecting some local information
into it. The simplest approach for doing that is to use a scale factor based on the
value of the function at a point Xo. That is, the scale factor Be is given as
sc(X) = g(x)/gc{x) , (6.1.20)
where gc is the global approximation. Then the scaled global approximation, g.o, is
given as
(6.1.21)
221
Chapter 6: Aspects of The Optimization Process in Practice
An improvement on this scale factor can be obtained by using the derivative of
g to construct a linear scale factor Sel given as
(6.1.22)
(6.1.23)
The local-global approximation was applied by Chang et al. [23] for approximating
displacements, stresses and frequencies of a supersonic wing structure obtained by a
finite element model. The global approximation used was a plate model of the wing.
The discrete equations of equilibrium for linear static response (obtained, for example,
from a finite element analysis) at a design point Xo are
Kouo = fo , (6.2.1)
where K o, Uo and fo are the stiffness matrix, the displacement vector and the load
vector at xo, respectively. Consider now a change ~x in the design which results in
a change ~K in the stiffness matrix, and ~f in the load vector. The equations of
equilibrium at Xo + ~x are
222
Section 6.2: Fast Reanalysis Techniques
and we can obtain a first approximation ~Ul to ~u by neglecting the ~K~u term
(6.2.4)
(6.2.6)
where the terms ~ Uj in the series are obtained through the iterative process of solving
(6.2.8)
Of course, the series is not guaranteed to converge, especially when ~x is not small.
Another approach for improving on ~Ul was suggested by Kirsch and Taye [24].
Their idea is that changes in the structure can be divided into overall scaling and
redistribution of material. That is, we write the perturbed stiffness matrix as
where s is a scaling factor. Overall scaling can be dealt with in a simple manner, so
that we need to analyse only the redistribution part. We choose s so as to minimize
~K •. That is, s is chosen so that sKo is as close as possible to K + LlK. Kirsch and
Taye suggested minimizing the sum of the squares of the elements of ~K.. Then it
can be shown (Exercise 7), that s is given as
S -
l:i j kOij~kij
1 + -='::---,,---- (6.2.10)
- .. k 02lJ
"L.J'l,} ••
Now we consider our nominal design to be the one with the matrix sKo instead of
Ko. For this design the displacement field is
Us = (l/s)uo. (6.2.11)
223
Chapter 6: Aspects of The 'ptimization Process in Practice
We consider only the case v here there is no change in the force, ~f = 0. Then Eq.
(6.2.4) for this scaled design is
where we used Eq. (6.2.9). Comparing this Equation to Eq. (6.2.4) we get
lIs - 1 1 (1 - S)2
~us = Us-UO+~Us1 = (--1)UO+2~U1
s s
+-2-uo = 2~U1-
s s s
2 Uo· (6.2.13)
Example 6.2.1
Apply a first term correction, without and with scaling, to approximate the stress
constraint in member C of Example (6.1.2) when the area of member B is increased by
25 percent (Xl = 1, and X2 is increased from 1 to 1.25, in terms of the nondimensional
areas defined in Example 6.1.2).
The stiffness matrix for the three-bar truss is easily verified to be
K _ E [0.75A A
- I °
AB
° ] _l(JoEp [0.75°
+ 0.25A B -
X1
(X2
° ]
+ 0.25 x d '
so that
Ep [0.75
Ko = l(Jo ° 1.25'
0]
~Ko
Ep
= l(Jo [0° 0.250] .
Also, from Example (6.1.2) we have
E
~U1 = -Ko-1 ~Kuo = l(Jo { -1.28 °} .
so that
E
~g = --(~v
41(Jo
- V3~u). (a)
224
Section 6.2: Fast Reanalysis Techniques
where Ko and Mo are the stiffness and mass matrices, respectively, and Po and Uo are
the eigenvalue (square of frequency) and eigenvector (vibration mode), respectively,
all evaluated at a nominal design point Xo. When Po is a nonrepeated eigenvalue,
the effect of perturbing the design can be easily estimated. Rewriting the eigenvalue
problem at Xo + ~x we have
We subtract Eq. (6.2.14) from Eq. (6.2.15) and neglect quadratic and cubic terms
in the perturbation such as ~K~u to get
Premultiplying by uZ' and using Eq. (6.2.14) and the symmetry of Ko and of Mo we
get
(6.2.17)
Alternatively, we can premultiply Eq. (6.2.15) by (UO+~U)T and neglect some higher
order terms in the perturbation to get
u6(Ko + ~K)uo
Po + up ~
A
-,;;-,'-----'-:-- (6.2.18)
u6(M o + ~M)uo
Equations (6.2.17) and Eq. (6.2.18) have been obtained by neglecting quadratic and
cubic terms, and it can be shown that their errors (which are not the same) are
proportional to the square of the perturbation in the design ~x, or that they are first
order approximations.
Another first-order approximation was suggested by Pritchard and Adelman [28].
It is based on integrating the derivative of the eigenvalue P with respect to a design
variable x. Equation (7.3.5) for the eigenvalue derivative may be written as
dp
dx = a - Jlb , (6.2.19)
where
dK dM
uT -u uT-u
a= dx and b= dx . (6.2.20)
uTMu' uTMu
226
Section 6.2: Fast Reanalysis Techniques
Assuming that a and b do not change and a i- 0, we obtain the solution of the
differential equation as a function of the design variable x as
(6.2.21)
This time the error in Eq. (6.2.22) is proportional to lI.6.xI1 4 , see Murthy and Haftka
[29]' so that Eq. (6.2.22) is a third-order approximation.
Example 6.2.2
k k
m m
-1]
1 '
n'
For the perturbed system there is no change in the stiffness matrix, and
M+~M=m[~ or ~M = [~ ~] .
From Eq. (6.2.20) we get
[1 1.618] } ,
or
Jlo + ~Jl ~ 0.276k/m .
Similarly, from Eq. (6.2.21) we get
Jlo
[1 1.618] [.:t -t] {1.~18
+ ~Jl ~ ------=.:[,.....---O~]"""";{C--1-
}
T} = 0.299k / m .
[ 1 1.618] 20m
m 1.618
We now consider the DEB approximation of Eq. (6.2.21) with x being the change
in left mass
and a = O.
For the nominal design x = 0, and for the perturbed design x = m so that
228
Section 6.3: Sequential Linear Progmmming
compared to the computational cost associated with the optimization operations, such
as the calculation of search directions. This is a typical situation when we employ
a finite element model with thousands of degrees of freedom to analyze a structural
design which is defined in terms of a handful of design variables. It then pays to reduce
the number of exact structural analyses required for the design process by applying
optimization algorithms to a model of the structure based on approximations.
The simplest and most popular approximation approach is that of sequential
linear programming (SLP). Consider an optimization problem of the form
minimize I(x),
(6.3.1)
subject to gj(x) 2: 0, j = 1, ... ,ng .
The SLP approach starts with a trial design Xo, and replaces the objective function
and constraints by linear approximations obtained from a Taylor series expansion
about Xo
;=1
- XOi) (a-ax,l. ) Xo
'
(6.3.2)
subject to gj(Xo) + t(Xi - XOi) (a~) 2: 0 j = 1, ... , ng ,
;=1 a. Xo
The last set of constraints are called move limits, with a,i and aui being the lower
and upper bounds, respectively, on the allowed change in Xi.
Because of the approximation involved, and the move limits, it is rare that the
final design of the linearized problem, XL, is acceptably close to the optimum design.
However, if the move limits are small enough to guarantee a good approximation
within these move limits, XL will be closer to the optimum than Xo. We can, therefore,
replace Xo by XL, and repeat the linear optimization with Eq. (6.3.1) linearized about
the new starting point. This process is repeated, so that we replace the original
optimization problem by a sequence of linear programming (LP) problems (hence
the name SLP). Each linear optimization is called an optimization cycle. The nature
of the linearization of a nonlinear problem and the application of move limits are
demonstrated in the following example.
Example 6.3.1
g2(XO) = 7 - 1 + 1 = 7,
(Vgd xo = {=~} ,
(Vg 2 )xo = { - ; }
10.00
8.00
6.00
4.00
2.00
0.00
0.00 2.00 4.00 6.00 8.00 10.00
Xl
These linear approximations are shown in Figure (6.3.1) together with the original
constraints represented by the dashed lines. Also shown in the figure are the move
limits which form a rectangular boundary around the initial design point.
230
Section 6.3: Sequential Linear Progmmming
The solution of this new linear programming problem is xf = (2.0 2.0) with
an objective function of f =-6 which corresponds to a 100% improvement in the
objective function. If there were no move limits, the solution of the problem would
have been at xf = (8.5 5.0) and the resulting value of the objective function would
be f = -22 (see Figure 6.3.1).
Although without move limits we achieve a much larger gain in the objective
function, the exact constraints are violated substantially, as shown in Figure (6.3.1).
A procedure for evaluating the acceptability of constraint violations is discussed later
in this section. __ _
SLP is attractive because reliable LP packages are readily available to most com-
puter users through system library packages, while reliable nonlinear programming
packages are not so readily available. However, the SLP strategy has several problems
associated with it. First, it greatly increases the computational cost associated with
optimization operations, because the optimization process is repeated several times
(typically five to forty times). Thus, this strategy is reasonable only when the cost
of these optimization computations is small compared to the cost of analysis plus
the cost of sensitivity derivatives. The efficiency of the LP package used for the SLP
approach can, therefore, become an important consideration.
Second, without a proper choice of move limits, the process may never converge.
In general, move limits should be gradually shrunk as the design approaches the
optimum. Part of the reason for the need to shrink the move limits is that the
accuracy of the approximation is required to be higher when we get close to the
optimum. When we are far from the optimum design, the gains that are made during
each cycle are large, and we can tolerate significant errors and still make progress
towards the optimum. When we get close to the optimum, the gains are small and
can be swamped by approximation errors. However, reduction of the move limits
early in the process may unnecessarily slow down the convergence too, especially if
the initial design is far from the actual optimum. The need to reduce move limits is
indicated when the final design of a cycle proves, upon exact analysis, to be inferior
to the initial design of that cycle (which is the final design of the previous cycle), or
provides no gain in the function f. The move limits are typically shrunk by ten to
fifty percent of their previous values until the improvement in the objective function
for a given set of move limits becomes smaller than a given tolerance. Popular choices
for starting values of the move limits are in the range of ten to thirty percent of the
design variables. However, this choice is reasonable only if a design variable is not
exceedingly small because it may be on its way to changing its sign. In such a case,
it may be reasonable for the move limits to be ten to thirty percent of a typical value
(as opposed to the instantaneous value) of that design variable.
A third difficulty associated with SLP arises occasionally when the starting design
is infeasible. The combined effects of approximation and move limits can then result
in a situation where the linearized optimization problem does not have a feasible
solution. That is, if the initial point of a problem is infeasible with respect to the
normalized constraints and the move limits are small, the region formed by the move
limits may remain entirely inside the infeasible linearized design space leading to an
infeasible problem. In this case it is advisable to relax the constraints during the first
231
Chapter 6; Aspects of The e7ptimization Process in Practice
few cycles. This can be done, for example, by replacing the optimization problem
Eq. (6.3.2) by
Example 6.3.2
2p
We consider the minimum weight design of a four bar statically determinate truss
shown in Figure (6.3.2). In the interest of simplicity we assume members 1 through
3 to have the same area A1 and member 4 an area A 2 . Under the specified loading
232
Section 6.3: Sequential Linear Programming
the member forces and the vertical displacement at joint 2 can be easily verified to
be
II = 5p, h = -p, fa = 4p, 14 = -2V3p ,
b
2
= 6pl (~
E Al +
v'A23) '
234
Section 6.9: Sequential Linear Programming
I
(18x 1+6f"3x 2 ::;;: 3)
xl =0.05 Xl ::;;:0.1546
0.25
0.20
0.15
t------I"........"""""+--'t---+-----x2 = 0.1395
0.10
- -_ _ f*=4O
f* = 44.784
0.05 +---~~It'""r" ........~___......~+_--- x2 = 0.05
~-.;:!~-f* = 60
where C is the Lagrangian function. This suggests the following procedure: If the
objective function and the most critical constraints both improve, always accept the
new design. If the objective function improves and the constraints deteriorate or vice
versa, compare the values of the Lagrangians. If the Lagrangian at the end of a cycle
is smaller than its value at the beginning of the cycle, then accept the new design. If,
on the other hand, the Lagrangian increases, modify the move limits. We recommend
using only critical and violated constraints in the Lagrangians.
Example 6.3.3
235
Chapter 6: Aspects of The Optimization Process in Practice
8.0 S YI S 20,
8.0 S Y2 S 20 ,
where lower bound for the variables are increased to 8.0 for convenience. An initial
guess of YI = 12, and Y2 = 8 results in f = 49.856.
Linearizing the problem with 30% move limits leads us to the problem
\Ve can generalize SLP by using nonlinear approximations for some of the con-
straints and objective function. For the application of SLP we need to linearize even
236
Section 6.4: Sequential Nonlinear Approximate Optimization
simple nonlinear functions. With the more general procedure we approximate only
expensive-to-calculate functions using either linear or nonlinear (such as quadratic)
approximations. Inexpensive constraints need not be approximated at all. \Ve start
by identifying those constraints (and possibly the objective function) which require
large computational resources for evaluation. These constraints are singled out for
approximation, while the cheaper constraints are evaluated exactly. Given a trial
solution Xo to the structural design problem, we construct approximations to the
expensive constraints about Xo. As in the case of SLP, we need to augment the ap-
proximate problem with move limits to guard against large changes in design variables
that can result in poor approximations.
The solution of the approximate problem with the move limits, obtained by any
optimization procedure is denoted as Xl. We perform a new exact structural analy-
sis at Xl, use it to construct new approximations to the expensive constraints, and
perform a new optimization of the approximate problem. That is, the original opti-
mization problem Eq. (6.3.1) is replaced by
minimize
subject to
(6.4.1)
and
for i = 0,1,2, ... ,
where fa and gaj denote the approximate objective function and constraints, respec-
tively, X~i) is the solution of the ith minimization, and aj is a suitably chosen move
limit.
Because most of the cost of the optimization is associated with the exact analysis
and sensitivity calculations, it is often not important what optimization procedure
is used for obtaining the optimum of approximate problems. In general, it is more
important to emphasize reliability and robustness in the choice of the optimization
procedure rather than computational efficiency.
The following example demonstrates the use of sequential nonlinear approximate
optimization with the standard approximations discussed in section 6.1 as well as one
which was tailored more to the problem at hand.
Example 6.4.1
The ten-bar truss shown in Figure (6.4.1) is a standard example used by many
authors. The minimum weight design obtained by changing the cross-sectional ar-
eas of the truss members is sought subject to stress constraints and minimum gage
constraints of 0.lin 2 . The maximum allowable stress in each member is the same
in tension and compression. This allowable is set to 25 ksi for all members except
member 9. For member 9 the stress allowable is 75 ksi. The density of the truss
material is 0.llb/in 3 •
237
t-0
Chapter 6; Aspects of The Optimization Process in Practice
-I- ®----1,
J I
~. ~----91
I:
x
The five generic local approximations described in section 6.1 were used here,
together with the linear force approximation proposed by Vanderplaats and coworkers
[e.g., 15J. Table 6.4.1 shows the initial and optimum designs and the stresses in the
optimum truss members.
Table 6.4.2 compares the convergence history of twelve cycles of approximate op-
timization using the six approximations. To compare the performance of the various
approximations in Table 6.4.2 a useful measure of performance is the number of cycles
required to get to within one percent of the optimum weight (that is to 1514 lb). The
linear, reciprocal-quadratic, and linear force approximations required six cycles, the
quadratic approximation seven, the reciprocal approximation ten, and the conserva-
tive approximation never made it. The difference between the linear and reciprocal
approximations turns out to be an idiosyncrasy of this problem. For many truss
238
Section 6.5: Special Problems Associated with Shape Optimization
Table 6.4.2 Convergence of optimum weight (lb) using different approximations
Cycle Linear Reciprocal Conservative Quadratic Recip-quadratic Linear force
1 1845 1774 2361 2002 1931 1891
2 1637 1673 1960 1741 1684 1688
3 1601 1593 1722 1650 1595 1589
4 1558 1566 1641 1586 1548 1549
5 1531 1548 1587 1547 1522 1526
6 1514 1537 1566 1525 1509 1511
7 1507 1528 1555 1514 1506 1504
8 1502 1522 1546 1507 1502 1501
9 1500 1518 1540 1503 1500 1500
10 1500 1511 1538 1501 1500 1499
11 1500 1511 1535 1500 1499 1499
12 1499 1508 1532 1499 1499 1499
problems the reciprocal approximation does better than the linear one. As a group,
the second order approximations are slightly better than the first order ones, but the
difference does not appear to be significant enough to justify the cost of computing
second derivatives (see Section 7.2.2 for discussion of the cost of calculating second
derivatives ).
The dismal performance of the conservative approximation is explained by the
fact that it is typically less accurate than either the linear or reciprocal approxima-
tion. It is useful in situations where we need the conservativeness (such as when it is
employed with interior penalty function algorithms), or the convexity (such as with
dual algorithms, see Chapter 9). However, for sequential approximate optimization
it is of little use. Finally, the linear force approximation due to Vanderplaats is com-
parable in performance to the second-order approximations even though it employs
only first derivatives. This is due to the fact that it approximates a "more linear"
quantity than the stress. In using this approximation we approximate an interme-
diate quantity-the member force, and compute the stress exactly from the force.
Similar physical insight leading to identification of quantities that are approximately
linear can afford comparable gains in other problems. , . ,
The term shape optimization is employed here in a very broad sense. In terms
of a finite element model we consider as shape optimization any problem where we
need to change the position of the nodes of the finite-clement model or the element
connectivity (e.g remove elements). Shape optimization problems are contrasted
with sizing optimization problems where we change only element stiffness properties,
such as bar cross-sectional areas or plate thicknesses. The term shape optimization
is often used in a narrow sense referring only to the optimal design of the shape
of the boundary of two- and three-dimensional structural components. The broad
usage includes also geometrical optimization of skeletal structures, and topological
239
Chapter 6: Aspects of The Optimization Process in Practice
optimization which decides the connectivity of the structure (for example, ,vhich
nodes are connected by clements).
Shape optimization problems are typically more difficult to tackle than sizing opti-
mization problems. Consider first the optimization of the boundary shape of two- and
three-dimensional bodies. The calculation of sensitivity derivatives for these shape
optimization problems is associated with accuracy problems discussed in Chapters 7
and 8. Another serious problems is mesh deformation. As the shape of the structure
changes we need to change the finite-element mesh. Simple remeshing rules that
translate node positions as the boundary changes, usually lead to highly deformed
finite elements and concomitant loss of accuracy. This problem can be addressed by
manually remeshing during the optimization process (which is time consuming), or
employing sophisticated mesh generators. \Vork in shape optimization has indeed
spurred the development and usage of such mesh generators (e.g., [30,31]).
Another problem associated with boundary shape optimization is that of the
existence or creation of internal boundaries or holes. In many problems the optimal
design will have internal cavities. It is impossible to generate these cavities with a
standard optimization approach without prior knowledge of their existence. That
is, an optimization procedure can easily find for llS the optimum shape of a cavity
once we assume there is one, but it cannot tell liS that there should be one, two,
or three cavities. One approach for dealing wit h this problem is to aSS11me that the
material is not homogeneous, but instead has an underlying microstructure. This
underlying microstructure can be of fibers and matrix composite material. However,
typically the assumed microstructure is more general than that of the fiber and matrix
components of a laminated plate, and includes micro cavities in the material. This
type of microstructure was devised so as to probe the theoretical limits of strength
and stiffness that can be attained by a structure (see, e.g., Kohn and Strang [32],
or Rozvany et al. [33]). Bends0e and Kikuchi [34] showed that it can be used to
determine the need for introducing cavities into the structure. Figure 6.5.1 shows
the type of structure obtained by Bends0e and Kikuchi by permitting microcavities.
The structure under consideration is a bar in tension where the cross sections at the
two ends are given (solid areas in figure), and the cross section on the left is larger
than the that on the right. The objective is to maximize the stiffness of the bar for a
given volume. The result shown in the figure, while not practical in itself, permits us
to identify regions where cavities exist. Standard optimization techniques can then
be used to find the optimal shape of these cavities.
An example of the application of this technique was reported by Rasmussen
[35] for the design of a floor beam design in a civil transport. Figure 6.5.2 shows
the topology that was assumed by the designers and the topology identified by the
homogenization approach which led to a suhstantially lighter design.
The problem of finding the cavities in two- and there-dimensional bodies belongs
to the realm of topological optimization. Topological optimization is a difficult prob-
lem which has received more attention in applications to skeletal structures such as
trusses and frames. There the optimum topology is typically defined by decisions as
to which joints are connected to each other by members. The basic approach followed
by most researchers is to create a ground structure where every joint is connected to
240
Section 6.5: Special Problems Associated with Shape Optimization
000000
Figure 6.5.2 Shape design of floor beam for a civil transport aircraft: initial and final
geometries.
every other joint. If the design problem is minimum weight with constraints on the
plastic collapse load, then as shown in Chapter 3, the optimization problem is linear,
and the simplex method may be used to find the optimum design. The algorithm
also automatically removes all unnecessary members. This approach was first taken
by Dorn and co-workers [361.
241
Chapter 6: Aspects of The Optimization Process in Practice
to the removal of members. This problem may be overcome by using simultaneous-
analysis-and-design techniques which do not require the inversion or factorization of
the stiffness matrix (see Section 10.6). The reader is referred to two survey papers by
Topping [39], and Kirsch [40J for additional information on topological optimization.
Geometrical optimization of skeletal structures refers to the search for the opti-
mum locations of the joints of the structures. The problem can be solved by standard
techniques, but there are often numerical advantages to treating the geometry vari-
ables differently from the sizing variables and employing a two-level optimization
approach. This topic is discussed in Chapter 10 in Section 10.5.
During the first few years of the development of structural optimization most an-
alysts developed special purpose finite-element programs with built-in optimization
procedures for their own use. When these programs were used by other analysts they
found them to be insufficiently documented and difficult to modify. In recent years it
has become more common to employ general purpose constrained optimization pack-
ages and interface them with general purpose structural analysis codes. Additionally,
the growing popularity of structural optimization as a tool for industrial applications
is generating demand for the introduction of optimization capabilities into general-
purpose analysis packages. The purpose of this section is to provide the reader with
brief description of some of the more popular packages.
First consider integrated packages which combine structural analysis and opti-
mization procedures. One of the more popular programs of this class was the TSO
program (originally called WASP [41,42]) developed for the preliminary design of
aircraft wing and tail structures subject to aeroelastic constraints. The program
models the wing or tail structure as an orthotropic plate and employs simplified
plate analysis rather than a finite element model. Design variables are coefficients
of polynomials that describe the thickness distribution and ply orientations over the
surface. The optimization procedure is based on an interior penalty function formu-
lation (see Chapter 5). The program has been used extensively for design studies and
for some actual aircraft design problems (see [43]).
Many integrated structural optimization packages are based on special purpose
finite element programs. One of the better known is the ACCESS program developed
by Schmit and co-workers [44,45J. Other programs of this type include FASTOP
[46]' OPSTAT [47], OPTCOMP [48], OPTIMUM [49], ASOP [50], STARS [51] and
DESAP [52].
Because of the lack of generality associated with special purpose finite-element
programs, there has been interest in structural optimization packages built around
a general purpose finite element program. Two early examples of this type are
PARS[53] and PROSSS [54]. These programs are based on the SPAR finite-element
package and its commercial derivative EAL. However, because the optimization soft-
ware was not supported by the developer of the finite-element package, the use of
242
Section 6.6: Optimization Packages
PARS and PROSSS has been limited. The EAL program, however, lends itself to in-
terfacing with other programs, and has been used with optimization software; Walsh
[55] reports on the use of EAL together with the CONMIN [56] program.
Other finite-element programs have also been recently used to form structural op-
timization packages. The OPTSYS package [57] is based on the ASKA and ABAQUS
finite-element programs, the ASTROS system [58] evolved from the public domain
version of NASTRAN, and the NISAOPT package (including the programs SHAPE
[59] and STROPT [60]) is based on NISA II.
Until that day, and probably even after, there will be a continuing demand for
general purpose optimization software that can be coupled to structural analysis pro-
grams. Most finite-element packages lend themselves to the calculation of sensitivities
via finite-differences (see Section 7.1), so that the analyst can construct constraint
approximations based on these derivatives (see Section 6.1) and use the optimization
package on this approximation in a sequential-approximate-optimization mode. The
most commonly available general-purpose optimization packages are linear program-
ming (LP) solvers. These are usually available at most computer centers as part of
IMSL or similar subroutine libraries. While in some cases there are advantages to
using more general optimization algorithms, LP packages seem to work well in the
majority of applications.
At the other extreme of generality we find the ADS [64]' DOT [65] and DOC [66]
packages from VMA Engineering which allow the user a wide menu of optimization
algorithms and strategies. These programs evolved from the very popular CONMIN
[56] package which was used extensively for structural optimization. DOT (Design
Optimization Tool) is a collection of fortran subroutines for optimization, and DOC
(Design Optimization Control) is a control program that simplifies the use of opti-
mization (calling DOT subroutines). Another general-purpose optimization package,
commonly used in structural optimization, is NEWSUMT [67] developed by Miura
and Schmit which is based on a penalty function procedure (see Chapter 5), and
an updated version of the program NEWSUMT-A which incorporates constraint ap-
proximations [68]. Other packages of this type include OPT based on the reduced
gradient algorithm (see Chapter 5), and IDESIGN [69] based on sequential quadratic
programming (see Chapter 5). There are also several packages available from math-
ematical programming specialists. However, these programs do not enjoy as much
popularity in structural optimization applications as the aforementioned programs
which were developed by engineers.
243
Chapter 6: Aspects of The Optimization Process in Pmctice
6.7 Test Problems
Standard test problems are useful for the purpose of checking optimization algo-
rithms and software. The three test problems given in this section have been widely
used for this purpose.
6.7.1 Ten-bar Truss
The ten bar truss shown in Figure 6.4.1 is a classical example used to show the dif-
ference between a fully stressed design (FSD) and an optimum design. The material
properties and the minimum area are given in Table 6.7.1. When the truss is de-
signed subject to stress constraints only, the optimum and FSD designs are identical.
However, when the stress allowable for member 9 is increased above 37,500 psi the
optimum design and the FSD design are different. The three designs are given in
Table 6.7.2. The truss has also been optimized with displacement constraints (Table
6.7.3) and the final design is given in Table 6.7.4. For additional information, see
Ref. [70J.
Table 6.7.1 Data for ten bar truss
Material: aluminum Specific mass: 0.1 Ibm/in3
Young's modulus: 10 7 psi Allowable stress: ±25 000 psi
Minimum area: 0.1 in 2
Table 6.7.2 Final designs for ten bar truss with stress constraints only
Increased allowable, member 9
Member FSD and optimum FSD optimum design
areas(in 2 ) areas(in 2 ) areas(in 2 )
1 7.94 4.11 7.90
2 0.10 3.89 0.10
3 8.06 11.89 8.10
4 3.94 0.11 3.90
5 0.10 0.10 0.10
6 0.10 3.89 0.10
7 5.74 11.16 5.80
8 5.57 0.15 5.51
9 5.57 0.10 3.68
10 0.10 5.51 0.14
Mass (Ibm) 1593.2 1725.2 1497.6
244
Section 6. 7: Test Problems
Table 6.7.4 Optimum designs for ten bar truss with displacement constraints
Cross-sectional areas (in2 )
Member Case A Case B Member Case A Case B
1 22.66 30.52 6 0.10 0.55
2 1.40 0.10 7 12.69 7.46
3 21.58 23.20 8 14.54 21.04
4 8.43 15.22 9 11.93 21.53
5 0.10 0.10 10 1.98 0.10
Mass(lbm) 4048.96 5060.85
The twenty-five-bar truss is shown in Figure 6.7.1. The loading, material properties
and allowables are shown in Tables 6.7.5,6.7.6,6.7.7, and 6.7.8, and the final design
is shown in Table 6.7.9. For additional details see Ref. [70].
z
245
Chapter 6: Aspects of The Optimization Proces in Pmctice
Table 6.7.6 Allowable stresses for twenty-five-bar truss (psi)
Members Tension Compression Members Tension Compression
1 40000 -35092 12,13 40000 -35092
2-5 40000 -11590 14-17 40000 -6759
6-9 40000 -17305 18-21 40000 -6959
10,11 40000 -35092 22-25 40000 -11082
The seventy-two-bar truss is shown in Figure 6.7.2. The loadings, material properties
and allowables are shown in Tables 6.7.10, 6.7.11, and 6.7.12, and the optimum design
is shown in Table 6.7.13. For additional details see Ref. [70J.
246
Section 6.7: Test Problems
Note: Fo. the ub of da,i,v. n01 Otll Piement' ,,,. drawn in this figur •.
247
Chapter 6: Aspects of The Optimization Process in Practice
6.8 Exercises
1. Show that the conservative approximation, Eq. (6.1.9) is concave, and the ap-
proximation of Eq. (6.1.11) is convex as long as the design variables do not change
their sign.
2. Derive Eq. (6.1.14).
3. Add to Table 6.1.1 a column representing an approximation to the constraint based
on a linear approximation of the force in member C (This linear-force approximation
is due to Vanderplaats and coworkers [15-17]).
248
Section 6.9: References
A c
4. The three-bar truss in Figure 6.8.1 has members with equal cross-sectional areas.
Calculate the five approximations discussed in Section 6.1 as well as the Linear-force
approximation discussed in the previous problem for the stress in member A. Compare
the accuracy and conservativeness of the approximations for changes of ±25% in the
area of member C.
5. Obtain a good approximation to the stress in member A in the previous problem
in terms of the two angles of the truss.
6. The beam in Figure 6.1.1 has a mass density p, and cross-sectional area propor-
tional to the square root of the moment of inertia A = a..,fj. Use the global-local
approximation to obtain the lowest vibration frequency as 12/11 is varied from 1 to
2. Use a two-element model as the exact solution, and a 1 element model as a global
approximation. Note that this requires you to derive the stiffness matrix of a beam
with a variable cross section.
7. Prove Eq. (6.2.10).
8. Repeat Example 6.2.1 doubling the left spring constant instead of the mass.
8. Use sequential linear programming to design the three-bar truss of Figure 6.1.2
subject to a yield stress constraint of ao and a minimum gage constraint on all
members of O.lp/ao.
9. Repeat the previous problem with the reciprocal approximation.
6.9 References
[1] Schmit, L.A. Jr., and Farshi, B., "Some Approximation Concepts for Structural
Synthesis," AIAA Journal, 12, 5, 692-699, 1974.
[2] Mills-Curran, W.C., Lust, R.V., and Schmit, L.A. Jr., "Approximation Methods
for Space Frame Synthesis," AIAA Journal, 21 (11),1571-1580,1983.
[3] Storaasli, 0.0., and Sobieszczanski, J., "On the Accuracy of the Taylor Approx-
imation for Structure Resizing," AIAA Journal, 12 (2), 231-233, 1974.
249
Chapter 6: Aspects of The Optimization Process in Pmctice
[4J Noor, A.K., and Lowder, H.E., "Structural Reanalysis via a Mixed Method,"
Computers and Structures, 5, 9-12,1975.
[5J Fuchs, M.B., "Linearized Homogeneous Constraints in Structural Design," Int.
J. Mech. Sci., 22, pp. 33-40, 1980.
[6J Fuchs, M.B., and Haj Ali, R.M., "A Family of Homogeneous Analysis Models
for the Design of Scalable Structures," Structural Optimization, 2, pp. 143-152,
1990.
[7J Starnes, J.H. Jr., and Haftka, R.T., "Preliminary Design of Composite Wings for
Buckling, Stress and Displacement Constraints," Journal of Aircraft, 16,564-570,
1979.
[8J Haftka, R.T., and Shore, C.P., "Approximate Methods for Combined Thermal-
Structural Analysis," NASA TP-1428, 1979.
[9J Prasad, B., "Explicit Constraint Approximation Forms in Structural Optimiza-
tion-Part l:Analyses and Projections," Computer Methods in Applied Mechan-
ics and Engineering, 40 (1), 1-26, 1983.
[10J Braibant, V., and Fleury, C., "An Approximation Concept Approach to Shape
Optimal Design," Computer Methods in Applied Mechanics and Engineering, 53,
pp. 119-148,1985.
[11] Prasad, B., "Novel Concepts for Constraint Treatments and Approximations in
Efficient Structural Synthesis," AIAA J., 22, 7, pp. 957-966, 1984.
[12J Woo, T.H., "Space Frame Optimization Subject to Frequency Constraints,"
AIAA J. 25, 10, pp. 1396-1404,1987.
[13J Schmit, L.A., Jr., and Miura, H., "Approximation Concepts for Efficient Struc-
tural Synthesis," NASA CR-2552, 1976.
[14J Lust, R.V., and Schmit, L.A., Jr., "Alternative Approximation Concepts for Space
Frame Synthesis," AIAA J., 24, 10, pp. 1676-1684,1986.
[15J Salajeghah, E., and Vanderplaats G.N., "An Efficient Approximation Method for
Structural Synthesis with Reference to Space Structures," Space Struct. J., 2, pp.
165-175, 1986/7.
[16J Kodiyalam, S., and Vanderplaats G.N., "Shape Optimization of 3D Continuum
Structures Via Force Approximation Technique," AIAA J., 27 (9), pp. 1256-1263,
1989.
[17J Hansen, S. R., and Vanderplaats G.N., "Approximation Method for Configuration
Optimization of Trusses," AIAA J., 28 (1), pp. 161-168, 1990.
[18J Box, G.E.P., and Draper, N.R., Empirical Model-Building and Response Surface,
Wiley, New York, 1987.
[19J Barthelemy, J.-F., and Haftka, R.T., "Recent Advances in Approximation Con-
cepts for Optimum Structural Design," NASA TM 104032, 1991.
250
Section 6.9: References
[20] Haftka, RT., Nachlas, J.A., Watson, L.T., Rizzo, T., and Desai, R, "Two-Point
Constraint Approximation in Structural Optimization," Computer Methods in
Applied Mechanics and Engineering, 60, pp. 289-301, 1989.
[21] Fadel, G.M., Riley, M.F., and Barthelemy, J.-F.M., "Two Point Exponential Ap-
proximation Method for Structural Optimization," Structural Optimization, 2,
pp. 117-124,1990.
[22] Haftka, RT., "Combining Local and Global Approximations," AIAA Journal,
Vol. 29 (9), pp. 1523-1525, 1991.
[23] Chang, K.-J., Haftka, RT., Giles, G.L., and Kao, P.-J., "Sensitivity Based Scaling
for Correlating Structural Response from Different Analytical Models," AIAA
Paper 91-0925, Proceedings of AIAA/ ASME/ ASCE/ AHS/ ASC 32nd Structures,
Structural Dynamics and Materials Conference, Baltimore, MD, April 8-10, 1991.
[24] Kirsch, U., and Taye, S., "High Quality Approximations of Forces for Optimum
Structural Design," Computers and Structures, 30,3, pp. 519-527, 1988.
[25] Haley, S.B., "Solution of Modified Matrix Equations," SIAM J. Numer. Anal., 24
(4), pp. 946-951, 1987.
[26] Fuchs, M.B., and Steinberg, Y., "An Efficient Approximate Analysis Method
Based on an Exact Univariate Model for the Element Loads", Structural Opti-
mization,3 (1), 1991.
[27] Holnicki-Szulc, J., Virtual Distortion Method, Springer Verlag, Berlin, pp. 30-40,
1991.
[28] Pritchard, J.I., and Adelman, H.M., "Differential Equation Based Method for
Accurate Approximation in Optimization," AIAA/ ASME/ ASCE/ AHS/ ASC 31st
Structures, Structural Dynamics and Materials Conference, Long Beach, CA,
April 2-4, Part I, pp. 414-424, 1990.
[29] Murthy, D.V., and Haftka, RT., "Approximations to Eigenvalues of Modified
General Matrices," Computers and Structures, 29, pp. 903-917, 1988.
[30] Shephard, M.S., and Yerry, M.A., "Automatic Finite Element Modeling for Use
with Three-Dimensional Shape Optimization," in The Optimum Shape (Bennett,
J.A., and Botkin M.E., eds.), Plenum Press, N.Y. 1986, pp. 113-135.
[31] Yang, RJ., and Botkin, M.E., "A Modular Approach for Three-Dimensional
Shape Optimization of Structures," AlA A J., 25 (3), pp. 492-497, 1987.
[32] Kohn, RV., and Strang, G., "Optimal Design and Relaxation of Variational
Problems," Comm. Pure Appl. Math., 39, pp. 113-137 (Part I), pp. 139-182
(Part II), and pp. 353-377 (Part III), 1986.
[33] Rozvany, G.I.N., Ong, T.G., Szeto, W.T., Olhoff, N., and Bends~e, M.P., "Least-
Weight Design of Perforated Plates," Int. J. Solids Struct., 23, pp. 521-536 (Part
I), and pp. 537-550 (Part II), 1987.
251
Chapter 6: Aspects of The Optimization Process in Practice
[34] Bends0e, M.P., and Kikuchi, N., "Generating Optimal Topologies in Structural
Design using a Homogeneization Method," Compo Meth. Appl. Mech. Engng.,
71, pp.197-224, 1988.
[35] Rasmussen, J., "Shape Optimization and CAD," SARA, 1,33-45, 1991.
[36] Dorn, W.S., Gomory, R.E., and Greenberg, H.J., "Automatic Design of Optimal
Structures," J. Mecanique, 3, pp. 25-52, 1964.
[37] Sheu, C.Y., and Schmit, L.A., "Minimum Weight Design of Elastic Redundant
Trusses under Multiple Static Loading Conditions," AIAA, J., 10 (2), pp. 155-
162, 1972.
[38] Reinschmidt, K.F., and Russel, A.D., "Applications of Linear Programming in
Structural Layout and Optimization," Comput. Struct., 4, pp. 855-869, 1974.
[39] Topping, B.H.V., "Shape Optimization of Skeletal Structures-a Review," ASCE
J. Struct. Enging., 109 (8), pp. 1933-1951,1983.
[40] Kirsch, U., "Optimal Topologies of Structures," Appl. Mech. Rev., 42 (8), pp.
223-239, 1989.
[41] McCullers, L.A., and Lynch, R.W., "Composite Wing Design for Aeroelastic Tai-
loring Requirements," Air Force Conference on Fibrous Composites in Flight
Vehicle Design, Dayton, Ohio, September, 1972.
[42] McCullers, L.A., and Lynch, R.W., "Dynamic Characteristics of Advanced Fila-
mentary Composites Structures," AFFDL-TR-73-111, Vol. II, 1974.
[43] Haftka, R.T., "Structural Optimization with Aeroelastic Constraints-A Survey
of US Applications," Int. J. Vehicle Design, 7, pp. 381-392, 1986.
[44] Schmit, L.A., and lvIiura, H., "A New Structural Analysis / Synthesis Capability
- Access I, AIAA J., 14 (5), pp. 661-671,1976.
[45] Fleury, C., and Schmit, L.A., "ACCESS 3-Approximation Concepts Code for Ef-
ficient Structural Synthesis--User's Guide," NASA CR-159260, September 1980.
[46] Wilkinson, K., et al., "An Automated Procedure for Flutter and Strength Anal-
ysis and Optimization of Aerospace Vehicles, Vol. I-Theory, Vol. II-Program
User's Manual," AFFDL-TR-75-137, 1975.
[47] Venkayya, V.B., and Tischler, V.A., "OPSTAT-A Computer Program for Opti-
mal Design of Structures Subjected to Static Loads," AFFDL-TR-79-67,1979.
[48] Khot, N.S., "Computer Program (OPTCOMP) for Optimization of Composite
Structures for Minimum Weight Design," AFFDL-TR-76-149, 1977.
[49] Gellatly, R.A., Dupree, D.M., and Berke, L., "OPTIMUM II: A ~vIAGIC Com-
patible Large Scale Automated Minimum Weight Design Program," AFFDL-TR-
74-97, Vols. I and II, 1974.
252
Section 6.9: References
[50] Isakson, G., and Pardo, H., "ASOP-3: A Program for the Minimum Weight Design
of Structures Subjected to Strength and Deflection Constraints," AFFDL-TR-76-
157, 1976.
[51] Bartholomew, P., and Wellen, H.K., "Computer Aided Optimization of Aircraft
Structures," J. Aircraft, 27 (12), pp. 1079-1086,1990.
[52] Kiusalaas, J., and Reddy, G.B., "DESAP 2-A Structural Design Program with
Stress and Buckling Constraints," NASA CR-2797 to 2799, 1977.
[53] Haftka, R.T., and Prasad, B., "Programs for Analysis and Resizing of Complex
Structures," Comput. Struct., 10, pp. 323-330, 1979.
[54] Sobieszczanski-Sobieski, J., and Rogers, J.L., Jr., "A Programming System for
Research and Applications in Structural Optimization," Int. Symposium on Op-
timum Structural Design, Tucson, Arizona, pp. 11-9-11-21, 1981.
[55] Walsh, J.L., "Application of Mathematical Optimization Procedures to a Struc-
tural Model of a Large Finite-Element Wing," NASA TM-87597, 1986.
[56] Vanderplaats, G.N., "CONMIN- A Fortran Program for Constrained Function
Minimization: User's manual," NASA TM X-62282, 1973.
[57] Brama, T., "Applications of Structural Optimization Software in the Design Pro-
cess," in Computer Aided Optimum Design of Structures: Applications, (Eds, C.
A. Brebbia and S. Hernandez), Computational Mechanics Publications, Springer-
Verlag, 1989, pp. 13-21.
[58] Neill, D.J., Johnson, E.H., and Canfield, R., "ASTROS-A Multidisciplinary Au-
tomated Structural Design Tool," J. Aircraft, 27,12, pp. 1021-1027,1990.
[59] Atrek, E., "SHAPE: A Program for Shape Optimization of Continuum Struc-
tures," in Computer Aided Optimum Design of Structures: Applications, (Eds, C.
A. Brebbia and S. Hernandez), Computational Mechanics PubliC"ations, Springer-
Verlag, 1989, pp. 135-144.
[60] Hariran, M., Paeng, J.K., and Belsare, S., "STROPT-the Structural Optimiza-
tion System," Proceedings of the 7th International Conference on Vehicle Struc-
tural Mechanics, Detroit, MI, April 11-13, 1988, SAE, pp. 27-38.
[61] Vanderplaats, G.N., Miura, H., Nagendra, G., and Wallerstein, D., "Optimization
of Large Scale Structures using MSCjNASTRAN," in Computer Aided Optimum
Design of Structures: Applications, (Eds, C. A. Brebbia and S. H('rnand('z), Com-
putational Mechanics Publications, Springer-Verlag, 1989, pp. 51-68.
[62] Ward, P. and Cobb, W.G.C., "Application of I-DEAS Optimization for the Static
and Dynamic Optimization of Engineering Structures," in Computer Aid('d Opti-
mum Design of Structures: Applications, (Eds, C. A. Brebbia and S. Hernandez),
Computational Mechanics Publications, Springer-Verlag, 1989, pp. 33-50.
63] GENESIS User's Manual (version 1.00), VMA Engineering, Goleta, California,
September, 1991.
253
Chapter 6: Aspects of The Optimization Process in Practice
[64J Vanderplaats, G.N., "ADS: A FORTRAN Program for Automated Design Syn-
thesis", VMA Engineering, Inc. Goleta, California, May 1985.
[65] DOT User's Manual (version 2.0B), VMA Engineering, Inc. Goleta, California,
Sept. 1990.
[66] DOC User's manual (version 1.00), VMA Engineering, Inc. Goleta, California,
March 1991.
[67] Miura, H., and Schmit, L.A., Jr., "NEWSUMT-A Fortran Program for Inequal-
ity Constrained Function Minimization-User's Guide," NASA CR-159070, June,
1979.
[68] Grandhi, R.V., Thareja, R., and Haftka, R.T., "NEWSUMT-A: A General Pur-
pose Program for Constrained Optimization Using Constraint Approximations,"
ASME Journal of Mechanisms, Transmissions and Automation in Design, 107,
pp. 94-99, 1985.
[69] Arora, J.S. and Tseng, C.H., "User Manual for IDESIGN: Version 3.5, Optimal
Design Laboratory, College of Engineering, The University of Iowa, Iowa City,
1987.
[70] Fleury, C., and Schmit, L.A. Jr., "Dual Methods and Approximation Concepts
in Structural Synthesis," NASA CR-3226, December, 1980.
254
Sensitivity of Discrete Systems 7
The first step in the analysis of a complex structure is spatial discretization of the
continuum equations into a finite element, finite difference or a similar model. The
analysis problem then requires the solution of algebraic equations (static response),
algebraic eigenvalue problems (buckling or vibration) or ordinary differential equa-
tions (transient response). The sensitivity calculation is then equivalent to the math-
ematical problem of obtaining the derivatives of the solutions of those equations with
respect to their coefficients. This is the main subject of the present chapter.
In some cases it is advantageous to differentiate the continuum equations govern-
ing the structure with respect to design variables before the process of discretization.
One advantage is that the resulting sensitivity equations are equally applicable to
various analysis techniques, whether finite element, Ritz solution, collocation, etc.
This approach is discussed in the next chapter.
As noted in chapter 6, the calculation of the sensitivity of structural response to
changes in design variables is often the major computational cost of the optimization
process. Therefore, it is important to have efficient algorithms for evaluating these
sensitivity derivatives.
The sensitivity of structural response to problem parameters also has other ap-
plications. For example, it is usually impossible to know all the parameters of a
structural model, such as material properties, loads and dimensions exactly. The
sensitivity of the response to small variations in these parameters is essential for
calculating the statistical variation in the response of the structure.
The simplest technique for calculating derivatives of response with respect to a
design variable is the finite-difference approximation. This technique is often com-
putationally expensive, but is easy to implement and very popular. The efficiency of
the analytical methods discussed in the present chapter is measured by comparison
to the finite-difference alternative. Unfortunately, finite-difference approximations
often have accuracy problems. We begin this chapter with a discussion of these
approximations to sensitivity derivatives.
255
Chapter 7: Sensitivity of Discrete Systems
7.1 Finite Difference Approximations
du (6.x)2J2'u
u(:r + 6.x) = u(x) + 6.x-(x)
dx
+ -- - 12(x + (6.x),
2 C:f
OS;(S;1. (7.1.3)
From Eq. (7.1.3) it follows that the truncation error for the forward-difference ap-
proximation is
(7.1.4)
Similarly, by including one more term in the Taylor series expansion we find that the
truncation error for the central difference approximation is
-1 S;(:::; 1. (7.1.5)
Thc condition error is the difference between the numerical evaluation of the function
and its exact value. One contribution to the condition error is round-off error in
256
Section 7.1: Finite Difference Approximations
Equations (7.1.4) and (7.1.6) present us with the so called "step-size dilemma." If we
select the step size to be small, so as to reduce the truncation error, we may have an
excessive condition error. In some cases there may not be any step size which yields
an acceptable error!
Example 7.1.1
Suppose the function u(x) is defined as the solution of the following two equations
101u+xv=10,
xu + 100v = 10,
and let us consider the derivative du/ dx evaluated at x = 100.
0.0
-0.1
du/dx
o
-0.2 Central difference o
o Forward difference
o
-0.3
0.00001 0.0001 0.001 0.01 0.1 1
Step Size
Figure 7.1.1 Effect of step size on derivative.
257
Chapter 7: Sensitivity '7 ...Jiscrete Systems
and the exact value of du/dx at x = 100 is -0.10. The forward-difference and central-
difference derivatives are plotted in Figure 7.1.1 for a range of step sizes. Note that
for the very small step sizes the error oscillates because the condition error is not a
continuous function. For the higher step sizes the total error is dominated by the
truncation error which is a smooth function of the step size. We can change the
problem slightly to make it more ill-conditioned, and increase the condition error as
follows
10001u + xv = 1000,
xu + 10000v = 1000 .
The values of the forward- and central-difference approximations at x = 10000 are
shown in Figure 7.1.2. Now the range of acceptable step sizes is narrowed and we have
to use the central-difference approximation if we want to have a reasonable range .•••
0.2
0.0
du/dx
-0.2
Central difference
o Forward difference o
o
-0.4
0.001 0.01 0.1 1
Step Size
A bound e on the total error- the sum of the truncation and condition errors-
for the forward-difference approximation is obtained from Eqs. (7.1.4) and (7.1.6)
as
(7.1.7)
258
Section 7.1: Finite Difference Approximations
where Sb is a bound on the second derivative in the interval [x, x + Ax). When IOu and
Sb are available it is possible to calculate an optimum step-size that minimizes e as
AXopt
~
=2V~. (7.1.8)
Procedures for estimating Sb and IOu are given in [1) and [2).
7.1.2 Iterative Methods
Condition errors can become important when iterative methods are used for per-
forming some of the calculations. Consider a simple example of a single displacement
component u which is obtained by solving a nonlinear algebraic equation which de-
pends on one design variable x
f(x,u)=O. (7.1.9)
The solution of Eq. (7.1.9) is obtained by an iterative process which starts with
some initial guess of u and terminates when the iterate u is estimated to be within
some tolerance 10 of the exact u (Note that 10 is a bound on the condition error in
u). To calculate the derivative du/dx, assume that we use the forward-difference
approximation. That is, we perturb x by Ax and solve Eq. (7.1.9) for U6
f(x + Ax, U6) = O. (7.1.10)
The iterative solution of Eq.(7.1.10) yields an approximation U6, and then du/dx is
approximated as
(7.1.11)
To start the iterative process for obtaining U6, we can use either of two initial guesses.
The first is the same initial guess that was used to solve for u. If the convergence
of the iterative process is monotonic there is a good chance that when we use Eq.
(7.1.11) the errors in u and U6 will almost cancel out, and we will get a very small
condition error. The other logical initial guess for U6 is u. This initial guess is good if
Ax is small, and so we may get fast convergence. Unfortunately, this time we cannot
expect the condition errors to cancel. As we iterate on U6, the original error (the
difference between u and u) will be reduced at the same time that the change due to
Ax is taking effect. (Consider, for example, what happens if Ax is set to zero, or an
extremely small number).
Reference [3) suggests a strategy which allows us to start the iteration for U6 from
u without worrying about excessive condition errors. The approach is to pretend that
u is the exact rather than approximate solution by changing the problem that we want
to solve. Indeed, u is the exact solution of
f(x,u) - f(x,u) = 0, (7.1.12)
which is only slightly different from our original problem (because f(x, u) is almost
zero). We now find the derivative du/dx from Eq.(7.1.12), by obtaining U6 as the
solution of
f(x + Ax, U6) - f(x, u) = o. (7.1.13)
Because u is the exact solution of this equation for Ax = 0 the iterative process will
only reflect the effect of Ax .
259
Chapter 7: Sensitivity of Discrete Systems
Example 7.1.2
f(u,x)=u 2 -x=0,
260
Section 7.1: Finite Difference Approximations
Table 7.1.2 Effect of starting Ut., from Uo
x + ~x = 1000.1 x + ~x = 1100
~u/ ~x U2 ~u/ ~x
41.2454 31.6436 -96.0181 33.1755 -0.08070
32.7453 31.6244 -11.2093 33.1662 0.00421
31.6420 31.6243 - 0.1772 33.1663 0.01524
31.6228 31.6243 0.01572 33.1663 0.01543
except at very high accuracies (low c:). The effect of the finite difference increment
~x is also evident. The errors for the small ~x are larger than for the larger ~x,
except when uo has fully converged (so that there is no condition error).
We now use the approach of 7.1.13, replacing the original equation by
u2 - X - f = 0,
where 1 is the residual of the last iterate of the nominal solution. That is, for the
perturbed solution we try to calculate the root of x + f instead of x. The results
of the modified calculation are shown in Table 7.1.3. We can now get a reasonable
approximation to the derivative in two iterations .•••
Table 7.1.3 Modified derivative calculation
x + ~x = 1100 x + ~x = 1000.1
uo ~u/~x ~u/~x
41.2454 42.4404 0.01195 41.2466 0.01205
32.7453 34.2382 0.01493 32.7468 0.01511
31.6420 33.1846 0.01543 31.6436 0.01572
31.6228 33.1663 0.01543 31.6243 0.01572
Cost and accuracy considerations often dictate that we avoid the use of finite-
difference derivatives. For static displacement and stress constraints analytical deriva-
tives are fairly easy to get, as discussed in the next section.
It is well known that small displacements and stresses are not calculated as accurately
as large stresses and displacements. The same applies to derivatives. When both the
function u and the variable x are positive, the relative magnitUde of the derivative
can be estimated from the logarithmic derivative
diU d(logu) du/u
dx = d(logx) = dx/x' (7.1.14)
The logarithmic derivative gives the percentage change in u due to a percent change in
x. Therefore, when the logarithmic derivative is larger than unity the relative change
261
Chapter 7: Sensitivity of Discrete Systems
in u is larger than the relative change in x and the derivative can be considered to
be large. When the logarithmic derivative is much smaller than unity, the relative
change in u is much smaller than the relative change in x. In this case the derivative
is considered to be small, and in general, it would be difficult to evaluate it accmately
using finite-difference differentiation (or any other procedure subject to condition or
truncation errors). Fortunately, when the logarithmic derivative is small it is usually
not important to evaluate it accurately, because its influence on the optimization
process is small.
The logarithmic derivative can be misleading when a variable is about to change
sign so that it is very small in magnitude. In that case we recommend using typical
values of u and x instead of local values. That is, we define a modified logarithmic
derivative dim U / dx as
dlmu dU/Ut
(7.1.15)
dx dx/xt'
where Xt and Ut are representative values of the variable and the function, respectively.
Example 7.1.3
The increased error associated with small derivatives is demonstrated in the following
simple design problem. We consider the design of a submerged beam of rectangular
cross section so as to minimize the perimeter of the cross section (so as to reduce
corrosion damage). The beam is subject to a bending moment M and we require the
maximum bending stress to be less than the allowable stress ao. The design variables
are the width b and height h of the rectangular cross-section. The problem can be
formulated as
minimize 2(b + h),
6M
such that bh 2 :::; ao·
We nondimensionalize the problem by defining a characteristic length I and using it
to define new design variables Xl and X2 as
Xl = b/l, X2 = h/I.
minimize U = Xl + X2,
1
such that --2=1,
xlx2
where the inequality has been replaced by an equality because it is clear that the
stress constraint will be active (otherwise the solution is b = h = 0). The equality
can be used to eliminate Xl, so that the objective function can be written as
U = 1/ X~ + .1:2 •
262
Section 7.2: Sensitivity Derivatives of Static Displacement and Stress Constraints
We now consider the calculation of the derivative by finite differences at two points;
at an initial design where X2 = 1, and near the optimum, at X2 = 1.29. In both cases
we use forward differences with ~X2 = 0.01. At X2 = 1 we get
The equations of equilibrium in terms of the nodal displacement vector u are gener-
ated from a finite element model in the form
Ku=f, (7.2.1 )
where K is the stiffness matrix and f is a load vector. A typical constraint, involving
a limit on a displacement or a stress component, may be written as
g(u,x)~O, (7.2.2)
where, for the sake of simplified notation, it is assumed that g depends on only a
single design variable x. Using the chain rule of differentiation, we obtain
(7.2.3)
263
Chapter 7: Sensitivity of Discrete Systems
ag
Z;=-. (7.2.4)
au;
Note that we use the notation dg/dx to denote the total derivative of 9 with respect
to x. This total derivative includes the explicit part ag/ax plus the implicit part
through the dependence on u. The explicit part of the derivative is usually zero or
easy to obtain, so we discuss only the computation of the implicit part. Differentiating
Eq. (7.2.1) with respect to x we obtain
K du = df _ dK u . (7.2.5)
dx dx dx
Premultiplying Eq. (7.2.5) by zTK-l obtain
zT du = zTK-1(df _ dK u }. (7.2.6)
dx dx dx
KA=Z, (7.2.7)
(7.2.8)
selected to satisfy equations that lead to elimination of the derivative of the response.
For the present case we rewrite Eq. (7.2.3) as
so that
:~ = 3:I?(I~ + 31rl2 + 3111~),
;~ = - 3;11(31r + 61112 + 31~) = - ;It (II + 12)2 .
265
Chapter 7: Sensitivity of Discrete Systems
The finite element solution is based on a standard cubic beam element, with one
element used for each section. We denote the displacement and rotation at the ith
node by Wi and 8i , respectively. The element stiffness matrix is
12 61 -12
Ke _ EI [ 61 412 -61
- [3 -12 -61 12
61 212 -61
so that the global stiffness matrix, corresponding to degrees of freedom W2, 82 , w3,
83 , is
K = E [
12(h/lt + 12/m -6(h/li - 12/1~)
4(Idh + h/12)
-12h/~
-6/2/1?
6/2/1~
212/122
1
1212/12 -612 /1 2 •
sym 4/2/12
The load vector f = [0, O,p, OJT, and the solution for the displacement vector is
12 -6/1
oK
oh
= (E)
IT
U
[ -611
0
4/i
o
o o
~(;,)a} ,
where the solution for W2 and 82 was used. Similarly,
oK (Eh)
-36 12h 0
= [ 1211 -41i 0
&11U It 0 0 0
o 0 0
-6(1 + 12/1d }
= (E)
11
{ 2(11 +
0
12 )
o
In the direct method
266
Section 7.2: Sensitivity Derivatives of Static Displacement and Stress Constmints
or
~
ah
{~:} _ K-1 {;:tfI1}
W3 - 0
__ ...L.
-
{~i:~/; q~~/3
Ell 1112 +lt12 +1 1/3
3 }
'
03 0 1112 + It/2
so that ag/aI1 = -aw3/ah, which agrees with the beam-theory result.
Similarly
In the adjoint method, ZT = -aWtip/ au = [0,0, -1,0]' and we can solve for the
adjoint vector
ag = _,xTaK
ah ah u
= ~(Jl
Eh 3h
1?12 m2
+ 2h + 211 +
ltl~)
11
= ...L.(12[
Elf 1 2 +
[Z2
1 2 +
[3/3)
1 ,
and
•••
The difference between the computational effort associated with the direct
method and with the adjoint method depends on the relative number of constraints
and design variables. The direct method requires the solution of Eq. (7.2.5) once for
each design variable, while the adjoint method requires the solution of Eq. (7.2.7)
once for each constraint. Thus the direct method is the more efficient when the
number of design variables is smaller than the number of displacement and stress
constraints that need to be differentiated. The adjoint method is more efficient when
the number of design variables is larger than the number of these constraints.
In practical design situations we usually have to consider several load cases. The
effort associated with the direct method is approximately proportional to the number
of load cases. The number of critical constraints at the optimum design, on the other
267
Chapter 7: Sensitivity of Discrete Systems
hand, is usually less than the number of design variables. Therefore, in a multiple-
load-case situation the adjoint method becomes more attractive.
Both the direct and adjoint methods require the solution of a system of equations
as the major part of the computational effort. However, the factored form of the
matrix K of the equations is usually available from the solution of Eq. (7.2.1) for
the displacements. The solution for du/dx or A is therefore much cheaper than the
original solution of Eq. (7.2.1). This provides the major computational advantage of
these two analytical methods over the finite-difference calculation of the derivatives.
For example, the forward difference approximation to du/ dx
requires the evaluation of u(x + ~x) by re-assembling the stiffness matrix and load
vector at the perturbed design and solving
~g _ zT ~u + du TR du (7.2.13)
dxdy - dxdy ( dx ) dy ,
We obtain the second derivative of the displacement field by differentiating Eq. (7.2.5)
K d2 u = ~f _ ~K u _ dK du _ dK du .
(7.2.15)
dxdy dxdy dxdy dx dy dy dx
268
Section 7.2: Sensitivity Derivatives of Static Displacement and Stress Constraints
Solving Eq. (7.2.5) for du/dx, a similar equation for du/dy, and Eq. (7.2.15) for
d2 u/dxdy we finally substitute into Eq. (7.2.13).
The adjoint method starts by differentiating Eq. (7.2.8) with respect to y
K dA = R du _ dK A . (7.2.17)
dy dy dy
Using Eqs. (7.2.5) and (7.2.17), Eq. (7.2.13) becomes
Both the direct and adjoint methods require the derivatives of the stiffness matrix
and load vectors with respect to design variables. These derivatives are often difficult
to calculate analytically, especially for shape design variables which change element
geometry. For this reason a semi-analytical approach, where the derivatives of the
stiffness matrix and load vector are approximated by finite differences, is popular.
Typically, these derivatives are calculated by the first-order forward difference ap-
proximation, so that dK/dx is approximated as
269
Chapter 7: Sensitivity of Discrete Systems
lOOOOO.-------------------------~
10000
1000
~
Z
~ 100
r
!
I
w
10
0.1
0.01 "r------.------"T------T------l
1.[-10 1.£-8 1.£-8 1.£-2
ft!LATIV[ SI[' SIZ!
Figure 7.2.3 Errors in the derivative of the strain energy with respect to a length
variable of the stick model for overall-finite-differences (OFD) and semi-analytical
(SA) methods.
Figure (7.2.3) shows the dependence of the relative error of the derivative of the
strain energy of the model with respect to one length variable in the semi-analytical
(SA) method and the overall finite difference (OFD) approach. For large step sizes,
the OFD method has smaller error (mostly truncation error) than the SA method.
The step-size range for which the approximate derivative has an error less than 1%
270
Section 7.2: Sensitivity Derivatives of Static Displacement and Stress Constraints
is much larger for the OFD than for the SA approximation. For small step sizes the
OFD method has a larger error (mostly condition error) than the SA method. Figure
(7.2.3) shows that, for a relative step size of 10- 7 , the SA method approximates well
the derivative. For some variables, however, there was no step size giving accurate
derivatives! To solve the accuracy problem the central difference approximation to
the derivative of the stiffness matrix had to be used, which increased substantially
the computational cost.
IOOOOOr-------------------------~
~
'"...
O.I~--~--_r--~----r_--~--~
1.(-10 1.£-1 1.(-1 1.1-1 I.E-I 1.(-5 1.1-'
RELA11vE STEP SIZE
271
Chapter 7: Sensitivity of Discrete Systems
I~~----------------------------------~
uoo
1300
; 1000
I'j
•w
L
:! lOa
!
...
Ii 100
'00
aoo
0
a s 10 IS 2fI
MINIER Of !LlIII IITS
This phenomenon is demonstrated in Fig. (7.2.5) which shows that the error in
the derivative of the tip displacement of a cantilever beam with respect to the length
of the beam greatly increases as the finite-element mesh is refined.
When a beam or a plate structure is modeled by more general elements, such as
three dimensional elements, mesh refinement is no problem. However, as the beam
becomes more slender or the plate thinner, the displacement-derivative field becomes
more and more incompatible with the geometry, and the same accuracy problems
ensue. Reference [6] reports very large errors for beams modeled by truss, plane-
stress and solid elements for slenderness ratios larger than ten.
Example 7.2.2
272
Section 7.2: Sensitivity Derivatives of Static Displacement and Stress Constraints
eT D.it
~ - 11 + 12 '
because K is a linear function of II' The situation is not as good for the truncation
error 8g / 8it which is approximately
D.ll T 8 2K pD.ll ( 2 2)
eT = -,\ 81 2 U= - E1 31 1 +7l 112 +4'2 ,
2 1 h 1
Comparing the semi-analytical error to the one obtained by the finite difference ap-
proach, we note that it is seven times larger when II = 12 . As shown in Ref. [7], this
larger error for the semi-analytical method increases a..'i the mesh is refined .•••
f( u, x) = f.lp( x ) , (7.2.20)
273
Chapter 7: Sensitivity of Discrete Systems
where f is the internal force generated by the deformation of the structure, and J.1p
is the external applied load. The load scaling factor J.1 is used in nonlinear analysis
procedures for tracking the evolution of the solution as the load is increased. This is
useful because the equations of equilibrium may have several solutions for the same
applied loads. By increasing J.1 gradually we make sure that we obtain the solution
that corresponds to the structure being loaded from zero.
Differentiating Eq. (7.2.20) with respect to the design variable x we obtain
Jdu _ J.1d p _ af
(7.2.21)
dx - dx ax'
where J is the Jacobian of f at u,
(7.2.22)
(7.2.24)
where again z is the vector of derivatives of the constraint with respect to the dis-
placement components, Zi = agjaui. It is easy to check that we obtain
At a critical point with the load value denoted as J.1*, the tangential stiffness matrix J
becomes singular, and we can have either a bifurcation point or a limit load. \Ve can
distinguish between the two by differentiating Eq. (7.2.20) with respect to a loading
274
Section 7.3: Sensitivity Derivatives of Static Displacement and Stress Constmints
parameter that increases monotonically throughout the loading history. The load
parameter J.l is not a good choice, because at a limit point it reaches a maximum and
is not monotonic. Instead we often use a displacement component, known to increase
monotonically, or the arc length in the (u, /J) space. We denote such a monotonic load
parameter by a, and denote a derivative with respect to a by a prime. Differentiating
Eq. (7.2.20) with respect to a we get
Ju'=J.l'p. (7.2.26)
At a critical point, J is singular, and we denote the left eigenvector associated with
the zero eigenvalue of J by v, that is
v T J* = 0, (7.2.27)
where the asterisk denotes quantities evaluated at the critical point. Premultiplying
Eq. (7.2.26) by v T , we get
/J'v T P = o. (7.2.28)
At a limit point this equation is satisfied because the load reaches a maximum, and
then J.l' = o. In that case, Eq. (7.2.26) indicates that the buckling mode, which is
the right eigenvector of the tangential stiffness matrix J, is equal to the derivative of
u with respect to the loading parameter. At a bifurcation point J.l' =I- 0, and instead
v T p = O. (7.2.29)
For a symmetric tangential stiffness matrix v is also the buckling mode, and Eq.
(7.2.29) indicates that the buckling mode is orthogonal to the load vector.
To calculate sensitivity of limit loads we need to consider a more general response
path parameter v which can be a load parameter, a design variable, or a combination
of both-a parameter that controls both structural design and loading simultaneously.
We denote differentiation with respect to v by a dot and differentiate Eq. (7.2.20)
with respect to v to get
J. af . . dp.
u + ax x = /Jp + /J dx x . (7.2.30)
We now want a parameter v that controls the design variable x and the load parameter
J.l so that we remain at a limit load, J.l = J.l*. We select v = x, and then Eq. (7.2.30)
becomes
J* . ( af)* _ d/J* * dp (7.2.31 )
u + ax - dx p + J.l dx '
where we used the fact that for our choice of parameter x = 1. Premultiplying Eq.
(7.2.31) by the left eigenvector, v T , and rearranging we get
af)* _ J.l* d P ]
vT [(
d/J*
dx
= ax
--~~~--~~
vTp
dx
(7.2.32)
The quantity in brackets in the numerator of Eq. (7.2.32) is the derivative of the
residual of the equations of equilibrium at the limit point. Thus we can use the
semi-analytical method to evaluate the limit load sensitivity as follows: We perturb
the design variable, calculate the change in the residual (for fixed displacements) and
take the dot product with the buckling mode to get the numerator. The denominator
is the dot product of the buckling mode with the load vector.
275
Chapter 7: Sensitivity of Discrete Systems
Undamped vibration and linear buckling analysis lead to eigenvalue problems of the
type
KU-JiMu=O, (7.3.1)
where K is the stiffness matrix, M is the mass matrix (vibration) or the geometric
stiffness matrix (buckling) and u is the mode shape. For vibration problems JI is the
square of the frequency of free vibration, and for buckling problems it is the buckling
load factor. Both K and M are symmetric, and K is positive semidefinite. The mode
shape is often normalized with a symmetric positive definite matrix W such that
uTWu = 1, (7.3.2)
where, for vibration problems, W is usually the mass matrix M. Equations (7.3.1)
and (7.3.2) hold for all eigenpairs (Jik, uk). Differentiating these equations with re-
spect to a design variable x we obtain
du dJ1 dK dM
(K - JiM)- - -Mu = - ( - - J1-)u, (7.3.3)
dx dx dx dx
and
(7.3.4)
where we have used of the symmetry of W. Equations (7.3.3) and (7.3.4) are valid
only for the case of distinct eigenvalues (repeated eigenvalues are, in general, not
differentiable, and only directional derivatives may be obtained, see Haug et al. [8]).
In most applications we are interested only in the derivatives of the eigenvalues.
These derivatives may be obtained by premultiplying Eq. (7.3.3) by u T to obtain
T dK dM
u ( - -Jl-)u
dJ1 dx dx (7.3.5)
dx
In some applications the derivatives of the eigenvectors are also required. For ex-
ample, in automobile design we often require that critical vibration modes have low
amplitudes at the front seats. For this design problem we need derivatives of the
276
Section 7.3: Sensitivity Calculations for Eigenvalue Problems
mode shape. To obtain eigenvector derivatives we can use the direct approach and
combine Eqs. (7.3.3) and (7.3.4) as
[
K - pM
_ TW
u
-MU] { ;:-1:!:. } = { -(: -
0
dx
!uTaw u
2 (IX
P :)u} . (7.3.6)
The system (7.3.6) may be solved for the derivatives of the eigenvalue and the eigen-
vector. However, care must be taken in the solution process because the principal
minor K -I-lM is singular. Cardani and Mantegazza [9] and Murthy and Haftka [10]
discuss several solution strategies which address this problem.
One of the more popular solution techniques is due to Nelson[l1]. Nelson's
method temporarily replaces the normalization condition, Eq. (7.3.2), by the re-
quirement that the largest component of the eigenvector be equal to one. Denoting
this re-normalized vector ii, and assuming that its largest component is the mth one,
we replace Eq. (7.3.2) by
(7.3.7)
and Eq.(7.3.4) by
d;; = 0 . (7.3.8)
Equation (7.3.3) is valid with u replaced by ii, but Eq. (7.3.8) is used to reduce its
order by deleting the mth row and the mth column. When the eigenvalue P is distinct,
the reduced system is not singular, and may be solved by standard techniques.
To retrieve the derivative of the eigenvector with the original normalization of
Eq. (7.3.2) we note that u = umii, so that
du dUm _ dii
dx = dx u + Urn dx ' (7.3.9)
and dUm/dx may be obtained by substituting Eq. (7.3.9) into Eq. (7.3.4) to obtain
dUm
dx
= _u2m uTWdii
dx
_ umuTaw U
2 dx
. ()
7.3.10
We can also use an adjoint or modal technique for calculating the derivatives of
the eigenvector by expanding that derivative as a linear combination of eigenvectors.
That is, denoting the ith eigenpair of Eq. (7.3.1) by (Pi, u i ) we assume
dUk
- =
L:
1
.
CkjU), (7.3.11)
dx .
)=1
and the coefficients Ckj can be shown to be (see, for example, Rogers [12])
'T dK dM k
uJ (~-Pk~)U
Ckj = 'T' k #j. (7.3.12)
(I-lk - Pj)u) MuJ
277
Chapter 7: Sensitivity of Discrete Systems
On the other 111md, if we use the normalization condition of Eq. (7.3.2) with W = M,
we get
1 k T dM k
Ckk = --(u ) - u . (7.3.14)
2 dx
If all the eigenvectors are included in the sum, Eq. (7.3.11) is exact. For most
problems it is not practical to calelllate all the eigenvectors, so that only a few of the
eigenvectors associated with the lowest eigenvalues are included. Wang [13J developed
a modified modal method that accelerates the convergence. Instead of Eq. (7.3.11)
we use
1 k I
eu _
- d - u 8 + ~dkjU ,
k ""' j (7.3.15)
X j=1
where
(7.3.16)
k i- j. (7.3.17)
The coefficient dkk is still given by Eq. (7.3.14) for the normalization condition of
uTMu = 1. For the normalization condition of (7.3.7)
dkk -- k -
-11 8m ""'d
~kjVmj . (7.3.18)
j#
Sutter et al. [14J present a study of the convergence of the derivative with increasing
number of modes using both the modal method and the modified modal method and
demonstrate the improved convergence of the modified modal method.
Example 7.3.1
The spring-mass-dashpot system shown in Fig. (7.3.1) is analysed here for the case
that the dash pot is inactivated, that is c = O. Initially the two masses and the
three springs have values of 1, and we want to calculate the derivatives of the lowest
vibration frequency and the lowest vibration mode with respect to k for two possible
normalization conditions: one of the form Eq. (7.3.2) with W = M, and one of the
form Eq. (7.3.7) with the second component of the mode set to 1.
278
Section 7.3: Sensitivity Calculations for Eigenvalue Problems
-1]
2 ' M=[~ ~].
For k = 1, the eigenvalue problem, Eq. (7.3.1) becomes
2 --1w 2 ]{UI}
2 - w2
[
-1 U2 = O. (a)
Setting the determinant of the system to zero we get the two frequencies, WI = 1,
and W2 = v'3. Substituting back the lowest frequency into Eq. (a) we get for the first
vibration mode
UI -
U2 = 0,
-UI
+ U2 = O.
As expected, the system is singular at a natural frequency, so that we need the nor-
malization condition to determine the eigenvector. For the normalization condition
(7.3.2) the additional equation is
uTMu = ui + u~ = 1 .
For the normalization condition Eq. (7.3.7), the condition is
where we use the bar to denote the vibration mode with the second normalization
condition. The solutions with the normalization conditions are
279
Chapter 7: Sensitivity of Discrete Systems
Next we calculate the derivative of the lowest frequency from Eq. (7.3.5) using primes
to denote derivatives with respect to k. For our example
K' = [10 0
0] ' M'=O.
We use the mode normalized by the mass matrix in Eq. (7.3.5), so that the denomi-
nator is equal to 1, and then
\Ve can also get the derivative of the frequency and the mode together by using Eq.
(7.3.6). We note that
1 -1
[ -1 1
-J2/2 -J2/2
\Ve solve this equation to get
J1
'M-u = 0.5-u = { 0.5
0.5 } ' -(K' - ItM')ii = -K'ii = { ~1 } .
Then Eq. (7.3.3), with ii replacing u, and the additional condition yield
u~ -fl~ = -0.5,
-fl~ +fl; = 0.5,
fl; = O.
The solution is
u~ = -0.5, u; = O.
We can show that u can indeed be retrieved from ii' by using Eqs. (7.3.9) and
(7.3.10). Equation (7.3.10) becomes
When the eigenvalue f.l is repeated with a multiplicity of m, there are m linearly
independent eigenvectors associated with it. Furthermore, any linear combination
of these eigenvectors is also an eigenvector, so that the choice of eigenvectors is not
unique. In this case the eigenvectors that are obtained from a structural analysis
program will be determined by the idiosyncrasies of the computational procedure
used for the solution of the eigenproblem. Assuming that u 1 , ..• , urn is a set of
linearly independent eigenvectors associated with f.l, we may write any eigenvector
associated with f.l as
rn
u= Lq;u i =Uq, (7.3.19)
;=1
(7.3.20)
where
(7.3.21)
and
(7.3.22)
Equation (7.3.20) is an m X m eigenvalue problem for df.l/dx. The m solutions
correspond to the derivatives of the m eigenvalues derived from f.l as x is changed, and
the eigenvectors q give us, through Eq. (7.3.19), the eigenvectors associated with the
perturbed eigenvalues. A generalization of Nelson's method to obtain derivatives of
the eigenvectors was suggested by Ojalvo [15] and amended by Mills-Curran [16] and
Dailey [17]. Their procedure seems to contradict the earlier assertion that repeated
eigenvalues are not differentiable. However, while we can find derivatives with respect
to any individual variable, these are only good as directional derivatives, in that
derivatives with respect to x and y cannot be combined in a linear fashion. That is
Of.l Of.l
ax
df.l= -dx+-dy
oy (7.3.23)
281
Chapter 7: Sensitivity of Discrete Systems
Example 7.3.2
K= [ 2 + Y
x 2'
x] W=M=1.
(a)
The two eigenvalues are identical for x = =y 0, and we will first demonstrate that
the eigenvectors are discontinuous at the origin. In fact for x = 0 the two eigenvectors
are
and for y = 0
1
u {I}
2
=l'u= {-I}l'
Obviously, we can get either set of eigenvectors as close to the origin as we wish by
approaching it either along the x axis or along the y axis.
Next we calculate the derivatives of the two eigenvalues with respect to x and y
at the origin. At (0,0) any vector is an eigenvector, and we select the two coordinate
n.
unit vectors as a basis, that is
u = [~
We first calculate derivatives with respect to x, and using Eqs. (7.3.21) and (7.3.22)
we get
B= [~ ~]
The solution of the eigenvalue problem, Eq. (7.3.20) is
282
Section 7.3: Sensitivity Calculations for Eigenvalue Problems
and because U is the unit matrix, from Eq. (7.3.19) u i = qi. It is easy to check that
these are indeed the eigenvectors along the y axis (x, 0). Similarly, for derivatives
with respect to y we have
A= [~ ~], B= [~ ~],
and the two eigenvalues of Eq. (7.3.20) are
To see that the above derivatives cannot be used to calculate the change in I-l due to
a simultaneous change in x and y, consider an infinitesimal change dy = 2dx = 2dt.
From the solution for the two eigenvalues, Eq. (a), we have
dl-l = dt ± V2dt .
On the other hand, Eq. (7.3.23) yields four values depending on which of two values
we use for the x and y derivatives. These are 3dt, dt, dt, and -dt .•••
The implications of the failure of calculating a derivative in an arbitrary direction
from derivatives in the coordinate directions are quite serious. Most optimization al-
gorithms rely on these calculations to choose move directions or to estimate objective
function and constraints. Therefore, these algorithms could experience serious dif-
ficulties for problems with repeated eigenvalues. On the bright side, computational
experience shows that even minute differences between eigenvalues are often sufficient
to prevent such difficulties. Furthermore, the coalescence of eigenvalues often has an
adverse effect on structural performance. In buckling problems it is associated with
imperfection sensitivity, and for structural control problems coalescence of vibration
frequencies can lead to control difficulties. Therefore, constraints are often used to
separate the eigenvalues in design problems.
(7.3.25)
we get
[f1?M + pC + KJu = o. (7.3.26)
Note that we have not defined the eigenvalue p in the way we did for the undamped
vibration problem. There p was the square of the frequency, while here, when C = 0,
we get p = iw where w is the vibration frequency. The derivative of the eigenvalue
p with respect to a design variable x is obtained by differentiating Eq. (7.3.26) with
respect to x and premultiplying by u T
dp
dx = (7.3.27)
This equation can be used for estimating the effect of adding a small amount of
damping to an undamped system. For the undamped system C = 0, the eigenvalue
is p = iw, and the eigenvector is the vibration mode that we will denote here as 41 to
distinguish it from the damped mode u. Then Eq. (7.3.27) becomes
dp
(7.3.28)
dx
Example 7.3.3
Use linear extrapolation to estimate the effect ofthe dashpot in Figure (7.3.1) on the
first vibration mode, and then compare with the exact effect for c = 0.2, and c = 1.0.
For this example we take x = c and then (using K, and M from Example 7.3.1)
dM dK
-=-=0,
dx dx
dC _
dx-
[1 0]
00'
Using the first vibration mode from Example (7.3.1) which is normalized so that the
denominator of Eq. (7.3.28) is 1, (41 1 f = (v'2/2)[1 ,1]' we get
dp _ dp, TdC
dc = dx = -0.541 dx 41 = -0.25.
From Example (7.3.1), the frequency of the first natural mode is W1 = 1 (which
corresponds to p = i in the notation of this section). Then using linear extrapolation
to calculate an approximate eigenvalue Pa we get
P,a = P I
c=O
dp
+ -dC c =
.
-0.25c + z .
284
Section 7.3: Sensitivity Calculations for Eigenvalue Problems
For the two given values of c = 0.2, and c = 1.0, the approximate eigenvalues are
-0.05 + i, and -0.25 + i, respectively. We compare this approximation to the exact
result obtained by solving Eq. (7.3.26); this yields
2
[ /1 + C/1 +
-1
2 /1 -1]
2+2
{u1 } = O.
U2
(a)
The eigenvalue /1 is obtained by setting the determinant of this equation to zero. For
the two values of c we get
C = 0.2 : /1 = -0.05025 + 1.0013i .
c = 1.0: /1 = -0.29178 + 1.0326i .
We see that the prediction that C changes only the damping and not the frequency
is quite good, and that linear extrapolation worked quite well for predicting the
damping .•••
The order of the damped eigenproblem is commonly reduced by approximating
the damped mode as a linear combination of a small number of natural vibration
modes u i , i = 1 ... ,m. This may be written as
u=Uq, (7.3.29)
where U is a matrix with u i as columns, and q is a vector of modal amplitudes.
Substituting Eq. (7.3.29) into Eq. (7.3.26) and premultiplying by U T we get
[/12M R + /1C R + KRlq = 0, (7.3.30)
where
(7.3.31)
After we solve for the reduced eigenvector q from Eq. (7.3.30), we can calculate
the derivative of the eigenvalue using two approaches. The first approach, called the
fixed-mode approach, employs Eq. (7.3.27) with It calculated from Eq. (7.3.30) and
u given by Eq. (7.3.29). The second approach, called the updated-mode approach,
uses Eq. (7.3.27) for the reduced problem, that is
2 TdMR TdCR TdK R
d/1 /1 q --q+/1q - q + q - - q
dx dx dx
dx = (7.3.32)
285
Chapter 7: Sensitivity of Discrete Systems
Example 7.3.4
For the spring-mass-dashpot example shown in Fig. (7.3.1) construct a reduced model
based only on the first vibration mode. Calculate the fixed-mode and updated-mode
derivatives of the eigenvalue associated with the lowest frequency with respect to the
constant k of the leftmost spring. Compare with the exact derivatives for c = 0.2
and c = 1.0.
Full-model analysis:
The eigenvalue problem for this example is given by Eq. (a) of Example (7.3.3),
and the exact eigenvalue is solved in that example for the two required values of c.
For the eigenvector we use a normalization condition that the second component, U2,
is equal to 1, and employ the second equation of the eigenproblem to obtain
u_- {p2 1+ I} .
To calculate the derivative of p with respect to the stiffness k of the leftmost spring
we use Eq. (7.3.27) with matrices calculated in Examples 7.3.1 and 7.3.3
C=[~~], -1]
2 '
For the two values of c we get (see Example 7.3.3 for values of p)
286
Section 7.3: Sensitivity Calculations for Eigenvalue Problems
Since we use only one mode for the reduced basis, U = u, and using Eq. (7.3.31)
with k = 1 we get
MR = 2, C R = C, KR = 2.
Equation (7.3.30) for the reduced system becomes
so that
JtR = -0.25c + i,h - 0.0625c2 ,
where the subscript R is used to denote the fact that this is the eigenvalue obtained
from the reduced system. The eigenvector, which has only one component, we select
as q = 1. For the two values of c we get
-1
tt'= ~~~~~~-=
uT(C + 2JtRM)u
------
c + 4JtR '
dJtT / itT
dk/k = 0.02525/( -0.05025) = -0.5025,
287
Chapter 7: Sensitivity of Discrete Systems
Updated-mode derivative:
In this case we need the derivative of the vibration mode with respect to k. This
was calculated in Example (7.3.1) as (remember that we use fi. from that example)
u ,_
- {-0.5}
0 .
n{-~.5
Similarly
M~ = 2u™u' = 2[ 1 1] [6 } = -1,
In many applications the damping matrix is not symmetric, and then it is con-
venient to transform the equations of motion Eq. (7.3.24) to a first order system
Bw+Aw=O, (7.3.34)
where
(7.3.35)
Setting
w=wel-'t, (7.3.36)
we get a first-order eigenvalue problem
AW+/lBw = O. (7.3.37)
For calculating the derivatives of the eigenvalues it is convenient to use the left eigen-
vector v which is the solution of the associated eigenproblem
(7.3.38)
288
Section 7.3: Sensitivity Calculations for Eigenvalue Problems
The two eigenproblems defined in Eqs. (7.3.38) and (7.3.37) are easily shown to have
the same eigenvalues (e.g., [18]). Differentiating (7.3.37) with respect to a design
variable x
dw dA dB dJl
(A + JlB) dx + ( dx + Jl dx )w + dx Bw = 0 , (7.3.39)
and premultiplying by yT we get
y
T dA
(-+Jl- w
dB)
dJl dx dx
dx =
(7.3.40)
To obtain derivatives of the eigenvector we need a normalization condition. A
quadratic condition such as Eq. (7.3.2) is inappropriate because the eigenvector is
complex and wTWw can be zero. Even if we eliminate this possibility by replacing
the transpose with the hermitian transpose, the condition
wHWw = 1 (7.3.41)
does not define the eigenvector uniquely because we can still multiply the eigenvector
by any complex number of modulus one without changing the product in Eq. (7.3.41).
Therefore, it is more reasonable to normalize the eigenvector by requiring that
yTBw = 1, = Vm = 1, Wm (7.3.42)
where m is chosen so that both Wm and Vm are not small compared to other compo-
nents of wand y. The derivative of the normalization condition gives us
dW m dV m = 0
dx = 0, (7.3.43)
dx '
and together with Eq. (7.3.39) we can solve for the derivative of the eigenvector. This
is the direct method for calculating the eigenvector derivatives. As in the symmetric
case, the adjoint method for calculating the same derivatives is based on expressing
the derivative of the eigenvector in terms of all the eigenvectors of the problem.
Denoting the ith eigenvalue as Jli and the corresponding eigenvectors as Wi and Vi
we assume
(7.3.44)
289
Chapter 7: Sensitivity 01 Discrete Systems
7.3.3 Sensitivity Derivatives lor Nonlinear Eigenvalue Problems
290
Section 7.4: Sensitivity of Constraints on Transient Response
7.4 Sensitivity of Constraints on Transient Response
One way of removing the time dependence of the constraint is to replace it with an
equivalent integrated constraint which averages the Reverity of the constraint over the
time interval. An example is the equivalent exterior constraint
where < a > denotes max(a, 0). The equivalent constraint g is violated if the original
constraint is violated for any finite period oftimc. If, however, g(u, x, t) is not violated
anywhere, g( u, x) is zero. The equivalent exterior constraint is identically zero in
the feasible domain, and so no indication is provided when the conRtraint is almost
critical. An equivalent constraint which is nonzero when the constraint is satisfied is
based on the Kresselmeier-Steinhauser function, [21, 22], and Eq. (7.4.2)
-1
g(u, x) = -in [ Ln, e-pgidt 1 , (7.4.4)
p ;=1
where p is a parameter which determines the relation between g and the most critical
value of g, gmin. Indeed, we can write Eq. (7.4.4) as
11 = 9mill - 1 [
-in Ln, e-P(gi-9minldt1 (7.4.5)
p ;=1
291
Chapter 7: Sensitivity of Discrete Systems
so that 9 is an envelope constraint in that it is always more critical than g. The
parameter p determines how much more critical 9 is. However, if p is made too
large for the purpose of reducing the difference between 9 and gmin, the problem can
become ill conditioned.
The savings obtained by replacing the discretized constraint, Eq. (7.4.2), by an
equivalent one may seem illusory because the integral in Eq. (7.4.3) or the sum in
Eq. (7.4.4) usually require the evaluation of g(u, x, t) at many time points. The
savings are realized in the optimization effort and in the computation of constraint
derivatives discussed later.
nominal design
Constraint - - - - - perturbed design
function
time
The disadvantage of equivalent constraints is that they may tend to blur design
trends. Consider, for example a change in design which moves the constraint 9 from
the solid to the dashed line in Fig. (7.4.1). An equivalent constraint 9 may become
more positive, indicating a beneficial effect, while the situation has become more
critical because we have moved closer to the constraint boundary (g = 0), at least at
some time point tml' To avoid this blurring effect we use the critical point constraint
replacing the original constraint by
where tmi are time points where the constraint has a local minimum. Figure (7.4.1)
shows a typical situation where the constraint function has two local minima: an
interior one at t m1 , and a boundary minimum at tm2' The local minima are critical
points in the sense that they represent time points likely to be involved first in
constraint violations.
One attractive feature of the critical point constraint is that, for the purpose of
obtaining first derivatives, the location of the critical point may be assumed to be
292
Section 7.4: Sensitivity of Constraints on Transient Response
fixed in time. This is shown by differentiating Eq. (7.4.7) with respect to the design
variable x
dg( tmi) 8g 8g du 8g dt mi
-"-d'-x---'- = 8x + -8u- -dx + at -d-x- . (7.4.8)
The last term in Eq. (7.4.8) is always zero. At an interior minimum such as tml in
Fig. (7.4.1) 8gj8t is zero. We get a boundary minimum when 8gj8t is positive at
the left boundary or negative at the right boundary. This boundary minimum cannot
move away from the boundary unless the slope, 8gj8t becomes zero. This means that
as long as 8g j at is nonzero at a boundary minimum, the minimum cannot move, so
thatdt m ; / dx is zero.
For the purpose of calculating derivatives of constraints we assume that the constraint
is of the form
g(u, x) =
t, p(u, x, t)dt
Jo ~ o. (7.4.9)
This form represents most equivalent constraints, as well as the critical-point con-
straint, which can be obtained by defining
(7.4.11)
To evaluate the integral we need to differentiate the equations of motion with respect
to x. These equations are written in a general first-order form
A diI = J du _ dA iI + 8f
dx dx dx 8x' ~:(O) = 0, (7.4.13)
The direct method consists of solving for dujdx from Eq. (7.4.13), and then substi-
tuting into Eq. (7.4.11). The disadvantage ofthis method is that each design variable
293
Chapter 7: Sensitivity of Discrete Systems
requires the solution of a system of differential equations, Eq.(7.4.13). When we have
many design variables and few constraint functions we can, as in the static case,
use a vector of adjoint variables which depends only on the constraint functions and
not on the design variables. To obtain the adjoint method, we pursue the standard
procedure of multiplying the derivatives of the response equations, Eq. (7.4.13), by
an adjoint vector and adding them to the derivatives of the constraint
dg = {t, {a p _ AT (Of _ dA
dx io ox ox dx
it) + [opAU _ AT (.A + J) _ O. f A] dU}
dx
dt
ATAdult,
+ dxlo .
(7.4.16)
Equation (7.4.16) indicates that the adjoint variable should satisfy
dg = (, [op _
dx io ax
>.7 (af _ dA
ax dx
it)] dt , (7.4.18)
where we used the fact that dujdx is zero at t = O. Equation (7.4.17) is a system of
ordinary differential equations for A which are integrated backwards (from t J to 0).
This system has to be solved once for each constraint rather than once for each design
variable. As in the static case, the direct method is preferable when the number of
design variable is smaller than the number of constraints, and the adjoint method
is preferable otherwise. Equation (7.4.17) takes a simpler form for the critical-point
constraint
(7.4.19)
By integrating Eq. (7.4.19) from tmi - f to tmi + f for an infinitesimal f, we can easily
show that Eq. (7.4.19) is equivalent to
(7.4.20)
A third method available for derivative calculation is the Green's function ap-
proach [23]. This method is useful when the number of degrees of freedom in Eq.
294
Section 7.4: Sensitivity of Constraints on Transient Response
(7.4.12) is smaller than either the number of design variables or the number of con-
straints. This can happen when the order of Eq. (7.4.12) has been reduced by
employing modal analysis. The Green's function method will be discussed for the
case of A = I in Eq. (7.4.12) so that Eq. (7.4.13) becomes
du
dx (0) = o. (7.4.21)
The solution of Eq. (7.4.21) may be written [23] in terms of Green's function K(t, T)
as
(7.4.22)
and where 8(t - T) is the Dirac delta function. It is easy to check, by direct substi-
tution, that du/dx defined by Eq. (7.4.22), indeed satisfies Eq. (7.4.21).
If the elements of J are bounded then it can be shown that Eq. (7.4.23) is
equivalent to
K(t,T) =0, t < T,
K(T, T) = I, (7.4.24)
K(t, T) - J(t)K(t, T) = 0, t > T.
Therefore, the integration of Eq. (7.4.22) needs to be carried out only up to T = t. To
see how du/dx is evaluated with the aid of Eq. (7.4.24), assume that we divide the
interval 0 ~ t ~ t f into n subintervals with end points at TO = 0 < t[ < ... < tn = t f.
The end points T; are dense enough to evaluate Eq. (7.4.22) by numerical integration
and to interpolate du/ dx to other time points of interest with sufficient accuracy. \Ve
now define the initial value problem
Each of the equations in (7.4.25) is integrated from Tk to Tk+l to yield K(Tk+l' Tk)'
The value of K for any other pair of points is given by (sec [23] for proof)
j > k. (7.4.26)
The solution for K is equivalent to solving nm systems of the type of Eq. (7.4.13)
or (7.4.20) where nm is the order of the vector u. Therefore, the Green's function
method should be considered for cases where the number of design variables and
constraints both exceed n m • This is likely to happen when the order of the system
has been reduced by using some type of modal or reduced-basis approximation.
295
Chapter 7: Sensitivity of Discrete Systems
Example 7.4.1
u(O) = 0,
g(u)=c-u(t)~O,
The response has been calculated and found to be monotonically increasing, so that
the critical-point constraint takes the form
\Ve want to use the direct, adjoint, and Green's function methods to calculate the
derivative of !J with respect to a and b.
The problem may be integrated directly to yield
b2 t
U=--.
bt +a
In our notation
of
A =a, J= au =2(u-b).
Direct Method. The direct method requires us to write Eq. (7.4.13) for x = a and
x= b. For x = a we obtain
dit du
a - - 2(u - b)- - it
da - da' ~~ (0) = O.
In general the values for u and it would be available only numerically, so that the
equation for duj da will also be integrated numerically. Here, however, we have the
closed-form solution for u, so that we can substitute it into the derivative equation
dit 2ab du ab 2 du
a-=---- , da (0) = 0,
da bt + a da (bt + a)2
du b2 t
da (bt + a)2 .
Then
296
Section 7-4: Sensitivity of Constraints on Transient Response
We now repeat the process for x = b. Equation (7.4.13) becomes
dit du
a- = 2(u - b)- - 2(u - b)
db db '
Solving for dujdb we obtain
du b2 t 2 + 2abt
db (bt + a)2 ,
and then
dg du b2 t} + 2abt I
db =- db (t/) =- (btl+a)2 .
Adjoint Method. The adjoint method requires the solution of Eq. (7.4.20) which
becomes
a)..+2(u-b) .. =O,
or
. 2ab
a)..---)..=O
bt + a '
which can be integrated to yield
).. = ~( bt + a )2.
a btl +a
Then dg j do, is obtained from Eq. (7.4.18) which becomes
Similarly, dg j db is
dg t, 2 (, bt + a 2 ab b2 tj + 2abt I
db = Jo 2)..(u-b)dt=-~Jo (btl+) bt+a =- (btl+a)2 .
dt
it=(u-b?ja,
or
. 2b
k(t, T) + --k(t, T) = O.
bt + a
297
Chapter 7: Sensitivity of Discrete Systems
The solution for k is
k = (bT+ a)2 t ~ T,
bt +a '
so that from Eq. (7.4.22)
Similarly
For the case of linear structural dynamics it may be advantageous to retain the second-
order equations of motion rather than reduce them to a set of first-order equations.
It is also common to use modal reduction for this case. In this section we discuss the
application of the direct and adjoint methods to this special case. The equations of
motion are written as
Mii + Cu+Ku = f(t). (7.4.27)
Most often the problem is reduced in size by expressing u in terms of m basis functions
u i, i = 1, ... m, where m is usually much less than the number of degrees of freedom
of the original system, Eq.(7.4.27)
u=Uq, (7.4.28)
(7.4.30)
When the basis functions are the first m natural vibration modes of the structure
scaled to unit modal masses, U satisfies the equation
where 0 is a diagonal matrix with the ith natural frequency Wi in the ith row. In that
case KR = 0 2 and MR = I are diagonal matrices. For special forms of damping, the
damping matrix C R is also diagonal so that the system Eq. (7.4.29) is uncoupled.
After q is calculated from Eq. (7.4.29) we can use Eq. (7.4.28) to calculate u. This
modal reduction method is known as the mode-displacement method.
298
Section 7.4: Sensitivity of Constraints on Transient Response
When the load f has spatial discontinuities the convergence of the modal approx-
imation, Eq. (7.4.29) can be very slow [24, 25]. The convergence can be dramatically
accelerated by using the mode acceleration method, originally proposed by Williams
[26]. The mode acceleration method can be derived by rewriting Eq. (7.4.27) as
u = K-1f - K-1Cu - K-1Mii. (7.4.32)
The first term in Eq. (7.4.32) is called the quasi-static solution because it represents
the response of the structure if the loads are applied very slowly. The second term
and third terms are approximated in terms of the modal solution. It can be shown
(e.g., Greene [27]) that K- 1 can be approximated as
K- 1 = U9-2 U T (7.4.33) .
Using this approximation for the second and third terms of Eq. (7.4.32) we get
(7.4.34)
This approximation is exact when U contains the full set of vibration modes. Note
that q and q in Eq. (7.4.34) are obtained from the mode-displacement solution, Eq.
(7.4.29). Therefore, there is no difference in velocities and accelerations between the
mode-displacement and the mode acceleration methods.
In considering the calculation of sensitivities we treat first the mode-displacement
method. The direct method of calculating the response sensitivity is obtained by
differentiating Eq. (7.4.29) to obtain
dq dq dq
MR- +C R - +K R - = r, (7.4.35)
dx dx dx
where
dfR dM R .. dM R . dK R
r= - - --q- --q - --q. (7.4.36)
dx dx dx dx
The derivative of KR with respect to x is given by Eq. (7.3.33), and similar expres-
sions are used for the derivatives of M R , C R , and fRo The calculation is simplified
considerably by using a fixed set of basis functions U or neglecting the effect of the
change in the modes. In some cases (e.g., [28]) the error associated with neglecting
the effect of changing modes is small. When this error is unacceptable we have to
face the costly calculation of the derivatives of the modes needed for calculating the
derivatives of the reduced matrices, such as Eq. (7.3.33). Fortunately it was found
by Greene [27] that the cost of calculating the derivatives of the modes can be sub-
stantially reduced by using the modified modal method Eq. (7.3.15) keeping only the
first term in this equation. This approximation to the derivatives of the modes may
not always be accurate, but it appears to be sufficient for calculating the sensitivity
of the dynamic response.
For the adjoint method we consider a constraint in the form of Eq. (7.4.9)
g(q,x) = Jo
t, p(q,x,t)dt~O, (7.4.37)
299
Chapter 7: Sensitivity of Discrete Systems
so that
(7.4.38)
To avoid the calculation of dqj dx we multiply the response derivative equation, Eq.
(7.4.35), by an adjoint vector, A, and add to the derivative of the constraint
dg ltl op op dq ltl T dq dq dq
-= (-+--)dt+ A (-MR--CR--KR-+r)dt. (7.4.39)
dx 0 ox oq dx 0 dx dx dx
We want to get rid of the response derivative terms by selecting A appropriately.
We use integration by parts to get rid of time derivatives in the response derivative
terms. We obtain
dg = (I {op _ AT r _ .\TM R + ~TCR _ ATKR] dq } dt
+ [OP
dx Jo ox oq dx ( )
7.4.40
-ATMR dqltl +~TMR dqltf _ ATCR dqltl.
dx 0 dx 0 dx 0
If the initial conditions do not depend on the design variable x, Eq. (7.4.40) suggests
the following definition for A
... op T .
MRA-CRA+KRA=(oq) ' A(t,)=A(t,)=O, (7.4.41)
7.5 Exercises
rx'u
y,v
o ~p
301
Chapter 7: Sensitivity 01 Discrete Systems
1 21 0
P---~~~~=~~===21=0==);~---P
Figure 7.5.2 Two-span beam.
7. The beam shown in Fig. (7.5.2) needs to be stiffened to increase its buckling load.
Calculate the derivative of the buckling load with respect to the moment of inertia of
the left and right segments, and decide what is the most economical way of stiffening
the beam. Assume that the cost is proportional to the mass, and the cross-sectional
area is proportional to the square root of the moment of inertia.
8. Obtain an expression for the second derivatives of the buckling load with respect
to structural parameters.
9. Repeat Example 7.3.4 for the derivative with respect to c instead of k.
10. Consider the equation of motion for a mass-spring-damper system
mill + cw + kw = I(t)
where I(t) = 10H(t) is a step function, and w(O) = w(O) = O. Calculate the derivative
of the maximum displacement with respect to c for the case kIm = 4., elm = 0.05,
101m = 2. using the direct method.
11. Obtain the derivatives of the maximum displacement in Problem 10 with respect
to c, m, 10 and k using the adjoint method.
12. Solve problem 10 using Green's function method.
13. Solve problem 10 using the mode-displacement method and mode-acceleration
methods with a single mode.
7.6 References
[1] Gill, P.E., Murray, W., Saunders, M.A., and Wright, M.H., "Computing Forward-
Difference Intervals for Numerical Optimization", SIAM J. Sci. and Stat. Comp.,
Vol. 4, No.2, pp. 310-321, June 1983.
[2] lott, J., Haftka, R.T., and Adelman, H.M., "Selecting Step Sizes in Sensitivity
Analysis by Finite Differences," NASA TM- 86382, 1985.
[3] Haftka, R.T., "Sensitivity Calculations for Iteratively Solved Problems," Inter-
national Journal for Numerical Methods in Engineering, Vol. 21, pp.1535-1546,
1985.
302
Section 7.6: References
[4J Haftka, R.T., "Second-Order Sensitivity Derivatives in structural Analysis",
AIAA Journal, Vol. 20, pp.1765-1766, 1982.
[5J Barthelemy, B., Chon, C.T., and Haftka, R.T., " Sensitivity Approximation of
Static Structural Response", paper presented at the First World Congress on
Computational Mechanics, Austin Texas, Sept. 1986.
[6J Barthelemy, B., and Haftka, R.T., "Accuracy Analysis of the Semi-analytical
Method for Shape Sensitivit.y Calculations," Mechanics of Structures and Ma-
chines, 18, 3, pp. 407432, 1990.
[7J Barthelemy, B., Chon, C.T., and Haftka, R.T., "Accuracy Problems Associated
with Semi-Analytical Derivatives of Static Response," Finite Elements in Analysis
and Design, 4, pp. 249-265, 1988.
[8J Haug, E.J., Choi, K.K., and Komkov, V., Design Sensitivity Analysis of Structural
Systems, Academic Press, 1986.
[9J Cardani, C. and Mantegazza, P., "Calculation of Eigenvalue and Eigenvector
Derivatives for Algebraic Fluttcr and Divergence Eigenproblems," AIAA Journal,
Vol. 17, pp.408-412, 1979.
[10J Murthy, D.V., and Haftka, R.T., "Derivatives of Eigenvalues and Eigenvectors
of General Complex Matrix", International Journal for Numerical 1vlethods in
Engineering, 26, pp. 293-311,1988.
[l1J Nelson, R.B., "Simplified Calculation of Eigcnvector Derivatives," AIAA Journal,
Vol. 14, pp. 1201-1205,1976.
[12J Rogers, L.C., "Derivatives of Eigenvalues and Eigenvectors", AIAA Journal, Vol.
8, No.5, pp. 943-944, 1970.
[13J Wang, B.P., Improved Approximate Methods for Computing Eigenvector Deriva-
tives in Structural Dynamics," AIAA Journal, 29 (6), pp. 1018-1020,1991.
[14J Sutter, T.R., Camarda, C.J., Walsh, J.L., and Adelman, H.M., "Comparison of
Several Methods for the Calculation of Vibration Mode Shape Derivatives" , AIAA
Journal, 26 (12), pp. 1506-1511,1988.
[15J Ojalvo, I.U., "Efficient Computation of Mode-Shape Derivatives for Large Dy-
namic Systems" AIAA Journal, 25, 10, pp. 1386-1390,1987.
[16J Mills-Curran, W.C., "Calculation of Eigenvector Derivatives for Structures with
Repeated Eigenvalues", AIAA Journal, 26 (7), pp. 867-871, 1988.
[17J Dailey, R.L., "Eigenvector Derivatives with Repeated Eigenvalues" , AlA A Jour-
nal, 27 (4), pp. 486-491, 1989.
[18J vVilkinson, J.H., The Algebraic Eigenvalue Problem, Clarendon Press, Oxford,
1965.
[19J Bindolino, G., and Mantegazza, P., "Aeroelastic Derivatives as a Sensitivity Anal-
ysis of Nonlinear Equations," AIAA Journal, 25 (8), pp. 1145-1146,1987.
303
Chapter 7: Sensitivity of Discrete Systems
304
Introduction to Variational Sensitivity Analysis 8
The methods for discrete sensitivity analysis discussed in the previous chapter
are very general in that they may be applied to a variety of nonstructural sensitivity
analyses involving systems of linear equations, eigenvalue problems, etc. However, for
structural applications they have two disadvantages. First, not all methods of struc-
tural analysis lead to the type of discretized equations that are discussed in Chapter
7. For example, shell-of-revolution codes such as FASOR [11 directly integrate the
equations of equilibrium without first converting them to systems of algebraic equa-
tions. Second, operating on the discretized equations often requires access to the
source code of the structural analysis program which implements these equations.
Unfortunately, many of the popular structural analysis programs do not provide such
access to most users. It is desirable, therefore, to have sensitivity analysis methods
that are more generally applicable and can be implemented without extensive access
to and knowledge of the insides of structural analysis programs. Variational methods
of sensitivity analysis achieve this goal by differentiating the equations governing the
structure before they are discretized. The resulting sensitivity equations can then be
solved with the aid of a structural analysis program. It is not even essential that the
same program be used for the analysis and the sensitivity calculations.
As an example of this approach consider the Euler-Bernoulli plane beam governed
by the differential equation
(8.1)
where w denotes the transverse displacement, EI is the flexural rigidity and q is the
load. Equation (8.1) is supplemented by appropriate boundary conditions. Imagine
that we have to design a class of structures that are modeled well by this beam equa-
tion with complex loading and boundary conditions corresponding to intermediate
supports. We have an old computer program, written to solve this problem, for which
we do not have any programming documentation. We now want to use this program
to calculate the sensitivity of the response to changes in the stiffness properties of the
beam. Finite difference sensitivity calculations are, of course, the first choice in this
type of situation. However, difficulties in finding good step-sizes for accurate deriva-
tives (see Section 7.1) force us to consider the calculation of analytical derivatives.
305
Chapter 8: Introduction to Variational Sensitivity Analysis
We start by differentiating Eq. (8.1) with respect to a parameter p (since x is used in
this chapter to denote a coordinate variable, we use p for the generic design variable)
which affects the moment of inertia of the beam over part of the span
(8.2)
306
Section 8.1: Linear Static Analysis
operator notation for these equations. The notation is compact and allows for ease in
algebraic manipulations. However, it is abstract, and it is not always easy to grasp.
The reader who has trouble with the notation may want to translate the abstract
equations for a specific case such as plane-stress or beam analysis. For linear analysis,
the strain displacement relation is written as
(8.1.1 )
where e is the generalized strain tensor, and u the displacement vector, and Ll is a
linear differential operator. For example, for Euler-Bernoulli beam analysis the gen-
eralized strain tensor has one component, the curvature Ii, and Eq. (8.1.1) translates
into
Ii = w,,,,,. (8.1.2)
The strain is obtained from the generalized strain Ii as f = -yli where y is the distance
from the neutral axis of the beam. However, in using the principle of virtual work
it is convenient to use the generalized strain and stress tensors rather than actual
strains and stresses.
For plane-stress analysis e has actual strain components f", fy, and f"y, while Ll
is given as
(8.1.3)
However, the constitutive equations are written in terms of generalized stresses which
are the stress resultants.
The linear constitutive equations are the appropriate version of Hooke's law, and
may be written as
(8.1.4)
where (T is the generalized stress tensor, D is the material stiffness matrix, and e i
is the initial strain (e.g. due to an applied temperature field). For example, for the
plane-stress problem, (T includes the stress resultant components N", Ny, and N"y,
while for a beam bending the stress resultant is the section bending moment M, and
the constitutive equation is
M = EI(Ii _ Iii) , (8.1.5)
where E is Young's modulus and I is the section moment of inertia.
The equations of equilibrium are written via the principle of virtual work as
(T • be = f • bu , (8.1.6)
where f is the applied load field, and a bullet denotes a scalar product followed by
integration over the structural domain. For example, for the plane stress case
307
Chapter 8: Introduction to Variational Sensitivity Analysis
and
f • ou = J
f . oudA = J (fx ou + fyov)dA + J (Tx ou + Tyov)drT , (8.1.7)
where fx and fy are body forces per unit area, and T x , Ty are tractions on the loaded
boundary T r
The virtual displacement field OU must be differentiable and satisfy the kinematic
boundary conditions, but is otherwise arbitrary. The virtual strain field oe is obtained
from the virtual displacement field via Eq. (8.1.1) as
(8.1.8)
This operator notation for the equations is quite general, in that it is equally ap-
plicable to continuum problems as well as to discrete formulations. It is also very
convenient for sensitivity calculations. In this section we consider only sensitivities
with respect to a stiffness parameter appearing in the material stiffness matrix D.
For one or two dimensional problems the parameter can include sizing variables, such
as rod cross-sectional areas or plate thicknesses, since these variables are incorporated
in D (as is the beam section moment of inertia in Eq. (8.1.5)).
The direct method for sensitivity calculation is obtained by differentiating the equa-
tions defining the response of the structure with respect to p. We then obtain a set
of equations for the response sensitivity u,p, e,p, and (T,p' The governing equations
for the sensitivity fields are shown to be the same as the equations for the response
itself, albeit with a different loading terms, that are called pseudo loads. The impli-
cation is that if we replace the loading in the original problem by the pseudo loads
our structural analysis package will compute the response sensitivities instead of the
response. We start by differentiating the strain-displacement relation
(8.1.9)
(T,p.oe = 0, (8.1.11)
where oe, given by Eq. (8.1.8), is not a function of p because ou is an arbitrary field.
Note that all the sensitivity fields have units of the original fields divided by units of
p. For example, if p represents the cross-sectional area of a truss member then (T,p
has units of stress divided by area.
We now compare the differentiated equations, Eq. (8.1.9), (8.1.10), and (8.1.11)
to the original governing equations, Eq. (8.1.1), (8.1.4), and (8.1.6). We see that the
308
Section 8.1: Linear Static Analysis
sensitivity field u,P' e,p, u,p can be viewed as the solution of the original structure
under a different set of loads called the pseudo loads. These loads do not include any
mechanical components, but just an initial strain field eP • This initial strain field is
obtained by rearranging Eq. (8.1.lO)as
U,p = D(e,p - eP ), (8.1.12)
For example, for truss members the relation between the generalized stress (member
force N) and the strain is
N=EA(€-€i). (8.1.13)
Differentiating this equation with respect to A we get
N,A = EA[€,A + (€ - €i)jA], (8.1.14)
so that to implement the direct method we need to apply an initial strain of magnitude
-(€ - (i)jA instead of the actual loads.
As another example consider the isotropic plane-stress case, where the constitu-
tive equations are
Nx }
{ Ny
N xy
Eh
= --
1- v 2
[1
v
0
v
1
0 l~vl {:: }. (8.1.15)
2 txy
By differentiating Eq. (8.1.15) with respect to the thickness h we can show that to
find the sensitivity with respect to change in thickness we need to apply a pseudo
initial strain of eP = -ejh. To obtain sensitivity with respect to Poisson's ratio v we
note that
oo
2(1 + v)
1,Dv= Eh [2V
l+v2
1 + v2
2v o 1.
, (1 - v 2 )2 0 o _ (1_;)2
(8.1.16)
so that we need to apply a pseudo initial strain of
...P - - -
'"" -
1
2
{-V€ -
-v/-/ y
€
x
}
(8.1.17)
1- v (1 - vh"y
When we analyze the structure using a finite-element model, the pseudo initial
strain is converted to a pseudo nodal force fP such that
De P • De = fP • DU . (8.1.18)
With other solution techniques the pseudo load is obtained from the initial strain
in a different manner. For example, in a three dimensional continuum formulation
the pseudo initial strain, e P , can be replaced by pseudo body forces with components
Ii = (DeP)ij,j and surface tractions with components Ti = (DeP)ijnj. Where nj are
the components of the vector normal to the boundary S, and a comma followed by
an index j denotes derivative with respect to the coordinate Xj.
309
Chapter 8: Introduction to Var'iational Sensitivity Analysis
1-
y,v
x,u
I
Figure 8.1.1 Three bar truss.
Example 8.1.1
Calculate the derivative of the stress in members A, Band C of the truss in Fig.
(8.1.1) with respect to the area of member B. At the nominal configurat.ion all three
members have the same area A.
\Ve assume that the areas of members A and C remain the same, AA, and denote
the area of member B as A B . Due to symmetry, the vertical force contributes only
to the vertical displacement, and the horizontal force only to the horizontal displace-
ment. Furthermore, member B does not influence the horizontal displacements. It
is easy then to check that the two displacements at the point of load application are
given as
v = Pv1/[(AB + 0.25A A )E] . (a)
The forces in members A, I3 and C are then calculated to be
, ~ ~ 0.25PVA A ~
NA = 0.5{730PH + 4 A = 0.97730P,
• B + 0.25 A
T _ PVAB _
AB - A B+O. 25A A - 1.6P, (b)
\Ve can calculate the derivatives of these forces with respect to AB analytically for the
purpose of comparing it later with the derivatives we obtain using the direct method.
dN B O. 25PV A A .. = O.32P/A.
dAB (AB + 0.25A A F
310
Section 8.1: Linear Static Analysis
For our problem, we need to apply to member B a pseudo initial strain
fP = -fB/AB = -NB /EA1 = -1.6P/EA2 ,
while for other members the pseudo initial strain is zero. Note that as with all
sensitivity fields the units of the pseudo initial strain are units of strain divided
by units of p (area here). The displacement field generated by this initial strain is
obtained by applying to member B a pair of opposite forces, with the force at the
bottom joint (having units of force over area) being (see Eq. (8.1.18))
and then
dNB = EA (df B _ P) = -1.28P 1.6P = 0 32 P
dAB B dAB f A + A . A·
We note that both derivatives agree with the expressions we obtained by explicit
differentiation.
To calculate the derivatives of the stresses from the derivatives of the loads we
note that
NA NB Nc
O"A = - , O"B = AB' O"C = - ,
AA AA
and therefore
dO"A 1 dNA -0.32P
dAB
=- -- =
AAdA B A2
dO"c 1 dNc -0.32P
dAB
=- --
AAdAB A2
and
•••
311
Chapter 8: Introduction to Variational Sensitivity Analysis
8.1.2 The Adjoint Method
Often we do not need the derivatives of the entire displacement or stress fields, but
only a few quantities such as the derivative of the vertical displacement at a point, or
the Von Mises stress at another point. In such cases it may be more economical to use
the adjoint method to calculate these derivatives. We therefore consider the adjoint
method for calculating derivatives of displacement and stress functionals. Consider
first a displacement functional defined by an integral over the structural domain V
H = J h(u,p)dV. (8.1.19)
This could also be used to represent the value of a displacement component at a point
by employing the Dirac delta function as part of h. The derivative of H with respect
to a design parameter p is
where h,u is a load-like vector field (recall that a bullet denotes scalar product followed
by integration over the structure). For example, in a plane-stress case if h = u 2 + v 2 ,
then
(8.1.21)
The calculation of h,p and h,u is typically easy, and the main difficulty is to obtain the
derivative of the displacement field, u,p' We can use the direct method to calculate
u,p, but instead, as shown below, we can define an adjoint problem with h,u as
the load, and use it to eliminate u,p' Since we want the derivative of H with the
requirement that Eqs. (8.1.1), (8.1.4) and (8.1.6) are satisfied, we multiply these
equations by some appropriate Lagrange multipliers (called adjoint variables) and
add them to H. The Lagrange multipliers for Eqs. (8.1.1) and (8.1.4) are an adjoint
stress field and an adjoint strain field, respectively. Equation (8.1.6) represents the
equations of equilibrium written as the work done on a virtual displacement field bu,
and the corresponding virtual strain field be = Ll(bu). Multiplying the eqnations of
equilibrium by a Lagrange multiplier is equivalent to calculating the work done when
this Lagrange multiplier is treated as a virtual displacement field. So we replace the
bu by the adjoint displacement field. Denoting the adjoint fields by a superscript a
we get
H*=H+uae(e-Ll(u))+eae(u-D(e-ei))+feua-ueLl(ua). (8.1.22)
Because Eqs. (8.1.1), (8.1.4) and (8.1.6) have to be satisfied for all values of p, we
have H* = H, and H~ = H,p' We will now differentiate Eq. (8.1.21), and then define
the adjoint fields so as to get rid of the terms involving the (expensive) derivatives of
the response. The derivative of Eq. (8.1.21) with respect to p is
312
Section 8.1: Linear Static Analysis
We can get rid of the terms involving u,p and e,p by requiring the adjoint fields to
satisfy the linear strain displacement relationship and Hooke's law
(8.1.24)
(8.1.25)
The terms involving u,p can be removed by requiring the adjoint field to satisfy the
equilibrium equations with a body force equal to h,u, so that from the principle of
virtual work
u a • De = h,u • DU. (8.1.26)
Indeed, if we choose DU = u,p in Eq. (8.1.26) we get the desired elimination of the
u,p terms. Altogether we get
(8.1.28)
When we use the finite element method for the analysis we can transform the second
term further. To this end we set De = e a , Du = u a in Eq. (8.1.18) to obtain
(8.1.29)
The treatment of a generalized stress functional is similar. \Ve limit the treatment
to the case where there is no initial strain in the structure (that is, mechanical loads
are allowed, but no temperature loading, dislocations, etc.) and consider the stress
J
functional
G = g(u,p)dV (8.1.31 )
J
and its derivative
G,p = g,pdV + g,IT. u,p, (8.1.32)
where g,CT is a strain-like tensor. Again, to get rid of the expensive derivative of
the response, u,p, we add the adjoint terms as Lagrange multipliers on Eqs. (8.1.1),
(8.104) and (8.1.6)
313
Chapter 8: Introduction to Variational Sensitivity Analysis
We differentiate Eq. (8.1.33) with respect to p to obtain
From Eq. (8.1.35) we can see that we can eliminate the terms including derivatives
of the response by using an adjoint strain-displacement relation in the form of Eq.
(8.1.24), and setting Hooke's law for the adjoint field as
(8.1.36)
and equilibrium as
(8.1.37)
That is, in this case the adjoint loading is an initial strain g,O" with no mechanical
load. Then
(8.1.38)
While Eq. (8.1..38) gives us G,p without the need to calculate first the design sensi-
tivity field, its second term involves calculations of stiffness matrix derivatives at the
element level, and may require some knowledge of the details of the finite-element
analysis. To overcome this problem we note that by using Eq. (8.1.36) we can
transform the second term of Eq. (8.1.38) into
(8.1.39)
so that using Eq. (8.1.12), which with e i = 0 reduces to e P = -D-1D,p, we can also
write G,p as
(8.1.40)
In obtaining Eq. (8.1.40) we used the fact that if UI and U2 are two stress tensors,
then UI.n-1u2 = D-1uI.u2. As with the displacement functional we can also write
G,p in terms of the pseudo load. We use Eq. (8.1.18) with (8u,8e) set to (ua,e a ) and
Eq. (8.1.12) to obtain
(8.1.41)
and then Eq. (8.1.38) becomes
314
Section 8.1: Linear Static Analysis
The last term in Eq. (8.1.42) still involves computations with displacements and
strains which may not be easy to implement in a general structural analysis code.
However, when G is simply the average stress (not generalized stress!) in an element,
the first and last terms often cancel. Consider, for example, the average stress in the
ith element of a truss. In a truss element the generalized stress is the member force
J
N, so
G= .!.Ii N dl·
A'
and (8.1.43)
When we need the derivative of G with respect to a design variable which does not
affect the ith element both the first and third terms in Eq. (8.1.42) are zero. For the
derivative of G with respect to the area of the ith element we have from Eq. (8.1.42)
(using D,p = E and f = N/AE)
Note that, as in the discrete case, both the direct and adjoint methods use the pseudo
load fP. In the direct method fP is applied to the structure to stand for the pseudo
initial strain e P of Eq. (8.1.12), and the response to that load is u,P' In the adjoint
method fP is used to form a scalar product with the adjoint displacement field u a , as
in Eq. (8.1.42).
Example 8.1.2
We solve Example (8.1.1) again using the adjoint method to obtain the derivatives
of the stresses in members A and B with respect to the cross-sectional areas of both
members.
Consider first the stress in member B written in terms of the generalized stress
(member force) N B
G=aB = -.!..JN
IB
BdlB .
AB
The adjoint load is an initial strain g,O" which is denoted here as g,N because the
member force N is the only component of (T.
1 1
g,N = lBAB = lA .
Note that adjoint initial strain is measured in units of l/(volume) in contrast to the
dimensionless physical strains. As a result, all the units of the adjoint field will be
the original ones divided by volume. As in Example (8.1.1), the effect of this initial
strain is obtained by applying a pair of opposite forces to member B, with the force
at the bottom being EABg,N = E/l. Using Eq. (a) of Example (8.1.1) we get
315
Chapter 8: Introduction to Variational Sensitivity Analysis
Following Eq. (8.1.44) we multiply this by the pseudo load fP of -1.6P I A obtained
in Example (8.1.1) to get
d~B P
dAB = G,AB = -1.28 A2 '
which agrees with the result obtained in Example (8.1.1).
Next we calculate the derivative of ~B with respect to AA. As in Example (8.1.1)
we need to calculate the pseudo load due to a change in AA. This change affects
both members A and C, leading to pseudo initial strains of -fAIAA and -fcIAA,
respectively. This in turn leads to pseudo loads of - N AI AA in the direction of member
A and -NcIA A in the direction of member C. The components of the pseudo load
are
PH = - -
p (NA - -Nc). sm60 0 = --, P PvP = - (NA
- + -Nc) cos60 0 = -O.4 P
AA AA A' A,1 AA A '
where the values of NA and Nc are substituted from Eq. (b) of Example (8.1.1).
Multiplying the adjoint displacement by the pseudo load we obtain
d~B _ G _ ,ap p _ P
-- - 4. - 1. V - -0.32-
dA A ,. A A2'
which can be easily checked directly.
N ext we calculate the derivatives of ~A by considering the functional
G= ~JNAdIA'
1.4. AA
\Ve need to impose an adjoint initial strain of
1 1
g,N = lAAA = 2/A .
This is implemented by applying a pair of opposite forces at the two nodes of member
A of magnitude EAAg,N = E 121 collinear with member A. The horizontal and vertical
component of the adjoint force at the bottom node are
a _ 0.433E a _ 0.25E
PH - --Z-, Pv - -1-'
Using Eq.(a) of Example 8.1.1 we get
(0.25E
va = -;----'--_ It)Z
_'---'---.,.-_ 0.2 4(0.433E Il)i 0.57735
(As + 0.25A ,1 )E A' ua = 3EA A = A .
To get the derivative of the stress with respect to AB we multiply the adjoint dis-
placements by the pseudo load associated with AB to obtain
d~A = -1.6PO.2 = -0.32~.
dAB A A A2
Similarly, to obtain the derivative with respect to AA we multiply the adjoint dis-
placements by the pseudo loads associated with AA
- -P 0.57735 - ----
d~A
- O.4P 0.2
= - 06 - P
.5(35 2 ,
dA A A A A A A
This last result can be checked directly by using the expression for N A in Eq. (b) of
Example (8.1.1) •••
316
Section S.l: Linear Static Analysis
S.1.3 Implementation Notes
In general, the direct method is easier to implement than the adjoint method, par-
ticularly if the implementation is outside the structural analysis program. The direct
method will require a postprocessor that calculates the value of the pseudo initial
strain from the values of the actual strains based on Eq. (S.1.12). The derivative of
the material stiffness matrix D,p which needs to be evaluated in this postprocessor
requires knowledge of the form of Hooke's law used in the analysis program, but not
of any finite element implementation. Then the values of the pseudo strain can be
used as initial strain input to the same structural analysis package. The output of
the package will be the sensitivity field, instead of the response. If the structural
analysis package docs not have the capability of accepting initial strain input it is
often possible to use a combination of a temperature field and anisotropic coefficients
of thermal expansion to get the required initial strains.
The implementation of the adjoint method for a stress functional, Eq. (S.1.31)
is more complicated. First we need to implement the calculation of an initial strain
field g,(7, which is usually fairly simple. We then need to implement Eq. (S.1.42),
which requires calculations at the element level for a finite element program. The
discussion following Eq. (S.1.42) shows that when the stress functional is just the
stress itself this difficulty can be in many cases bypassed.
317
Chapter 8: Introduction to Variational Sensitivity Analysis
8.2 Nonlinear Static Analysis and Limit Loads
In this section we generalize the results of the previous section to the case of geo-
metric nonlinearity. We consider only the case where the nonlinearity is adequately
represented by replacing Eq. (8.1.1) by
(8.2.1)
where L2 is a second order homogeneous operator. For example, for the nonlinear
deformation of a beam under lateral and axial loads, the generalized strain has one
component of axial strain Ex and one component of curvature K, and Eq.(8.2.1) is
written as
(8.2.2)
where Lll is a symmetric bilinear operator, i.e. Lll(u, v) = Lll(v, u), defined by
(8.2.5)
In solving nonlinear analysis problems it is customary to increase the load gradually
from zero to its final value. To accommodate this practice we assume that the load
f and the initial strain e i depend on a load amplitude parameter It, that is
(8.2.6)
The structural response can then be obtained by solving Eqs. (8.2.1), (8.1.4) and
(8.1.6) as a function of the load parameter ft.
Unfortunately, in the nonlinear regime the response is not always a single-valued
function of the load parameter ft. Figure (8.2.1) shows a typical load displacement
curve for two values of the stiffness parameter p. At load levels near the maximum
(limit load), there are two solutions for each value of J1. Structural analysis packages
that solve for nonlinear response often use more general parameters for tracing the
response curve. A typical parameter is the arc length in the (u, Il) space. We call
any parameter that is used to trace an equilibrium path (that is, a path of solutions
to Eqs. (8.2.1), (8.1.4) and (8.6)) a path parameter.
318
')ection 8.2: Nonlinear Static Analysis and Limit Loads
"-
"- .......
~ ______________________________________--..u
(8.2.13)
319
Chapter 8: Introduction to Variational Sensitivity Analysis
In terms of a finite element analysis, the load sensitivity equations are governed by
the tangent stiffness matrix. So the only difference between the linear and nonlinear
sensitivity calculation is that the pseudo initial strain is applied to the "tangent"
structure instead of the original structure. Finally, let us note that both the load
sensitivity equations and the design sensitivity equations are linear, even though the
analysis problem is nonlinear. This is a general property of sensitivity analysis of
nonlinear problems.
It can be shown that the effect of nonlinearity on the adjoint method is similar
to its effect on the direct method. That is, in the case of a displacement functional
H of Eq. (8.1.19) the adjoint structure satisfies
(8.2.16)
(8.2.17)
(8.2.18)
The adjoint structure is therefore the tangent structure with h,u as the applied load
[see also [3J). To implement the calculation of the adjoint field in a structural analysis
package, we need only to replace the actual load by h,u in the load sensitivity module.
It can be shown (Exercise 5) that Eq. (8.1.28) is still applicable. Similarly, in the
case of a stress functional G of Eq. (8.1.31) we apply an initial strain g,O" to the
tangent structure, and we can still obtain Eq. (8.1.42).
Example 8.2.1
The beam shown in Figure (8.2.2) has a cross sectional area Ao, a moment of inertia
I = 0.00IAoL2, and is subject to a constant applied temperature, T (measured
from the stress-free temperature), and a variable transverse load, I1P. The applied
temperature T is selected so that the resulting axial load is close to the buckling load
limit, that is, EAo€i = EAoo:T = 7.5EIo/ L2, where 0: is the coefficient of thermal
expansion, and the applied load is P = 1.2 x 1O- 4 EA o. We want to calculate the
derivative of the displacement under the load, Wm, with respect to the cross sectional
area A (assuming that P and I remain constant).
320
Section 8.2: Nonlinear Static Analysis and Limit Loads
2L
For a beam under combined axial and bending actions the generalized strain
tensor has two components fx and "', and the generalized stress tensor includes the
axial load N and the bending moment M. The nonlinear strain-displacement relation
for the beam is given by Eq. (8.2.2), and Hooke's law is
M=EI""
where fi = aT. The virtual work equation is
where
8", = 8w,xx.
First we solve the analysis problem in closed form based on a simple finite-element
model. Because of the symmetry we need analyze only the left half of the beam,
using half of the force, and symmetry conditions of u = 0, and w,>: = 0 in the middle.
We approximate the left half of the beam by a single beam finite element with linear
variation of u and cubic variation of w. Using the boundary conditions, and the
finite-element shape functions we have
u = 0, where x = x/L, W = w m / L.
(b)
This equation can be verified by differentiating Eq. (a) with respect to p.
Direct method: Equations (8.2.13)- (8.2.15) become (remember that I is constant)
We note that the sensitivity equations are identical to the tangent state equations
except that instead of the load P we have the pseudo initial strain (p = -((x - (i)/ A.
Using Eq. (8.1.18), we find that the initial strain gives rise to a pseudo load defined
by
L
P P 6wm = - 1 E(f x - fi)&",dx = -E 1L[18w2(i; - i;2)2 - fi]36iv6w(i; - i;2)2dx
= -1.02857Ew 3 6w m + 1.2EfiW6w m .
(c)
The design sensitivity equation is obtained from the load sensitivity equation, Eq.
(b), by replacing the actual load (0.5P) with the pseudo load, pp and replacing w'
with W,A, so that the equation for W,A is
which is identical to the result obtained from the direct method .•••
Next we consider the calculation of a limit load; here the load sensitivity equations,
Eqs. (8.2.10)-(8.2.12), become singular. To circumvent the problem associated with
this singularity it is customary to define the response path in term of a parameter
other than the load (e.g., a displacement component or an arc length parameter).
We specialize Eqs. (8.2.7)-(8.2.9) to that case, where the parameter controls the
response and ft, but not the stiffness (that is D = 0). At the limit point, jt = 0,
and we denote the derivative of the response with respect to the path parameter by
a subscript 1. That is Eqs. (8.2.7)-(8.2.9) become
(8.2.19)
This equation requires the derivatives of the pre buckling response. We can elimi-
nate these derivatives without using an adjoint field by noting the similarity of the
numerator to what we get by substituting flu = u~p into Eq. (8.2.21)
(8.2.27)
To make Eqs. (8.2.27) more similar to the numerator of Eq. (8.2.26) we use Eqs.
(8.2.21) and (8.2.24) to rewrite Eq. (8.2.27) as
* e1 -
OO,p. D ,p (*
e - e i) • e1 + !l,p* D e i' • e1 + 00 * • L 11 (UI, u,p* ) = 0 . (8.2.28)
Finally, combining Eqs. (8.2.26) and (8.2.28) we get a form of the derivative of the
limit load with respect to a stiffness parameter
* _ D,p(e* - e i ) • e1
(8.2.29)
!l,p - fl. U1 + Dei' • e1
that does not require derivatives of prebuckling response. This expression can be
simplified further for the case of finite-element calculation. Using Eqs. (8.1.12) and
(8.1.18) we get
* -fP U1 *.
!l,p = (fl + fi') • U1 '
(8.2.30)
where f P* is the pseudo load of Eq. (8.1.18) evaluated at the limit point, and fi' is
the equivalent nodal load due to the initial strain e il .
The above calculation appears to be applicable also to bifurcation buckling. How-
ever, for bifurcation buckling jJ, in Eq. (8.2.9) is not zero. The consistency condition
for this equation is for the right-hand side to be orthogonal to the nonzero solution of
the homogeneous problem U1. That is, for the bifurcation problem (fl + fi'). U1 = 0,
and we cannot use Eqs. (8.2.29) and (8.2.30). The sensitivity of bifurcation buckling
loads is discussed in the next section.
324
Section B.2: Nonlinear Static Analysis and Limit Loads
Example 8.2.2
The two-bar truss shown in Figure 8.2.3 is subject to a constant load P and variable
negative applied temperature -pT. As the truss is cooled the displacement h under
the load will increase until a limit point is reached and the truss collapses. We want to
calculate the derivative of the limit load factor pM with respect to the cross-sectional
area A for A = Ao. The other parameters of the problem are PI EAo = 0.001,
aT = 0.01, and B = 10°, where E is Young's modulus, and a is the coefficient of
thermal expansion.
Because of symmetry we need analyze only one half of the truss, applying to
it one half of the mechanical load. We select a coordinate x that runs along the
truss member. The strain-displacement relation, Hooke's law, and the virtual work
equation are given as
Equation (a) can be used to trace the response of the truss as the temperature is
increased. For a given load parameter fJ this requires the solution of a cubic equation.
However, it is possible instead to gradually increase Ji and calculate the resultant fJ.
Tracing the curve we find that the limit load factor is fJ' = 0.56274 corresponding to
a displacement Ji = 0.09424. Since this problem has only one degree of freedom, the
buckling mode has only one component, Ji, and we can take it to have a unit value.
To calculate the sensitivity of fJ* using Eq. (8.2.30) w~ also need ( - (i at the
limit point. Using the expression for the strain in terms of h we get
This force is also collinear with the element. We now use Eq. (8.2.30), noting that
for our case f' = 0
326
Section 8.3: Vibration and Buckling
A structural analysis package for nonlinear analysis will typically have facilities for
generating the derivatives of the applied loads with respect to the loading parameter
IJ, and for solving the tangent equations of equilibrium at any value of that load.
For the sensitivity of static response using the direct method only the second is
needed. The procedure is identical to that used in the linear case (see Section 8.1.3).
The actual load is replaced by the initial strains associated with the stiffness change
(Eq. (8.1.12), and the tangent equations of equilibrium are solved by the structural
analysis package. The output of the package will then be the sensitivity to the stiffness
variable.
The adjoint method is similar to that used in the linear case. The same adjoint
load is used, but it is applied to the tangent system. Equations. (8.1.28) and (8.1.42)
are still applicable. However, for nonlinear analysis there is even less of a reason to
use the adjoint method than in the linear case. In nonlinear analysis the cost of the
analysis is much larger than the cost of sensitivity calculations (which are always
linear). Therefore, even when the number ofresponse functionals to be differentiated
is much smaller than the number of design variables, the direct method is still a
reasonable choice.
For sensitivity of limit loads, Eq. (8.2.30) is easy to implement. It requires
calculation of the pseudo load associated with the stiffness change, and the compu-
tation of two scalar products: of the pseudo load and the actual load (including both
mechanical and initial strain components) with the buckling mode.
(8.3.4)
327
Chapter 8: Introduction to \ iational Sensitivity Analysis
(8.3.5)
(8.3.8)
Then subtracting Eq. (8.3.8) from Eq. (8.3.7) and using Eqs. (8.3.5) and (8.3.6) we
can get (Exercise 7)
The first and last terms in the numerator of Eq. (8.3.9) correspond to the derivatives
of the stiffness matrix and mass matrix, respectively, in Eq. (7.3.5). \\'hen we
calculate derivatives of natural frequencies the other terms in the numerator vanish.
However, for the vibration frequencies of a loaded structure we need the other term
which contain derivatives of the static field u, (T with respect to p. These derivatives
need to be calculated by solving Eqs. (8.2.13) - (8.2.15).
The derivative of the buckling load is ohtained from the condition that w 2 = 0
at buckling. As p changes, J.L' must change with it so that w 2 remains zero, that is
d(w 2 ) = O. Thus
(8.3.10)
where a prime denotes derivative with respect to p. The first term in Eq. (8.3.10) is
the change in w 2 at a fixed load level, and the second is the change in w2 due to a
change in load level. These two changes add up to zero, so that the frequency remains
zero at the buckling load. Equation (8.3.10) gives
(8.3.11)
328
Section 8.3: Vibration and Buckling
To calculate the derivative of the frequency with respect to the load parameter J.l we
start by differentiating Eqs. (8.3.1) - (8.3.3) with respect to J.l and then set bu = Ul
(8.3.12)
(8.3.13)
O"~ eel +O"leLn(u', ul)+O"'eL2(ut)+O"eLn(u~, ut) = (w2)'MuleUI +w2Mu~ eUl.
(8.3.14)
Next, we eliminate the derivatives of the vibration field with respect to J.l by setting
bu = u~ in Eq. (8.3.3) and using Eq. (8.2.3)
(8.3.15)
and then subtracting Eq. (8.3.15) from Eq.(8.3.14) and using Eqs. (8.3.2), (8.3.12),
and (8.3.13) to get
Finally, substituting Eqs. (8.3.9) and (8.3.16) evaluated at the buckling load into Eq.
(8.3.11) gives
where the asterisk denotes prebuckling quantities evaluated at the buckling load.
Note that the field Ul, 0"1 now denotes the zero-frequency or buckling mode.
Example 8.3.1
The beam in Example (8.2.1) has a mass density p. Calculate the derivative of the
lowest frequency of lateral vibration with respect to the cross-sectional area A with
the applied load parameter !-l = 1 (assuming again that I and P do not change).
We use the same single finite-element approximation for half the beam that we
used in Example (8.2.1). Assuming a symmetric mode shape, we find the vibration
mode
To calculate the vibration frequency we use the Rayleigh quotient Eq. (8.3.4). The
first term in the numerator is
Using Eqs (8.3.1) and (8.3.2) and expressions from Example (8.2.1) we have
so that
w2 = 3.08571EAw2 + 12EI/L2 -1.2EA€i = 0 01077~
0.3714pAL2 . p£2 .
Note that for the unloaded beam, w= 0, €i = 0 we get
{EI
w = 5.68 yp;:iJ ,
which is about 1.5% above the exact answer. We can differentiate w2 with respect to
A, for an analytical derivative that we can use later for comparison
For the direct method to calculate the same derivative we use Eq. (8.3.9). The
individual terms in this equation are calculated as follows:
O',p. L2(ud = Io L
N,AWi,,,,dx,
where
So
and
330
Section 8.3: Vibmtion and Buckling
Altogether
The direct sensitivity approach requires the calculation of sensitivities of the static
field (prebuckling state), Eqs.(8.2.13) - (8.2.15). This calculation can become expen-
sive when we need sensitivities with respect to a large number of structural parame-
ters. In that case an adjoint technique that eliminates the need for static sensitivities
is appropriate. As usual, we multiply the equations that govern static equilibrium by
Lagrange multipliers (that we call the adjoint fields) and add them to to w2 ; thus
where mo is the value of MUI • UI for the nominal value of p (that is mo does not
change with p). The constant mo is included to simplify the final expressions for the
adjoint field. We differentiate Eq. (8.3.18) and use Eq. (8.3.9) to get
(1'a. [Ll(6u) + L l1 (u, 6u)] + (1'. L 1l (ua , 6u) - 2(1'1. L 1l (Ul, 6u) = O. (8.3.23)
Then the derivative of the frequency is given as
,p MUI. Ul
The adjoint equations, which have a homogeneous part identical to that of Eqs.
(8.3.1) - (8.3.3) for w = 0, may be considered to be the field equations of an adjoint
structure for which the term L 2 (Ul) in Eq. (8.3.21) is an initial strain term, and the
last term in Eq. (8.3.23) corresponds to body-force loading. In a buckling problem
(w = 0) the homogeneous part is singular, the adjoint fields are not unique, and
any multiple of the buckling mode Ul can be added. Any convenient orthogonality
relation can be used to make the adjoint fields unique.
The derivative of the buckling eigenvalue is similarly given as
* D pel • Cl - D pc' • c a
fl = -' , (8.3.25)
,p 2(1'1 • Lll (U'*, Ul) + (1"* • L 2 ( UI)
Equation (8.3.25) is based on the buckling mode and the prebuckling state calculated
at fl = fl'. The usual practice, however, is to estimate the buckling load by solving a
linearized eigenvalue problem based on a load {t < fl*. It is shown in [4J that the error
introduced in the derivative Ii,; due to this approximation is of the order of ({t* - p)2.
Example 8.3.2
We repeat Example (8.3.1) using the adjoint approach. We need to recalculate the
two terms that depend on the derivative of the static solution. From that example
these are
Using the adjoint method these two terms are replaced by the term
in Eq. (8.3.24).
The adjoint state, defined by Eqs. (8.3.21)-(8.3.23), has an initial strain and a
body force. The initial strain is L 2 ( UI) = wi,x' The corresponding equivalent nodal
force, if, is
if L6w = 1£ wi,xEA6Exdx.
Using expressions from Examples (8.2.1) and (8.3.1) for bE x and WI we get
if EAiiJ
= 1296--
L
1£ 0
(x - X2)4dx = 2.05714EAw.
332
Section 8.4: Static Shape Sensitivity
r = If + f~ = 6. 17142EAiiJ .
This force has to be applied to the tangent structure. This means that if we use it to
replace the right-hand side of the tangent state equation, Eq. (b) of Example (8.2.1)
then we must use iiJa to replace Wi on the left side. That is
For later use we compare Eq. (b) to Eq. (d) of Example (8.2.1) and note that
-a 6.17142AiiJ,A
(c)
W = -1.02857iiJ2 + 1.2fi .
fax = w a, x
W,x - w 21,x = 36(x - x 2)2(iiJuiiJ - 1) ,
so that
A =36(1 - iiJaiiJ) l \ x - X2 )2 [18(x - x 2 f - fi] dx
(d)
=(1 - iiJaiiJ)EL(l.02857iiJ2 - 1.2fi ).
We can now calculate w a from Eq. (b) and A from Eq. (d) to get the derivative of
w2 without calculating W,A' To check that Eq. (d) gives the same result as Eq.(a) we
use Eq. (c) to obtain
Substituting this expression into Eq. (d) we find Eq. (a) .•••
333
Chapter 8: Introduction f Variational Sensitivity Analysis
Consider a shape variation field ¢ such that the a material particle located at x is
moved to x¢
x¢ = x + ¢(x,p) , (8.4.1 )
where p is a shape design variable. The coordinate x is typically referred to as the
material or Lagrangian coordinate in that it is associated with a material particle.
The variation changes the domain V and the boundary S of the structure as
shown in Figure 8.4.1.
,"
,~ ~
. ......
~
I
\
. .... ____ rep)
"\ •
,I Q(p) ,
J
" ,,
, ,.
~
~
~
,-'
I
I x I
,
I
I
.. ..... ,•
~ , , - ,,'
334
Section 8.4: Static Shape Sensitivity
Consider a function f(x,p) defined on the changing structural domain V. We
denote the partial derivative af / ap of f with respect to p by f,p. This derivative
measures the change in f at a fixed position in the structure, and is often referred
to as the local derivative. The derivative that measures the change in f at a fixed
material point needs to take into account also the change in x as p changes. This
derivative is called the material derivative or the total derivative of j, and is denoted
here by jp
jp = f,p + V' fT xc/>,p = f,p + V' P v , (8.4.2)
where V' j denotes the gradient of j in space, and
Consider now a vector function such as the displacement field u. For each com-
ponent Ui of U we can use Eq. (8.4.1) to obtain the material derivative as
Typically the material derivative is more physically interesting than the local
derivative. For example, if we change the shape of a hole boundary to relieve stress
concentration at that boundary, we would want the derivative of the stress at the
boundary rather than at a point with fixed coordinates. Mathematically, the material
335
Chapter 8: Introduction to Variational Sensitivity Analysis
derivative is more complicated to handle than the local derivative. For example,
the local derivative commutes with differentiation with respect to coordinates while
the material derivative does not. Consider, for example, the strain field associated
with a displacement field u, and denote it as e(u). The strain is obtained from the
displacements by differentiation, and since we can change the order of differentiation
for local derivatives
e,p(u) = e(u,p), (8.4.10)
while we cannot write a similar equation for the material derivative e p •
In order to differentiate the virtual work equation with respect to p we need to
calculate derivatives of integrals over the volume and over the surface of the structure.
Let Iv denote an integral over the domain of the structure
Iv = l f(x,p)dV . (8.4.11)
(8.4.12)
where lip is the relative change in volume. It can be shown (e.g., [2]) that
(8.4.13)
Recall that repeated indices are summed over the dimensionality of the problem, so
that in the three-dimensional case
Is = is f(x,p)dS (8.4.15)
(8.4.16)
where n is the vector normal to the boundary S, and H is the curvature of S in two
dimensions and twice the mean curvature in three dimensions.
336
Section 8.4: Static Shape Sensitivity
8.4.2 Domain Parametrization
The discussion of the domain parametrization is based on the work of Haber and
coworkers, and in particular [111. With this approach the material coordinate vector
x is given in terms of some reference domain as
x = x(r,p), (8.4.18)
where r is a coordinate vector in the reference domain n with boundary r, and p is
a shape parameter (see Figure 8.4.2). When isoparametric elements are used, it is
convenient to use the parent element as the reference domain for the actual element.
Specifically, for isoparametric elements the coordinate vector x in the element is
written as
# nodes
x = L h;(r)d;(p), (8.4.19)
;=1
where hi are shape functions for the element, r is a vector of intrinsic coordinates,
and d i are vectors of nodal coordinates. Variations in geometry are represented by
variations of the nodal coordinates, with the shape functions held fixed.
r
---+ n
r
The transformation between the reference domain and material domain is char-
acterized by the Jacobian of the transformation JE, known as the Eulerian Jacobian,
and its inverse J-E
8r; 8x;
J'J.. -- -8 -r"
Xj
- J,J'
-E
andJ. . = -
'J 8rj
= xij. . (8.4.20)
(8.4.22)
where ni are the components of the unit outward normal to the surface area r of
the reference domain, and repeated indices are summed. The derivative of J- E with
respect to p is obtained from its definition
(8.4.23)
while the derivative of JE requires using the formula for the derivative of an inverse
(8.4.24)
to get
(8.4.25)
(8.4.27)
This produces an explicit dependence of the strain on the shape parameter, and to
reflect that we rewrite Eq. (8.1.1) as
(8.4.28)
Derivatives of integrals are handled in a similar way to the material derivative ap-
proach. A volume integral is written in terms of the reference coordinates
where] is the new form of the function when it is written in terms of the reference
coordinates. Then
(8.4.30)
338
Section 8.4: Static Shape Sensitivity
where
Vp= (det(J-E))/det(J- E). (8.4.31)
(8.4.32)
where
(8.4.33)
To apply the direct method to shape sensitivity calculation we need to differentiate the
strain displacement relation, Eq. (8.1.1), Hooke's law, Eq. (8.1.4) and the equilibrium
equations, Eq. (8.1.6) with respect to p. We start with the strain displacement
relation and with the material derivative approach. Using Eqs. (8.4.9) and (8.4.10)
the differentiated strain-displacement relation is
ep=L1(up)-e, (8.4.35)
where
e= Ll [(V'u)v] - (V'e)V (8.4.36)
is an initial-strain associated with the sensitivity field. Even though Eq. (8.4.36)
appears to include strain gradients, these gradients cancel out and e includes only
first derivatives of the displacement and shape velocity fields. For example, for the
three-dimensional case we obtain
(8.4.37)
We assume that the elastic coefficients do not change with shape change, and
that there is no initial strain. Then the derivative of Hooke's law is
(8.4.40)
339
Chapter 8: Introduction to-Variational Sensitivity Analysis
The derivative of the equations of equilibrium is
(0". 6e)p = (f. 6u)p. (8.4.41)
The term on the left side of Eq. (8.4.41) is a volume integral which according to Eqs.
(8.4.12) and (8.4.30) is the volume integral of the derivative of the integrand plus a
term which accounts for the change in the volume element. This translates to
(0" • 6e)p = O"p. 6e + 0". 6ep + 0". (~6e). (8.4.42)
The derivative of the virtual strain 6ep is obtained in a similar manner to Eq. (8.4.35)
as
6ep = Ll(6up) - 6e, (8.4.43)
where with the material derivative approach
6e = LdV(6u)v]- (V6e)v, (8.4.44)
while for the domain parametrization approach
6Eij = -~(6ui.kJfi,p + 6uj.kJ{i,p)' (8.4.45)
The derivative of the virtual work of the applied loads is more complicated because
this work is composed of volume and surface integrals
f. 6u = fb • 6u + T. 6u, (8.4.46)
where fb denotes the body load vector, and T denotes the vector of applied tractions.
The first term on the right hand side of Eq. (8.4.46) is a volume integral, while the
second term is a surface integral. Differentiating the body force integral is straight
forward. However, the traction term can be a problem if there are corners on the
boundary or if the loaded boundary is changing. We will assume that there are no
corners or changes in loaded boundary. Then we can differentiate Eq. (8.4.46) to get
(f.6u)p = f bp • 6u+fb • 6up+ f b• (Vp6u) + Tp. 6u + T .6up + T. (SpC'iu). (8.4.47)
The virtual displacement 6u is arbitrary except that it needs to satisfy the kinematic
boundary conditions, which are assumed to be independent of p. \Ve make sure that
6u satisfies these boundary conditions as the shape changes by requiring that
6up = O. (8.4.48)
Using Eq. (8.4.48), Eq. (8.4.43) becomes
6e p = -6e . (8.4.49)
Finally, using Eqs. (8.4.41), (8.4.42), (8.4.47) and (8.4.49) we get
O"p.6e=fbp.6u+fb.(~6u)+Tp.6u+T.(Sp6u)+0".6e-O"Vp.6e. (8.4.50)
The right hand side of Eq. (8.4.50) represents the body forces that need to be
applied to the structure (along with the initial strain e) in order for the solution to
be the sensitivity field. The pseudo load fP that needs to be applied to the original
structure to produce the sensitivity field includes the terms on the right hand side of
Eq. (8.4.50) as well as a pseudo force due to the initial strain e
f P .6u = fbp.6u+fb~.6u+Tp.6u+TSp.6u+0".6e-0"~.6e+De.6e. (8.4.51)
When the curve separating the loaded and unloaded boundaries is changing, and
when the boundary has corners, there are additional terms (see [6]). By using Eq.
(8.4.51), we may write Eq. (8.4.50) as
0"p • 6e = fP • 6u - De • 6e (8.4.52)
340
Section 8.4: Static Shape Sensitivity
Example 8.4.1
(a) L
re~eren.ce
b)
material
omam
COnfigu,"ti°l L...._ _....I
The bar shown in Figure (8.4.3) is loaded under its own weight. Calculate the
sensitivity of the solution to changes in the length of the bar (approximated by a
single finite element) using the direct method.
The loading in this case is a body force of constant magnitude f = pAg. The
exact solution for the displacement u and the member force N is given in terms of
the density p, the area A and the acceleration due to gravity g as
N = pAgeL - x), u = (pg/ E)(Lx - x 2 /2).
Using a single linear finite element we concentrate half of the body force at each node,
so that each node is loaded by pAgL/2. The finite-element solution is
which also agrees with the material derivative result. The first term in the pseudo
load expression Eq. (8.4.52) is zero, because the body load is constant. The second
term introduces a body force of f I L = pAgl L which accounts for the effect of change
in volume element on the resultant of the original body force. This is equivalent to
an end load of pAg/2. The two terms associated with the tractions vanish because
we have no applied surface tractions. The next term two terms are evaluated using
the fact that for the finite-element model the member force N and the strain E are
constant in the element
Altogether
fP • 8u = pAg8u2
which indicates that fP is equal to a force of pAg. Under this force we get
U2p = pgLI E
which agrees with the results in Eq. (a) abo\'('. To calculate the derivative of the
member force, Np we first calculate Ep from Eq. (8.4.19).
Ep = LI(U p ) - E = up,x - ElL = u2plL - ElL
so that
Ep = pgl2E, Np = EAEp = pAg/2
which agrees with the result in Eq. (a) above .•••
342
Section 8.4: Static Shape Sensitivity
8.4.4 The Adjoint Method
(8.4.53)
From Eq. (8.4.54) we see that we can eliminate of the response sensitivity terms by
defining the adjoint as we did in the stiffness variable case, Eqs. (8.1.24)-(8.1.26).
Then we get
Hp = j(hp + hVp)dV + fP e u a • (8.4.55)
Equation (8.4.55) requires the evaluation of fP from the volume integrals in Eq.
(8.4.51). It is possible to transform fPeu a to surface integrals (e.g., [6), [7]). However,
there has been unfavorable computational experience with the surface version of the
adjoint method (e.g., [12]). Unfortunately, it is not always possible to tell which
method gives more accurate results as demonstrated in the following Example.
Example 8.4.2
The cantilever beam shown in Figure (8.4.4) is modeled with rectangular plane
stress elements. The beam is composed of two materials with different Young's mod-
ulus, and the position of the interface between the two is the design parameter p. The
sensitivity of the tip displacement with respect to the position of the interface was cal-
culated using six methods: (i)Overall finite-differences (OFD); (ii) the semi-analytical
method (SA); (iii)the discrete direct method (DD); (iv) the direct variational method
(DV); (v) the adjoint variational domain method (AVD); and (vi) the adjoint vari-
ational surface method (AVS). The first three methods are discussed in Chapter 7,
the next two in this chapter, and the last method in [6].
343
Chapter 8: Introduction to Variational Sensitivity Analysis
P =11 b
T W=2"
~ _______E_1_=__1_04__P_Si_____________Y_b_=L-1.________~
I. '"1
L =20"
·1
nx = number of elements along x
Figure 8.4.4 Geometry, loading and mesh definition of cantilever beam modeled by
plane stress elements (from [13].)
9
: DO - Direct discrete
0.024
OV - Direc. variational
AVO - Adjoint _0,10110001 domain
AVS - Adjolnl _o,loli_I ,u,fac •
... 0.2 f- i
z SA - Semi- analyllcol
...>
a
."
0.0111
~ ~
<I
~ I OD.DV.AVD >
...a:
\
e; ~~.. 6;: Of"O 0.012 0
25 0.1 l(»::a;;::-it:·:·::t:~~::-::~:::'=::::::-:-:::~
~- AVS _ 0.0011
o~---~-------~----~o
o 100 200 300
Figure 8.4.5 Convergence of the tip displacement and its derivative for ny = 8.
344
Section 8.6: References
The convergence of the the displacement and its derivative as the number of
elements along the axis of the beam is increased, is shown in Figure (8.4.5). It is
clear that the derivatives converge more slowly than the displacement, which is to be
expected. It is also seen, that though several methods including the direct methods
and the adjoint variational domain method agree very well with the overall finite
difference method, they are not more accurate than the adjoint variational surface
method. They just converge to the correct value from a different direction .•••
8.5 Exercises
8.6 References
[1] Cohen, G.A., "FASOR-A program for Stress, Buckling, and Vibration of Shells
of Revolution", Advances in Engineering Software, 3 (4), pp.155-162, 1£)81.
[2] Haug, E.J., Choi, K.K., and Komkov, V., Design Sensitivity Analysis of Structural
Systems, Academic Press, 1986.
[3] Mr6z, Z., Kamat, M.P., and Plaut, R.H., "Sensitivity Analysis ami Optimal De-
sign of Nonlinear Beams and Plates," J. Stmct. Mech., 13 (3/4), pp. 245-266,
1985.
[4] Cohen, G.A., "Effect of Nonlinear Prebuckling State on the Post buckling Behav-
ior and Imperfection Sensitivity of Elastic Structures", AIAA Journal, 6 (8), pp.
1616-1619, 1968.
345
Chapter 8: Introduction to Variational Sensitivity Analysis
[5] Mr6z, Z., "Sensitivity Analysis and Optimal Design with Account for Varying
Shape and Support Conditions" , In Computer Aided Optimal Design: Structural
and Mechanical Systems (C.A. Mota Soares, Editor), Springer-Verlag, 1987, pp.
407-438.
[6] Choi, K.K., "Shape Design Sensitivity Analysis and Optimal Design of Struc-
tural Systems" , In Computer Aided Optimal Design: Structural and Mechanical
Systems (C.A. Mota Soares, Editor), Springer-Verlag, 1~87, pp. 439-492.
[7] Yang, R-J., "A Three Dimensional Shape Optimization System-SHOP3D",
Computers and Structures, 31(8), pp. 881-890, 1989.
[8] Dems, K., and Haftka, RT., "Two Approaches to Sensitivity Analysis for Shape
Variation of Structures," Mechanics of Structures and Machines, Vol. 16, No.4,
pp. 501-522, 1988/89.
[9] Dems, K., and Mr6z, Z., "Variational Approach by Means of Adjoint Systems to
Structural Optimization and Sensitivity Analysis, Part I: Variation of Material
Parameters within Fixed Domain," Int. J. Solids Struct., 19 (8), pp. 677-692,
1983, "Part II: Structure Shape Variation," 20, pp. 527-552, 1984.
[10] Phelan, D.G., and Haber, RB., "Sensitivity Analysis of Linear Elastic Systems
Using Domain Parametrization and a Mixed Mutual Energy Principle," Com-
puter Methods in Applied Mechanics and Engineering, Vol. 77, pp. 31-59, 1989.
[11] Arora, J.S., and Cardoso, J.B., "A Variational Principle for Shape Design Sen-
sitivity Analysis," AIAA Paper 91-1213-CP, Proceedings AIAA/ ASME/ ASCE-
/ AHS / ASC 32nd Structures, Structural Dynamics and Material Conference, Bal-
timore, MD, April 8-10, 1991, Part 1, pp. 664-674.
[12] Choi, K.K. and Seong, H.G., "A Domain Method for Shape Design Sensitivity of
Built-Up Structures", Computer Methods in Applied Mechanics and Engineering,
Vol. 57, pp. 1-15, 1986.
[13] Haftka, R.T., and Barthelemy, B., "On the Accuracy of Shape Sensitivity Deriva-
tives", In: Eschenauer, H.A., and Thierauf, G. (eds), Discretization Methods and
Structural Optimization-Procedures and Applications, pp. 136-144, Springer-
Verlag, Berlin, 1989.
346
Dual and Optimality Criteria Methods 9
In most of the analytically solved examples in Chapter 2, the key to the solution
is the use of an algebraic or a differential equation which forms the optimality con-
dition. For an unconstrained algebraic problem the simple optimality condition is
the requirement that the first derivatives of the objective function vanish. When the
objective function is a functional the optimality conditions are the Euler-Lagrange
equations (e.g., Eq. (2.2.13)). On the other hand, the numerical solution methods
discussed in chapters 4 and 5 (known as direct search methods) do not use the opti-
mality conditions to arrive at the optimum design. The reader may have wondered
why we do not have numerical methods that mimic the solution process for the prob-
lems described in Chapter 2. In fact, such numerical methods do exist, and they
are known as optimality criteria methods. One reason that the treatment of these
methods is delayed until this chapter is their limited acceptance in the optimization
community. While the direct search methods discussed in Chapters 4 and 5 are widely
used in many fields of engineering, science and management science, optimality crite-
ria method have been used mostly for structural optimization, and even in this field
there are many practitioners that dispute their usefulness.
The Fully Stressed Design (FSD) technique is probably the most successful optimality
criteria method, and has motivated much of the initial interest in these methods. The
FSD technique is applicable to structures that are subject only to stress and minimum
gage constraints. The FSD optimality criterion can be stated as follows:
For the optimum design each member of the structure that is not at its minimum
gage is fully stressed under at least one of the design load conditions.
This optimality criterion implies that we should remove material from members
that are not fully stressed unless prevented by minimum gage constraints. This
appears reasonable, but it is based on an implicit assumption that the primary effect
of adding or removing material from a structural member is to change the stresses in
that member. If this assumption is violated, that is if adding material to one part of
the structure can have large effects on the stresses in other parts of the structure, we
may want to have members that are not fully stressed because they help to relieve
stresses in other members.
348
Section 9.1: Intuitive Optimality Criteria Methods
Example 9.1.1
1
1
I
The stresses in the two members (based on the assumption that the platform remains
horizontal) are easily shown to be
P
0'1 = 0'2 = .
Al +A2
Now assume that member one is made of a high-strength low-density alloy such that
= 20'02 and PI = O.9P2' In this case the critical constraint is
0'01
349
Chapter 9: Dual and Optimality Criteria Methods
so that At + A2 = P/(T02' The minimum mass design obviously will make maximum
use of the superior alloy by reducing the area of the second member to its minimum
gage value A2 = A o, so that At = P/(T02 - Ao, provided that P/(T02 is larger tlli1n
2Ao. This optimum design is not fully stressed as the stress in member 1 is only half
of the allowable and member 1 is not at minimum gage. The fully stressed desigll
(obtained by the stress-ratio technique which is described below) is: At = Ao and
A2 = P/(T02 - A l . In this design, member 2 is fully stressed and member 1 is at.
minimum gage. This is, of course, an absurd design because we make minimal \lS(~
of the superior alloy and maximum use of the inferior one. For an illustration of the
effect on mass assume that
P/(T02 = 20Ao·
For this case the optimal design has At = 19A o, A2 = Ao and m = 18.1p2Aol. The
fully stressed design, on the other hand, has At = Ao, A2 = 19Ao and m = 19.9p2J1oi .
• ••
Beside the use of two materials, another essential feature of Example 9.1.1 is it
structure which is highly redundant, so that changing the area of one member has
a large effect on the stress in the other member. This example is simple enough so
that the optimum and fully-stressed designs can be found by inspection.
Minimum Minimum
size
A more complex classical example (developed by Berke and Khot [9]) often used tf)
demonstrate the weakness of the FSD is the ten-bar truss shown in Figure 9.1.2. Th('
truss is made of aluminum (Young's modulus E = 107 psi and density p = O.llb/in~l
with all members having a minimum gage of 0.1 in2 . The yield stress is ±25000 psi
for all members except member 9. Berke and Khot have shown that for (109::; 3750()
psi the optimum and FSD designs are identical, but for (T09 ~ 37500 psi the optimum
design weighs 1497.6 lb and member 9 is neither fully stressed nor at minimum gage.
The FSD design weighs 1725.2 lb, 15% heavier than the optimum, with member D at.
minimum gage. The two designs are shown in Figure 9.1.2.
350
Section 9.1: Intuitive Optimality Criteria Methods
The FSD technique is usually complemented by a resizing algorithm based on
the assumption that the load distribution in the structure is independent of member
sizes. That is, the stress in each member is calculated, and then the member is
resized to bring the stresses to their allowable values assuming that the loads carried
by members remained constant ( this is logical since the FSD criterion is based on a
similar assumption). For example, for truss structures, where the design variables are
often cross-sectional areas, the force in any member is a A where a is the axial stress
and A the cross-sectional area. Assuming that a A is constant leads to the stress ratio
resizing technique
a
Anew = Ao1d-, (9.1.1)
ao
which gives the resized area Anew in terms of the current area A old , the current stress
a, and the allowable stress ao. For a statically determinate truss, the assumption
that member forces are constant is exact, and Eq. (9.1.1) will bring the stress in
each member to its allowable value. If the structure is not statically determinate
Eq. (9.1.1) has to be applied repeatedly until convergence to any desired tolerance is
achieved. Also, if Anew obtained by Eq. (9.1.1) is smaller than the minimum gage,
the minimum gage is selected rather than the value given by Eq. (9.1.1). This so
called stress-ratio technique is illustrated by the following example.
Example 9.1.2
For the structure of Example 9.1.1 we use the stress ratio formula and follow the
iteration history. We assume that the initial design has Al = A2 = A o, and that the
applied load is p = 20Aoa02. The iteration history is given in Table 9.1.1.
Table 9.1.1
Iteration AI/Ao AdAo aI/aol a2/ a o2
1 1.00 1.00 5.00 10.00
2 5.00 10.00 0.67 1.33
3 3.33 13.33 0.60 1.2
4 2.00 16.00 0.56 1.11
5 1.11 17.78 0.56 1.059
6 1.00 18.82 0.504 1.009
7 1.00 18.99 0.500 1.0005
Convergence is fast, and if material 2 were lighter this would be the optimum
design .•••
As can be seen from Example 9.1.2 the convergence of the stress ratio technique
can be quite rapid, and this is a major attraction of the method. A more attrac-
tive feature is that it does not require derivatives of stresses with respect to design
variables. When we have a structure with hundreds or thousands of members which
need to be individually sized, the cost of obtaining derivatives of all critical stresses
with respect to all design variables could be prohibitive. Practically all mathemati-
cal programming algorithms require such derivatives, while the stress ratio technique
351
Chapter 9: Dual and Optimality Criteria Methods
does not. The FSD method is, therefore, very efficient for designing truss structures
subject only to stress constraints.
For other types of structures the stress ratio technique can be generalized by
pursuing the assumption that member forces are independent of member sizes. For
example, in thin wall construction, where only membrane stresses are important,
we would assume that uijt are constant, where t is the local thickness and Uij are
the membrane stress components. In such situations the stress constraint is often
expressed in terms of an equivalent stress U e as
(9.1.2)
For example, in a plane-stress problem the Von-Mises stress constraint for an isotropic
material is
2 2 2 32 < 2 (9.1.3)
Ue=Ux+Uy-UxUy+ Txy_U O '
(9.1.4)
In the presence of bending stresses the resizing equation is more complicated. This
is the subject of Exercise 3.
'When the assumption that member forces remain constant is unwarranted the
stress ratio technique may converge slowly, and the FSD design may not be optimal.
This may happen when the structure is highly redundant (see Adelman et al. [10],
for example), or when loads depend on sizes (e.g., thermal loads or inertia loads).
The method can be generalized to deal with size-dependent loads (see, for example,
Adelman and Narayanaswami [11] for treatment of thermal loads), but not much can
be done to resolve the problems associated with redundancy. The combination of FSD
with the stress ratio technique is particularly inappropriate for designing structures
made of composite materials. Because composite materials are not isotropic the FSD
design may be far from optimum, and because of the redundancy inherent in the use
of composite materials, convergence can be very slow.
The success of FSD prompted extensions to optimization under displacement
constraints which became the basis of modern optimality criteria methods. Venkayya
[12] proposed a rigorous optimality criterion based on the strain energy density in the
structure. The criterion states that at the optimum design the strain energy of each
element bears a constant ratio to the strain energy capacity of the element. This was
the beginning of the more general optimality criteria methods discussed later.
The strain energy density criterion is rigorous under some conditions, but it has
also been applied to problems where it is not the exact optimality criterion. For
example, Siegel [13] used it for design subject to flutter constraints. Siegel proposed
that the strain energy density associated with the flutter mode should be constant
over the structure. In both [12] and [13] the optimality criterion was accompanied
by a simple resizing rule similar to the stress ratio technique.
352
Section 9.2: Dual Methods
The simultaneous failure mode approach was an early design technique similar to
FSD in that it assumed that the lightest design is obtained when two or more modes
of failure occur simultaneously. It is also assumed that the failure modes that are
active at the optimum (lightest) design are known in advance.
N
/ / / / LI-
t}
:r
b} "I
1
--l r-- t2
Consider, for example (from Stroud [14]) how this procedure is used to design
a metal blade-stiffened panel having the cross section shown in Figure 9.1.3. There
are four design variables b1 , b2 , t 1 , t 2 • Rules of thumb based on considerable experi-
ence are first used to establish proportions, such as plate width-to-thickness ratios.
The establishment of these proportions eliminates two of the design variables. The
remaining two variables are then calculated by setting the overall buckling load and
local buckling load equal to the applied load. This approach results in two equa-
tions for the two unknown design variables. The success of the method hinges on
the experience and insight of the engineer who sets the proportions and identifies
the resulting failure modes. For metal structures having conventional configurations,
insight has been gained through many tests. Limiting the proportions accomplishes
two goals: it reduces the number of design variables, and it prevents failure modes
that are difficult to analyze. This simplified design approach is, therefore, compatible
with simplified analysis capability.
As noted in the introduction to this chapter, dual methods have been used to
examine the theoretical basis of some of the popular optimality criteria methods.
Historically optimality criteria methods preceded dual methods in their application
to optimum structural design. However, because of their theoretical significance we
will reverse the historical order and discuss dual methods first.
353
Chapter 9: Dual and Optimality Criteria Methods
The Lagrange multipliers are often called the dual variables of the constrained opti-
mization problem. For linear problems the primal and dual formulations have been
presented in Chapter 3, and the role of dual variables as Lagrange multipliers is not
difficult to establish (See Exercise 1). If the primal problem is written as
minimize cT x
subject to Ax - b ~ 0,
x ~ o.
Then the dual formulation in terms of the Lagrange multipliers is
maximize ATb
subject to AT A - c ~ 0,
A ~ o.
There are several ways of generalizing the linear dual formulation to nonlinear
problems. In applications to structural optimization, the most successful has been
one due to Falk [15) as specialized to separable problems by Fleury [16).
The original optimization problem is called the primal problem and is of the form
minimize f(x)
(9.2.1 )
subject to gj(x) ~ 0, j = 1, ... , n g .
The necessary conditions for a local minimum of problem (9.2.1) at a point x* is that
there exist a vector A* (with components Ai, ... , A~ 9 ) such that
Equations (9.2.3)-(9.2.5) are the Kuhn-Tucker conditions (see Chapter 5). They
naturally motivate the definition of a function .£ called the Lagrangian function
L
ng
Equations (9.2.5) can then be viewed as stationarity conditions with respect to x for
the Lagrangian function. Falk's dual formulation is
maximize .£m(A)
such that Aj ~ 0, j = 1, ... , 71 g , (9.2.7)
354
Section 9.2: Dual Methods
where
Cm(,x) = minC(x,,x),
xEC
(9.2.8)
and where C is some closed convex set introduced to insure the well conditioning of
the problem. For example, if we know that the solution is bounded, we may select C
to be
c = {x: -r::; Xi::; r , i=l, ... ,n}, (9.2.9)
where r is a suitably large number. Under some restrictive conditions the solution of
(9.2.7) is identical to the solution of the original problem (9.2.1), and the optimum
value of Cm is identical to the optimum value of f. One set of conditions is for the
optimization problem to be convex (that is, f(x) bounded and convex, and gj(x)
concave), f and gj to be twice continuously differentiable, and the matrix of second
derivatives of C(x,,x) with respect to x to be nonsingular at x*.
Under these conditions the convexity requirement also guarantees that we have
only one minimum. For the linear case the Falk dual leads to the dual formulation
discussed in Section 3.7 (Exercise 1).
In general, it does not make sense to solve (9.2.7), which is a nested optimization
problem, instead of (9.2.1) which is a single optimization problem. However, both the
maximization of (9.2.7) and the minimization of (9.2.8) are virtually unconstrained.
Under some circumstances these optimizations become very simple to execute. This
is the case when the objective function and the constraints are separable functions.
The optimization problem is called separable when both the objective function and
constraints are separable, that is
n
(9.2.10)
The primal formulation does not benefit much from the separability. However, the
dual formulation does because C(x,,x) is also a separable function and can, therefore,
be minimized by a series of one-dimensional minimizations, and Cm (,x) is therefore
easy to calculate.
Example 9.2.1
(a)
where
Lj(Xd = xi - )'1 X j ,
We now need to find the maximum of Lm(A) subject to the constraints Aj :::: 0,
A2 :::: O. Differentiating Lm(A) we obtain
or
1 1
Aj=9 3 , A2=1 3 , L m(A)=52.
vVe also have to check for a maximum along the boundary A1 = 0 or A2 = o. If A1 = 0
and this function attains its maximum for A2 = 3.2, Lm(A) = 12.8. For A2 = 0 we
get
with the maximum attained at Al = 10, Lm = 50. \Ve conclude that the maximum
is inside the domain. From the expressions for XI, X2, X3 above we obtain
2 1 1
Xj = 4- X2 = 5- X3 = 13 , f(x) = 52.
3' 3'
The equality of the maximum of Lm(A) and the minimum of f(x) is a useful check
that we obtained the correct solution .•••
356
Section 9.2: Dual Methods
Because the dual method requires only one-dimensional minimization in the design
variable space, it has been used for cases where some of the design variables are
discrete (see, Schmit and Fleury [17]). To demonstrate this approach we suppose all
the design variables are discrete. That is, the optimization problem may be written
as
L
n
=L
(9.2.12)
such that gj(x) gji(Xi) ~ 0, j = 1, ... ,ng,
i=l
and XiEX i , i=l, ... ,n.
The set Xi = {d i1 , di2 , •.. , } is a set of discrete values that the ith design variable can
take. The Lagrangian function is
L Li(Xi, >.),
n
where
n.
Li(Xi, >.) = !;(Xi) - L Ajgji(Xi), i=l, ... ,n. (9.2.14)
j=l
(9.2.15)
Note that for a given Xi, Li is a linear function of >.. The minimum over Xi of Li is
a piecewise linear function, with the pieces joined along lines where Li has the same
value for two different values of Xi. If the set Xi is ordered monotonically, and the
discrete values of Xi are close, we can expect that these lines will be at intersections
where
(9.2.16)
Equation (9.2.16) defines boundaries in >.-space, which divide this space into regions
where x is fixed to a particular choice of the discrete values. The use of these bound-
aries in the solution of the dual problem is demonstrated in the following example
from Ref. [181.
357
Chapter 9: Dual and Optimality Criteria Methods
Example 9.2.2
For the two bar truss shown in Figure 9.2.1, it is required to find the minimum
weight structure by selecting each of the cross-sectional areas Ai, i = 1,2, from the
discrete set of areas
A={1,1.5,2},
while at the same time satisfying the displacement constraints
FL ( 1 1)
u = 2E Al + A2 '
It is convenient to use Yi = l/A i as design variables. Denoting the weight by TV, and
the weight density by p, we can formulate the optimization problem as
W 1 1
minimize -=-+-
pL YI Y2
such that 1.5 - YI - Y2 2 0 ,
0.5 - YI + Y2 2 0 ,
and
L1(YI,)..) = -1.5Al - 0.5A2 + l/YI + (AI + A2)YI ,
L2(Y2,)..) = 1/Y2 + (AI - A2)Y2 .
358
Section 9.2: Dual Methods
The boundaries for changes in values of Y1, from Eq. (9.2.16) are
1 1 1 2
1/2 + 2(A1 + A2) = 2/3 + 3(A1 + A2),
1 2 1
2/3 + 3(A1 + A2) = 1 + A1 + A2·
This yields
(112,1)
2 --? x
~~,f
Figure 9.2.2 Regions in (A1' A2) space for two-bar truss problem
These lines divide the positive quadrant of the (A1' A2) plane into 6 regions with
the values of YI and yz in each region shown in Figure (9.2.2). \Ve start the search for
the maximum of Lm at the origin, ,x = (0,0). At this point L(x,,x) = l/YI + 1/Y2,
so that L(x,,x) is minimized by selecting the discrete values YI = Yz = 1, as also
indicated by the figure. For the region where these values are fixed
Obviously, to maximize Lm we increase A1 (we cannot reduce A2) until we get to the
boundary of the region at (1.5,0) with Lm = 2.75. \Ve can now move into the region
where the values of (YI, Y2) are (2/3,1), or into the region where these values are
(2/3,2/3). In the former region
359
Chapter 9: Dual and Optima, :y Criteria Methods
Example 9.2.3
or
Al = 7, Al = 9.
For X2 we consider transitions between 4 and 5 and between 5 and 6. Equa-
tion(9.2.16), applied to L2 yields
Similarly, for X3, Eq. (9.2.16) applied to L3 for transitions between 0 and 1 and
between 1 and 2, gives
and A2 = l.5.
360
Section 9.2: Dual Methods
L = 65 - A.) - 2A.z
(5,6,2)
2
L =62 - A.)
(5,6, 1)
o
8 9 10 11
Figure 9.2.3 A-plane for Example 9.2.3.
These boundaries, and the values of .e(x, A) in some of the regions near the
continuous optimum are shown in Figure 9.2.3. We start the search for the optimum
at the continuous optimum values of Al = 9~, A2 = I!.
For this region the values
of the x;'s that maximize .em are (5,5,1), and .em = 51 + A2. This indicates that
A2 should be increa.'3ed. For A2 = 1.5 we reach the boundary of the region and
.em = 52.5. That value is attained for the entire boundary of the region marked
in heavy line in Figure 9.2.3. There are six adjacent regions to that boundary, and
using the expressions for .em given in the figure we can check that .em = 52.5 is the
maximum. \Ve now have six possible choices for the values of the Xi'S, as indicated
by the six regions that touch on the segment where .em is maximal. The two leftmost
regions violate the first constraint, and the three bottom regions violate the second
constraint. Of the two regions that correspond to feasible designs (5,5,2) has lower
objective function f = 54. The optimum, however, is at (4,6,1) with f = 53 .•••
\Vhile this example demonstrates that the method is not guaranteed to converge
to the optimum, it has been found useful in many applications. In particular, the
method has been applied extensively by Grierson and coworkers to the design of steel
frameworks using standard sections [1921J. The reader is directed to Ref. [18J for
additional information on the implementation of automatic searches in A-space for
the maximum, and for the case of mixed discrete and continuous variables.
Many of the first order approximations discussed in chapter 6 are separable. The
linear and the conservative approximations are also concave, and the reciprocal ap-
proximation is concave in some ca.'3es. Therefore, if the objective function is convex
and separable the dual approach is attractive for the optimization of the approxi-
mate problem. Assume, for example, that the reciprocal approximation is employed
361
Chapter 9: Dual and Optimality Criteria Methods
for the constraints, and the objective function is approximated linearly. That is, the
approximate optimization problem is
+L
n
where the constants Cij in (9.2.17) are calculated from the values of f and gj and
their derivatives at a point Xo. That is
of
fo = f(xo) - L XOi~(XO)
n
i=1 x,
, fi
of
= "!l(xo),
UXi
(9.2.18)
(9.2.19)
This approximate problem is convex if all the Ci/S are positive. Alternatively, the
problem is convex in terms of the reciprocals of the design variables if all the fi'S are
positive. In either case we have a unique optimum. The Lagrangian function is now
The first step in the dual method is to find Cm(A) by minimizing C(x, A) over x.
Differentiating C(x, A) with respect to x, we obtain
n.
fi - L >"jCij/X~ = 0, (9.2.21)
r
j=l
so that
x; ~ U; ~ ~jC;j (9.2.22)
(9.2.23)
362
Section 9.2: Dual Methods
where x;(oX) is given by Eq. (9.2.22).
The maximization of .em(oX) may be performed numerically, and then we need
the derivatives of .em(oX). Using Eq. (9.2.21) we find
a.em
a>.. = -COj +
1
Ln
;=1
C;j/Xi(oX), (9.2.24)
and
a2.em ~(/ 2) aXi
a>.1·a>. k = - L...J
;=1
Cij X; a>.k ' (9.2.25)
,x.u
Y,v
363
Chapter 9: Dual and Optimality Criteria Methods
h = AA + 2A B ,
which is based on the assumption that the cost of member B is high. The constraints
are
gl = 1 - u/d ?: 0,
g2=I-v/d?:0,
where 11 and v are the horizontal and vertical displacements, respectively. Assuming
that all three members have the same Young's modulus E, we may check that
4pl
u=--
3EA A '
2pl
v = -=-:----------,-
E(AB + 0.25A A) .
vVe now use the reciprocal approximation for g2(X) about an initial design point
X6=(I,I)
g2(XO) = -0.2,
Og2 (xo) = 0.375
OXI (.T2 + 0.25xl)2 IXo = 0.24,
Og2 (xo) = 1.5 .
+ 0.25xd 2 IXo
= 0.96.
OX2 (X2
minimize f = Xl + 2X2
subject to gl(X) = 1 - I/x] ?: 0,
g2R(X) = 1 - 0.24/x] - 0.96/.7:2 ?: O.
a2 cm
8),r = -2"
1( 1)3 1.113 = -0.3626,
so that
Xl = (0.503 + 0.24 x 1.903)1/2 = 0.980,
= (0.48 X 1.903)1/2 = 0.956.
X2
One additional iteration of Newton's method yields ),1 = 0.356, ),2 = 2.05, Xl = 1.02,
X2 = 1.17. We can check on the convergence by noting that the two Lagrange
multipliers are positive, so that we expect both constraints to be critical. Setting
gl(X) = 0 and g2R(X) = 0, we obtain Xl = l,x2 = 1.263,1 = 3.526 as the optimum
design for the approximate problem. Newton's method appears to converge quite
rapidly. The optimum of the original problem can be found by setting gl (x) = 0 and
g2(X) = 0 to obtain Xl = 1, X2 = 1.25, f = 3.5 .•••
Because dual methods operate in the space of Lagrange mUltipliers they are par-
ticularly powerful when the number of constraints is small compared to the number
of design variables. The same is true for optimality criteria methods which are dis-
cussed next. These methods are indeed exceptional when we have only a single critical
constraint.
Optimality criteria methods originated with the work of Prager and his co-workers
(e.g., [23]) for distributed parameter systems, and in the work of Venkayya, Khot,
365
Chapter 9: Dual and Opt, tlity Criteria Methods
and Reddy ([24] for discf( /stems. They formulated optimality criteria such as the
uniform energy distributi. riterion discussed earlier. Later the discrete optimality
criteria were generalized 't1J'"i3erke, Venkayya, Khot, and others ( e.g., [25]-[27]) to
deal with general displacement constraints. The discussion here is limited to discrete
optimality criteria, and it is based to large extent on Refs [28] and [29]. The reader
interested in distributed optimality criteria is referred to a textbook by Rozvany [30]
who has contributed extensively to this field.
Optimality criteria methods are typically based on a rigorous optimality criterion
derived from the Kuhn-Tucker conditions, and a resizing rule which is heuristic. Usu-
ally the resizing rule can be shown to be based on an assumption that the internal
loads in the structure are insensitive to the resizing process. This is the same assump-
tion that underlies the FSD approach and the accompanying stress-ratio resizing rule.
This assumption turns out to be equivalent in many cases to the assumption that the
reciprocal approximation is a good approximation for displacement constraints. This
connection between optimality criteria methods and the reciprocal approximation is
useful for a better understanding of the relationship between optimality criteria meth-
ods and mathematical programming methods, and is discussed in the next section.
We start by showing that for some structural design problems the assumption of
constant internal loads is equivalent to the use of the reciprocal approximation for
the displacements. The equations of equilibrium of the structure are written as
Ku=f, (9.3.1)
where K is the stiffness matrix of the structure, u is the displacement vector, and f
is the load vector.
Because the reciprocal approximation is used extensively in the following, we
introduce a vector y of reciprocal variables, y; = llx;, i = 1, ... , n. The displacement
constraint is written in terms of the reciprocal design variables as
g( u, y) = z- ZT U ;::: 0, (9.3.2)
where z is a displacement allowable, and zT u is a linear combination of the displace-
ment components. The reciprocal approximation is particularly appropriate for a
special class of structures defined by a stiffness matrix which is a linear homogeneous
function of the design variables Xi (e.g., truss structures with cross-sectional areas
being design variables)
n n
(9.3.3)
i=1 i=1
We also assume that the load is independent of the design variables. Under the above
conditions we will show that
n 8
g(u) = z+ :LYi~' (9.3.4)
i=1 8y;
366
Section 9.3: Optimality Criteria Methods for a Single Constraint
That is, what appears to be a first order approximation of 9 is actually exact. Equa-
tion (9.3.4) does not imply that the constraint is a linear function of the design
variables because 8gj8Yi depends on the design variables. To prove Eq. (9.3.4) we
use Eq. (7.2.8) for the derivative of a constraint, replacing x by Yi. As the load vector
is independent of the design variables, Eq. (7.2.8) yields
Using Eqs. (7.2.7), (9.3.4), (9.3.5) and (9.3.6) and the symmetry of K we get
From Eqs. (9.3.7) and (9.3.2) we can see that Eq. (9.3.4) is indeed correct.
Equation (9.3.4) motivates the use of the reciprocal approximation for displace-
ment constraints. For statically determinate structures, under the assumptions used
to prove Eq. (9.3.4), we have even a stronger result that the derivatives 8gj8Yi are
constant, so that the reciprocal approximation is exact. We prove this assertion by
showing that if internal loads in the structure are independent of the design variables
then 8g j 8Yi are constant. The internal loads in a statically determinate structure are,
of course, independent of design variables which control stiffness but not geometry
or loads.
We consider K;jYi in Eq.(9.3.3) to be the contribution of the part ofthe structure
controlled by the ith design variable to the total stiffness matrix K. The forces acting
on that part of the structure are fi
K
fi=-'u. (9.3.8)
Yi
If the ith part of the structure is constrained against rigid body motion, the same
forces will be obtained from a reduced stiffness matrix K~ and a reduced displacement
vector u~
K~u,i ,
fi = - (9.3.9 )
Yi
where K~ is obtained from Ki by enforcing rigid body motion constraints, and u~ is
obtained from u by removing the components of rigid body motion from the part
of u which pertains to the ith part of the structure. Under these conditions K~ is
invertible, so that
u:
= YiUj, (9.3.10)
367
Chapter 9: Dual and Optimality Criteria Methods
where
Ui = (K~t1fi . (9.3.11)
Using Eqs. (9.3.6), (9.3.8) and (9.3.9), we now write Eq. (9.3.5) as
(9.3.12)
The vector ATK:!Yi is the internal force vector due to the (dummy) load z (see Eq.
(7.2.7», and is constant if we assume that internal forces are independent of design
variables. Also fi in Eq. (9.3.9) is constant, and so is Ui from Eq. (9.3.11). Therefore,
finally, from Eq. (9.3.12) 8g18y; is constant.
We will now consider the use of optimality criteria methods for a single displace-
ment constraint, based on the reciprocal approximation.
of _ >. 8g _ 0
oy; OYi - ,
i = 1, ... ,n. (9.3.14)
In many cases the objective function is linear or almost linear in terms of the original
design variables Xi, and since Yi = 1lxi, Eq. (9.3.14) is rewritten as
28f 8g
Xi-O
Xi
+ >.~
VYi
= 0, (9.3.15)
so that
Og/OYi) 1/2
Xi = ( ->. of 18x; , i = 1, ... ,no (9.3.16)
The Lagrange multiplier>. is obtained from the requirement that the constraint re-
mains active (with a single inequality displacement constraint we can usually assume
that it is active). Setting the reciprocal approximation of the constraint to zero we
have
gR = g(yo) + Ln
;=1
8g
"i):(Yi - YOi) = Co +
y,
n
i=l
L
og 1
"i):;: = 0,
y, ,
(9.3.17)
where
(9.3.18)
368
Section 9.3: Optimality Criteria Methods for a Single Constraint
Substituting from Eq.(9.3.16) into Eq. (9.3.17) we obtain
A ~ [~ t. (- ::;:rr
Equations (9.3.19) and (9.3.16) can now be used as an iterative resizing algorithm
(9.3.19)
We repeat example 9.2.2 with only a single displacement constraint on the verti-
cal displacement. Using the normalized design variables, we pose the mathematical
formulation of the problem as
minimize f(x) = Xl + 2X2
subject to g(x) = 1 - 1.5/(X2 + 0.25xd ~ O.
369
Chapter 9: Dual and Optimality Criteria Methods
We also add minimum gage requirements that Xl ;::: 0.5 and X2 ;::: 0.5.
The derivatives required for the resizing process are
of of og 2 og 0.375xi
OXI = 1, OX2 = 2, - = - XI - = -
0YI OXl (X2 + 0.25xt}2 '
og 20g 1.5x~
- = - X2- = .
0Y2 OX2 (X2 + 0.25xt}2'
og og
Co = g(y) + -;;-Yl
UYl
+ -;;-Y2
UY2
1.5 0.375xl 1. 5x 2
= 1 - (X2 + 0.25xd + (X2 + 0.25xl)2 + (X2 + 0.25xl)2 =1
.
We start with an initial design Xo = (1, If, and the iterative process is summarized
in Table 9.3.1.
Table 9.3.1
Eq.(9.3.16)
Xl X2 OgjOYl OgjOY2 c*0 .\. Xl X2
1.0 1.0 -0.24 -0.96 1.0 3.518 0.92 1.30
0.92 1.30 -0.136 -1.083 1.0 3.387 0.68 l.35
0.68 1.35 -0.0751 -1.183 1.0 3.284 0.496 l.39
0.50 1.39 -0.0408 -1.263 0.918 2.997 0.350 l.376
0.50 1.376 -0.0416 -l.261 0.917 2.999 0.353 1.375
The design converged fast to Xl = 0.5 (lower bound) and X2 = 1.375 even though
the derivative of the constraint with respect to Yl is far from constant. The large
variation in the derivative with respect to Yl is due to the fact that the three bar
truss is highly redundant. This statement that one extra member constitutes high
redundancy may seem curious, but what we have here is a structure with 50% more
members than needed .•••
As can be seen from the example, the optimality criteria approach to the single
constraint problem works beautifully. Indeed, it is difficult to find It more suitable
method for dealing with this class of problems.
The optimality criteria approach discussed in the previous section is very similar to
the dual method. In particular, Eq. (9.2.22) is a special case of Eq. (9.3.16). While
the derivations in the previous section were motivated by the match between displace-
ment constraints and the reciprocal approximation, they arc clearly suitable for any
constraint that is reasonably approximated by the reciprocal approximation. In the
present section we generalize the approach of the previous section, and demonstrate
its application to more general constraints.
370
Section 9.3: Optimality Criteria Methods for a Single Constmint
may be written as
A_ of / og i = 1, ... ,no (9.3.24)
- ox; ax;'
The right-hand-side of Eq. (9.3.24) is a measure of the cost effectiveness of the ith
design variable in affecting the constraint. The denominator measures the effect of x;
on the constraint, and the numerator measures the cost associated with it. Equation
(9.3.24) tells us that at the optimum all design variables are equally cost effective in
changing the constraint. Away from the optimum some design variables may be more
effective than others. A reasonable resizing technique is to increase the utilization of
the more effective variables and decrease that of the less effective ones. For example,
in the simple case where Xi, of/ax; and og/ox; are all positive, a possible resizing
rule is
x~ew = x~d(Ae.)l/'1 ,
I t t (9.3.25)
where
ei = (og/oxi)/(of /OXi) , (9.3.26)
is the effectiveness of the ith variable and 'T] is a step size parameter. A large value of
'T] results in small changes to the design variables, which is appropriate for problems
where derivatives change fast. A small value of 'T] can accelerate convergence when
derivatives are almost constant, but can cause divergence otherwise. To estimate the
Lagrange multiplier we can require the constraint to be critical at the resized design.
Using the reciprocal approximation, Eq. (9.3.17), and substituting into it Xi from
Eq. (9.3.25), we get
1 n
A= [ - Lx;~ei'
og _l.
,
l'1 (9.3.27)
Co i=l uX,
with Co obtained from Eq. (9.3.18). A resizing rule of this type is used in the FASTOP
program [31] for the design of wing structures subject to a flutter constraint.
Example 9.3.2
A container with an open top needs to have a minimum volume of 125m3 . The cost
of the sides of the container is $10/m2 , while the ends and the bottom cost $15/m2 •
Find the optimum dimensions of the container.
We denote the width, length and height of the container as Xl, X2, and X3, re-
spectively. The design problem can be formulated then as
371
Chapter 9: Dual and Optimality Criteria Methods
e1 = X2X3/(30X3 + 15x2) ,
e2 = XIX3/(20X3 + 15xI) ,
e3 = xlx2/(20X2 + 30xd .
Xl =5(8.62/9)1/2 = 4.893,
X2 =5(8.62/7)1/2 = 5.549,
X3 =5(8.62/10)1/2 = 4.642.
For the new values of the design variables we obtain f = 1604, g = 1.04, e1 = 0.l158,
e2 = 0.1366, e3 = 0.1053, and .A = 8.413. The next iteration is then
Finally, for these values of the design variables the effectivenesses are e1 = 0.l180,
e2 = 0.1320 and e3 = 0.1089, g = 0.54, and f = 1584. We see that the maximum dif-
ference between the e;'s which started at 43 percent is now 21 percent. By continuing
the iterative process we find that the optimum design is Xl = 4.8075, X2 = 7.2l12,
X3 = 3.6056 and f = 1560. At the optimum e1 = €2 = €3 = 0.120. Even though the
design variables change much from the values we obtained after two iterations, the
objective function changed by less than two percent, which was expected in view of
the close initial values of the ei's . •••
As noted in the previous section, Eq. (9.3.24) indicates that at the optimum all design
variables (which are not at their lower or upper bounds) are equally cost effective,
and that their cost effectiveness is equal to 1/.A. It is possible, therefore, to estimate
.A as an average of the reciprocal of the cost effectivenesses. Venkayya [291 proposed
to estimate .A as
.A = 'C"'n
2:~=1 ai , (9.3.28)
L.."i=l aiei
where the ai represent some suitable weights (such as af /aXi). Equation (9.3.28) can
then be used in conjunction with a resizing rule, such as Eq. (9.3.25).
Unfortunately, the combination of Eq. (9.3.28) with a resizing rule does not
contain any mechanism for keeping the constraint active, and so the iterative process
will tend to drift either into the feasible or infeasible domains. Therefore, an estimate
372
Section 9.3: Optimality Criteria Methods for a Single Constraint
of oX from Eq. (9.3.28) must be accompanied by an additional step to insure that the
design remains at the constraint boundary. One simple mechanism, used extensively
with optimality criteria formulations is that of design variable scaling. One reason
for the popularity of scaling is that for the simple case represented by Eq. (9.3.3)
it is very easy to accomplish. It is easy to check from Eqs. (9.3.1) and (9.3.3) that
scaling the design variable vector by a scalar a to ax scales the displacement vector
to (1/0' )u. Venkayya [291 proposed the following procedure for the more general case.
Consider a constraint g of the form
g(x) = z- z(x) ~ 0, (9.3.29)
where z(x) represents some response quantity such as a displacement component,
and z is an upper bound for z. If at the current design g ~ 0, we would like to find
a so that
z(ax) = z. (9.3.30)
Approximating z(ax) linearly about x we get
otherwise Eq. (9.3.33) is to be used. It can be readily checked that the scaling
equations for a in terms of g are valid also for lower bound constraints of the form
z - z ~ O.
In combining the resizing step, Eq. (9.3.25), with the scaling step we must con-
sider whether we calculate new derivatives for each of these two operations. If we
do, then the number of derivative calculation will increase to two per iteration. In
most cases this is unnecessary. Unless the scaling step results in large changes in the
design variables we can calculate the Lagrange multiplier using derivatives obtained
before scaling.
373
Chapter 9: Dual and Optim. :ty Criteria Methods
Example 9.3.3
Consider again the container problem of Example (9.3.2). We will solve it again using
Eq. (9.3.28) for estimating ,x, and also employ scaling.
vVe start with the same initial design as in Example (9.3.2) of Xl = X2 = X3 = 5m.
For this design 9 = 0, so that we do not need any scaling. We have e1 = 1/9, e2 = 1/7,
e3 = 1/10, so that Eq. (9.3.28) with all the weights set to one gives us
3
,x = = 8475
1/9 + 1/7 + 1/10 . ,
Then, using Eq. (9.3.25) with." = 2, we have
Xl =5(8.475/9)1/2 = 4.852,
X2 =5(8.475/7)1/2 = 5.502,
X3 =5(8.475/10)1/2 = 4.603.
For the new values of the design variables 9 = -2.12, f = 1577, 8g/8xl = 25.325,
8g/8x2 = 22.334, 8g/8x3 = 26.695, el = 0.1148, e2 = 0.1355, ea = 0.1044. In our
case z = X1X2.Ta and it is easy to check that Eq. (9.3.35) is satisfied, so that we use
Eq. (9.3.32) for scaling.
a = 1- -2.12 = 1.00576.
25.325 X 4.852 + 22.334 X 5.502 + 26.695 X 4.603
Scaling the design variables we get Xl = 4.880, X2 = 5.533, and X3 = 4.630. For these
scaled variables 9 = 0.015, indicating that the scaling worked. For this scaled design
f = 1595 which is a truer measure of improvement than the f = 1577 of the unsealed
design, because the constraint is not violated. \Ve next obtain
3
,x = 0.1148 + 0.1355 + 0.1044 = 8.457,
and resize to obtain
X'1= 4.880(8.457 X 0.1148)1/2 = 4.808,
X2 = 5.533(8.457 X 0.1355)1/2 = 5.923,
= 4.628(8.457 X 0.1044)1/2 = 4.351.
Xa
For this design 9 = -1.08, f = 1570, 8g/8xI = 25.772, 8g/8x2 = 20.921, 8g/8x3 =
28.481, el = 0.1175, e2 = 0.1315, e3 = 0.1084. The scaling factor a is
-1.08
a = 1- = 1.0029.
25.772 X 4.808 + 20.921 X 5.923 + 28.481 X 4.351
Scaling the design variables we get Xl = 4.822, .7:2 = 5.941, and X3 = 4.364. For these
values 9 = 0.018 and f = 1579. Note that convergence is faster than in Example
(9.3.2).
374
Section 9.4: Seveml Constmints
We start again by posing the optimization problem in terms of the reciprocal variables
minimize f(y)
(9.4.1)
subject to 9j(y)?= 0, j = 1, ... ,ng,
so that the Kuhn-Tucker conditions are
where
k = 1, ... ,n. (9.4.4)
k = 1, . .. ,n. (9.4.5)
However, several other possibilities for using Eq. (9.4.3) have been proposed and
used. One resizing rule, called the exponential rule, is based on rewriting Eq. (9.4.3)
as
In. )
k = Xk ( --:r; L
1/1/
X EW
AjCkj , k = 1, ... ,n, (9.4.6)
XkJk j=1
where the old value of Xk is used on the right-hand side to produce a new estimate for
Xk. A linearized form of Eq. (9.4.6) can be obtained by using the binomial expansion
as
k = 1, ... ,n, (9.4.7)
where
k = 1, ... ,n. (9.4.8)
375
Chapter 9: Dual and Optimality Criteria Methods
It is clear from the form of the last two equations that 17 is a damping or step-size
parameter. A high value of 17 reduces the correction to the present design, prevents
oscillations, but can slow down progress towards the final design. A value of 17 = 2
corrC'sponds to Eq.(9.4.5).
The main difficulty in the case of multiple constraints is the calculation of the
Lagrange multipliers. It is possible to use the dual method and calculate the Lagrange
multipliers using Newton's method. A second approach is to calculate them from the
condition that the critical constraints remain critical, similar to Eq. (9.3.17). Assume,
for example, that the ng constraints are all critical. Then the Lagrange multipliers
arc fonnd from the condition that
I = 1, ... , n g , (9.4.9)
or
(9.4.10)
I: I: x
"fl g n
CklClej, -
3
--A· -
J f
I:
1l
C"I
X
- - 1791 ( X ) , I = 1, ... , 71 g • (9.4.11)
j=l k=l . k· k k=l k
Equation (9.4.11) is a system of linear equations for A. Often the solution will yield
negative values for some of the Lagrange multipliers which may indicate that the
corresponding constraints should not be considered active. Several iterations with
revised sets of active constraints may be needed before a set of positive Lagrange
multipliers is found. Equation (9.4.11) may also be used to find starting values for a
solution with the dual approach.
Stress constraints can be dealt \vith using the above approach. However, in
many optimality criteria procedures they are handled instead by using the stress
ratio technique. Member sizes obtained by the stress ratio technique are then used
as minimum gages for the next optimality criteria iteration. The two approaches are
compared in the following example.
Example 9.4.1
Find the minimum-mass design of the truss in Figure 9.4.1 subject to a limit
of d = O.OOll on the vertical displacement and a limit of au on the stresses. The
design variables are the cross-sectional areas of the members, AA, An and A c , and
because of symmetry it is required that AA = Ac. All members arc made from the
same material having Young's modulus E, density p and yield stress ao = O.002E.
After finding the optimum design we also want to estimate the effect of increasing
t he displacement allowable to 1.25d.
376
Section 9.4: Several Constraints
I
x,u
I
p
y,v 8p
The truss was analyzed in example 6.1.2, and the vertical displacement and
stresses in the members were found to be
8pl
v =
+ 0.25A A )
=c:-:---=--:::----::-:-:--:-~
E(AB ,
aA = P (_v'3_3_ + 2 ) ,
3A A AB + 0.25A A
8p
=
aB
AB + O.25A A ,
ae = p ( __v'3_3_ + 2 )
3A A All + O.25A A
minimize m = pl(AB + 4A A )
v
subJ'ect to 91 = 1 - - -
.
>0
0.0011 - ,
aA aB
g2 = 1- -;::: 0, 93=1--;:::0,
ao aD
ae ae
g4 = 1- -;::: 0, g5 = 1 + -;::: O.
ao ao
where the second constraint on ae is needed because rYe could be negative. Defining
nondimensional design variables
377
Chapter 9: Dual and Optimality Criteria Methods
we may rewrite the problem as
minimize f(x) = 4Xl + X2
16
such that gl(X) =1- (X2 + O.25xd ;::: 0,
g2(X)
v'3 -
= 1 - -3Xl 2
> 0,
(X2 + O.25xl) -
8
g3(X) = 1 - (X2 + O.25xI) ;::: 0,
g4(X) = 1 + -
v'3 - 2
> 0,
3Xl (X2 + O.25xl) -
g5(X)
v'3 +
= 1 - -3Xl 2
> O.
(X2 + O.25xl) -
Obviously, gl is always more critical than g3, and g2 is always more critical than either
g4 or g5, so that we need consider only gl and g2. We solve the problem first by using
the stress ratio technique coupled with the optimality criterion for the displacement
constraint.
Using the stress-ratio technique we resize the areas as
378
Section 9.4: Several Constraints
We start at x~ = (1,10) and obtain
92 = 0.2275, 93 ~9l
= 0.2195, UYI = -0.03807, 091 = -15.23.
OY2
Applying the stress ratio technique we get (Xl)new = 0.7725, (X2)new = 7.805. Because
of the large difference in the derivatives of 91 with respect to Yl and Y2 we expect the
optimality criteria approach to try and reduce Xl further, so that the value obtained
from the stress ratio technique will end up as a minimum gage constraint. Therefore,
we consider Xl to be a passive design variable (i.e Xl E Ip). Then from Eqs.(9.3.20)
and (9.3.21) we have
Finally from Eq. (9.3.16) we obtain Xl = 0.356, X2 = 15.83, confirming the assump-
tion that Xl is controlled by the stress constraints. The iteration is continned in Table
9.4.1.
Table 9.4.1
Iteration Xl X2 (xdnew (X2)new c*0 A Xl X2
1 1. 10. 0.7725 7.805 0.9619 16.46 0.356 15.83
2 0.7725 15.83 0.6738 7.904 0.9880 16.00 0.193 15.81
3 0.6738 15.81 0.6617 7.916 0.9894 16.00 0.169 15.83
We next solve the same problem using the optimality criteria technique for both
constraints. We use Eq. (9.4.11) for calculating the Lagrange multipliers, and Eq.
(9.4.8) with TJ = 2 for updating the design variables. The iteration history is given in
Table 9.4.2.
Table 9.4.2
Iteration Xl x2 91 92 Al A2 6. x l 6. x 2
1 1. 10. -0.5610 0.2275 11.70 O. -0.4443 3.906
2 0.5557 13.906 -0.1392 -0.1814 15.00 2.648 0.0897 1.694
3 0.6434 15.600 -0.0152 -0.0243 15.63 2.826 0.0160 0.231
Note that Tables 9.4.1 and 9.4.2 indicate convergence to the same design, with
A in Table 9.4.1 and Al in Table 9.4.2 converging to 16.00. this value is the 'price'
of 91. At the optimum design 91 = 0 or v = d. If we increase that allowable
displacement to 1.25d, then 91 = 0.2, and the expected decrease in the objective
function is approximately 0.2 x 16 = 3.2 .•••
379
Chapter 9: Dual and Optimality Criteria Methods
where
e .. _ ogj/ of i = 1, ... , n , j = 1, ... , ng , (9.4.13)
'J - OXj OXj
is the effectiveness parameter of the ith design variable with respect to the jth con-
straint. Equation (9.4.12) indicates that at the optimum the effectivenesses of all
design variables, weighted by the Lagrange multipliers, are the same. This form of
weighting makes sense, since the Lagrange multipliers measure the importance of the
constraints in terms of their effect on the optimum value of the objective function.
Venkayya [29] suggests the generalization of Eq. (9.3.25) as
for resizing the design variables. For the Lagrange multiplier evaluation he proposes
using estimates based on a single constraint, that is Eq. (9.3.28), which gives
However, Lagrange multipliers are calculated only for the most critical constraints,
and are set to zero for the other constraints. Finally, scaling is used, based on the most
critical design constraint. This approach is demonstrated by repeating the previous
example.
Example 9.4.2
g2(X) = 1 - - -
v'3 2
> O.
3Xl (X2 + 0.25xd -
We solve this problem assuming that a constraint is critical if after scaling its value
is less than 0.15. Starting with Xl = 1, X2 = 10, we get gl = -0.5610,92 = 0.2275,
so that we need to scale based on the first constraint. For this constraint we have
4 16
2 = 0.03807, -;----:-::-=__= = 0.1523 ,
( X2 + O. 25Xl ) (X2 + 0.25xl)2
380
Section 9.4: Several Constraints
agd aXl agd aX2
ell = af/aXl = 0.009518, e21 = af/ aX 2 = 0.1523.
For this case z = 1 - g so that the scaling test, Eq. (9.3.35) yields
1 ~ az -1 ~ ag -1
~ f;;r aXi Xi = 1 _ g f;;r
aXi Xi = 1.561 (0.03807 x 1 + 0.1523 x 10) ~ O.
381
Chapter 9: Dual and Optimality Criteria Methods
Table 9.4.3
Scaled Resized
Xl X2 g1 g2 >'1 >'2 Xl x2
1.5610 15.61 0 0.5051 12.36 0 0.9142 18.28
0.7901 15.80 0 0.1443 40.32 11.18 0.9457 18.67
0.8004 15.80 0 0.1537 42.04 0 0.4688 18.51
0.6277 24.78 0.3584 0 0 3.017 0.7448 9.000
1.2974 15.68 0 0.4300 9.927 0 0.7598 18.36
0.6593 15.93 0.006 0 40.49 7.807 0.7910 18.77
0.6672 15.83 0 0.0096 42.34 8.453 0.8003 18.66
0.6789 15.83 0 0.0246 41.85 8.646 0.8143 18.66
There are several other formulations of optimality criteria methods. These are of-
ten tailored to treat specific constraints. An example is the treatment of stability
constraints by Khot in [32]. The stability eigenvalue problem is typically written as
(9.4.16)
where K is the stiffness matrix, KG is the geometric stiffness matrix, ILk is the buckling
eigenvalue, and Uk is the corresponding eigenvector or buckling mode. We assume
that the modes are normalized so that
uIKGUk = 1, (9.4.17)
and then the eigenvalue ILk is given by
ILk = uIKuk. (9.4.18)
The constraints on eigenvalues considered in [32] are of the form
j = 1, ... ,ng. (9.4.19)
The derivative of gj with respect to a design variable Xi is obtained from Eq. (7.3.5)
as
(9.4.20)
The second term of the right-hand side of Eq. (9.4.20) is zero if the pre buckling
internal loads, and therefore KG, do not depend on the design variables. Even when
the second term is not zero, there are many situations where it can be neglected.
Khot defines
2 aILj 2 T aK
bij = Xi -a = Xi U j -a Uj. (9.4.21 )
Xi Xi
382
Section 9.5: Exercises
then from Eqs. (9.4.18) and (9.4.21)
1/. -
r"J - L: ..21.
n
b··
X·
, (9.4.23)
j=1 I
so that
where fi = of loxi' We can use the more general form corresponding to Eq. (9.4.6)
(9.4.27)
The calculation of the Lagrange multipliers then follows one of the methods suggested
in this section. In [32] the method leading to Eq. (9.4.11) was employed. The method
converged well for the truss examples in [32] even though the coefficients bjj can be
expected to change substantially with changes in the design.
To conclude this chapter we should note that it emphasized the relationship be-
tween optimality criteria methods, dual methods and approximation concepts. There
are other treatments of optimality criteria both for specific and for general constraints.
The reader is directed to Refs. [33-34] for survey of other works on optimality criteria
methods.
9.5 Exercises
1. Show that for the linear case the Falk dual leads to the dual formulation discussed
in Chapter 3.
2. The truss of Figure 9.2.4 is to be designed subject to stress and Euler b1lckling
constraints for two load conditions: a horizontal load of magnitude p; and a vertical
load of magnitude 2p. The yield stress is ao = fiE where E is Young's modulus and
383
Chapter 9: Dual and Optimality Criteria Methods
0: a proportionality constant. Assume that the moment of inertia of each member is
I = fJA2 where fJ is a constant and A the cross-sectional area. Write a program to
obtain a fully-stressed design of the truss, assuming that member A and member C
are identical, for various 0:, {J, p, E, and t. What is the design for 0: = 10-3 , {J = 1.0
and (Jol2/ p = 105 . .
3. Obtain the FSD resizing rule for a panel of thickness t subject to in-plane loads
n x , ny, nxy and bending moments m x , my, m xy (all per unit length) using the Tresca
(maximum shear stress) yield criterion.
4. Using the dual method find the minimum of f = Xl + X2X3 + x~ subject to the
constraint 10 - I/XI - 2X2X3 -1/x4 ~ 0 and Xi ~ 0, i = 1, ... ,4.
5. Write a computer program to solve Example 9.2.4. Perform enough iterations to
obtain the optimum design to three significant digits.
6. Repeat Example 9.2.3 when Xl and X2 can take only even integer values, and X3
can vary continuously.
7. \\Trite a program to repeat Example 9.3.1 when the design is not symmetric, so
that we have three design variables. Member C is not subject to minimum gage
constraints, but members A and Bare.
8. Find how small we can make 'T] in Example 9.3.2 without causing divergence of
the solution.
9. Solve Example 9.4.1 with the additional constraint that the horizontal displace-
ment does not exceed d = 0.0005l.
10. Complete Tables 9.4.1 and 9.4.2 for Example 9.4.1.
11. Use an optimality criteria method to design the truss of Figure 9.2.4 so that the
fundamental frequency is about 1 Hertz, and the second frequency above 3 Hertz.
Assume that all members have the same material properties.
9.6 References
384
Section 9.6: References
[5] Razani, R, "Behavior of Fully Stressed Design of Structures and its Relationship
to Minimum Weight Design," AIAA J., 3 (12), pp. 2262-2268,1965.
[6] Dayaratnam, P. and Patnaik, S., "Feasibility of Full Stress Design," AIAA J., 7
(4), pp. 773-774,1969.
[7] Lansing, W., Dwyer, W., Emerton, R and Ranalli, E., "Application of Fully-
Stressed Design Procedures to Wing and Empennage Structures," J. Aircraft, 8
(9), pp. 683-688, 1971.
[8J Giles, G.L., Blackburn, C.L. and Dixon, S.C., "Automated Procedures for Sizing
Aerospace Vehicle Structures (SAVES)," AIAA Paper 72-332, presented at the
AIAA/ ASME/SAE 13th Structures, Structural Dynamics and Materials Confer-
ence,1972.
[9J Berke, L. and Khot, N.S., "Use of Optimality Criteria for Large Scale Systems,"
AGARD Lecture Series No. 170 on Structural Optimization, AGARD-LS-70,
1974.
[10J Adelman, H.M., Haftka, RT. and Tsach, U., "Application of Fully Stressed De-
sign Procedures to Redundant and Non-isotropic Structures," NASA TM-81842,
July 1980.
[l1J Adelman, H.M. and Narayanaswami, R, "Resizing procedure for structures under
combined mechanical and thermal loading," AIAA J., 14 (10), pp. 1484-1486,
1976.
[12J Venkayya, V.B., "Design of Optimum Structures," Comput. Struct., 1, pp. 265-
309, 1971.
[13J Siegel, S., "A Flutter Optimization Program for Aircraft Structural Design,"
Proc. AIAA 4th Aircraft Design, Flight Test and Operations Meeting, Los An-
geles, California, 1972.
[14J Stroud, W.J., "Optimization of Composite Structures," NASA TM-84544, August
1982.
[15J Falk, J.E., "Lagrange Multipliers and Nonlinear Programming," J. Math. Anal.
Appl., 19, pp. 141-159, 1967.
[16] Fleury, C., "Structural Weight Optimization by Dual Methods of Convex Pro-
gramming," Int. J. Num. Meth. Engng., 14 (12), pp. 1761-1783,1979.
[17] Schmit, L.A., and Fleury, C., "Discrete-Continuous Variable Structural Synthesis
using Dual Methods," AIAA J., 18 (12), pp. 1515-1524,1980.
[18J Schmit, L.A., and Fleury, C., "Discrete-Continuous Variable Structural Synthesis
using Dual Methods," Paper 79-0721, Proceedings of the AIAA/ ASME/ AHS 20th
Structures, Structural Dynamics and Materials Conference, St. Louis, MO, April
4-6,1979.
[19J Grierson, D.E., and Lee, W.H., "Optimal Synthesis of Steel Frameworks Using
Standard Sections," J. Struct. Mech., 12(3), pp. 335-370, 1984.
385
Chapter 9: Dual and Optimality Criteria Methods
[20] Grierson, D.E., and Lee, W.H., "Optimal Synthesis of Frameworks under Elastic
and Plastic Performance Constraints Using Discrete Sections," J. Struct. Mech.,
14( 4), pp. 401-420, 1986.
[21] Grierson, D.E., and Cameron, G.E., "Microcomputer-Based Optimization of Steel
Structures in Professional Practice," Microcomputers in Civil Engineering, 4 (4),
pp. 289-296, 1989.
[22] Fleury C., and Braibant, V., "Structural Optimization: A New Dual Method
Using Mixed Variables," Int. J. Num. Meth. Eng., 23, pp. 409-428, 1986.
[23] Prager, W., "Optimality Criteria in Structural Design," Proc. Nat. Acad. Sci.
USA, 61 (3), pp. 794-796,1968.
[24] Venkayya, V.B, Khot, N.S., and Reddy, V.S., "Energy Distribution in an Opti-
mum Structural Design," AFFDL-TR-68-156, 1968.
[25] Berke, L., "An Efficient Approach to the Minimum Weight Design of Deflection
Limited Structures," AFFDL-TM-70-4-FDTR, 1970.
[26] Venkayya, V.B., Khot, N.S., and Berke, L., "Application of Optimality Criteria
Approaches to Automated design of Large Practical Structures," Second Sympo-
sium on Structural Optimization, AGARD-CP-123, pp. 3-1 to 3-19, 1973.
[27] Gellatly, R.A, and Berke, L., "Optimality Criteria Based Algorithm," Optimum
Structural Design, R.H. Gallagher and O.C., Zienkiewicz, eds., pp. 33-49, John
Wiley, 1972.
[28] Khot, N.S., "Algorithrm; Based on Optimality Criteria to Design Minimum
Weight Structures," Eng. Optim., 5, pp. 73-90, 1981.
[29] Venkayya, V.B., "Optimality Criteria: A Basis for Multidisciplinary Optimi7:a-
tion," Computational Mechanics, Vol. 5, pp. 1-21,1989.
[30] R07:vany, G.I.N., Structural Design via Optimality Criteria: The Prager Approach
to Structural Optimization, Kluwer Academic Publishers, Dordrf'cht, Holland,
1989.
[31] \Vilkinson, K. et al. "An Automated Procedure for Flutter and Strength Analysis
and Optimization of Aerospace Vehicles," AFFDL-TR-75-137, December 1975.
[32] Khot, N.S., "Optimal Design of a Structure for System Stability for a Specified
Eigenvalue Distribution," in New Directions in Optimum Structural Design (E.
Atrek, R.H., Gallagher, K.M., Ragsdell and O.C. Zienkiewicz, editors), pp. 75-87,
John Wiley, 1984.
[33] Venkayya, V.B., "Structural Optimization Using Optimality Criteria: A Review
and Some Recommendations," Int. J. Num. Meth. Engng., 13, pp. 203-228, 1978.
[34] Berke, L., and Khot, N.S., "Structural Optimi7:ation Using Optimality Criteria,"
Computer Aided Structural Design: Structural and Mechanical Systems (C.A.
Mota Soares, Editor), Springer Verlag, 1987.
386
Decomposition and Multilevel Optimization 10
The resources required for the solution of an optimization problem typically increase
with the dimensionality of the problem at a rate which is more than linear. That is, if
we double the number of design variables in a problem, the cost of solution will typi-
cally more than double. Large problems may also require excessive computer memory
allocations. For these reasons we often seek ways of breaking a large optimization
problem into a series of smaller problems.
One of the more popular methods for achieving such a break-up is decomposition.
The process of decomposition consists of identifying relationships between design
variables and constraints that permit us to separate them into groups that are only
weakly interconnected. Once we have accomplished the process of decomposition we
need to identify an optimization method that would take advantage of the grouping
and replace the overall design with a series of optimizations of the individual groups,
coordinated so as to optimize the entire system.
The coordination process is often achieved by an optimization algorithm, and then
the overall optimization becomes a two-level optimization process. The coordination
level is usually referred to as the top level, and the small optimization problems are
called the subordinate level. Of course, it may be possible to break each one of the
groups in the subordinate level to further subgroups, so that we obtain a three-level
optimization, and so on. The multilevel structure generated through the process of
decomposition is usually characterized by a large number of daughter subproblems
in successive levels. \\Then the decomposition process is depictcd schematically (see
Figure lO.1.1a), the diagram has a wide-tree (or multiple branching) structure.
2.65 FiglO.l.lE Multilevel-problem structures Multilevel optimization is not only
generated through decomposition. Some problems have natural multilevel structure
with only one or few daughter sublevels, that is they have a narrow-tree structure (see
387
Chapter 10: Decomposition and Multilevel Optimization
10.2 Decomposition
388
Section 10.2: Decomposition
and each constraint depends only on variables from a single group. That is, if we
denote the vector of constraints associated with Xi as gi, the constraints may be
written as
gi(Xi) 2:: 0, i = 1, ... , s. (10.2.3)
f x X X X f X X X X
X X
X X X
X X X
X X X
This simple problem structure is diagrammed in Figure 10.2.1a. The rows in the
diagram represent the objective function and constraints, and the columns represent
the design variables. An 'x' in a block indicates that the objective function or the
constraint corresponding to the row of the block depends on the vector of design
variables associated with the column of that block. For a block-diagonal problem the
solution naturally breaks down to a series of problems
minimize J; (Xi)
(10.2.4)
such that gi(Xi) 2:: 0,
which can be solved independently for i = 1, ... ,s (that is, the problem is separable,
see Section 9.2.2). This is an ideal situation because we replace the solution of the
large problem with a series of smaller problems without the need for any coordination
between subproblems. This is also the simplest example of problem decomposition.
It is extremely rare to encounter problems that have a simple block-diagonal
structure, but in many cases we have optimization problems where the coupling be-
tween groups of variables is very weak. The coupling between groups of variables
means that some of the blank off-diagonal squares in Fig. (10.2.1) fill up. A weak
coupling means that the derivatives in these off-diagonal squares are small compared
with the derivatives in the diagonal squares. In cases of weak coupling it may be
possible to proceed as if the problem form were block diagonal. However, instead of
optimizing each group of variables only once, ,ve have to repeat the process several
times to account for the weak coupling between groups. For example, consider the
design of truss structures subject to stress and local buckling constraints. 'vVe can
design the cross-sectional parameter of each member of the truss separately to sat-
isfy the stress and local buckling constraints, assuming that member forces remain
389
ChapteT 10: Decomposition and Multilevel Opt'irnizatioTl
where ni denotes the member force vector for the ith load case, and E is a matrix
of direction cosines. For the limit design problem of the truss we need to enforce the
yield constraints under each load casc as
where (JT, and (Jc denote the yield stress ill tension and compression, rcspecti\'ely, Aj
is the cross sectional area of the jth member, and I1j denotes the force in member j
under the ith load casco The limit design problem for minimum weight design of the
truss can then be formulated as
r
minimize m = I:PAjL j
j=1
(10.2.8)
subject to Eni =pi,
and Aj(J C ::; nj ::; A j(JT ,
390
Section 10.2: Decomposition
where p and L j denote the density and length of the jth member, respectively. In
this problem the member forces and cross-sectional areas are the design variables. In
this case the member forces for the ith load case, ni play the role of the local variable
vectors Xi since ni appears only in the constraints associated with the ith load case.
The cross-sectional areas play the role of the coupling vector y since they appear in
the objective function and in the constraints for all load conditions.
Example 10.2.1
The three-bar truss in Figure 10.2.2 is to be designed for minimum mass so as not to
collapse, under two load systems: a vertical load of magnitude 8p and a horizontal
load of magnitude p. We assume that the truss can collapse not only due to yield, but
also due to Euler buckling of the compression members. The post-buckling behavior
is assumed to be fiat (that is constant load with increasing deformation), so that the
buckling stress can be substituted for the yield stress in Eq. (10.2.7) for members
in compression. The design variables are the cross-sectional areas and moments of
inertia of the members (assumed to be independent).
r-x'u t~
Y,v section a-a
overall geometry
The horizontal load can act either to the right or to the left, and so we require a
symmetric design, AA = Ae and IA = Ie. We assume that the material properties
of the members are identical, and that under the horizontal load member B will not
be critical in tension. Denoting the two load cases by superscripts H and V, we
391
Chapter 10: Decomposition and Multilevel Opt'imization
Ii \'
and 0.8GG(n'l - 1!(') = 0,
\.
lin + O.o(nct + nel -
h.I".F_
-8]),
7[2 EfA
_n V
C -
< 4[2
The block diagram for the problem is shown in Figure 10.2.3 , with a detailed
variahle-by-variable diagram in (a), and a variable-group diagram in (b). The dia-
gram shows that the optimization problem has a block angular form, with the cross-
sectional properties heing the cOllpling \'ariables, and the member fOlT(,S for each load
case heing the local variables .•••
A block angular form can be used in various ways, discussed later, to [(>place
the overall optimization problem by a series of smaller problems. Aside from its
value in decompositioll, a block angular form also has other complltational benefits.
The lllaill advantage is that derivative calculation is inexpellsive because constraints
depend only on a lilllited number of design variables. Therefore, it is wortll\vhilc to try
and induce such a block angular structure by proper choice of desigll yariables, even
if we use a standard optimization algorithm to solye the problem. This is illustrated
in the following example.
Example 10.2.2
The three-bar truss in figure 10.2.2 is now to be designed for minimum weight in the
clastic range by varying the radius anel the thickness of the members, The two loads
are now assumed to act simultaneously, so that wc consider only a single load case.
Because of symmetry we assume that membns A and C are identical so that the
design variables are r.I, (I, 1'lJ, and tlJ. \Ve assume that the thicknesses of the tubes
are small compared to the radii, so that the cross-sectional areas are approximated
as
Displacement, stress anel huckling constraints are applied, The vertical displacement
/' is restricted to be less than 0.0011. The stress in each membcr should be less than
(To = 0.002E, where E is YOllng's modulus, and (To is the yield stress in tension
and compression, (To = lOspl z2. Additionally, tlie members should not huckle. This
means that the stress in (>aeh member is limited to be below the shell- buckling stress
of O.G05Etl]' where r is the radius of the member and t its thickness, and the stress
302
Section 10.2: Decomposition
H H H V V V
AA IA AB IB n A n B ne n A n B ne
mass x x
horizontal eql. X X
"0
~
0 vertical eql. X X X
....J
c:a yielding A X X
C
0
N
·c0 buckling B X X X
::r: buckling C X X
(a)
horizontal eql. X X
"0
~ vertical eql. X X X
0
....J
c:au buckling A X X
·B
(l)
buckling B X X X
>
buckling C X X
Cross
nH nH nH V V V
Sectional ABe nA nB ne
Variables
mass X
Horizontal
(b) load X X
constraints
Vertical
load X X
constraints
must also be below the Euler buckling stress of 7f2 £,.2 /2L2 where L is the length of
the member.
The truss was analyzed in Example 6.1.2 for a vertical tensile force, and it is easy
to change the sign of that force and obtain
393
Chapter 10: Decomposition and Multilevel Optimization
8pl
v= - ,
E(AB + O.25A A)
aA _ P(_J3_3_ _ 2 )
- 3A A AB + O.25AA '
8p
aB = - AB + 0.25A A ,
ac = _p (_J3_3_ + ___2___ )
3A A AB + 0.25A A
We assume that the yield stress is the same in compression and in tension, and then
member C will always be more critical than member A, so that the design problem
may be written as
minimize m = pl(AB + 4A A )
v
such that 1+ 0.0011 ~ 0,
I + aB ~ 0,
ao
0.605EtB aB 0 1r2Er~ aB
-----=-+-> -2 2 1 +-,~O,
rBaO ao - ao ao
>0
1 + ac ~ 0, 0.605EtA ac
----+-
ao rAaO ao -
1r2 Er2 a
__ A +--2>0.
8l2ao ao-
As posed the problem is fully coupled in that each constraint depends on all four
design variables (note that the stresses in each member depend on the area and
hence on the thickness and radius of the other member). However, it is simple to
decouple the members and construct a block angular problem structure by changing
design variables. We select the cross-sectional areas as the coupling variables (y),
and then either the radii or the thicknesses of the members can be the local or
subsystem variables. In this example, let us use the two radii as the local variables.
The thicknesses may then be obtained from the radius and cross-sectional areas. \Ve
define nondimensional area variables as
and then the mass, the displacement, and the stresses may be written in terms of Yl
and Y2 only. The buckling constraints also require the radii. Defining the nonclimen-
sional radii as
394
Section 10.2: Decomposition
and
1f2 Er2 1f2 E
~ = - -aox~ = 2467 aox~ .
2l 2 ao
Using similar expressions for member C, we can now write the design problem as
g3(Y) = 1- -v'3 - 2
~ 0, (stress in C)
3YI Y2 + 0.25YI
and gll(XI,Y) = 4.814 x 10
-4YI v'3 -
2 - -
2
~ 0, (shell buc. C)
Xl 3YI Y2 + 0.25YI
gI2(XI,Y) = 616.9x 2v'3
I - -3 - + 02 25 ~ 0, (Euler buckling C )
YI Y2 . YI
4Y2 8
g21(X2,Y) = 4.814 x 10- 2 - 25 ~ 0, (shell buckling B)
x2 Y2 + O. YI
8
g22(X2,Y) = 2467x2-
2
025 ~O. (Euler buckling B)
Y2 + . YI
The problem now has the requisite block angular structure.e e e
Now consider the case of a more complex truss structure composed of s tubular
members, designed for minimum mass and subject to stress, displacement, and local
buckling constraints. The stresses will be calculated from a finite element model.
For optimization we will need the derivatives of the stresses with respect to design
variables, and this derivative calculation can be the major cost in the optimization
process, especially if derivatives are calculated by finite differences. If the radii and
thicknesses of the members are used as design variables, then the problem is fully
coupled, in that a change in each design variable may affect the stresses in all mem-
bers. \Ve will need to calculate derivatives of the stresses in the members with respect
to 28 design variables. If, on the other hand, we use the decomposition approach em-
ployed for the three-bar truss, the cross-sectional areas and the radii are the design
variables. The partial derivatives of the stresses with respect to the member radii are
taken for fixed values of the corresponding areas (this is, of course, possible because
the thicknesses are not specified). So these derivatives of stresses with respect to
radii are zero, and we need to calculate only the s partial derivatives of stresses with
respect to areas.
A similar approach may be used for frame type structures. The portal frame shown
in Figure 10.2.4 , for example, was introduced by Sohieski et al. [4] for demonstrating
multilevel optimization concepts. Each one of the three beams has an I cross-section
defined by 6 design variables. Constraints are imposed on stresses and displacements
under the loads shown in the figure. If the detail (local) design variables are used,
395
Chapter 10: Decompositim Id Multilevel Optimization
J.!·----lOC
r
!
---------------;1::-1-;0'.---
P=50000N
2 !
!
500 em
A_
!
! V
i
I- h
l.~ ~---h-
+~
i '1
N 3
!
! b1 [~- -=I-I i -~jb;~
!
I
..JL t3 ..JL
! 1000 em tl t2
! A-A
I not to scale
!
!
396
Section 10.2: Decomposition
Consider, for example, a generalization of the truss and frame cases where each
subsystem has a set of global variables that are used to eliminate a number of the
subsystem variables. For the sake of simplicity we will consider a single subsystem,
and omit the subscript associated with it. That is, let x be the vector of subsystem
variables (such as the radius and thickness for the truss tube member), and let y be
the part of the global variable vector associated with that subsystem (such as the
cross-sectional area for that truss member).
\Ve assume that \ve can identify a subset of x that can he eliminated in terms of
y and denote it as XE, and denote the rest of the local variables (to be ret.ained ) as
XR. The relationship between y, XE, and XR is given as
h(y,XE,XR) = O. (10.2.9)
This relationship cannot always be solved analytically to yield an expression for XE
in terms of y and xu, but it can be solved numerically (e.g., Newton's method).
The numerical solution for XE is usually inexpensive, because Eq. (10.2.9) is a small
system of algebraic equations. It is important, however, to choose XE such that the
system has a solution, that is the Jacobian ah/ aXE must be nonsingular.
If we replace x by y and Xu as design variables without having an analytical
expression for the eliminated variables, our main difficulty will be in calculating
derivatives of objective function and constraints with respect to the new set of design
variables. Consider, for example, a constraint function
g(x) = g(XR,XE) = Y(XR'Y). (10.2.10)
\Ve need to calculate the derivatives of Y without having an explicit expression for
it. This is easily accommodated using implieit differentiation. Differentiating Eq.
(10.2.10) we get
ay ag ag aXE
-aXIl
=DXIl-+ --
aXE aXR '
(10.2.11)
ay ag aXE
ay aXE ay .
Note the difference between ag/aXR and a!J/axn. The first is a derivative of the
constraint with XE h(,ld constant, while the second is a derivative of the constraint
with y held constant.
To be able to evaluate the derivatives from Eq. (10.2.11) we need the derivatives
aXE/aXR and axE/Dy. These arc obtained by differentiating Eq. (10.2.9) as
ah + ah aXE = 0
ay ax E ay ,
(10.2.12)
ah
-+---0
ah aXE
aXIl aXE aXR - ,
which can be solved to yield
aXE _ [ah ] -1 Dh
ay - - aXE ay ,
(10.2.13)
aXE [ ah ] -1 ah
aXn = - aXE DXR .
397
Chapter 10: Decomposition and Multilevel Optimization
This process is illustrated in the following example.
Example 10.2.3
Consider again the portal frame of Figure 10.2.4. The natural global variables are the
cross-sectional areas and moments of inertia. Denoting the area and moment of inertia
of a typical member by A and I, respectively, and assuming that the thicknesses are
much smaller than the other dimensions we have for Eq. (10.2.9)
Assume that we have a local constraint which requires (say, to avoid unreasonable
geometries) that the web accounts for at least 20 percent of the total area, that is
(b)
Assume further that \VC use the area and moment of inertia to eliminate the variables
tl and t3. That is, here tj and t3 are the components of XE and bl , b2 , t2 and Hare
the components of XR. After the elimination of the two local variables the constraint
may be written as
g(A,I,b l ,b2 ,t2,H) ~ O.
\Ve want to demonstrate that we do not need to have an explicit form for !J to be
able to evaluate it and its derivatives. To evaluate !J for a given set of its arguments
we first solve Eqs. (a) for tl and t 3, and then we evaluate 9 from (b) and note that
Consider now, for example, the derivative of g with respect to the area A.
and then
8y
8A = 1.3.
As a check we can change the area by a small amount D..A without changing the
other arguments of y. This can be accomplished by changing tl by (8tl/8A)D..A =
-0.5D..A/ H and changing t3 by (at 3/8A)D..A = 1.5D..A/ H. We then check that the
moment of inertia I does not change (to first order in D..A), and that g changes by
approximately 1.3D..A. • ••
In many cases, however, the optimization process at the two levels has to be
coordinated. For linear problems Dantzig and Wolfe ([9] and [10]) and Rosen [11]
and [12] developed two-levels algorithms for the block-angular problem, Eq. (10.2.5).
For nonlinear problems, one possible approach is known as the model-coordination
method. Here we describe a version based on derivatives of the optima of subsys-
tems with respect to upper-level variables. Consider the block-angular problem, Eq.
399
Chapter 10: Decomposition and Multilevel Optimization
Consider again the three-bar truss of Example 10.2.1. As shown in that example,
the problem has a hlock-angular form, with the areas and moments of inertia being
the global design variahles, and member forces the local variables. The upper level
optilllization in a two-level approach for this problem can be formulated as follows:
minimize m = (II (4A.\ + A B)
such that p~f - P 2: 0 ,
\.
jJ" -]J 2: 0 ,
where p:f amI p;:. denote the collapse values of p for the horizontal and vcrtiealload
cases, respectivel.\". These collapse values are ohtained from the solution of two sub-
level optimization problems. For the horizontalloacl we solve
maXImIze p;f
such that 0.8GG(1l ~ - n{!) = p:f ,
n~ + O.!j(n~ + n{!) = 0,
7f2 EI 4
-ncf~ < --_.
412
400
Section 10.4: Penalty and Envelope Function Approaches
maximize
such that
To optimize the upper-level problem we will need derivatives of the two collapse loads
with respect to the cross-sectional areas and moments of inertia. \Ve will consider
only the derivatives of the horizontal collapse load p!!. The problem is simple enough,
so that the solution for the collapse load can be found by inspection. If IB is large
enough, so that member B is not critical, then collapse will be reached when members
A and C reach their maximum (yield or buckling) loads, and from the horizontal
equation of equilibrium we get
From the vertical equation of equilibrium we can then check that at this load member
B will be indeed below its failure load if
If, on the other hand, I B < I BO then members C and B will reach t.heir maximum
load first, and using the two equations of equilibrium we find that
II 0.8667f2 E
Pc = [2 (2I B +0.5IA ).
It is easy to check that when IlJ = I Bo both expressions for the collapse load give
identical results, so that p~l is a continuous function of lB. The derivative of p~I with
respect to III, on the other hand, is not continuous. When IB < I Bo this derivative
is zero, as the collapse load is independent of the properties of member n when that
member is not critical. For Is > I Bo we get
()p~I 1.7327f2
OlB [2
This discontinuity in t.he derivative can pose difficulties to most optimization algo-
rithms, especially if the optimum design is in the vicinity of IB = I Bo .• ••
One way of avoiding the difficulties of the two-level approach discussed above is
to use an exterior or extended interior penalty-function method (see Section 5.7)
401
Chapter 10: Decomposition and Multilevel Optimization
for the objective function at the lower levels. The penalty function approach allows
us to accept upper-level (y variables) that do not have lower-level (x; variables)
feasible solutions. Indeed, the penalty associated with constraint violation at the
lower levels will eventually drive the upper-level design variables away from regions
with no lower-level feasible solutions. Also, the extended penalty function smoothens
the discontinuities associated with the derivatives of the lower-level optima, especially
when the lower-level optimization is not performed with extreme values of the penalty
parameter. Finally, the use of a penalty function solves the difficulty that occurs when
the lower-level variables do not contribute to the objective function.
Consider the block-angular problem described by Eq. (10.2.5). Using a penalty
function approach we replace the constrained problem with
for 9j ~ go ,
(10.4.3)
for gj < go .
go = go01"1/2 , (10.4.4)
where goo is a constant. The problem described by Eq. (10.4.1) is solved for a series
of values of 1" such that 1" --+ O. A multilevel version of this formulation is
s
The series of values for the subsystem penalty parameters r; tend to zero together
with the global penalty parameter 1".
The method of varying the penalty parameters of the subsystems defines the
particular multilevel algorithm. One attractive approach is to perform each sublevel
optimization for only a single value of 1";, arguing that there is no point in striving
for an exact sublevel optimum before the upper-level variables have settled close to
their final values. That single value of the penalty parameter for each subsystem
402
Section 10.4: Penalty and Envelope Function Approaches
can then be gradually reduced towards zero as the optimization proceeds. Reference
[15] shows that when all subsystems use the same penalty parameter, the multilevel
optimization is completely equivalent to the single-level approach. This means that
the same series of int.ermediate designs are obtained on the way to t.he final opt.imum,
and the calculat.ions performed could be made to be identical. The process can be
viewed as a two-level optimization, or a single-level optimization where the block-
angular form is utilized to reduce the amount of computation and permit parallel
operations.
Note that even when other techniques are used to solve multilevel optimization
problems it is common practice to use approximate or partially converged solutions
of the sub-level optimizations.
Example 1004.1
Consider the two-level formulation of the elastic design of the three-bar truss in
Example 10.2.2. For this simple example it is convenient to use a vector penalty
function Pv which is equal to the penalty associated with the most critical constraint.
Pv(g, r) = p[min(gd, r] .
•
In general this penalty approach may create discontinuity problems when the most
critical constraint changes identity. For our problem, though, this does not happen.
The penalty function formulation is then
where the mass and constraint functions are given in Example 10.2.2. Note that the
local variables Xl and X2 do not contribute to the mass, so that the formulation of
Eq. (10.3.1) would not have any objective function at the lower level, and the lower
level problems would only require finding a feasible solution.
With this penalty function formulation, the lower-level objectives ¢l and ¢2 each
contain the contributions of two constraints. Because the penalty is based on the
most critical constraint the lower-level optimum occurs when these two constraints
are equally critical. For the first subsystem we get gll = g12 which yields
With these relationships we can now solve the upper level problem as a single-level
optimization problem .•••
403
Chapter 10: Decomposition and Multilevel Optimization
where g; are the components of g, p is a factor that plays the same role as the penalty
parameter and gmin is the most critical constraint. It is easy to show that
404
Section 10.5: Narrow-Tree Multilevel Problems
that was formulated as a single-level problem in Example 10.2.1. The single-level for-
mulation had cross-sectional areas (structural sizes) and member forces (structural
response) as design variables.
In the case oflimit design the single-level formulation, that is the SAND approach,
is the method of choice in engineering practice. However, in the elastic range the
nested approach is the rule. The problem of minimum-weight clesign subject to
displacement and stress constraints in the elastic range can be formulated as
minimize W (x)
(10.5.1)
such that gj(u,x):::,,:O, j=l, ... ,m,
where the displacement field u can be obtained as the solution to the minimization
of the potential energy U given in term of the stiffness matrix K and the load vector
f
minimize U = (1/2)u T K(x)u - uTf. (10.5.2)
The common approach is to solve this problem as a two-level optimization since
the solution to the energy minimization prohlem is obtained simply by solving the
equations of equilibriulIl Ku = f(x).
The SAND approach of using the equations of equilibrium as equality constraiuts
and treating both strnctural sizes and displacements as design variahles was at-
tempted in the 1960's by Fox and Schmit [17] using a conjugate gradient (CG) tech-
nique for the optimization. However, the CG method could not deal effectively with
the equality constraints associated with the equations of equilibrium because the stiff-
ness matrix generated hy a finite-element model is typically ill-conditioned. Gaussian
elimination techniques lose accuracy when applied to ill-conditioned equations, but
this can be tolerated if the numher of digits used in the computer arithmetic is high
enongh (most finite-element computations are done in douhle precision). The ef-
fect of ill-conditioning on iterative methods snch as the CG method is to slow down
COllYergence.
Recent advances in optimization methods such as preconditioned CG methods,
however, improve the efficiency of the SAND approach, and make it competitive
for three-dimensional problems that result in a poorly-banded stiffness matrices. As
a result there has been a revival of interest in SAND approaches (see Haftka [IS],
Smaoui and Schmit [19], Ringertz [20], anel Haftka and Kamat [21]). Overall, the
SAND method eliminates the need for continnally reanalysing the structure, at the
expense of solving a larger optimization problem (including displacements as design
variables). It is, therefore, most appropriate to usc SAND in problems with a very
large number of structnral design variables, where the addition of displacement vari-
ables has a small effect on the total nnmber of design variahles.
The SAND method is not the method of choice when there are many load cases
because in that case the number of displacement design variables becomes very large.
However, Chibani [22] employed SAND in this case using a two-level approach and
geometric programming to alleviate the computational burden. The method is also
very useful in topology optimization where the traditional nested approach runs into
405
Chapter 10: Decomposif;,;"t and Multilevel Optimization
trouble when the elimination of parts of the structure can render the stiffness matrix
singular (see Bends0e et al. [23])
It is not always possible to transform a two-level problem into a single-level
one. Consider, for example, the problem of maximizing the lowest frequency Wi of a
strnctnre subject to the constraint that its weight TV docs not exceed a limit H'u. A
two-level formulation of the problem is
maximize Wi(X)
(10.5.3)
such that IVIt - lV(x) >
_ 0,
(10.5.4)
with M being the mass matrix and u the eigrnwctor corresponding to Wi. It is not
possible to replace this tVv"O-level problem by thr single-level problem
w2 - _uTKu
__
find x and u to maximize 1 - uTMu (10.5.5)
such that 1V'Ii - W(x) >
_ 0,
because in the above formulation the optimization ,yill choose the eigenvector corre-
sponding to the highest. rather than the lowest frequency. It is still possible to convert
this frequency maximization problem to an SAND single-level approach [24] by using
the Kuhn-Tucker conditions of the problem, but the process is more complex and
more computationally costly than the nested approach of Eqs. (10.5.3) and (10.5.4).
406
Section 10.6: Decomposition in Response and Sensitivity Calculations
analysis) of the individual subsystems. That is, the equations governing the response
of the system can he written as
\Ne can take advantage of this block angular structure in the solution procedure.
For example, consider the use of Newton's method for solving the system. Given an
initial estimate for the solution we compute a correction to that estimate from a first
order Taylor series expansion
(10.6.4)
That is, the problem can be reduced t.o a solution of a system, Eq. (10.6.4), of the
order of w, and then the individual subsystem responses, Ui, can be calculated, as
needed, from Eq. (10.6.3).
The same procedure can be used to calculate the sensit.ivity of the response with
respect to design variables. Assume now that the system depends also on a design
parameter x. That is, we have
oro OUI oU s ow
- + TO 1 - - + ... + ro - - + ro 0 - =0
ox 'o:c ,8 ox 'ox '
(10.6.6)
OUi ow
r· + r · · - +r·o- =0
, ',' ox "ox '
i = 1,···, s.
\Ne can now express oud ox in terms of ow / ih and reduce the problem to a system
of the same order as that of w.
407
Chapter 10: Decomposition and Multilevel Optimization
That is, ri is a procedurt, for calculating Wi given the response of the other disciplines
and a vector x of design variables. Similarly ti H'presents a procedurc for calculating
the response Ui. Equation (10.6.7) represents a systf'm of coupled nonlinear cquations
in the Wi, i = 1,'" ,s. The solution of this system can proceed, for exampk, by the
use of ~ewton's mctl1od, so that given an iuitial estimate w? for the Wi'S we can find
a correction tl.wiby solving
JD.W = tl.r, (10.6.8)
where
1:: ) 1~~: )
I -rj,2
-r2.1 I
J= W= tl.r =
I Ws tl.rs
( 10.6.9)
and where
( 10.6.10)
After \\Ie converge to the solution for W we can thcn find the Ui from Eq. (10.6.7).
The calculation of sensitivity with respect to a design parameter proceeds ill a similar
manner. Differentiating Eq. (10.6.7) with respect to a component of x we get
(10.6.11)
The special structure ofthe Jacobian J permits us to reduce the order of the equations
by eliminating onc of the Wi'S as illustrated in the example below.
408
Section 10.6: Decomposition in Response and Sensitivity Calculations
The GSE approach requires the derivatives of the individual disciplinary responses
with respect to the input of all the other disciplines. The cost of these calculation can
be very large when the front of interaction between disciplines is large. In comparing
the cost of the GSE approach to that of finite-difference calculation of the derivatives,
a key parameter is the number of design variables. For a large number of design vari-
ables, the GSE method tends to be more efficient than the finite-difference method,
while for a small number of design variables finite-differences are less expensive. For
a more detailed discussion of the cost issues, as well as the pathological cases when
the GSE matrix may be singular, the reader is referred to [32].
As noted before, the major difficulty associated with using multilevel techniques
is in finding a way to decompose the problem so that it would have the requisit.e hi-
erarchical structure. Successful decomposition breaks the problem into elements that
have only narrow fronts of interaction. For multidisciplinary analysis and sensitivity
we seek wa.ys to narrow the front of interaction between disciplines. The following ex-
a.mple of integrated aerodynamic-structural wing analysis and sensiti\'ity calculations
illustrates the use of a reduced-basis technique for achieving this goal.
Example 10.6.1
Consider the aeroelastic analysis of an aircraft wing. The flow field around the wing
is calculated based on the shape of the wing. Then pressures and loads are calculated
from flow velocities, and these are used to calculate structural displacements which
in turn change the shape of the wing. The solution for this coupled problem is often
performed iteratively, starting with the flow field around a rigid wing, continuing
with the loads and displacements associated with this flow field, updating the shape
of the wing based on these displacements, and so on. This approach, called fixed-point
iteration, may be preferable to Newton's method if the calculation of the Jacobian
is expensive. However, if we need also the sensitivity of the aeroelastic response to
design parameters, it may be advantageous to use Newton's method instead of the
fixed-point iteration. The feasibility of using Newton's method depends on the \vidth
of the front of interaction. To focus on the question of the front of interaction we start
without consideration of design variables and examine the solution of the aeroelastic
interaction.
We assume that we have an aerodynamic 'black box' that solves for the flow
field represented, say, by the velocity vector, v, given the shape of the wing which is
represented by a shape vector, s,
(a)
where b a denotes the application of the aerodynamic black box. Next we have a force
black box which translates the flow velocities into aerodynamic loads fa that can be
used in the structural analysis
(b)
The next black box is the structural analysis package which combines the aerodynamic
loads with inertia loads and calculates the displacement vector u
u = bs(f,,). (c)
409
Chapter 10: Decomposition and Multilevel Optimization
Finally, we have an interpolation black-box which updates the shape of the wing
based on the displacement field
(d)
At first glance, the system described by Eqs. (a)-( d) appears to be fully coupled.
Solving this system by Newton's method appears to be impractical because of the
huge size of the Jacobian. The flow field vector v and the displacement vector u
usually have thousands or tens of thousands of components. However, the vectors
fa and s can have a fairly small number of components, and we can reduce the
problem size enormously by combining the first two and last two black boxes. The
first combination gives us the aerodynamic forces in terms of the shape of the wing
(e)
and the second combination gives us the shape of the wing as a function of the
aerodynamic forces
(1)
\Ve note that the variables fa and s play the role of Wj and W2 in Eq. (10.6.7), while
v and u play the role of Uj and U2.
The above approach of using only fa and s as interaction variables leads to a
great reduction in the number of cross-derivatives that need to be calculated. How-
ever, the number of components of fa and s is often several dozens, and calculating
the Jacobian can still be prohibitively expensive. Further reduction in the number of
required derivatives is achieved by using a reduced basis technique to represent the
displacements for the purpose of describing the aero elastic interaction. The displace-
ment vectors are assumed to be adequately represented by a linear combination of
mode shapes (often vibration modes) as
U=Uq, (g)
where U is a matrix of modes and q a vector of modal amplitudes. The order of the
vector q is typically much smaller than that of U or even fa. Furthermore, for the
reduced basis structnral analysis we now do not need fa but instead the generalized
load vector f; given as (see Eq. (7.4.30))
(h)
The reduced-basis (or modal) structural analysis black box is now described schemat-
icallyas
( i)
It is now most efficient to group our four black boxes in a slightly different order to
make f; and q the interaction variables. That is, the generalized aerodynamic forces
are given in terms of the modal amplitudes as
(j)
410
Section 10.7: Exercises
For the Newton iteration, Eq. (10.6.8), we need to calculate J 12 = {)ri/{)q
and J 21 = {)r'2/ {)f:. These are cross derivatives, in that they are derivatives of the
aerodynamic forces with respect to the shape changes due to structural displacements,
and derivatives of shape change due to structural displacements with respect to the
aerodynamic loads. J 12 and J 21 are matrices, and it is convenient to label them as
A and S. The component aij of the matrix A is the derivative of the ith component
of f:, f:;, with respect to the jth component of q, qj. Similarly the component Sij of
f:
the matrix S is the derivative of qi with respect to j . These derivatives are often
calculated by finite differences. For example, if we perturb qj and recalculate from f:
f:
Eq. (j) we can estimate the jth column of the matrix A as the difference in divided
by the perturbation in qj.
(k)
Because of the special structure of Eq. (k) we can eliminate either ~q or ~f:. For
example, if ~f: has more components than ~q, it may be advantageous to eliminate
~f: by using the first row of Eq. (k)
The calculation of sensitivity with respect to a design parameter .1: will proceed
along the same line. Equation (10.6.11) will become
[1 -Aj {PJi}
-S ~1
= { ~}
~
(n)
While the reduced basis technique approximates the aero elastic interaction, it does
not require that we also approximate the calculation in each individual discipline.
After we find f: and q from the coupled analysis, we do not need to use Eq. (g) to
calculate the displacements. Instead we can calculate the actual aerodynamic forces
fa corresponding to displacements U q, and then calculate the displacement from the
full structural analysis, Eq. (c) .•••
411
Chapter 10: DecOm1)Osition and Multilevel Optimization
10.7 Exercises
1. Consider the 3 bar truss of Figure 10.2.2. The cross-sectional areas and moments
of inertia of the three members are given, and we want to optimize the geometry of the
truss to minimize the weight subject to the constraint that the truss does not collapse
under either load case (consider both yielding and Euler buckling). Formulate the
problem in a block-angular form.
2. Consider the portal frame of Figure 10.2.4. Formulate the minimum weight design
of the frame subject to stress constraints and horizontal displacement limit of 10cm.
The design variabks are the cross-sectional dimensions for each of the three beams.
Define global design variables to reduce the problem to a block-angular form.
3. Calculate the dcri\'at.ives of !J in Example 10.2.3 with respect to its ot.her five
arguments.
4. Obtain t.he solut.ion of Example 10.4.I.
5. Solve Example IDA. 1 using the 1(S function.
6. Formulate the clastic design problem of the three-bar truss (Example 10.2.2) as a
simultaneous-analysis and design problem.
10.8 References
[1] Giles, G.L. "Procedure for Automating Aircraft vYing Structural Design," J. of
the Structural Division, ASCE, 97 (ST1), pp. 99-113, 1971.
[2 SohieszczCluski, J. and Loendorf, D., "A :t..Iixed Optimization Method for Auto-
mated Design of Fuselage Structures", .T. of Aircraft, 9 (12), pp. 805-811, 1972.
[3] I3art.helemy, J.-F.,r.l., "Ellgineering Design Applications of Multilevel Optimiza-
tion Methods," in Computer-Aided Optimum Design of Structures: Applications
(eds. C.A. Brebbia and S. Hernandez), Springer-Verlag, pp. 113-122,1989.
[4] Sobieszczanski-Sobieski, J., James, B.B., and Dovi, A.B., "Structural Optimiza-
tion by tvTultilevel Decomposition", AIAA J., 23,11, PI>. 1775-1782,1985.
[3] Thareja, n. R., and Haftka, R. T., "Efficient Singlc-Le\'el Solution of Hierarchical
Problems in Struct1ll'al Optimizatioll", AIAA .T., 28,3, pp. 506-514, 1990.
[6] Thareja, R., and Haftka, RT., "Numerical Difficultif's Associatf'd with using
Eqnality Constraints to Achieve :'vlultilevel Decomposition in St.ructural Opti-
mization," AlA A Paper No. 86-0854CP, Proceedings oCthe AIAA/ASME/ ASCE/
AHS 27th Stl'1letures, Structural Dynamics and Materials Conference, San Anto-
nio, Texas, May 1986, pp. 21-28.
[7] Schmit L.A., and Mehrinfar, M., "Multilevel Optimum Design of Structures with
Fiber-Composite Stiffened Panci Components" ,AIAA J., 20,1, pp. 138-147,1982.
412
Section 10.8: References
[8] Kirsch, V., "Multilevel Optimal Design of Reinforced Concrete Structures", En-
gineering Optimization, 6, pp. 207-212, 1983.
[9] Dantzig, G.B., and Wolfe, P., "The Decomposition Algorithm for Linear Pro-
gram," Econometrica, 29, No.4, pp. 767-778,1961.
[10] Dantzig, G.B., "A Decomposition Principle for Linear Programs," in Linear Pro-
gramming and Extensions, Princeton Press, 1963.
[11] Rosen, J.B., "Primal Partition Programming for Block Diagonal Matrices", Nu-
merische Mathematik, 6, pp. 250-260, 1964.
[12] Geoffrion, A.M., "Elements of Large-Scale 11athematical Programming", in Per-
spectives on Optimization (A.M. Geoffrion, editor) Addison "'('sley, pp. 25-64,
1972.
[13] Kirsch, V., "An Improved Multilevel Structural Synthesis 11ethod", J. Structural
Mechanics, 13 (2), pp. 123 144, 1985.
[14] Barthelemy, J.-F .11., and Sobieszczanski-Sobieski, J., "Ext.rapolation of Optimum
Designs based on Sensitivity Derivatives," AIAA J., 21, pp. 797-799,1983.
[15] Haftka, R.T., "An Improved Computational Approach for Multilevel Optimum
Design", .1. of Structural Mechanics, 12,2, pp. 245-261, 1984.
[16] Sobieszczanski-Sobieski, J., James, B. B., and Riley, M. F., "St1'llctural Sizing by
Generalized, Multilevel Optimization", AlA A J., 25, 1, pp. 139-145,1987.
[17] Fox, R. L., and Schmit, L. A., "Advances in the Integrated Approach t.o Structural
Synthesis", J. of Spacecraft and Rockets, 3 (6), pp.858-866, 1966.
[18] Haftka, RT., "Simultaneous Analysis and Design" ,AlAA .1.,23, 7, pp. 1099-1103,
1985.
[19] Smaoui, H., and Schmit. L.A., "An Integrated Approach to the Synthesis of Geo-
metrically Non-linear Structures," Int.ernational .1ournal for Numerical Methods
in Engineering, 26, pp. 555-570, 1988.
[20] Ringertz, V.T., "Optimization of Structures with Nonlinear Response," Engineer-
ing Optimization, 14, pp. 179-188, 1989.
[21] Haftka, R. T., and Kamat, M. P., "Simultaneous Nonlinear St1'llctural Analysis
and Design", Computational Mechanics, 4, 6, pp. 409-416, 1989.
[22] Chibani, L., Optimum Design of Structures, Springer-Verlag, Berlin, Heidelherg,
1989.
[23] Bendsl'le, M.P., Ben-Tal, A., and Haftka, RT., "New Displacement.-Ba.<;ed Met.h-
ods for Optimal Truss Topology Design," AIAA Paper 91-1215, Proceedings,
AIAA/ ASME/ ASCE/ AHS/ ASC 32nd Structures, Structural Dynamics and Ma-
terials Conference, Baltimore, MD, April 8-10,1991, Part 1, pp. 684-696.
[24] Shin, Y., Haftka, R T., and Plaut, R H., "Simultaneous Analysis and Design for
Eigenvalue Maximization", AlAA ,1., 26, 6, pp. 738-744,1988.
413
Chapter 10: Decomposition and Multilevel Optimization
[25] Pedersen, P., "On the :\linimum Mass Layout of Trusses" , AGARD Conference
Proceedings, No. 36 on Symposium on Structural Optimization, Turkey, October,
1969, pp. 1Ll-1l.l7, 1970.
[26] Vanderplaats, G.N., and Moses, F., "Automated Design of Trusses for Optimum
Geometry" . J. of the Structural Divisioll, ASCE, 98, ST3, pp. 671-690, 1972.
[27] Spillers, \V.R., "Iterative Design for Optimal Geometry", J. of the Structural
Division, ASCE, 101, ST7, pp.1435-1442, 1975.
[28] Kirsch, u., "Synthesis of Structural Geometry using Approximation Concepts",
Computers and Structures. 15, 3, pp. 303-314, 1982.
[29] Ginsburg, S., and Kirsch, U .. "Design of Protective Structures against Blast", .I.
of the Structural Division, ASCE, 109 (6), pp. 1490-1506,1983.
[30] Kirsch, l;., "Nlultilevel Synthesis of Standard Building Structures," Engineering
Optimization, 7, PI'. 105-120,1984.
[31] Kirsch, U., "A Bounding Procedure for Synthesis of Prestressed Systems," Com-
puters alld Structures, 20 (5), pp. 885-895,1985.
[32] Sobieszczanski-Sobieski, ,I., "Sensitivity of Complex, Illternally Coupled Sys-
tems," AIAA .TournaI, 28 (I), pp. 153-160,1990.
414
Optimum Design of Laminated Composite Structures 11
\Vhile laminated composite materials are attractive replacements for metallic ma-
terials for many structural applications that require high stiffness-to-weight and high
strength-to-weight ratios, the analysis and design of these materials are considerably
415
Chapter 11: Optimum De tn of Laminated Composite Structures
more complex than those of metallic structures. One of the complexities in formu-
lating the analysis of a laminated composite material is due to material anisotropy
that requires an increased number of material constants for characterization of the
mechanical response of the laminate. The generalized Hooke's law for an anisotropic
material is given in terms of 21 independent stiffness coefficients. It is this aspect
of composite materials which makes them attractive for optimal design and tailoring
purposes. However, for a general structure with a three-dimensional stress state it is
very difficult to solve the governing equations. Fortunately, most composite structures
are plate-type structures which are composed of layers or plies of orthotropic material
which can be characterized in terms of a smaller number of stiffness constants. In
the following section, the basic equations that govern the mechanical response of an
orthotropic lamina are summarized.
For an orthotropic material with the axes of orthotropy 1-2 aligned with the x-y
coordinate axes (e = 0 in Fig. 11.1.1), the stress-strain relation in the principal
material directions is given by the following set of equations with 9 independent
constants.
o o
11
l i~)
o o o E2
o o oo 103
/01 ) (11.1.1 )
= C44 () o 123·
T31 o C55 o 131
T12 o o C 66 112
Furthermore, by assuming a plane stress state in each of the layers in the 1-2 principal
material plane, we have
416
Section 11.1.' Mechanical Response of a Laminate
(11.1.3)
where the Qij'S are called the reduced stiffnesses and are given in terms of four
independent engineering material constants in principal material directions as
El E2
Qll = , Q22 = ,
1- 11121121 1- 11121121
026 = (Qll - Q12 - 2Q66) sin3 8 cos 8 + (Q12 - Q22 + 2Q66) sin 8 cos 3 8,
066 = (Ql1 + Q22 - 2QI2 - 2Q66) sin 2 8 cos 2 8 + Q66(sin4 8 + ('os4 ()) .
Equations (11.1.6) are the basic building blocks of the classical lamination theory
which will be discussed next. These equations, however, can be put into a simpler
form in terms of the angular orientation of the principal axis of orthotropy with
respect to the reference x-y coordinate system. Tsai and Pagano [2J defined the
following material properties that are invariant with respect to ply orientation
1
U1 = g(3Qll + 3Q22 + 2QI2 + 4Q66) ,
1
U2 = 2(Qll - Q22)'
1
U3 = g(Qll + Q22 - 2QI2 - 4Q66), (11.1.7)
1
U4 = g(Ql1 + Q22 + 6Q12 - 4Q66) ,
1
U5 = g( Q11 + Q22 - 2Q12 + 4Q66) .
417
Chapter 11: Optimum Design of Laminated Composite Structures
Using various trigonometric identities, we can write the transformed reduced stiff-
nesses of Eq. (11.1.6) as
Oll = U1 + U2 cos 28 + U3 cos 48 ,
012 = U4 - U3 cos40,
022 = U1 - U2 cos 20 + U3 cos 48,
016 = -~U2 sin 28 - U3 sin48, (11.1.8)
I1
hl2
2
.~Zo Z2
h
Jz k ~I
zk zN-1
ZN
N I
Figure 11.1.2 Laminate stacking convention.
Classical lamination theory (CLT) assumes that the N orthotropic layers described
above are perfectly bonded together, as in Fig. 11.1. 2, with a non-shear-deformable,
infinitely thin bondline. Kirchhoff plate theory is used, which assumes a linear
throllgh-the-thickness variation of the in-plane displacements,
awo awo
u = uo - z ax ' v = Vo - z ay , (11.1.9)
{ :: } = {
IXy
:~ } + z { :: }
IXy Kxy
, (11.1.10)
418
Section 11.1: Mechanical Response of a Laminate
where the superscript 0 indicates the mid-plane strains, and the curvatures K, are
the mid-plane curvatures. Therefore, the stresses in the kth ply can be expressed in
terms of the reduced stiffnesses of that particular ply by substituting Eq. (11.1.10)
into the stress-strain relationship, Eq. (11.1.5)
(11.1.11)
~y
• , •
x x
I L
r
x
Nx • tz .. N x
N xy
y ;: ... N xy
y/
«
The net stress resultant and moment resultant (stress couple) per unit length of
the cross section acting at a point in the laminate, see Fig. 11.1.3, are obtained by
through-the-thickness integration of the stresses in each ply,
{ N} = J
x
Ny
h/2 { }
ax
ay dz =L
N J
Zk { ax
ay
}
dz, (11.1.12)
N xy -h/2 rxy k k=lzk_t r xy
{ M}
Mu = J h/2 { }
~: zdz =L
N
J
Zk {
~: }
zdz. (11.1.13)
Mxy -h/2 rxy k k=lzk_t r xy
and
419
Chapter 11.' Optimum De:;ign of Laminated Composite Structures
where
N
(11.1.17)
(11.1.18)
The A and D matrices are the extensional and flexural stiffness matrices, respec-
tively. The A matrix relates the in-plane stress resultants to the mid-plane strains,
and the D matrix relates the moment resultants to the curvatures. The B matrix,
on the other hand, relates the in-plane stress resultants to the curvatures and mo-
ment resultants to the mid-plane strains, and hence is called the bending-extension
coupling matrix. This coupling matrix can be a useful tool in designing laminates
for certain structural applications. If it is undesirable, the B matrix can be avoided
by a symmetric placement of the plies with different orientations with respect to the
mid-plane of a laminate. However, as noted by Caprino and Crivelli-Visconti [3] and
by Gunnink [4], symmetry is a sufficient but not a necessary condition to avoid cou-
pling. It is shown by Kandil and Verchery [5] that a certain class of laminates, such as
laminates consisting of two symmetric sub-laminates with equal numbers of plies and
equal but arbitrary fiber orientations, (h and B2 , for which the minimum number of
layers is eight [BI/B 2 /B2/BI/B 2 /BI/BI/B 2 ], possess no bending-extension coupling. This
may be important for design optimization purposes because symmetric placement of
the plies may restrict certain combinations of the in-plane and bending stiffnesses.
In addition to the bending-extension coupling, certain elements of the A, B, and
D matrices result in coupling response. \Vhen the Al6 and A 26 terms are not zero,
there is a shear-extension coupling. The existence of D16 and D 26 terms induces
bending-twisting coupling, and bending-shear coupling as well as extension-twisting
coupling results from non-zero B16 and B 26 terms. Again, by proper selection of the
laminate, these coupling terms can be eliminated. For example, by using negative
angle plies for every positive angle ply used in the laminate one can eliminate the
shear-extension coupling. Such laminates arc referred to as balanced laminates. How-
ever, these same terms can also be manipulated to tailor the response of a laminate
to the needs of a specified design application, as in the case of aeroelastic tailoring
(see Section 11.4.2).
The A, B, and D matrices are commonly used in the literature in the form
defined in Eqs. (11.1.16)-(11.1.18) together with the definitions of (Qij) given by
Eq. (11.1.6). However, for some design procedures, the use of sines and cosines of
multiple angles (see Eqs. 11.1.8) proved to be more useful, especially for derivation
420
Section 11.1: Mechanical Response of a Laminate
of the sensitivities of these matrices with respect to the angular orientation design
variables. Starting with the integral form of the Eqs. (11.1.16)-(11.1.18), for example,
J
h/2
{All,Bll,D ll } = Qll{1,z,Z2}dz, (11.1.19)
-h/2
and assuming each layer to be of the same material, we have
J J
h/2 h/2
{All, B ll , D ll } = Udh,O, ~:}+U2 cos2B{I,z,z2}dz+U3 cos4B{1,z,z2}dz.
-h/2 -h/2
(11.1.20)
Similar expressions can be found for the other stiffness terms, and are summarized
in Table 11.1.1 where the expressions for the V's are the following
Table 11.1.1 : A, B, D Matrices in Terms of Lamina Invariants
VO{A,B,D} V1{A,B,D} V2 {A,B,D} V3 {A,B.D} V4{A,B,D}
{All, B ll , D l l } U1 U2 0 U3 0
{A 22 , B 22 , D 22 } U1 -U2 0 U3 0
{AI2' B 12 , D 12 } U4 0 0 -U3 0
{A 66 , B 66 , D 66 } U5 0 0 -U3 0
2{A16, B 16 , D 16 } 0 0 -U2 0 -2U3
2{A26, B 26 , D 26 } 0 0 -U2 0 2U3
h3
VO{A,B,D} = {h, 0, 12} ,
J
h/2
Vl{A,B,D} = cos2B{1, z, z2}dz,
-h/2
J
h/2
V2{A,B,D}= sin2B{I,z,z2}dz, (11.1.21 )
-h/2
J
h/2
V3{A,B,D} = cos4B{1, z, z2}dz,
-h/2
J
h/2
V4{A,B,D} = sin4B{1, z, z2}dz .
-h/2
The above set of integrals can again be replaced hy summations.
421
Chapter 11: Optimum Design of Laminated Composite Stmctures
The laminate stiffness matrices described in the previous section can he manipulated
both by changing the number of layers and their orientations. Therefore, use of
these quantities as design variables enables us to change the material properties of a
laminate as well as its thickness. In many practical applications, bending-extension
and shear-extension coupling is undesirable. Consequently, most laminates in use
today are symmetric and balanced to eliminate these couplings. Balanced symmetric
laminates are also much easier to analyze. For example, analysis of a laminate with
bending-extension coupling is difficult because out-of-plane deformations associated
with in-plane loads may be large and, therefore, require nonlinear analysis capability.
Therefore, most of optimization work to date has been limited to balanced symmetric
laminates. In the remainder of this chapter only such laminates are considered.
Most commercially available composite materials come in fixed ply thickness.
Furthermore, most of the data available for laminate behavior is limited to ply ori-
entations of 0-, 90-, and ±45-deg. For these reasons, laminate design is primarily an
integer programming problem. However, most of the available optimization software
is for continuous-valued design variables and the past work on laminate optimization
are based on the use of such variables. The total thicknesses of contiguous plies of
the same orientation, referred to as the ply thickness variables, were commonly used
as design variables. Ply orientations were also occa..<;ionally used as of'sign variables,
with orientations taking any value between 0- ano 90-deg. The final ply thicknesses
(or orientations) can be rounded-off to integer multiples of the commertially available
ply thickness (or convential ply orientations). However, for large number of design
variables finding a rounded-off design that does not violate any constraint is often dif-
ficult. Also, the problem must be formulated with a given stacking sequence, rather
than letting the optimization obtain the best stacking sequence. For these rea..'>ons,
there is a growing interest in the application of integer programming methods to
laminate design. We start this chapter with description of approaches that imple-
ment traditional continuous-valued variables, with integer programming applications
described in section 11.3.
There are a number of design considerations for optimization of laminated plates
depending on the intended application. One of the key considerations in terms of
analysis and design is whether the plate is designed for in-plane or out-of-plane re-
sponse. For the sake of simplicity we review these two cases separately.
11.2.1 Design of Laminates for In-plane Response
Ply Thickness Variables: One of the earliest efforts in designing laminates for in-plane
strength and stiffness requirements is due to Schmit and Farshi [6) who considered
a symmetric balanced laminate with fixed ply orientations. The thicknesses of the
individual layers ti, i = 1, ... , I with different prescribed orientations were used as
design variables. Because of the symmetric laminate restriction, only the thicknesses
of one half, I, of the total number of layers, N, are used. The laminate is under the
action of combined membrane force resultants, N xb N yk , N xyb k = 1, ... , J{ where J{
is the number of load cases.
422
Section 11.2: Laminate Design
The optimization problem is formulated as the following:
I
where pP), Q~i), RY); are coefficients which define the jth boundary of a failure enve-
lope for each layer (i) in the strain space, and the flik, f2ik, and /12ik are the principal
material-direction strains in the ith layer under the kth load condition. For a simple
maximum strain criterion, which puts bounds on the maximum values of the strains
in the principal material directions, the failure envelope has 6 facets with P, Q, and
R defined as the inverse of the normal and shearing failure strains in the longitudinal
and transverse directions to the fibers in tension and compression. Equations (11.2.3)
prescribe lower limits All/, A 22 /, and A661 of the membrane stiffnesses of the laminate.
The approach used by Schmit and Farshi transforms the nonlinear programming
problem described in Eqs. (11.2.1)-(11.2.5) into a sequence of linear programs (see
section 6.1). The inequality constraint Eq. (11.2.2) representing the strength criterion
is a nonlinear function of the thickness variables and, therefore, is linearized as
~,
g,)kL
(t) = S(t)
9 0
+ ~(t
L...J 1
_ t ) (p(i)af 1ik
01 ) at + Q(i)af
) at + R(i)a/
2ik
) at
12ik )
' ( 11.2.6)
1=1 I I I
where the derivatives of the principal strains in the ith layer are related to the deriva-
tives of the laminate strains through the transformation relations
aeik _ T,ae'k
(11.2.7)
atl - 'atl'
where eik = (flik, f2ik, /12ikf, and Ti is the transformation matrix for the ith layer
defined by
Ti = [
COS 20i
sin20i
sin20i
0i
COS 2
cos Oi sin Oi
- cos 0i sin Oi
1. (11.2.8)
- 2 cos Oi sin Oi 2 cos 0i sin Oi (COS 2 0i - sin 2 Oi)
423
Chapter 11: Optimum Design of Laminated Composite Structures
Since the A matrix is a linear function of the thickness variables (see Eq. 11.1.16),
the derivative is simply equal to the transformed reduced stiffnesses of the ith layer
8A -
oti = Qi, (11.2.10)
8e'k A-1Q-
=- (11.2.11)
0
otl lek'
Equation (11.2.6) together with (11.2.7) and (11.2.11) can be used to form the linear
approximations at any stage of the design optimization.
In addition to the constraint approximation, Schmit and Farshi also used a con-
straint deletion technique by including only those constraints that are potentially
critical at each stage of the constraint approximations.
Table 11.2.1 : Minimum weight laminates with stiffness constraints loaded in
axial compression.
Layup and Orientation Initial Final Final Number
Layer Angle Design Design Design of Plies
Number deg ti (in.) tf (in.) % (rounded)
[0/ ± 45/90).
1 00 0.032281 0.018793 28.96 4
2 +450 0.032281 0.023048 35.52 6
3 -45 0 0.032281 0.023048 35.52 6
4 90 0 0.032281 0.000000 0 0
2: ti 0.102583 0.059438
Results of optimal designs for various conventional laminates with O-deg, ±45-
deg, and 90-deg ply orientations under various combinations of in-plane normal and
shear loads presented in Ref. [6) demonstrate the importance of the choice of laminate
stacking sequence on the optimum design. For example, for a laminate under uniaxial
stress and limits on shear stiffness, it does make a difference whether we select a
[0/ ± 45/90). laminate or [0/ ± 45). laminate even though at the end of the design
iterations the thickness of the 90-deg plies of the first laminate vanishes. Results
of these laminates obtained from Ref. [6) are summarized in Table 11.2.1. The
final design of the first laminate has a critical strength constraint for the 90-deg
424
Section 11.2: Laminate Design
ply. Compared to the second laminate [OJ ± 45], it is about 9% thicker due to
an additional O-deg ply required for the first laminate to prevent violation of the
strength constraint in the 90-deg layers. In order to achieve a true optimal solution,
therefore, the designer has to repeat the optimization process with different laminate
definitions, especially by removing the layer(s) that converge to their lower bounds.
However, the fact that a layer assumes a value different from its lower bound may not
mean that the particular layer is essential for the optimal design. That is, it is quite
possible that once a layer with a thickness different from its lower bound is removed,
the optimization procedure can resize the remaining layers to achieve a weight lower
than the one achieved before. This can make the design procedure difficult, because
of the need to try all possible combinations of preselected angles. However, for most
practical applications the presence of plies with fibers running in prescribed directions
(such as fibers transverse to the load direction) is desirable. Therefore, lower limits
which are generally different than zero are imposed, and ply removal is not an option.
Multiple load conditions also tend to produce designs where ply removal may not be
possible.
Ply Orientation Variables: In order to find the laminate stacking sequence which
is best suited to the load condition under consideration, the ply orientations of the
laminate as well as the ply thicknesses need to be used as design variables. Indeed
many design codes treat both as design variables. In order to demonstrate the use
of ply orientations as design variables, however, we concentrate on examples with
only ply orientation variables. For optimization problems formulated as minimum-
weight designs, the objective function is independent of the ply orientations. This
might cause difficulties in converging to an optimum solution with some optimiza-
tion algorithms. An alternative to the weight objective function minimization is the
maximization of the laminate strength as demonstrated by Park [7] and l\Iassard [8].
A quadratic first-ply failure (FPF) criterion based on an approximate failure
envelope in the strain space [9] is used by Park [7] for laminates under various in-
plane loading conditions (Nx , Ny, N xy ). This approximate failure envelope is given
by
€;+ €~ + (lj2h;y = bo2 , (11.2.12)
where bo is defined solely in terms of the stiffness and strength properties in the
principal material directions. The objective function to be minimized is defined as
f = €; + €~ + (lj 2h;y, (11.2.13)
which represents the square of the norm of the strain vector. The smaller the objec-
tive function value, the larger the loads that can be applied to the laminate before
the failure envelope is violated, therefore, the stronger the laminate in FPF. One
key feature of this approximate strain failure envelope is that it applies to laminate
strains and does not require ply-level strain calculations. Only balanced symmetric
laminates are considered in reference 7, and six different laminates were studied, five
of which were the following conventionallayups; [-B, +B]" [-B, 0, +B1.., [-B, 90, +B]s,
[-B, 0, 90, +B]., and [-B, -45, +45, +B],. The sixth laminate wa.<; called a continu-
ous laminate, and was assumed to have fiber orientation changing linearly from the
top surface to the mid-plane of the laminate covering a range from -B to B-deg.
orientations.
425
Chapter 11: Optimum Desig of Laminated Composite Structures
Results in [7] showed that under combined loading the best laminate, according
to the FPF criterion, for large longitudinal loading without shear is the [-0,0, +8]8
type, and for large shear loading without the longitudinal load, the best is the
[-0, -45, +45, +8]8 laminate. The optimum angle for the [-8,0, +8]8 laminate de-
pended on the magnitude of the transverse load Ny, and was equal to O-deg for Ny = O.
As the transverse load is increased, the optimum angle reached 45-deg for Ny = N x/2,
and was equal to 60-deg for Ny = N x • Similarly, for the [-8, -45, +45, +0]. laminate
(with shear loading and no axial loading), the optimal angle was 45-deg for Ny = O.
As the transverse load Ny increased, the optimal angle increased and reached a value
of about 73-deg for Ny = Nxy . The continuous laminate proved to have the best
overall performance under combined longitudinal and shear loadings with a range
±65-deg for Ny = O.
The above results were intuitively appealing in that the fibers were mostly placed
in a direction parallel to the applied loads. But such intuition may not always lead
to optimal designs when working with composite materials. Consider, for example,
using Hill's yield stress criterion interpreted for composite materials by Tsai [10],
for the strength prediction of a unidirectional composite. The quantities X, Yare the
normal strengths in directions parallel and transverse to the fibers, and S is the shear
strength of a ply. Brandmaier showed [11] that if the transverse normal strength Y
is less than the shear strength S, optimal placement of the fibers is not along the
principal stress directions, but depends on the values of the strength quantities as well
as the applied stresses. This can be demonstrated (see Exercise 1) by expressing the
principal stresses in terms of the applied stresses (Jx, (Jy, T xy , and the fiber orientation
8, and equating the derivative of Eq. (11.2.14) with respect to the fiber orientation
to zero.
A Graphical Tool for Optimum Design: A graphical procedure introduced by
Miki [12, 13] for the design of laminates with prescribed in-plane stiffness properties is
a highly practical tool for design optimization. The procedure is suitable for multiple
balanced angle-ply laminates of the type [(±8/ )N)(±8/-dNI_J ... /(±8dN,]s where
the total number of plies in the laminate is N = 2 2:{=1 N i . In addition to the
balanced angle-ply sub-laminates, one unidirectional lamina with principal material
axes aligned with the axes of the laminate can be included into the stacking sequence.
The major effort of this design procedure is the construction of a lamination
parameter diagram which describes the allowable region of lamination parameters
Vt and V3*' These parameters are obtained by normalizing the in-plane components
of VIA and V3A in Eq. (11.1.21) by the total laminate thickness. For a laminate
of total thickness h, in which the volume fraction of the plies with ±8; orientation
angles are V;, the lamination parameters are given as
L
/ TT 1
VI* = h
VIA = "'"
~ vk cos 28k, and V3* = -v3A =
h
Vkcos 48k, (11.2.15)
k=1 k=1
426
Section 11.2: Laminate Design
where
I
Vi =
2(Zi - zi-d
h ' and L v;=1.
;=1
(11.2.16)
Because of the normalization, the values of the lamination parameters are always
bounded, -1::; Vi· ,Va· ::; 1. For a laminate with only one fiber orientation angle,
the lamination parameters are
C A
-1
(11.2.19)
427
Chapter 11: Optimum Design of Laminated Comr ,ite Strnctures
where the elements of the extensional stiffness matnx of the laminate are determined
from the following equations from Table (11.1.1)
( 11.2.20)
and where the Ui are the orientation-invariant material properties, Eq. (11.1.7).
If the laminate consists of two or more fiber orientations, then it is shown bv i\Iiki
[12] that Eq. (11.2.18) becomes an inequality •
(11.2.21 )
The allowable region of the lamination parameters is the area bounded by the curve
ABC in Fig. 11.2.1, independent of the number of different ply orientations. Any
point inside the lamination parameter diagram, therefore, corresponds to laminates
with two or more fiber orientations. Because a point is defined by two parameters,
this means that only two orientation angles (h and O2 arc sufficient for designing
laminates for prescribed stiffness requirements. For halanced angle-ply laminates with
more than two orientations, there will be many eombinations of til(' ply orientations
that will produce the same lamination parameters and, therefore, the same stiffness
properties. Each point inside the design space is called a lamination point, and
corresponds to a laminate with a specific stiffness properties. It is also possible to
restrict permissible values of the various effective engineering stiffnesses (Ex, E y , G xy,
and vxy ) graphically. This is achieved by introducing contours of constant effective
engineering stiffnesses, obtained from Eqs. (11.2.19) and (11.2.20), for each of the
engineering constants
Ex contours: ( 11.2.22)
contours:
V;* _ vxy U2 Vt - VryU j + U4 (11.2.24)
(1 + Vxy)U3
v xy
3 - '
17* _ U.s - G xy
Gxy contours: ~3 - (11.2.25)
U:I
428
Section 11.2: Laminate Design
y* y 1*
~~~~~++~~~~.-~ 1
-1 -1
y*
1
-1 't----i----f30 1 -1 1
-1
of lamination angle [±B]s that maximizes the effective Poisson's ratio is not straight
forward and is a function of the lamina properties via Eqs. (11.1.7) and (11.1.4). For
example, for T300/5208 graphite/epoxy and Scotchply 1002 glass/epoxy materials
the laminates that produce the maximum Poisson's ratio are [±25]s and [±31]s,
respectively.
For design problems where one or more of the effective engineering constants are
constrained, appropriate contours can be superimposed to identify the feasible design
space and the lamination point that maximizes (or minimizes) the desired stiffness
property (see Exercise 2).
429
Chapter 11: Optimum Design of Laminated Composite Structures
Ply Thickness Variables: For rectangular laminated plates under in-plane compres-
sive loads, the strength constraint becomes unimportant if the size of the plate is
large compared to the thickness. For such plates, elastic stability and vibration,
which are governed by the flexural rigidities of the plate must be considered. One of
the earliest studies that included the elastic stability constraint during the optimal
design of composite plates is by Schmit and Farshi [14].
b
f-
~Nx
XY
... x
430
Section 11.2: Laminate Design
where N x , Ny, and Nry are equal to the applied design loads. Substituting Eq.
(11.2.27) into the equilibrium equation and applying Galerkin's method leads to an
eigenvalue problem of the form
Kw = AKcw, (11.2.29)
where the eigenvector is composed of the unknown coefficients of the displacement
function, w = {Wl1 ... IV1N W 21 ... IV2N •..... IVM N V. The matrices K and Kc
are given as
{ m,p= 1, ... ,"~1} (11.2.30)
n,q=l, ... ,N '
where
(11.2.31)
(11.2.33)
where m and n are the number of half waves in the x and y directions, respectively,
that minimize Acr .
As they did for the strength constraint Eq. (11.2.2), Schmit and Farshi used a
linear approximation for the buckling constraints in the form
I aAb
gdt) = 1 - Ab(to) - 2:(t; - to;)-I . (11.2.34)
.
,=1 at; t=to
431
Chapter 11: Optimum Design oj Laminated Composite Structures
Noting from Eqs. (11.2.32) that the matrix KG is independent of the design variables,
and using Eq. 7.3.5, we can show that the derivatives of the kth buckling load factor
are given by
(11.2.35)
Since the matrix K is a function of the flexural stiffnesses, an explicit expression for
the derivatives of K with respect to the design variables can be written as
okpq _ ab
otj - 4
[6 6 OJmn]
mp nqOti '
(11.2.36)
where
Bimn = 7f4 [OD ll (m)4 + 2(OD 12 + 2BD66) (m)2 (~)2 + BD22 (~)4]
Oti Oti a ot i Oti a b Oti b
(11.2.37)
The partial derivatives of the flexural stiffnesses can be related to the partial deriva-
tives of the in-plane stiffness matrix A. For a quasi-homogeneous laminate in which
the bending-twisting coupling terms, D16 and D 26 , are ignored (these terms vanish
as the number of ply groups increase), the in-plane and flexural moduli are related
by (see page 204 of Ref. 9)
h2
Djj = 12Aij, (11.2.38)
where h is the laminate thickness. The partial derivatives of the flexural stiffnesses
are, therefore, given by
oD rs
ot i
=~
12
[(OArs)
at i h2 + 2A rs h] , r,s = 1,2,6, (11.2.39)
Miki shows [15] that a relation of the same form as Eq. (11.2.21) is obtained
(11.2.42)
432
Section 11.2: Laminate Design
Therefore, any balanced symmetric angle-ply laminate with multiple orientations can
be represented as a point in a region bounded by
w; = 2Wt 2 -1, (11.2.43)
where the designs on the boundaries correspond to designs with only one lamination
angle, [(±O)I]., and
wt = cos 20, and w; = cos 40 . (11.2.44)
The diagram for the flexural lamination parameters can be used for designing
laminates for maximum buckling load under uniaxial and biaxial loads. For pre-
scribed values of the m and n, and a fixed ratio of applied transverse load to axial
load it can be shown, by manipulating Eq. (11.2.33), that the contours of the critical
load parameter ACT are straight lines in the flexural lamination diagram. However,
a difficulty in using flexural lamination parameter in designing laminates with max-
imum buckling load is that m and n are seldom known a priory. Since these two
numbers depend on the design variables, as well as the plate aspect ratio and the ap-
plied loads, it is not always possible to predict them accurately. For further discussion
of the use of the flexural lamination parameter diagram for buckling maximization
see Ref. [15]. Also, the following analytical discussion of the use of ply orientation
variables for buckling problem explains the role of m and n.
Ply Orientation Variables: A number of researchers carried out analytical inves-
tigations of the optimization of various flexural response quantities such as vibration
frequency [16-18]' structural compliance [19]' and buckling response [20] of simply
supported laminated plates. For a plate with length a and width b, Pedersen [20]
defined a parameter 4> which is proportional to the square of natural frequency and
buckling load, and inversely proportional to the out-of-plane displacements. The
quantity 4>, composed of a linear combination of the non-dimensional bending stiff-
nesses dij (i,j = 1,2,6), is defined as
(11.2.45)
where "I is a mode parameter defined as the ratio of the longitudinal and transverse
half-wave lengths by
na
"1= - , (11.2.46)
mb
with m and n being the modal half-wave numbers in the x and y directions, respec-
tively (see Eq. 11.2.27). The non-dimensional bending stiffllesses, dij, are defined in
terms of the flexural stiffnesses as
(11.2.47)
For a laminate with fixed ply thicknesses, the maximization of the buckling load or
the natural frequency, or minimization of the displacements, is achieved by obtaining
the stationary value of the 4> with respect to the ply orientations. That is
84> _ 8du 2 2( 8d 12 2 8d66 ) 48d22 - 0 (11.2.48)
80 - 80 + "I 80 + 80 + 17 80 - .
433
Chapter 11: Optimum Design of Laminated Composite Structures
Restricting the laminate to be a balanced, symmetric angle-ply laminate and
ignoring the bending-twisting coupling terms, we can put the bending stiffness matrix
from Table 11.1.1 into a summation form
W*1
- W1*
(11.2.49)
o
o
where Wi and W; arc defined by Eq. (11.2.40). Using Eqs. (11.2.40), (11.2.48), and
(11.2.49) we have
(11.2.50)
The existence of multiple values of the fiber orientation that yield stationary values
for the quantity ¢ indicates local optima. The first two roots are independent of the
material properties and the geometry. The solution in Eq. (11.2.52), on the other
hand, contains the material properties and the mode parameter TJ, and is valid in a
range (see Muc [21]) TJ~in < T,2 < TJ~ax where
2 6 ± }36 + 4[(U2/4U3? - 1]
TJmax = 2[(U2 /4U3 ) + 1] ,
( 11.2.53)
when 0 reaches 0 and 90-deg , respectively.
The optimal values of the fiber angles for two different values of the U2/4U3
values are presented in Figure 11.2.4 from Eq. (11.2.52). The range of the U2 /4U3
values used in the figure practically covers many commercially available composites
including Graphite-Epoxy, Boron-Epoxy, Glass-Epoxy, and Aramid-Epoxy. Clearly
the optimal fiber orientation is insensitive to the material properties, but strongly
influenced by the mode shape parameter. For small or large values of the mode
parameter TJ the optimal orientation is either Ok = O-deg or Ok = 90-deg, and the
optimal orientation is independent of the position of the layer in the laminate.
The influence of the mode parameter TJ on value of the optimal fiber orientation
needs to be investigated further. The minimum value of ¢ which corresponds to the
buckling mode shape with lowest buckling load is obtained for transverse wavelength
parameter n = 1, but it is not always clear what value of the longitudinal wavelength
parameter m leads to the lowest value of the parameter ¢. For plate aspect ratios
r = alb less than a critical value (rer)l the wave number Tn = 1 gives the lowest value.
For r > (rer)l the wave number is determined such that it minimizes ¢. The points
434
Section 11.2: Laminate Design
90
80
- - U2!'4U3 = 0.96
70
- - U2/4U3 = 1.16
60
...g. 50
CJ:)
40
30
20
10
0.0 0.2 0.4 0.6 0.8 l.0 1.2 1.4 1.6 1.8 2.0 2.2
ll=nalmb
Figure 11.2.4 Optimal ply orientation as a function of the mode parameter ",.
points of intersection of the curves of ¢J for mode parameters m and m + 1 give the
critical values of the plate aspect ratio [151
(11.2.54)
U2 (r 4 m 2 2 ) + m ± +
y'Ui(r 4 m 2 m 2 )2 - 8U3 (U1 - U3 )(r4 - m 2 m 2 )2
cos20=--~------~--~~~--~--~~~~----~~------~
3 (r 4 - m 2 2 ) 4U m
(11.2.55)
The optimal orientation of the fibers, including the interaction of the adjacent
modes, for a T300/5208 Graphite/Epoxy laminate as a function of the plate aspect
435
Chapter 11: Optimum Design of Laminated Composite Structures
80
m=1
70
...'
m=! , 2, ,I m=2
60 ,,
, . m=2,3
I
I
m;::3
'.,:
'.
...c.. 50
a
CD 40
30 ,.
I
I
,,
20 I I I
I I
I
I
, I
10 I
,•• I
•
0
•
• ••
0 2 3 4 5
r= a I b
ratio is shown in Figure 11.2.5. For aspect ratios greater than unity, the optimal
angle oscillates around 45-deg. The amplitude of the oscillations decreases as the
aspect ratio r is increased, therefore, for all practical purposes and for aspect ratios
r > 4 the optimal angle can be assumed to be Bapt = 45-deg.
If the laminate is loaded under biaxial compression [20], for small aspect ratios,
r < 1.5, the optimal fiber angle is similar to the case of uniaxial compression. For
aspect ratios larger than 1.5, the value of the optimal angle increases rapidly as the
ratio of the transverse load to the axial load (Ny/Nr) increases. For Ny ~ 4Nx , the
optimal fiber orientation is 90-deg.
Importance of Laminate Stacking Sequence: When ply thickness design variables
are used, the stacking sequence is selected ahead of time. As for in-plane loads, the
optimum design can be influenced by a choice of whether include or not to include
a particular ply orientation. However, for flexural response, the stacking sequence
is more important because it strongly affects the D matrix while it has no effect on
the A matrix. Fortunately, as shown below, the optimum design is insensitive to the
choice of stacking sequence.
If the relative position of the boundaries between the plies are ~k = Zk/ h for a
laminate with N plies, then
436
Section 11.2: Laminate Design
the derivative of ¢ with respect to the ply boundary variable is
Here Diik depends only on the properties and orientation of the k-th layer and (as-
suming the adjacent layers to be made of the same material so that the constant U
terms are omitted) is defined by
U2 cos 20k + U3 cos 40k -U3cos40k
Dk == h3 [ -U3cos40k
o
-U2 cos 20k + U3 cos 4fh
o
(11.2.59)
-U3c~os40J
Then, as shown by Cheng Kengtung [19] the derivative of the function ¢ can be
expressed as
(11.2.62)
If the total thickness of the two plies is kept constant, the derivative is zero for
these angles whatever the location of the boundary between the plies. Therefore,
the buckling load is independent of the thickness distribution of the adjacent plies.
Moreover, for a square laminate, .,., = 1, ¢ is constant for
(11.2.63)
StackingSequence
[0/90/ 45
[0/45/90.
1' 0.0366
0.0366
(0.04)
(0.04)
0.1539
0.2496
(0.16)
(0.24)
0.8095
0.7139
(0.80)
(0.72)
[45/0/90]. 0.2228 (0.20) 0.0634 (0.08) 0.7139 (0.72)
[45/90/0]. 0.2228 (0.20) 0.3044 (0.32) 0.4729 (0.48)
[90/45/0]. 0.1399 (0.12) 0.3872 (0.40) 0.4729 (0.48)
[90/0/45]. 0.1399 (0.12) 0.0506 (0.04) 0.8095 (0.84)
t Ply thicknesses are rounded such that each laminate has a total of 50 plies.
The insensitivity of the design to the choice of stacking sequence disappears when
strength is also a consideration. In such cases the choice of stacking sequence is
critical, and this topic is discussed in the next section.
The methods presented in the previous section yield results that are valuable for un-
derstanding the basic trends in laminate design. However, one of the major difficulties
of a realistic design situation is the need for a practical laminate which is generally
made up of plies with only O-deg, 90-deg and ±45-df'g orientations (or occasionally
orientations with 15-df'g increments between 0- and 90-dc-g), and thicknesses which
are integer multiples of the ply thickness. Of course, df'ciding the numhf'r of plies of a
specified orientation is not sufficient to define a laminate, but through-the-thickness
location of the ply must be decided as well. This means that the basic design problem
is to determine the stacking sequence of the composite laminate~a problem which
calls for discrete programming techniques. In the following, we introduce various
approaches t.hat address t.his problem.
The lamination parameter diagrams introduced in s('ction 11.2 can be used for de-
signing laminates with predetermined ply orientation angles. It is shown by ~1iki
and Sugiyama [23] that the feasible region for laminates with fixed ply angles is a
polygon with vertices located on the envelope of the lamination parameter diagram.
If the design point is on the periphery of the diagram, the laminate is an angle ply
438
Section 11.3: Stacking Sequence Design
laminate with one fiber orientation. Therefore, given a set of permissible integer ply
orientations, vertices of the polygons are placed at those locations that correspond
to the selected angles. For example, the design spaces for laminates made up of plies
with O-deg, ±45-deg, and 90-deg orientations and O-deg, ±30-deg, ±60-deg, and 90-
deg orientations are shown in Fig. 11.3.1-a and 11.3.1-b, respectively. For laminates
with 0, ±45, and 90-deg plies, the design space is a triangle with vertices at A, B,
and C as shown in the figure. For ply orientations of O-deg, ±30-deg, ±60-deg, and
90-deg, the design space is a trapezoid.
v·3
-1 B (45">.
a) 0-, ±45-, and 9O-deg plies b) 0-, ±30-, ±60-, and 9O-deg plies
Figure 11.3.1 In-plane lamination diagmm for laminates with integer ply orientations.
Points along the edges and interior points of the polygons correspond to laminates
with combinations of two or more ply orientations, and their number is determined
by the total number of layers in the laminate. If the total number of layers is N
and I = N /2, then in addition to the vertices, we obtain I - 1 equally spaced design
points along the edges and along the internal lines that join two vertices. From the
nodes we obtained along the edges, we also draw lines parallel to the lines that join
vertices. If such a line terminates at another discrete design point at the opposite
end of the polygon, then it is easy to label the design that would be in the interior
by looking at the designs at the two end points. For example, for an eight-ply (total)
laminate with O-deg, ±45-deg, and 90-deg angles (triangular design space), there are
five equally spaced design points with fiber orientations varying incrementally from
one vertex to another as shown in Fig. U.3.I-a. Note that the design points inside
the triangular region also follow an incremental pattern, but are combinations of the
three available angles. Design points for a laminate with total six layers are shown
in Fig. 11.3.I-b. Labeling of those designs is left to the reader (see exercise 4).
Just as for the in-plane lamination diagram, it is possible to construct the flexural
lamination diagram for a laminate with prescribed fiber orientations. The boundaries
439
Chapter 11: Optimum Design of Laminated Composite Structures
of the design space are same as the in-plane parameters; the prescribed angles are on
the envelope of the lamination diagram and form the vertices of a polygon. However
in this case the design points, which are combinations of the given angles, are not
equally spaced (although combinations of the angles corresponding to two vertices
are still located along the edge that connect these vertices) but are located through
the use of Eq. (11.2.40).
Buckling Design: The procedure described in section 5.7.4 for the use of a penalty
function to achieve designs with discrete valued variables is demonstrated in this
section for buckling maximization of laminates with fiber orientation variables. In
order to establish results that can be used to compare with integer orientation designs,
a series of results was generated for the continuous problems, see Gurdal and Haftka
[24]. This was achieved by turning off the penalty terms for the non-discrete values
of the design variables.
The problems solved are for a = 20 in by b = 10 in (50.8 ern x 25.4 ern) rectangular
plates of specified numbers of plies and fiber orientation design variables. The critical
eigenvalues are maximized for applied compressive load of N x = 1 with varying N y / N x
ratios.
90
..... ,..
80
.. ..........
' ,
..... '.-'.....'..... -....,...-..,.-
......' .
..--- 70 .:.t·:· .
oil
~
"0
.~,: .
'-'
P.. 60
0
<:D
50
- - - midplane layers
surface layers
40
0.00 0.50 1.00 1.50 2.00 2.50
Ny/Nx
Figure 11.3.2 Optimum continuous fiber orientations for maximum buckling load.
Plates with four different thicknesses corresponding to 8, 12, 16, and 24 ply lam-
inates were designed. The optimal orientations of the surface layer fibers (indicated
440
Section 11.3: Stacking Sequence Design
by dashed lines) and the layer adjacent to the mid-plane (solid lines) are shown in
Fig. 11.3.2 for each of the four laminates. For uniaxial compression, Ny = 0 or
N x = 0 (or NyjNx > 2.5), the laminates have the same fiber orientation through
the entire thickness which are ±45-deg and 90-deg, respectively. For intermediate
load ratios, the fiber angles at the surface layer are larger than the mid-plane layers
with the difference being largest for the thick 24-ply laminates. However, the fiber
orientation of the surface layers appears to depend only on the load ratio, and not
on the laminate thickness.
40
· .•. 24-ply Laminate
· .... 16-ply Laminate
=
~30
· .•. 12-ply Laminate
· .•. 8-ply Laminate
*
-
..
~,
...
. . &. . ........ "' ..
• : ........ .... :::£.
:1;' •
.. "
. ••••
. . . .
•.... '-''''''''
... :
o
0.00 0.50 1.00 1.50 2.00 2.50
Ny/Nx
Figure 11.3.3 Buckling load reduction for laminates with O-deg, ±45-deg, and 90-deg
plies.
Next, the same design cases were repeated using discrete fiber orientations of 0-,
±45-, and 90-deg. Solutions were obtained with the penalty function approach, and
checked by the branch-and-bound approach described in section 11.3.3. Plies with
+45-deg orientation were required to be adjacent to -45-deg plies so as to minimize
bending-twisting coupling. For the penalty function approach, it was convenient to
require also the plies with 0- and 90-deg orientations to appear in pairs. Plots of
the percentage reduction in buckling load due to the restrictions to discrete orien-
tations arc shown in Fig. 11.3.3 for the four laminates. Discrete valued designs are
accompanied with a substantial buckling load reduction over at least a portion of
the load ratio range considered. The largest penalty was for Nyj N x = 0.5 (about
22% reduction), and the thin 8-ply and 12-ply laminates. However, buckling load
reductions associated with different thicknesses appeared to be quite random.
The laminate stacking sequences obtained for the discrete valued designs are
441
Chapter 11: Optimum Design of Laminated Composite Structures
Table 11.3.1: Optimum stacking sequence for 8-ply laminates under biaxial compression.
Continuous Optima Penalty Approach Global Optima
Table 11.3.2 : Optimum stacking sequence for 16-ply laminates under biaxial compression.
Continuous Optima Penalty Approach Global Optima
presented in Table 11.3.1 and 11.3.2 for the 8-ply and the 16-ply laminates. Included
in the table are the laminate stacking sequences for the continuous valued designs,
the discrete designs obtained by using the modified penalty method, and the global
optimal designs. If the design obtained by the penalty function approach is same as
the global optimal design, the entry under the Global Optima column is left blank.
The penalty approach is unable to reach the global optimum in some cases, especially
for laminates with large numbers of plies. In every case, the discrete designs obtained
by the penalty function approach followed a pattern such that the orientations of the
outer plies were larger than those plies close to the mid-plane; this was similar to
the trend observed for the continuous designs. Global optimal designs, on the other
hand, had orientations that were more random. The differences in buckling loads
ranged up to 14%, and illustrate the danger oflooking for the discrete optimum near
the continuous one.
11.3.3 Integer Linear Progmmming Formulation
The normalized integrals used for the graphical procedure as design variables, see Eqs.
(11.2.14) and (11.2.39), may not be a good choice for more general design problems.
In order to define the integrals that are needed for characterizing the laminate, a new
set of variables that define the existence of a given orientation layer or the orientation
of a specified layer are proposed by Haftka and Walsh [25]. Such variables arc referred
442
Section 11.3: Stacking Sequence Design
to as ply-identity design variables. For example, if we have four possible orientations
and N plies, we can use N design variables that take the values of 1 to 4 to define
the stacking sequence. If symmetry is used this number can be reduced to N /2.
It is also possible to use zero-one ply identity design variables. For example, if
the laminate is made up of O-deg, 90-deg, and ±45-deg plies the stacking sequence
can be defined in terms offour sets of ply-orient at ion-identity variahles 0;, ni, ff and
ft, i = 1"", N/2, that are zero-one integer variables. The variables 0i, ni, If or
ft is equal to one if there is a O-deg, 90-deg, 45-deg or -45-deg ply, respectively, in
the ith layer.
The advantage of these zero-one ply-identity variables is that the integrals, and
therefore the A and D matrices are linear functions of these variables. The integrals
VOA , ViA and V3A are given in terms of the ply identity variables and the thickness
of a single ply t as
VOA = j h/2
dz =
N/2
2tL:)Ok + nk + fr + fl:'),
-h/2 k=l
ViA = j h/2
cos20dz =
N/2
2t~)Ok - nk), (11.3.1)
-h/2 k=l
h~ ~2
V3A = j cos40dz = 2t~)Ok + nk - tr - fl:') .
-h/2 k=l
For the flexural response, the integrals Von, Vin and V3n are exprcs:-;cd as
3 N/2 3 N/2
2t '"'
Vm = 3"" Zk 3 - (-t-)
~Pkcos20k [(T) Zk-l 3] 2t '"'I
= 3"" ~ k 3 - (k -1) 3] (Ok - nk), (11.3.2)
k=l k=l
3 N/2 3 N/2
V3n = 32t '"' [ Zk
~Pkcos40k (T) -(-t-)
3 Zk-l
1= 32t
3 '"' 3 3 m
~[k -(k-l) ](Ok+nk-tr-fk)'
k=l k=l
where tr
and ff do not appear in the expression for ViA and Vin since the cosine of
90 degrees is equal to zero. The variable Pk in Eq. (11.3.2) is unity if the Hh ply is
occupied and zero if it is empty. Constraints are applied during the optimization to
ensure that Pk can be zero only for the outermost plies.
Stacking Sequence for Buckling Design: Since the buckling load for symmetric
laminates under biaxial loads is a linear function of the flexural lamination param-
eters which are linear functions of the ply-identity yariables (see Eqs. (11.2.32) and
443
Chapter 11: Optimum Design oj Laminated Composite Structures
The minimization over m and n is performed by checking for all valu('s of 111 between
1 and mj, and all values of n between 1 and nj. The last constraint in Eq. (11.3.3)
ensures that the number of 45-deg and -45-deg plies is the same, so that the laminate
is balanced. The optimization problem of Eq. (11.3.3) is a int('gcr linear programming
problem, and the methods described in chapter 3.9 can be applied.
For the dual problem of weight minimization of a laminate capable of sustaining
a specified load without buckling, the total number of layers must be variable. This
seems to contradict the use of ply-identity variables which requires N to be known
in advance. A remedy for this contradiction is to start with a number of layers large
enough so that the initial design does not buckle, but permit some of the plies to be
empty (OJ + ni + Jr + Ijm ~ 1). Of course, plies that are permitted to be empty must
be the outer plies of the laminate in order to maintain integrity of the laminate. The
formulation takes the form
find OJ, ni, If, Ijm, i = 1, ... , N/2 ,
N/2
to minimize L(Oi + nj + Jr + It)
i=1
suchthat oX cr (m,n);:::l, m=l,···,mj, 71=l,···,nj,
(11.3.4)
Oi+ni+Jr+lim~l, i=1,···,N/2,
N/2
2:. If - lim = 0,
j=1
and 0i + ni + Jr + Ijm ~ Oi-l + ni-l + If-l + /[':.1 .
where the last constraint ensures that the empty pli('s are on the outside.
444
Section 11.3: Stacking Sequence Design
In general, the solution of the weight minimization problem is not unique. For a
minimum weight design with N· layers, it is possible to change the orientations of the
fibers and come up with designs that will have the same weight but different buckling
loads. Out of those feasible designs, ideally, one would like to choose the one that has
the largest margin for the buckling constraint. This can be achieved by subtracting
a small fraction of Acr from the objective function, so that the modified objective
function serves the dual purpose of minimizing weight while maximizing the buckling
load. For results on weight minimization designs, the reader is referred to Haftka and
Walsh [25]. In the following paragraphs, results for buckling maximization will be
presented.
40
..
•• 24-ply Laminate
.~.
16-ply Laminate
=
'.'
' 12-ply Laminate
~ 30 a-ply Laminate
*
- ..
., '.
:::: 20
Y
:a..
~
.
.. .•*', '. .•. . . :....•....
'-'
c<• 10 ...
c<
. •• .... • .
_...... ••
~
........ .. '
Figure 11.3.4 Buckling load reduction for globally optimal laminates with O-deg} ±45-
deg} and 90-deg plies.
For the results presented in this section the solution of Eqs. (11.3.3) is gener-
ated with the LINDO program [26] which employs the branch-and-bound algorithm
described in section 3.9.1. First we present the biaxial load cases that were reported
earlier in Table 11.3.1 and 11.3.1 as global optima. A plot similar to the plot shown
in Fig. 11.3.3, this time for the global optimum designs obtained through the use of
the linear integer programming approach, is shown in Fig. 11.3.4 for comparison. In
general, there is a small amount of improvement in the buckling load reduction for
most of the laminates. For example, the worst buckling load reduction (compared to
the continuous designs) is still for the 8-ply laminate for a load ratio of Ny/Nx = 0.5,
but it is only about 18% as compared to 22%. Also, there is an orderly progres-
sion with increasing laminate thickness. The smallest and the largest buckling load
445
Chapter 11: Optimum Design of Laminated Composite Stmct11res
reductions are associated with the 24-ply and the 8-ply laminates, respectively.
When the number of contiguous plies with the same orientation angle is large,
composite laminates are known to experience matrix cracking. Therefore, it is desir-
able to limit the number of such contiguous plies. vVe demonstrate the use of such
a constraints on the design obtained for N y/ N x = 2. \Ve start with the design that
was presented in Table 11.3.2, [90 4 / ± 45/90 2]" which we imposed the constraint that
the plies with different orientations appear in pairs. The critical load factor for this
optimal design was Au = 36.19. Next, we relax this requirement and redesign the
plate so that we can have single plies with different orientations adjacent to one an-
other. This yields a design which has 5 contiguous 90-deg plies, [90,,/ + 45/- 45/90]'•.
The critical load factor for this design is Acr = 36.8--1, a 1.8% increase compared to
the design which restricts each orientation to be in pairs. TIl(' fad that 45-deg plies
appear in a pair is of course coincidental. vVe then implement the contiguous ply
requirement by adding the constraint
The design ohtained with this constraint is [90 4 / + --13/9021 - 43]s and has a slightly
smaller load factor, Acr = 36.59, compared to the previous design. However, it still
has a slightly larger load factor compared to the design from Table 11.3.2, hut violates
the requirements that off-axis angles appear in pairs. By introducing a constraint of
the form
if - Jr';-l = 0, i = 1,2, ... , (I -1),
( 11.3.6)
where if' = 0, and If = 0,
designs that have the --13-deg plies in plus and minus pairs can he achieved, \vithout
requiring the 0- and the 90-deg plies to be in pairs, and without exceeding 4 contiguous
plies with the same orientations. In this particular case we obt ain again the design
presented in Table 11.3.2.
Stiffness and Buckling Design: In some cases it rna}' be dpsirahle to impose
constraints on the stiffness of the plate. For example, a constraint requiring All to
have a minimum value of A~l can be written as
(11.3.7)
As shown in [25] this constraint can he expressed as a linear function of the ply
identity design variables similar to the buckling constraint. Therefore, it can be used
as a constraint in the prohlem formulated by Eqs. (11.3.3). The effect of introducing
a minimum stiffness requirement is checked for Ny/Nx = 2. The optimum laminate
for this case, was dominated hy 90-deg plies, and has only 16 percent of the axial
stiffness All of an all O-deg laminate. A requirement that All be at least 50 percent
of the unidirectional laminate was added, ,vith and without the requirement of no
more than four contiguous plies. The results are compared to the original design in
Fig. 11.3.5. It is seen that the stiffness requirement is satisfied by putting O-deg plies
near the plane of symmetry where they have only a minimal effect on the bending
stiffnesses, and hence on the buckling load. The reduction in the buckling load is
about 8 percent. For this design the effect of adding the requirements of no more
446
Section 11.3: Stacking Sequence Design
and (11.3.8)
The strains for the kth ply may be calculated from the transformation
Even though the extensional stiffnesses Aij are linear functions of the design variables
the strains calculated by Eq.(l1.3.8), are nonlinear functions ofthese variables. These
strains can be linearized, as shown by Nagendra et al. [27], by a linear Taylor series
in A ij . We have
(11.3.10)
447
Chapter 11: Optimum Design of Laminated Composite Structures
where E is a typical strain component (.x = 1 ), EL is its linear approximation, and
Aijo and Aij are the extensional stiffnesses calculated at the nominal design point Xo
and neighboring designs, respectively. The derivatives of the strain with respective
to the extensional stiffnesses at the nominal design point are calculated in terms of
the midplane strains and the extensional stiffnesses at the nominal design. The linear
strain approximation can thus be constructed along a particular fiber orientation and
transverse to it by evaluating the strains E~, E~ and ,f2 for each orientation (since the
orientation is chosen apriori, either 0° or 45°) in terms of the midplane strains using
Eq. (11.3.9). For example, the strains along and transverse to the 45° fibers and in
shear can be derived as
(11.3.11)
The derivatives needed for the strain approximation of Eq.(11.3.10) can then be
obtained by differentiating Eq.(I1.3.11). For example, the derivative of the strain
along the 45° fiber with respect to All can be written as
OEI 1 (A12 - Ad
(11.3.12)
OAll = 2 (AU A 22 -Ai2) Ex,
where Aij are the extensional stiffnesses at the nominal design point. Similar strain
derivatives with respect to A22 and A12 can be derived. The extensional stiffnesses are
a linear function of the ply-identity design variables, thus the strain approximation
is a linear function of the ply-identity variables. It is also important to note that the
strains are initially calculated based on some reference value of the load. In order
to implement the strain constraint they have to be multiplied by the value of the
buckling load multiplier .xc which is also a function of the design variables,
< (11.3.13)
where Ei a is the strain allowable. The strain constraint of Eq. (11.3.13) can be
linearized by moving .xc to the right hand side, and expanding 1/.xc in linear Taylor's
series to obtain
(11.3.14)
448
Section 11.3: Stacking Sequence Design
Ny =0.25 lb/in.
Ny =0.5Ib/in.
.,,~,~~.,~ symmetry plane symmetry plane
<
-
k·
'J
i,j=1,2. (11.3.15)
Designs with strength constraints were obtained for laminates that are thicker
than those considered in the previous cases so that the buckling loads are likely to
449
Chapter 11: Optimum Design of Laminated Composite Structures
violate the strain failure constraints. Design results for 48-ply laminates under two
different combination of biaxial loads (Ny / N z = 0.25 and N y / N z = 0.5), for N z =
0.25 lb/in ( 175 N /m) are presented in Fig. 11.3.6, along with the results for designs
with no strain constraint. Since the method used involves local approximations, the
final design may be a locally optimal design. Designs with a higher confidence of
being globally optimum can be generated by using one of the probabilistic search
algorithms for nonlinear programming problems with discrete valued design variables
(see chapter 4). The last design in each of the load cases presented in Fig. 11.3.6
is generated using the genetic algorithm discussed in section 4.4.2 and verified to be
actually the global optimum design. Compared to the design without strength failure
constraint, the failure load factor decreased by 6.05% for Ny = 0.25. Although the
design for this load case was only a local maximum, the load factor differed from the
global optimum design only by a fraction of a percent. For the load ratio of 0.5, the
design without the strain constraint violated the shear strength by 7%. The design
obtained from the sequential integer linear programming approach was also the global
optimum.
Probabilistic search methods such as simulated annealing and genetic algorithms have
a number of parameters that can be tuned to tailor the method to the problem at
hand. For simulated anealing these parameters include the initial temperature and
the rate of cooling. For genetic algorithms the tuning parameters are the probabilities
of the various genetic operators, such as mutation, as well as population size and
convergence criteria. The design of unstiffened laminates using Classical Lamination
theory is a good problem for tuning such parameters because it is so computationally
inexpensive to optimize.
For simulated annealing Lombardi [28] studied the effect of initial temperature
and cooling rate on the performance of the algorithm for the buckling load maximiza-
tion problem described in the previous section. The performance of the algorithm was
judged by two criteria: computational cost and reliability in finding the global opti-
mum. The problem tends to have a large number of solutions (stacking sequences)
with very similar buckling loads. For this reason, a success was defined as a solution
which is within 0.1 % of the maximum buckling load. Results were obtained for 32-
ply plates where plies were grouped in stacks of two O-deg, 90-deg or ±45-deg plies.
For symmetric laminates this requires to define the angles of 8 stacks for a total
of 38 = 6561 possiblilities. The simulated annealing algorithm required about 1000
analyses for high reliability, which is a sizable fraction of the design space. However,
when the number of plies was incleased from 32 to 64, the number of required anal-
yses increased only to about 3000, while the number of possible designs increased to
3 16 = 43 million.
Le Riche and Haftka [29] solved the same buckling maximization problem for 48-
and 64-ply laminates using genetic algorithms. Tuning the probabilites of the genetic
operators as well as the population size could reduce substantially the number of
required analyses. For 48-ply laminates, for example, the number of required analyses
was found to be about of 200-300. One advantage of the genetic algorithm is that
450
Section 11.4: Design Applications
it yields several near optimal designs, rather than one optimum. For example, for a
= =
plate with a 20in, b 5in, N x = =
llbjin, and Ny 0.51bjin two of the best designs
were: [90 2 , ±452 , 902 , ±45, 902 , ±456]s, and [±45, 90 4 , ±45, 90 2 , ±455 , 90 2 , ±45]s. The
first laminate has a buckling load of .xc = 9998, while the second buckles at .xc = 9976.
For a designer, the differences between the laminates, such as the presence of ±45-deg
plies on the outside, or the reduced percentage of 90-deg plies in the second laminate
may be more important than the 0.2% difference in buckling loads.
Laminated plates stiffened by longitudinal and transverse members are one of the
most common structural components. Use of stiffeners makes it possible to resist
highly directional loads, and to introduce multiple load paths that may provide pro-
tection against damage and crack growth under both compressive and tensile loads.
The biggest advantage of the stiffeners, though, is the increased bending stiffness
of the panel with a minimum of additional material, which makes these structures
highly desirable for out-of-plane loads and destabilizing compressive loads. In addi-
tion to placement of the stiffeners to resist directional loads, the use of composite
materials makes it possible to further tailor the stiffness and strength characteristics
of the individual elements (such as webs, flanges, and skin) of a stiffened plate to meet
various structural requirements. This local tailoring is achieved through selection of
ply orientations and thicknesses for the different sections of the plate. Also the use
of composite materials makes it possible to adopt stiffener cross-sectional geometries
which may be expensive to manufacture using metallic materials.
However, the complex behavior of stiffened composite plates makes it difficult
to adopt the simplifying assumptions used for the analysis of flat laminates which
often lead to closed-form solutions. Therefore, design optimization of such plates
typically requires use of numerical algorithms. In this section we will discuss the
design of stiffened composite plates under compressive and shear loadings, and subject
to mainly buckling constraints.
In one of the early studies of optimum design of stiffened plates, Stroud and
Agranoff [30] considered a longitudinally stiffened plate composed of an assembly
of orthotropic plate elements. The plate configurations were limited to corrugated
and hat-stiffened plates, but the same procedure used in Ref. 30 can be extended to
other geometries such as the ones shown in Figure 11.4.1. The simplified analysis was
based on buckling of orthotropic plates with simply supported boundary conditions.
Both global and local modes of buckling were considered. The global buckling analysis
modeled the stiffened plate as an orthotropic plate with smeared stiffeners, a'lsumed to
buckle as a wide column. For local buckling, each element of the plate wac;; considered
separately as a narrow strip of orthotropic plate with simply supported boundary
451
Chapter 11: Optimum Design of Laminated Composite Structures
conditions along the lines of attachment to adjacent elements. That is, the rotational
restraint between panel elements such as stiffener and skin was ignored, and the
continuity of the buckling mode shapes between different elements was not accounted
for. Equations for the buckling loads resulting from these assumptions are presented
in Table 11.4.1 for plates loaded by compressive and shear loads.
The local buckling equations in the table are applied to each of the plate elements
of width b and length L. The length L of each element is assumed to be much larger
than the width of the elements for both longitudinal compression and shear loadings.
The D;/s are the bending stiffness coefficients (Eq. 11.1.18) of the respective plate
elements. For global buckling under longitudinal compression, the panel is treated
as a wide column with the loaded edges simply supported and the unloaded edges
free. The longitudinal stiffness of the column is equal to the smeared longitudinal
stiffness of the panel, EI. For the shear loading case, the stiffened panel is modeled
as a uniform thickness orthotropic laminate (with smeared orthotropic properties,
D l , D 2 , and D 3 ) infinitely long in the transverse direction and simply supported
along the loaded edges. The smeared stiffness terms (EI, D 1 , D 2 , and D 3 ) in the
global buckling relations strongly depend on the cross-sectional configuration of the
stiffeners. The calculation of these smeared stiffnesses for complicated stiffened panel
geometries is quite involved and requires various kinematic assumptions depending
on the applied loads. The derivation of some of the smeared stiffness terms is demon-
strated in Ref. 30 for corrugated and hat-stiffened panels.
The design problem of Ref. 30 was formulated as a mathematical programming
problem with panel mass per unit width being the objective function. Design variables
452
Section 11.4: Design Applications
Table 11.4.1 : Overall and Local Buckling Equations from Reference 30
Loading Equation Reference
Global Buckling
Longitudinal Eq. (92), [31)
Compression Eq. (3), [32)
Local Buckling
Longitudinal 21l'2 [ 1 + D12 + 2D66]
Nx,er = ---,;2 (DllD22)'i
Eq. (92), [31)
Compression Eq. (3), [32)
were the element widths and thicknesses of the layers that make up the elements. The
design constraints were buckling load, strength and stiffness requirements, and lower
and upper bounds on some of the panel dimensions. A general purpose optimization
code AESOP [35)' which is based on exterior penalty function formulation, was used
for the design optimization.
A more rigorous design procedure [36] based on a stiffened panel buckling and
vibration analysis code VIPASA [37, 38) and a mathematical programming code
based on the method of feasible directions algorithm (see Section 5.6 ) CONMIN [39)
was introduced to improve some of the assumptions made in Ref. 30. The analysis
code VIPASA is capable of computing buckling loads of structures comprised of flat
rectangular plate elements connected together along their longitudinal edges. As
opposed to the procedure used in Ref. 30, the analysis accounts for the physical
connection between the adjacent elements by maintaining the continuity of the buckle
453
Chapter 11: Optimum Design of Laminated Composite Structures
patterns across the intersection of neighboring plate elements. Buckling solutions are
based on exact thin-plate equations with D 16 and D 26 anisotropic stiffness terms so
that bending-twisting coupling is allowed. Individual plate elements may be isotropic,
orthotropic, or anisotropic. However, the laminates that make up the elements are
limited to balanced symmetric layups such that bending-extension and extension-
shearing couplings are eliminated. Another limitation of the analysis is the buckling
boundary conditions. Although the unconnected longitudinal edges may take various
boundary conditions, the boundary conditions along the loaded edges are limited
to simply supported conditions. Any combination of longitudinal, transverse, and
shearing loads that are constant along the length of the panel may be applied, see
Fig. 11.4.2. However, as will be discussed later, in the case of applied shear loads
the limitation of the simply supported boundary condition at the loaded edge may
result in inaccuracies in the buckling load calculations.
The VIPASA analysis program was eventually used by Stroud and Anderson as
the basis of a design code PASCO [40,41] which is commonly used for preliminary de-
sign of uniaxially-stiffened panel structures. PASCO uses the nonlinear mathematical
programming code CONMIN [39] for optimization. The design problem is formulated
so as to minimize the panel mass for a given set of loadings. Constraints include up-
per and lower bounds on design variables, lower bounds on material strength and
buckling loads, lower and upper bounds on overall bending, extensional, and shear
stiffnesses, and lower bounds on vibration frequencies. In addition to the design con-
dition described for VIPASA analysis (Nx , Ny, N xy ), PASCO includes applied bending
moment (Mx), lateral pressure (p), overall bow-type initial imperfections, and tem-
perature loadings. The effects of the bending strains, resulting from the applied
bending moment, pressure, initial imperfection, or the temperature, are included in
the strain failure analysis by superimposing them on the uniform strains resulting
from the in-plane loads. The bending strains resulting from the applied pressure
and bow-type imperfections are calculated based on a beam-column approach [42] by
calculating the corresponding bending moment at the panel midlength. This maxi-
mum bending moment is conservatively assumed to act over the entire panel length.
454
Section 11.4: Design Applications
This approach is in line with the VIPASA requirement that the prebuckling stress
distribution be constant along the panel length. For more detailed discussion of the
bending moments see Ref. 40. Use of multiple sets of design conditions is also allowed
in PASCO. The set of design variables consists of the widths, b, the ply thicknesses,
t, and orientations, 6, of any of the plate elements that make up the panel. Re-
ducing the number of design variables by linking of some of the element dimensions
or ply orientations through linear relations is also possible. PASCO is also capable
of implementing approximations for the buckling and vibration constraints through
first-order Taylor series expansion of those constraints, and set move limits for the
design variables. This aspect of the code makes it computationally efficient and very
attractive for preliminary design purposes, and lets the designer compare various
design concepts in a cost-effective manner.
Example 11.4.1
roc
b) Corrugated panel with continuous laminate
2b j
1\ 1--:::--1 H
2bl b3
c) Hat-stiffened panel
o
d) Blade-stiffened plate
455
Chapter 11: Optimum Design of Laminated Composite Stmctures
This example by Swanson and Giirdal [43] is a comparison of the structural ef-
ficiencies of optimally designed composite wing rib panel configurations typical of a
center-wing-box fuel-cell closeout rib of large transport-type aircraft. Rib dimensions
of 28 inches high by 80 inches wide are used. The panel configurations are cho-
sen to be practical and applicable to cost-effective manufacturing techniques. These
configurations are shown in Fig. 11.4.3, and include a tailored corrugated panel, a
corrugated panel with a continuous laminate throughout its length and width, and a
hat-stiffened panel. A corrugated panel is relatively easy to manufacture since it has
continuous plies which run throughout the configuration that form integral stiffeners
without requiring fasteners. It is also suitable for the thermoforming process which is
a potentially economical manufacturing technique for thermoplastic materials. Also
included are a blade-stiffened panel, which is the most commonly used concept for
wing rib applications, and a flat unstiffened plate which is used as a ba.'leline config-
uration for comparison.
The constraints considered in this example include those associated with material
strength, buckling, and geometric limits. The material failure criterion chosen is the
maximum strain failure criterion. The buckling criterion implemented is hased on a
common design practice used for wing structures that docs not allow the components
to buckle at design limit loads. Thus, the design of the wing rib does not consider
any post buckling-load-carrying capability of the panel.
The design variahles are thc thicknesses of plies with different ply orientations in
the different sections of the panels. Conventional ply angles of ±45-deg, O-deg, and
90-deg orientations are chosen. Also, detailed cross-sectional dimensions are used as
sizing variables to determine the best cross-sectional geometry. Hercules AS4/3502
preimpregnated graphite-epoxy tape is chosen as a typical graphite-epoxy material.
211 (45°)
212 (0°)
0° plies
211 (45°)
456
Section 11.4: Design Applications
The geometry of the repeating elements is typically defined by the plate element
width design variables b1 through b4 as shown in Fig. 11.4.3. For the corrugated
panels, for example, both the upper and lower corrugation caps are assumed to be
of equal width due to symmetry. The plate element widths, b2 and b3 , define the
corrugated panel web angle. The panel webs are made of only ±45-deg plies, Fig.
11.4.4, that run continuously across the width of the cross section. Such continuous
plies help reduce manufacturing costs and eliminate stress concentrations that could
occur at the ±45-deg ply termination points. In the plate elements which make up the
caps, O-deg plies are included between the layers of ±45-deg fibers. Thus, the entire
laminate is defined by two thickness design variables, tl and t2, relating to the 45-deg
and O-deg plies, respectively. Cross-sectional details of the other configurations can
be obtained from Ref. 43.
The loads considered in Ref. 43 are combined in-plane axial compression (Nx ),
shear (Nxy ), and pressure (p) loads with magnitudes typical of an inboard wing rib
fuel closeout cell for a large transport aircraft. In the present example a load index of
N x / L, where L is the panel length, is used with values ranging from 0.3 to 1000 lb/ in 2 .
This range includes loadings above and below typical rib loads so that design trends
for panels for other subcomponents, such as a wing skin, are covered.
The effect of axial compression load intensity on the structural efficiency and
geometry of all the panel configurations considered in the present study is shown in
Figs. 11.4.5. The tailored corrugated panel concept with different laminates in the
corrugation crowns and webs is the most structurally efficient configuration. The
corrugated panel concept with a continuous laminate is the next most structurally
457
Chapter 11: Optimum Design of Laminated Composite Structures
efficient concept, followed by the blade-stiffened panel concept, the hat-stiffened panel
concept, and the unstiffened flat panels (see Fig. 11.4.5). The weight differences in
this load range are due largely to the modeling of the laminates that define the
panel geometry. Each configuration is modeled such that a minimum number of plies
necessary to define the geometry is used, and that number differs for each model.
For low axial load intensity, all configurations, excluding the unstiffened plate, are
constrained by the same minimum gage ply thickness of 0.005 inches on all the plies.
Therefore, the weight of a panel is almost directly proportional to the number of
layers in the cross section and is independent of the intensity of the load .•••
The design requirement that does not allow buckling of the panels at the limit
load is appropriate for wing and empennage cover panels because of nonstructural
considerations such as maintaining a good aerodynamic surface. However, fuselage
panels of metallic aircraft structures are commonly designed to buckle below their
ultimate loads. The lack of sufficient information on the post buckling response of
composite panels hindered the application of such a design philosophy in the past.
Realization of the possible weight saving kindled interest in designing post buckled
panels in recent years (see Dickson et al. [48, 49] and Shin et al. [50]. A non-
linear theory for the prediction of behavior of locally imperfect stiffened panels has
been incorporated by Bushnell into the design optimization program PANDA2 [51].
However, because of the complexity and serious computational cost involved in post-
buckling analysis of stiffened panel structures, optimal design of such panels is still
far from being a routine practice.
458
Section 11.4: Design Applications
459
Chapter 11: Optimum Design of Laminated Co losite Structures
for optimization purpose. The TSO program was used for several design studies for
aeroelastic tailoring applications to existing aircraft [54, 58, 59]
Another popular program for the design of lifting surfaces subject to strength
and aero elastic constraints is the finite element based program FASTOP developed
by Grumman [60]. The program employs optimality criteria methods (see Chapter
9), and is also capable of handling flutter constraints. Optimality criteria meth-
ods are very efficient for designs subject to a single constraint. Thus, despite the
costly finite element analysis involved, the cost of optimization was kept manageable
through the use of sequential treatment of constraints. First the stress constraints
are treated by the non-optimal Fully Stressed Design (FSD, see Section 9.1), fol-
lowed by a 'uniform-cost-effectiveness' optimality criterion (Section 9.3) for each of
the aero elastic constraints. The process is repeated with the strength and aeroelastic
constraints until convergence is achieved. Design variables are limited to thickness
or cross-sectional areas, and ply orientations are not allowed to change during the
design.
A more recent finite element based design program is ASTROS (Automated
Structural Optimization Systems) [61] developed by Northrop under an Air Force
contract. ASTROS is designed as an automated procedure to address interdisci-
plinary requirements during preliminary design of aerospace structures. The struc-
tural analysis module of ASTROS is derived from the public domain version of the
NASTRAN finite element code and forms the core of the procedure. The structural
analysis module is used to obtain structural response to applied mechanical, gravita-
tional, aerodynamic, induced thermal, and time dependent loads. Design constraints
include limits on stresses, strains, displacements, modal frequencies, flutter response,
aeroelastic lift effectiveness, and aileron effectivE'ness. Design variables that can be
used in the process are element areas and thicknesses, structural inE'rtias and con-
centrated masses. Membrane and bending elements used in the structural analysis
provide full-composite modeling capability. Individual ply thicknesses of the mate-
rial can be used as design variables, but the ply orientation design variables are not
allowed. In order to reduce the number of design variables and to assure physically
meaningful dimensions design variable linking is used. The design variable linking
is implemented together with a procedure that divides the design variables into two
groups that are identified as global and local variables. A global design variable can
be specified as a weighted sum of a number of local design variables. Similar to
TSO, shape function type of linking can be used to define shapes such as a smooth
thickness variation along the span direction. The design optimization module used
in ASTROS is the ADS (Automated design Synthesis) [62] program. All sensitivities
of the objective function and of the constraints are calculated based on analytical
derivatives. Both direct and adjoint-variable methods (see Chapter 7) are available.
460
Section 11.5: Design Uncertainties
structures are known to be sensitive to changes in load conditions and imperfections.
Because of increased number of variables which enable designers to tailor the design
closer to the desired specifications, this sensitivity may be heightened for composite
structures. The simplest example of sensitivity to changes in the load condition is the
case of a laminate designed to carry uniaxial loads [63]. For this application, it can
easily be demonstrated that the best design is the one that has all the layers oriented
along the load direction. It is also well known that this design is extremely poor for
carrying loads transverse to the fiber direction. Therefore, any change in the direc-
tion of the applied design load is likely to result in a failure, wherea.<; a similar design
made of a conventional isotropic material would be capable of carrying a transverse
load of magnitude equal to the original design load.
Another complication in designing optimal composite structures is sometimes
the difficulty in identifying and imposing proper strength constraints. Not only the
load and stress distributions are functions of the ply thickness and fiber orientation
variables, but the strength properties are also dependent on these variables. Failure of
composite laminates is largely due to highly localized stresses. The number of possible
local failure modes is large, and these failure modes are generally micromechanically
governed and complex. Fiber breaking, matrix cracking, fiber-matrix debonding, and
separation of individual layers can result in surface and through-the-thickness cracks,
splits, and delaminations. Under compressive loads, even the instability of fibers on
a microscopic scale (often referred as fiber microbuckling) wa.<; proposed as a failure
mechanism, although based on more recent studies compression failures for high-
performance composites are believed to be strength-related failures. Furthermore,
failure modes can interact with one another making the strength prediction even
more difficult.
Some of the basic assumptions used for simplification of the laminate stress anal-
ysis that reduce the three-dimensional nature of the laminated composites to two
dimensions may also cause loss of information important for failure predictions. It is
well known that laminated composite plates can locally display a three-dimensional
stress state. The most common examples of these three-dimensional effects are free-
edge stresses, and interlaminar stresses at the stiffener-skin interface of stiffened pan-
els. It is important that designers be aware of such local effects during the formulation
of the optimization problem and include appropriate constraints to account for them.
It is only fair to claim that some of the design-related issues of composites failures
are not well understood. Sometimes strength quantities that are needed for imple-
mentation of a certain stress constraint may not be available. For example, based
on their experience with metallic materials, designers often look for a compressive
material strength limit that they can include in an optimization problem. It can be
argued that the compressive failure strength is a highly problem-dependent quantity,
rather than a material strength parameter. In some applications, the lack of under-
standing and availability of predictive models for certain design considerations may
hamper the design effort. For example, unlike metallic materials, composites have
been found to be sensitive to low-velocity impact loadings. Currently, there is no
predictive model that can realistically be used for designing laminates under impact
damage conditions. Some of these topics are still under development and constitute
a major effort in the area of mechanics of composite materials.
461
Chapter 11,' Optimum Design of Laminated Composite Structures
Under these difficulties, designers sometimes resort to practical guidelines. Rather
than using ply orientation angles as design variables, designers often fix them to
prescribed practical angles such as O-deg, ±45-deg, and 90-deg. Even if the applied
loading is highly directional, such as panels under uniaxial loadings, presence of plies
other than the ones aligned along the load direction provides increased safety for off-
design load conditions such as unexpected transverse loads. In order to assure that
the thickness design variables associated with those plies that are placed based on
intuitive guidelines do not disappear, either lower bound on those thicknesses are used
or additional loads are specified. For example, application of a certain percentage of
the axial load as shear load leads to non-zero thickness for ±45-deg layers even if the
lower bound on those layers is zero.
The selection of a stacking sequence for a laminate is also guided by intuitive
considerations. For example, use of ±45-deg plies as the outside layers of a laminate
is preferred because of damage tolerance considerations. Another practical guideline
is not to allow more than 4 identical contiguous plies. This guideline helps to reduce
the interlaminar stresses between plies with different orientations. In order to satisfy
such ply stacking sequence rules, an iterative procedure may be used as outlined in
Ref. [64]. If the branch-and-bound algorithm with ply identity variables is used, this
requirement can easily be implemented through the nse of Eq. (11.3.5) as dcscribed
earlier.
11.6 Exercises
1. For a unidirectional laminate under uniform applied stresses, (Jx, (Jy, and T xy ,
show that the stationary values of the Tsai-Hill function
(11.6.1)
462
Section 11.6: Exercises
graphite-epoxy laminate with maximum effective shear stiffness G;xy. The laminate
must also meet the following stiffness requirements
Ex 2:: 17.5 106 psi, Ey 2:: 5.8 106 psi, and 0.1 2:: Vz: y 2:: 0.3 .
Engineering properties of T300-5208 graphite-epoxy material along its principal ma-
terial directions are
El = 26.25 106 psi, E2 = 1.49 106 psi, G 12 = 1.04 106 psi, and V12 = 0.28 .
3. Show that a quasi-isotropic laminate [Of, 90f, -45f, +45fl can be replaced by
[90j, Ok, -45f, +45fl with an identical D matrix by suitably selecting j and k so
that j + k = 2i (note: j and k may be non-integers).
4. For a laminate that is made up of integer number of plies with 0-, ±30-, ±60-,
and 90-deg orientations, the design space is shown in Fig. l1.3.1-b.
a) Complete the figure by putting the stacking sequences of laminates next to the
appropriate discrete design points on the figure.
b) If the laminate is required to have a Poisson's ratio V",y greater than 0.3,
determine the stacking sequence that maximizes the transverse modulus E y •
463
Chapter 11: Optimum Design of Laminated Composite Structures
5. The skin laminate of a ~irnply snpported blade stiffened panel shown in Figure
11.6.1 is a [±45 n l s construction, and the stiffeners are made of unidirectional laminae.
Determine the longitudinal ~rneared stiffness EI which can be used for the global
buckling load calculation presented in Table 11.3.1. Assuming the thicknesses of
individual plies to be continuously variable, determine the minimum weight design
for an axial compression of N x = 10000 lbjin. Consider only buckling constraints.
11. 7 References
[11 Jones, R. "Nr., !\Iechanics of Composite Materials, IVlcGraw-Hill Book Co., New
York, pp. 45~57, 1975.
[2J Tsai, S. \V., and Pagano N. J., "Invariant Properties of Composite :vlaterials,"
in Composite lvlaterials \Vorkshop, (Eds. Tsai, S.\V., Halpin, J.C., Pagano, N.J.)
Technomic Publishing Co., Westport, pp. 233-253, 1968.
[3] Caprino, G., and Crivelli Visconti, 1., "A Note on Specially Orthotropic Lami-
nates," J. Compo Il,latls., 16, pp. 395~399, 1982.
[4] Gllnnink, ,T. \V., "Comment on A Kote on Specially Orthotropic Laminates," ,T.
Compo 1\1atls., 17, pp. 508510, 1983.
[5] Kandil, N., and Verchery, G., "New Methods of Design for Stacking Sequences of
Laminates ," Proceedings of the International Conference on "Compnter Aided
Design in Composite Ivlaterial Technology," Eels. Brebbia, C. A., de \Vilcle, \V.
P., and Blain, W. R, pp. 243-257, 1988.
[6] Schmit, L. A., and Farshi, B., "Optimum Laminate Design for Strength and
Stiffness," Int. J. Num. Mcth. Engng., 7, pp. 519-536,1973.
[7J Park, W. J., "An Optimal Design of Simple Symmetric Laminates Uncler the
First Ply Failure Criterion," J. Compo I\Iatls., 15, pp. 341-355, 1982.
[8] Massard, T. K., "Computer Sizing of Composite Laminates for Strength," ,T.
Reinf. Plastics and Composites, 3, pp. 300-345, 1984.
[9] Tsai, S. \V., and Hahn, H. T., Introduction to Composite Materials, Technomic
Publishing Co., Inc., Lancaster, Pa., pp. 315-325, 1980.
[10] Tsai, S. W., "Strength Theories of Filamentary Structnres," in R. T. Schwart~
and H. S. Schwartz (cds.), Fundamental Aspects of Fiber Reinforced Plastic,
\Viley Illterscience, New York, pp. 3-11, 1968.
[11] Brandmaier, H. E., "Optimum Filament Orientation Criteria," J. Composite Ma-
terials, 4, pp. 422-425, 1970.
[12J Miki, M., ":VIaterial Design of Composite Laminates with Required In Plane Elas-
tic Properties," Progress in Science and EngillE'ering of Composites, Eds., T.
464
Section 11.7: References
465
Chapter 11: Optimum Design of Laminated Composite Structures
the AIAAj ASMEj ASCEj AHSj ASC 33th Structures, Structural Dynamics,
and Materials Conference, Dallas, TX., April, 1992.
[28] Lombardi, M., "Ottimizzazione di Lastre in Materiale Composito con l'uso di
un Metodo di Annealing Simulato," Tesi di Laurea, Department of Structural
Mechanics, University of Pavia, 1990.
[29J Le Riche, R., and Haftka, R.T., "Optimization of Laminate Stacking-Sequence for
Buckling Load Maximization by Genetic Algorithm," submitted for presentation
at the AIAAj ASMEj ASCEj AHSj ASC 33th Structures, Structural Dynamics,
and Materials Conference, Dallas, TX., April, 1992.
[30] Stroud, W. J., and Agranoff, N., "Minimum-mass Design of Filamentary Com-
posite Panels Under Combined Loadings: Design Procedure Based on Simplified
Buckling Equations," NASA TN D-8257, 1976.
[31] Timoshenko, S., Theory of Elastic Stability, McGraw-Hill, New York, 1936.
[32] Stein, M., and 1\1ayers, J., "Compressive Buckling of Simply Supported Curved
Plates and Cylinders of Sandwich Construction," NACA TN 2G01, 1952.
[33] Advanced Composites Design Guide. Vols. I~V, Third Edition, U.S. Air Force,
Jan. 1973.
[34] Lekhnitskii, S. G., Anisotropic Plates. Translated by Tsai, S. W., and Cheron,
T., Gordon and Breach Sci. Pub!., Inc., New York, 19G8.
[35] Hague, D. S., and Glatt, C. R., "A Guide to the Automated Engineering and
Scientific Optimization Program, AESOP, " NASA CR-73201, April H)G8.
[3G] Stroud, \Y. J., Agranoff, N., and Anderson, t-.I. S., "Minimum-Mass Design of
Filamentary Composite Panels Under Combined Loads: Design Procedure Based
on a Rigorous Buckling Analysis, " NASA TN D-8417, July 1977.
[37] Wittrick, W. H., and Williams, F. W., "Buckling and Vibration of Anisotropic
or Isotropic Plate Assemblies Under Combined Loadings, " Int. J. Mech. Sci., 16,
4, pp. 209-239, April 1974.
[38J Plank, R. J., and Williams, F. W., "Critical Buckling of Some Stiffened Panels in
Compression, Shear and Bending, " Aeronautical Q., XXV, Part 3, pp. 165-179,
August 1974.
[39J Vandcrplaats, G. N., "CONMIN - A Fortran Program for Constrained Function
Minimization, User's Manual," NASA TM X-52, 282, 1973.
[40] Stroud, W. J., and Anderson, M. S., "PASCO: Structural Panel Analysis and Siz-
ing Code, Capability and Analytical Foundations, " NASA TM 80181, November
1981.
[41] Anderson, M. S., Stroud, W. J., Durling, B. J., and Hennessy, K. W., "PASCO:
Structural Panel Analysis and Sizing Code, User's Manual, " NASA TM 80182,
November 1981.
466
Section 11. 7: References
[42] Giles, G. L., and Anderson, M. S., "Effects of Eccentricities and Lateral Pressure
on the Design of Stiffened Compression Panels," NASA TN D-6784, June 1972.
[43] Swanson, G. D., and Giirdal, Z., "Structural Efficiency Study of Graphite-Epoxy
Aircraft Rib Structures," J. Aircraft, 27 (12), pp. 1011-1020, 1990.
[44] Stroud, W.J., Greene, W.H., and Anderson, M.S., "Buckling Loads of Stiffened
Panels Subjected to Combined Longitudinal Compression and Shear: Results Ob-
tained With PASCO, EAL, and STAGS Computer Programs," NASA TP 2215,
January 1984.
[45] Williams, F.W., and Kennedy, D., "User's Guide to VICON, VIPASA with Con-
straints," Department of Civil Engineering and Building Technology, University
of Wales Institute of Science and Technology, August, 1984.
[46] Williams, F.W., and Anderson, M.S., "Incorporation of Lagrangian Multipliers
into an Algorithm for Finding Exact Natural Frequencies or Critical Buckling
Loads," Int. J. Mech. Sci., 25, 8, pp. 579-584, 1983.
[47] Butler, R., and Williams, F.W., "Optimum Design Features of VICONOPT, an
Exact Buckling Program for Prismatic Assemblies of Anisotropic Plates," Pro-
ceedings of the AIAA/ ASME/ ASCE/ AHS/ ASC 31st Structures, Structural Dy-
namics, and Materials Conference, Long Beach, CA, Part 2, pp. 1289-1299,1990.
[48] Dickson, J. N., Cole, R. T., and Wang, J. T. S., "Design of Stiffened Composite
Panels in the Post buckling Range, " In Fibrous Composites in Structural Design,
Eds. Lenoe, E. M., Oplinger, D. W., and Burke, .J . .J., Plenum Press, New York,
pp. 313-327, 1980.
[49] Dickson, J. N., and Biggers, S. B., "Design and Analysis of a Stiffened Composite
Fuselage Panel, " NASA CR-159302, August 1980.
[50] Shin, D.K., Giirdal, Z., and Griffin, O. H . .Jr., "Minimum-Weight Design of
Laminated Composite Plates for Post buckling Performance, Proceedings of the
AIAA/ AS ME/ ASCE/ AHS/ ASC 32th Structures, Structural Dynamics, and Ma-
terials Conference, Baltimore, Maryland, Part I, pp. 257-266, 1991.
[51] Bushnell, D., "PANDA2 - Program for Minimum Weight Design of Stiffened,
Composite, Locally Buckled Panels," Comput. Struct., 25 (4), pp. 469-605, 1987.
[52] Shirk, M. H., Hertz, T . .J., and Weisshaar, T. A., "Aeroelastic Tailoring - Theory,
Practice, and Promise, " J. Aircraft, 23 (1), pp. 6-18, 1986.
[53] Lynch, R. W., and Rogers, W. A., "Aeroelastic Tailoring of Composite Materials
to Improve Performance, " Proceedings of the AIAA/ ASMEjSAE, 17th Struc-
tures Structural Dynamics and Materials Conference, King of Prussia, PA., May
5-7, pp. 61-68, 1976.
[54] McCullers, L. A., "Automated Design of Advanced Composite Structures, " Pro-
ceedings of the ASME Structural Optimization Symposium, AMD-7, pp. 119-
133, 1974.
467
Chapter 11: Optimum Design of Laminated Composite Structures
[55] McCullers, L. A., and Lynch, R. \V., "Dynamic Characteristics of Advanced Fil-
amentary Composite Structures, " AFFDL-TR-73-111, vol. II, Sept. 1974.
[56] Haftka, R. T., "Structural Optimization with Aeroelastic Constraints: A Survey
of US Applications, " Int. J. of Vehicle Design, 7 (3/4), pp. 381-392, 1986.
[57] McCullers, L. A., and Lynch, R. W., "Composite Wing Design for Aeroelastic
Tailoring Requirements, " Air Force Conference on Fibrous Composites in Flight
Vehicle Design, Sept. 1972.
[58] Fant, J. A., "An Advanced Composite Wing for the F-16, " paper presented at
the 22nd National SAMPE Symposium and Exhibition, San Diego, pp. 773-783,
April 1977.
[59] Gimmestad, D., "Aeroelastic Tailoring of a Composite Winglct for KC-135,"
AIAA Paper No. 81-0607, presented at the AIAA/ ASME/ ASCE/AHS 22nd
Structures, Structural Dynamics and Materials Conference, Atlanta, GA., Part
2, pp. 373-376, April 1981.
[60] Wilkinson, K., Markowitz, J., Lerner, E., George, D., and Batill, S. M., "FASTOP:
A Flutter and Strength Optimization Program for Lifting Surface Structures, "
J. Aircraft, 14 (6), pp. 581-587,1977.
[61] Neill, D ..1., Johnson, E.H., and Canfield, R., "ASTROS-A Multidisciplinary Au-
tomated Structural Design Tool," J. Aircraft, 27, 12, pp. 1021 1027,1990.
[62] Vanderplaats, G. N., "ADS - A Fortran Program for Automated Design Syn-
thesis, " NASA-CR-177985, Sept. 1985.
[63] Stroud, W . .1., "Optimization of Composite Structures," NASA TM 84544, Au-
gust 1982.
[64] Nagendra, S., Haftka, R. T., Gurdal, Z., and Starnes, J. H., Jr., "Design of a
Blade-Stiffened Composite Panel with a Hole," Composite Structures, Vol. 18
(3), pp. 195-219,1991.
468
Name Index
472
Name index
473
.''l/ame Inc/ex
474
Subject Index
1. M.A. Krasnoselskii, P.P. Zabreiko, E.I. Pustylnik and P.E. Sbolevskii: Integral
Operators in Spaces of Summable Functions. 1976 ISBN 90-286-0294-1
2. V.V. Ivanov: The Theory of Approximate Methods and Their Application to the
Numerical Solution of Singular Integral Equations. 1976 ISBN 90-286-0036-1
3. A. Kufner, O. John and S. Pu~fk: Function Spaces. 1977 ISBN 90-286-0015-9
4. S.G. Mikhlin: Approximation on a Rectangular Grid. With Application to Finite
Element Methods and Other Problems. 1979 ISBN 90-286-0008-6
5. D.G.B. Edelen: Isovector Methods for Equations of Balance. With Programs for
Computer Assistance in Operator Calculations and an Exposition of Practical Topics of
the Exterior Calculus. 1980 ISBN 90-286-0420-0
6. R.S. Anderssen, F.R. de Hoog and M.A. Lukas (eds.): The Application and Numerical
Solution of Integral Equations. 1980 ISBN 90-286-0450-2
7. R.Z. Has'minskiI: Stochastic Stability of Differential Equations. 1980
ISBN 90-286-0100-7
8. A.1. Vol'pert and S.1. Hudjaev: Analysis in Classes of Discontinuous Functions and
Equations of Mathematical Physics. 1985 ISBN 90-247-3109-7
9. A. Georgescu: Hydrodynamic Stability Theory. 1985 ISBN 90-247-3120-8
10. W. Noll: Finite-dimensional Spaces. Algebra, Geometry and Analysis. Volume I. 1987
ISBN Hb 90-247-3581-5; Pb 90-247-3582-3
MECHANICS OF CONTINUA
Editors: W.O. Williams and G.m. Oravas
1. G.c. Sih (ed.): Methods of ATUllysis and Solutions of Crack Problems. 1973
ISBN 90-01-79860-8
2. M.K. Kassir and G.C. Sih (eds.): Three-dimensional Crack Problems. A New Solution
of Crack Solutions in Three-dimensional Elasticity. 1975 ISBN 90-286-0414-6
3. G.C. Sih (ed.): Plates and Shells with Cracks. 1977 ISBN 90-286"0146-5
4. G.C. Sih (ed.): ElastodYTUlmic Crack Problems. 1977 ISBN 90-286-0156-2
5. G.C. Sih (ed.): Stress ATUllysis of Notch Problems. Stress Solutions to a Variety of
Notch Geometries used in Engineering Design. 1978 ISBN 90-286-0166-X
6. G.C. Sih and E.P. Chen (eds.): Cracks in Composite Materials. A Compilation of Stress
Solutions for Composite System with Cracks. 1981 ISBN 90-247-2559-3
7. G.C. Sih (ed.): Experimental Evaluation of Stress Concentration and Intensity Factors.
Useful Methods and Solutions to Experimentalists in Fracture"Mechanics. 1981
ISBN 90-247-2558-5