0% found this document useful (0 votes)
20 views498 pages

Elements of Strucural Optimization Haftka

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views498 pages

Elements of Strucural Optimization Haftka

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 498

ELEMENTS OF STRUCTURAL OPTIMIZAnON

SOLID MECHANICS AND ITS APPLICATIONS


Volume 11

Series Editor: G.M.L. GLADWELL


Solid Mechanics Division, Faculty of Engineering
University of Waterloo
Waterloo, Ontario, Canada N2L3GI

Aims and Scope of the Series


The fundamental questions arising in mechanics are: Why?, How?, and How much?
The aim of this series is to provide lucid accounts written by authoritative research-
ers giving vision and insight in answering these questions on the subject of
mechanics as it relates to solids.
The scope of the series covers the entire spectrum of solid mechanics. Thus it
includes the foundation of mechanics; variational formulations; computational
mechanics; statics, kinematics and dynamics of rigid and elastic bodies; vibrations
of solids and structures; dynamical systems and chaos; the theories of elasticity,
plasticity and viscoelasticity; composite materials; rods, beams, shells and
membranes; structural control and stability; soils, rocks and geomechanics;
fracture; tribology; experimental mechanics; biomechanics and machine design.
The median level of presentation is the first year graduate student. Some texts are
monographs defining the current state of the field; others are accessible to final
year undergraduates; but essentially the emphasis is on readability and clarity.

For a list of related mechanics titles, see final pages.


Elements of
Structural Optimization
Third revised and expanded edition

by

RAPHAEL T. HAFfKA
Department ofAerospace ami Ocean Engineering,
Virginia Polytechnic Institute and State Vniversity,
Blacksburg, Virginia, U.sA.

and
ZAFER GORDAL
Department of Engineering Science ami Mechanics,
Virginia Polytechnic Institute ami State VniverSity,
Blacksburg, Virginia, V.SA.

..
SPRINGER-SCIENCE+BUSINESS MEDIA, B.V.
Library of Congress Cataloging-in-Publication Data

Haftka. Raphael T.
Elements of structural optimization I by Raphael T. Haftka and
Zafer Gurdal. -- 3rd rev. and expanded ed.
p. cm. -- (Solid mechanlcs and its applications ; v. 11)
Includes blbllographlcal references and indexes.
ISBN 978-0-7923-1505-6 ISBN 978-94-011-2550-5 (eBook)
DOI 10.1007/978-94-011-2550-5
1. Structural optlmlzation. 1. Gurdal. Zafer. II, Tltle.
III. Ser Ies.
TA658.8.H34 1991
624.1' 7--dc20 91-37690
CIP

ISBN 978-0-7923-1505-6

Printed an acid-free paper

AH Rights Reserved
© 1992 Springer Science+Business Media Dordrecht
Originally published by Kluwer Academic Publishers in 1992
Softcover reprint of the hardcover l st edition 1992
No part of the material protected by this copyright notice may be reproduced or
utilized in any form or by any means, electronic or mechanica1,
including photocopying, recording or by any information storage and
retrieval system, without written permission from the copyright owner.
This book is dedicated to

Rose

PIllar and Erin


Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. xiii
Chapter 1. Introduction ................................................. 1
1.1 Function Optimization and Parameter Optimization ................. 1
1.2 Elements of Problem Formulation ................................... 3
Design Variables ................................................. 3
Objective Function . .............................................. 5
Constraints . ..................................................... 9
Standard Formulation . ........................................... 9
1.3 The Solution Process .............................................. 12
1.4 Analysis and Design Formulations .................................. 14
1.5 Specific Versus General Methods ................................... 15
1.6 Exercises .......................................................... 16
1. 7 References ......................................................... 19
Chapter 2. Classical Tools in Structural Optimization ............... 23
2.1 Optimization Using Differential Calculus ........................... 23
2.2 Optimization Using Variational Calculus ........................... 29
Introduction to the Calculus of Variations . ...................... 29
2.3 Classical Methods for Constrained Problems ....................... 33
Method of Lagrange Multipliers .. ................................ 34
Function Subjected to an Integral Constraint . ...................... 37
Finite Subsidiary Conditions . ................................... 40
2.4 Local Constraints and the Minmax Approach ...................... 44
2.5 Necessary and Sufficient Conditions for Optimality ................. 49
Elastic Structures of Maximum Stiffness ......................... 50
vii
Contents

Optimal Desiqn of Euler-Bernoulli Columns . .................... 52


Optimum Vibrating Euler-Bernoulli Beams ....................... 57
2.6 Use of Series Solutions in Structural Optimization .................. 61
2.7 Exercises .......................................................... 64
2.8 References ......................................................... 66
I

Chapter 3. Linear Programming .............. ............................. 71


3.1 Limit Analysis and Design of Structures Formulated as LP Problems72
3.2 Prestressed Concrete Design by Linear Programming .............. 81
3.3 Minimum Weight Design of Statically Determinate Trusses ......... 83
3.4 Graphical Solutions of Simple LP Problems ........................ 86
3.5 A Linear Program in a Standard Form ............................ 88
Basic Solution . ................................................. 89
3.6 The Simplex Method .............................................. 90
Changing the Basis .. ........................................... 91
Improving the Objective Function . .............................. 93
Generating a Basic Feasible Solution-Use of Artificial Variables 94
3.7 Duality in Linear Programming .................................... 96
3.8 An Interior Method-Karmarkar's Algorithm ..................... 100
Direction of Move ................. ............................ 101
Transformation of Coordinates . ................................ 103
Move Distance .. .............................................. 104
3.9 Integer Linear Programming ...................................... 104
Branch-and-Bound Algorithm . ................................. 107
3.10 Exercises ........................................................ 110
3.11 References ....................................................... 113
Chapter 4. Unconstrained Optimization ............................. 115
4.1 Minimization of Functions of One Variable ........................ 115
Zeroth Order Methods ........... .............................. 116
First Order Methods . .......................................... 121
Second Order Method ....... ................................... 122
Safegttarded Polynomial Interpolation . ......................... 123
4.2 Minimization of Functions of Several Variables .................... 123
Zeroth Order Methods .... ..................................... 123
First Order Methods . .......................................... 132
Second Order Methods . ........................................ 137
Applications to Analysis . ...................................... 142
4.3 Specialized Quasi-Newton Methods ............................... 143
Exploiting Sparsity . ........................................... 143
Coercion of Hessians for Suitability with Quasi-Newton Methods 144
Making Quasi-Newton Methods Globally Convergent .. ......... 145
4.4 Probabilistic Search Algorithms ................................... 145
Simulated Annealing . .......................................... 146
Genetic Algorithms . ........................................... 149
4.5 Exercises ......................................................... 152
4.6 References ........................................................ 154

Vlll
Contents

Chapter 5. Constrained Optimization ................................ 159


5.1 The Kuhn-Thcker Conditions ..................................... 161
General Case ............................................... ... 161
Convex Problems .............................................. 166
5.2 Quadratic Programming Problems ................................ 169
5.3 Computing the Lagrange Multipliers .............................. 170
5.4 Sensitivity of Optimum Solution to Problem Parameters .............. 173
5.5 Gradient Projection and Reduced Gradient Methods .............. 176
5.6 The Feasible Directions Method ................................... 182
5.7 Penalty Function Methods ........................................ 186
Exterior Penalty Function ..................................... 187
Interior and Extended Interior Penalty Functions .............. 190
Unconstrained Minimization with Penalty Functions ........... 193
Integer Programming with Penalty Functions .................. 195
5.8 Multiplier Methods ............................................... 198
5.9 Projected Lagrangian Methods (Sequential Quadratic Prog.) ..... 201
5.10 Exercises ........................................................ 205
5.11 References ...................................................... 206
Chapter 6. Aspects of the Optimization Process in Practice ....... 209
6.1 Generic Approximations .......................................... 211
Local Approximations ......................................... 211
Global and Midrange Approximations .......................... 219
6.2 Fast Reanalysis Techniques ....................................... 222
Linear Static Response . ....................................... 222
Eigenvalue Problems . .......................................... 226
6.3 Sequential Linear Programming ................................... 228
6.4 Sequential Nonlinear Approximate Optimization .................. 236
6.5 Special Problems Associated with Shape Optimization ............. 239
6.6 Optimization Packages ........................................... 242
6.7 Test Problems .................................................... 244
Ten-Bar Truss ................................................ 244
Twenty-Five-Bar Truss ........................................ 245
Seventy-Two-Bar Truss ........................................ 246
6.8 Exercises ......................................................... 248
6.9 References ........................................................ 249
Chapter 7. Sensitivity of Discrete Systems .......................... 255
7.1 Finite Difference Approximations ................................. 256
Accuracy and Step Size Selection . .............................. 256
Iterative Methods .. ............................................ 259
Effect of Derivative Magnitude on Accuracy ................... 261
7.2 Sensitivity Derivatives of Static Displacement and Stress Constraints263
Analytical First Derivatives . ................................... 263
Second Derivatives ............................................ 268
The Semi-Analytical Method . .................................. 269
Nonlinear Analysis . ........................................... 273

IX
Contents

Sensitivity of Limit Loads . .................................... 274


7.3 Sensitivity Calculations for Eigenvalue Problems .................. 276
Sensitivity Derivatives of Vibration and Buckling Constraints .. . 276
Sensitivity Derivatives for Non-Hermitian Ei.qenvalue Problems. 283
Sensitivity Derivatives for Nonlinear Eigenvalue Problems ....... 290
7.4 Sensitivity of Constraints on Transient Response .................. 291
Equivalent Constraints .. ....................................... 291
Derivatives of Constraints . .................................... 293
Linear Structural Dynamics . ................................... 298
7.5 Exercises ......................................................... 301
7.6 References ........................................................ 302
Chapter 8. Introduction to Variational Sensitivity Analysis .. ..... 305
8.1 Linear Static Analysis ............................................ 306
The Direct Method ............................................ 308
The Adjoint Method . .......................................... 312
Implementation Notes ........................................ 317
8.2 Nonlinear Static Analysis and Limit Loads ........................ 318
Static Analysis . ............................................... 318
Limit Loads . .................................................. 323
Implementation Notes .............................. ........... 327
8.3 Vibration and Buckling ........................................... 327
The Direct Method ............................................ 328
The Adjoint Method . .......................................... 331
8.4 Static Shape Sensitivity .......................................... 334
The Material Derivative . ...................................... 334
Domain Parametrization .................................. .... 337
The Direct Method ................ ............................ 339
The Adjoint Method ........................................... 343
8.5 Exercise .......................................................... 345
8.6 References ........................................................ 345
Chapter 9. Dual and Optimality Criteria Methods . ................ 347
9.1 Intuitive Optimality Criteria Methods ............................. 348
Fully Stressed Design . ......................................... 348
Other Intuitive Methods .. ..................................... 353
9.2 Dual Methods .................................................... 353
General Formulation . ......................................... 354
Application to Separable Problems . ............................ 355
Discrete Design Variables .............. ........................ 357
Application with First Order Approximations . .................. 361
9.3 Optimality Criteria Methods for a Single Constraint .................. 365
The Reciprocal Approximation for a Displacement Constraint . .. 366
A Single Displacement Constraint .. ............................ 368
Generalization for Other Constraints . .......................... 370
Scaling-based Resizing . ........................................ 372
9.4 Several Constraints ............................................... 375

x
Contents

Reciprocal-Approximation Based Approach . .................... 375


Scaling-based Approach ....................................... 380
Other Formulations . ........................................... 382
9.5 Exercises ......................................................... 383
9.6 References ........................................................ 384

Chapter 10. Decomposition and Multilevel Optimization .......... 387


10.1 The Relation between Decomposition and Multilevel Formulation .387
10.2 Decomposition .................................................. 388
10.3 Coordination and Multilevel Optimization ....................... 399
10.4 Penalty and Envelope Function Approaches ...................... 401
10.5 Narrow-Tree Multilevel Problems ................................ 404
Simultaneous Analysis and Design . ............................ 404
Other Applications . ........................................... 406
10.6 Decomposition in Response and Sensitivity Calculations .......... 406
10.7 Exercises ........................................................ 412
10.8 References ...................................................... 412

Chapter 11. Optimum Design of Laminated Composite Materials 415


11.1 Mechanical Response of a Laminate .............................. 415
Orthotropic Lamina . .......................................... 416
Classical Laminated Plate Theory .. ............................ 418
Bending, Extension, and Shear Coupling .. ..................... 420
11.2 Laminate Design ............................................... 422
Design of Laminates for In-plane Response . .................... 422
Design of Laminates for Flexural Response . .................... 430
11.3 Stacking Sequence Design ....................................... 438
Graphical Stacking Sequence Design . ........................... 438
Penalty Function Formulation . ................................ 440
Integer Linear Programming FormtLlation ...................... 442
Probabilistic Search Methods . .................................. 450
11.4 Design Applications ............................................. 451
Stiffened Plate Design . ........................................ 451
Aeroelastic Tailoring . .......................................... 459
11.5 Design Uncertainties ............................................ 460
11.6 Exercises ........................................................ 462
11.7 References ...................................................... 464

N arne Index ............................................................ 469

Subject Index .......................................................... 475

xi
Preface

The field of structural optimization is still a relatively new field undergoing rapid
changes in methods and focus. Until recently there was a severe imbalance between
the enormous amount of literature on the subject, and the paucity of applications
to practical design problems. This imbalance is being gradually redressed. There is
still no shortage of new publications, but there are also exciting applications of the
methods of structural optimizations in the automotive, aerospace, civil engineering,
machine design and other engineering fields. As a result of the growing pace of
applications, research into structural optimization methods is increasingly driven by
real-life problems.
t-.Jost engineers who design structures employ complex general-purpose software
packages for structural analysis. Often they do not have any access to the source
program, and even more frequently they have only scant knowledge of the details of
the structural analysis algorithms used in this software packages. Therefore the major
challenge faced by researchers in structural optimization is to develop methods that
are suitable for use with such software packages. Another major challenge is the high
computational cost associated with the analysis of many complex real-life problems.
In many cases the engineer who has the task of designing a structure cannot afford
to analyze it more than a handful of times.
This environment motivates a focus on optimization techniques that call for mini-
mal interference with the structural analysis package, and require only a small number
of stfllctural analysis runs. A class of techniques of this type, pioneered by Lucien

XUI
Preface

Schmit, and which are becoming widely used, are referred to in this book as sequen-
tial approximate optimization techniques. These techniques use the analysis package
for the purpose of constructing an approximation to the structural design problem,
and then employ various mathematical optimization techniques to solve the approx-
imate problem. The optimum of the approximate problem is then used as a basis for
performing one or more structural analyses for the purpose of updating or refining
the approximate design problem. Most of the approximate design problems are based
on derivatives of the structural response with respect to design parameters.
In the new environment the structural designer is typically called upon to provide
the interface between a commercially available analysis program, and a commercially
available optimization software package. The three most important ingredients of
the interface are: sensitivity derivative calculation, construction of an approximate
problem, and evaluation of results for the purpose of fine-tuning the approximate
problem or the optimization method for maximum efficiency and reliability.
This textbook is organized so that its middle part-Chapters 6, 7 and 8 deal with
the two issues of constructing the approximate problem and obtaining sensitivity
derivatives. Evaluating the results of the optimization calls for a basic understanding
of optimality conditions and optimization methods. This is dealt with in Chapters
1 through 5. The last three chapters deal with the specialized topics of optimality
criteria methods, multi-level optimization, and applications to composite materials.
The material in the textbook can be used in various ways in teaching a graduate
course in structural optimization, depending on the available amount of time, and
whether students have prior preparation in optimization techniques.
Without prior preparation in optimization techniques it is suggested that the
minimum time requirement is one semester. It is suggested to cover Chapter 1,
sections 2.1, 2.2 and 2.3 of Chapter 2, Sections 3.1 and 3.4 of Chapter 3, some
material from Chapters 4 and 5 depending on the instructor's favorite optimization
methods, most of Chapter 6 and the first two sections of Chapter 7. With a two-
quarter sequence it is suggested to cover Chapters 1 and 2, selected t.opics of Chapters
3 to 5 and Chapter 6 in the first quarter, and Chapters 7, 9, 11 and either Chapter
8 or Chapter 10 in the second quarter. Finally, in a two-semester sequence it is
recommended to cover Chapters 1 through 6 in the first semester, and Chapters 7
through 11 in the second semester.
With a preparatory course in mathematical optimization a one quarter and a
one semester versions of the course can be considered. A one-quarter version could
include Chapters 1 and 2, sections 3.1, 3.2, 3.3 and 3.7 of Chapter 3, and Chapters
6, the first two sections of Chapter 7, and Chapter 9 or 11.. A one-semester version
could include the same part of Chapters 1 through 7 and then Chapters 9 through
11.
The authors gratefully acknowledge the assistance of Drs. H. Adelman, B.
Barthelemy, J-F. Barthelemy, L. Berke, R. Grandhi, D. Grierson, E. Haug, R. Plaut,
J. Sobieski, and J. Starnes in reviewing parts of the manuscript and offering critical
comments.

xiv
Introd uction 1

Optimization is concerned with achieving the best outcome of a given operation


while satisfying certain restrictions. Human beings, guided and influenced by their
natural surroundings, almost instinctively perform all functions in a manner that
economizes energy or minimizes discomfort and pain. The motivation is to exploit
the available limited resources in a manner that maximizes output or profit. The
early inventions of the lever or the pulley mechanisms are clear manifestations of
man's desire to maximize mechanical efficiency. Innumerable other such examples
abound in the saga of human history. Douglas 'Wilde [1] provides an interesting
account of the origin of the word optimum and the definition of an optimal design.
We will paraphrase Wilde and offer the definition of an optimal design as being 'the
best feasible design according to a preselected quantitative measure of effectiveness'.
As it is beyond the scope of this text to trace the historical development of op-
timization, we list a few of the more recent references on the suhject of structural
optimization. These references [2~ 19] trace the development of the field of structural
optimization dating back to the eighteenth century. The importance of minimum
weight design of structures was first recognized hy the aerospace industry where
aircraft structural designs are often controlled more by weight than hy cost consider-
ations. In other industries dealing with civil, mechanical and automotive engineering
systems, cost may be the primary consideration although the weight of the system
does affect its cost and performance. A growing realization of the scarcity of raw ma-
terials and a rapid depletion of our conventional energy sources is being translated
into a demand for lightweight, efficient and low cost structures. This demand in turn
emphasizes the need for engineers to be cognizant of techniques for weight and cost
optimization of structures. The objective of this text is to acquaint students and
practicing engineers with these techniques.

1.1 Function Optimization and Parameter Optimization

Before the advent of high speed computation most of the solutions of structural
analysis problems were based on formulations employing differential equations. These

1
Chapter 1: Introduction

differential equations were solved analytically (e.g., by using infinite series) with oc-
casional use of numerical methods at the very end of the solution process. The
unknowns were functions (representing displacements, stresses, etc.) defined over a
continuum.

Figure 1.1.1 Beam example.

The early beginning of structural optimization followed the same route, in that
the unknowns were functions defining the optimal structural properties. Consider,
for example, the beam shown in Figure 1.1.1. Structural analysis is concerned with
finding the displacement w( x) of the beam by solving the well-known governing equa-
tion
J2 (Ef J2w)
dx 2 dx = q(x).
2 (1.1.1)

The structural designer may want to find the optimum distribution of the moment of
inertia f( x) of the beam along its length. Of course, the notion of optimality requires
that we have an objective function that we wish to maximize or to minimize. For
example, the objective function may be the mass of the beam. For many common
beam cross sections the mass m is given as

m =c 11 fP(x)dx, (1.1.2)

where the exponent p is usually between 0.4 and 0.5, and c is a known constant. An
optimization problem typically involves a number of constraints. Without any con-
straint the optimum beam would have zero moment of inertia and zero mass. In the
design of a beam, a typical constraint would be to limit the maximum displacement
of the beam to some specified allowable wo,

lOmax = max lo{x) ~ woo (1.1.3)


0$",9

It is possible to obtain the necessary conditions for optimality in the form of


a differential equation in f(x) and w(x). The mathematical discipline that deals
with this type of problem is called the calculus of variations, and is briefly discussed

2
Section 1.2: Elements of Problem Formulation
in Chapter 2. The class of structural optimization problems that seeks an optimum
structural function is called function or distributed parameter structural optimization.
In the late fifties and early sixties high speed electronic computers had a profound
effect on structural analysis solution procedures. Techniques that were well suited to
computer implementation, in particular the finite element method (FEM), became
dominant. The finite element method discretizes the structure at the very beginning
of the analysis, so that the unknowns in the analysis are discrete values of displace-
ments and stresses at nodes of the finite element model, rather than functions. The
differential equations solved by earlier analysts are replaced by systems of algebraic
equations for the variables that describe the discretized system.
The same transformation began to take hold in the early sixties in the field of
structural optimization. When optimizing a structure discretized by finite elements
it is natural to discretize the structural properties which are optimized. Consider
again the beam example of Figure 1.1.1. A finite element solution for the displace-
ments starts by dividing the beam into a number of constant-property segments or
finite elements. An optimization of the same beam would naturally use the moments
of inertia of the segments as design parameters. Thus, instead of searching for an
optimum function, we will be looking for the optimum values of a number of param-
eters. The mathematical discipline t.hat deals with parameter optimization is called
mathematical programming. The bulk of this t.ext (Chapters 3- 7, 9-11) is concerned,
therefore, with mathematical programming techniques and their application to struc-
tural optimization problems defined by discretized models. In particular, it is often
implicitly assumed that the structural analysis is based on the finite element method.

1.2 Elements of Problem Formulation

1.2.1 Design Variables

The notion of improving or optimizing a structure implicitly presupposes some free-


dom to change the structure. The potential for change is typically expressed in terms
of ranges of permissible changes of a group of parameters. Such parameters are usu-
ally called design variables in structural optimization terminology and denoted by
a vector x = (Xl, X2, ... , xn) in this book. Design variables can be cross-sectional
dimensions or member sizes, they can be parameters controlling the geometry of the
structure, its material properties, etc. Design variables may take continuous or dis-
crete values. Continuous design variables have a range of variation, and can take any
value in that range. For example, in the design problem of Figure 1.1.1 the moment
of inertia of any segment of the beam may be considered a continuous design variable.
Discrete design variables can take only isolated values, typically from a list of permis-
sible values. Material design variables are often discrete. If we consider five materials
in the design of the beam, then we can define a design variable that can take any
integer value from one to five to represent the material choice. Design variables that

3
Chapter 1.' Introduction
are commonly treated as continuous are often made discrete due to manufacturing
considerations. For example, if the beam of Figure 1.1.1 is designed to minimize cost,
then we may need to limit ourselves to commercially available cross sections. The
moment of inertia would then cease to be a continuous design variable, and would
become a discrete one.
In most structural design problems we tend to disregard the discrete nature of
the design variables in the solution of the optimization problem. Once the optimum
design is obtained, we then adjust the values of the design variables to the nearest
available discrete value. This approach is taken because solving an optimization
problem with discrete design variables is usually much more difficult than solving a
similar problem with continuous design variables. However, rounding off the design
to the closest integer solution works well when the available values of the design
variables are spaced reasonably close to one another, so that changing the value of a
design variable to the nearest integer does not change the response of the structure
substantially. In some cases the discrete values of the design variables are spaced too
far apart, and we have to solve the problem with discrete variables. This is done by
employing a branch of mathematical programming called integer programming. In
this text it is assumed that design variables are continuous unless otherwise stated.

JI I
==

Figure 1.2.1 Optimal thickness distribution of a plate.

The choice of design variables can be critical to the success of the optimization
process. In particular it is important to make sure that the choice of design variables is
consistent with the analysis model. Consider, for example, the process of discretizing
a structure by a finite element model and applying the optimization procedure to the
model. If the design variable distribution has a one-to-one correspondence with the
finite element model we can encounter serious accuracy problems. For example, the
plate shown in Figure 1.2.1 was analyzed [20] by a 7 X 7 finite element mesh, with
most design variables specifying the thickness of individual elements. While the 7 x 7

4
Section 1.2: Elements of Problem Formulation

model was adequate for the initial design which had uniform thickness, it was not
adequate for the final design shown in the Figure.

(a) (b)

Figure 1.2.2 Optimized shape of a hole in a plate, (a) initial design, (b) final design.

A similar problem may be encountered when the coordinates of nodes of the finite
element model are used as design variables. For example, the shape of the hole in the
plate shown in Figure 1.2.2 was optimized [21] to reduce the stress concentration near
the hole with the coordinates of the boundary nodes serving as design parameters.
Again, the finite element model was adequate for the analysis of the initial circular
shape of the hole, but not the "optimal shape" obtained. In general, the distribution
of design variables should be much coarser than the distribution of finite elements
(except for skeletal structures where often each element corresponds to a physical
member of the structure)

1.2.2 Objective Function

The notion of optimization also implies that there are some merit function f(x) or
functions f(x) = [II (x), h(x), ... , fp(x)] that can be improved and can be used as a
measure of effectiveness of the design. The common terminology for such functions is
objective functions. Optimization with more than one objective is generally referred
to as Multicriteria Optimization. For structural optimization problems, weight, dis-
placements, stresses, vibration frequencies, buckling loads, and cost or any combina-
tion of these can be used as objective functions. Consider, for example, the three-bar
truss of Figure 1.2.3. Our design problem may be to vary the horizontal locations of
the three support points so as to minimize the mass of the truss and t.he stresses in
its members. We have four objective functions: the mass and the three stresses.
Dealing with multiple objective functions is complicated and is usually avoided.
There are two intuitive ways commonly used for reducing the number of objective
functions to one. The first way is to generate a composite objective function that
replaces all the objectives. For example, if the mass of the structure is denoted m and

5
Chapter 1: Introduction

2 3

100 in

.
...P=l0000lb
x,u

Y,v

Figure 1.2.3 Three-bar truss example.

the stresses in the three bars as ai, i = 1,2,3, then a composite objective function I
could be
(1.2.1)
where the Qi are weighting coefficients selected to reflect the relative importance of
the four objective functions.
The second intuitive way to reduce the number of objective functions is to select
the most important as the only objective function and to impose limits on the others.
Thus we can formulate the three-bar truss design problem as minimization of mass,
subject to upper limits on the values of the three stresses.
When it is not intuitively clear how to weight or choose between the objective
functions, a systematic approach to the problem is through a branch of mathematical
programming called Edgeworth-Pareto optimization that deals with multiple objective
functions [22-24]. Stadler [25,26] was probably the first to apply Edgeworth-Pareto
optimality to structural design. More recent applications can be found in Refs. 27-31.
A vector of design variables x' is said to be Edgeworth-Pareto optimal if, for any
other vector x, either the values of all the objective functions remain the same, or
at least one of them worsens compared to its value at x*. \Vhen it is not possible to
specify intuitively the relative importance of the objective functions in an equation
such as (1.2.1), the values of the weights Qi, i = 0,1,2,3 in Eq. (1.2.1) can be decided
by studying various Edgeworth-Pareto optimal designs. Thus the design process is an
interactive process, and the imposition of constraints is postponed until knowledge of
the optimum performance is gained by studying Edgeworth-Pareto optimal designs.
One of the approaches for generating a pareto-optimal solution to multiple ob-
jective function optimization problems is based on the minimization of the deviation
of the individual objective functions from their individual minimum values. If the
independent minimizations of each of the objective functions result in function val-
ues of Ii, 12, ... , I; associated with design points xi, x2, ... , x;, then for an arbitrary
value of the design variable vector x the normalized distance of each of the objective

6
Section 1.2: Elements of Problem Formulation

functions from its individual optimum is given by

i=I, ... ,p (1.2.2)

It is then possible to pose the problem either as the minimization of the largest
deviation of the objective functions from their individual minima (too norm),

minimize max [di(x)] , (1.2.3)


l=l, ... ,p

or of the distance (i.e., the l2 or Euclidean norm) from the reference point f* =
(Ji, f;, ... ,f;) to f = (II, 12, ... , fp);

L dT .
p

minimize (1.2.4)
i=1

It is also possible to use weighting coefficients in Eq. (1.2.4) for the contributions
of the individual objective functions. A more detailed discussion of the methods for
solving multicriteria optimization problems and their design applications is given by
Eschenauer et al. [31].

Example 1.2.1

Consider the design of cross-sectional dimensions of a rectangular beam so as to


minimize the area. At the same time it is desired to minimize the maximum shear
stress in the beam corresponding to a unit shear force. Based on some physical
constraints, the two variables, wand h, which are the width and height of the
cross-section are limited to be in the range 0.5 ~ w, h ~ 5 units.

I" w
~I
2 w 3 4 5 2 w 3 4 5
area contours shear stress contours
Figure 1.2.4 Design of a beam cross-section for minimum area and minimum shear
stress.

7
Chapter 1: Introduction
The contour lines for the two objective functions,

3
II = A = wh, and h = T = 2wh ' (a)
are shown in Figure 1.2.4 . The individual minima for the two functions are at the
opposite corners of the design space, wi = hi = 0.5 and w2 = h; = 5.0, with
associated function values of N = 0.25 in 2 and f2 = 0.06 Ib/in 2 •
The weighted objective function approach with equal weights results in minimiza-
tion of the function
3
F=wh+- · (b)
2w h
Since design variables wand h appear everywhere in the form of a product, we
can treat this product as a single variable. Minimization of Eq. (b) with respect
to the product results in w"h" = fil2
= 1.225 with objective function values of
It = 12 = 1.225. If, on the other hand, we use the minimization of the Euclidean
norm of the distance from the individual minima, the function that needs to be
minimized is

F = (hW - 0.25)2 + (~ - 0.06) 2


(c)
0.25 0.06

The resulting design is w"h' = 2.5 with objective function values of Ii = 2.5 and 12 =
0.6.

6 w*h* = 0.25
'""'
'"'" 5
~ =0.5 w*h*
'" 4
~
~ 3
'"
~ w*h* =1.225
'-'
~W*h*=2.5
......."'2 w*h* = 25.0
~W*h*=3.0
~
1

5 10 15 20 25
f 1 (area)

Figure 1.2.5 Pareto-optimum solutions for the beam design problem.

The two designs obtained above and the designs corresponding to the minimiza-
tion of the individual functions constitute a pareto-optimum. There are other so-
lutions that satisfy the condition for pareto-optimality. These solutions can be ob-
tained either by varying the weighting coefficients of the individual objectives, or by

8
Section 1.2: Elements of Problem Formulation
imposing one of the objectives as a constraint and varying the desired level of this
constraint. For example, if the second objective function is turned into a constraint
by imposing a condition that h::; 0.5 while minimizing the area, we would obtain a
design w*h* = 3.0 with objective function values of fi = 3.0 and I:; = 0.5. Similarly,
if we minimize h by imposing a constraint that h ::; 0.5, we obtain w* h* = 0.5 with
Ii = 0.5 and I:; = 3. All of these solutions lie on a curve in the function space that
connect the two individual minima as shown in Figure 1.2.5 . This curve is usually
called the efficiency curve. • ••

1.2.3 Constraints

The formulation of the three-bar truss example where the stresses are subject to
upper limits, and the beam cross-section design problem where the height and width
variables are limited to take values only in a certain range, introduces the notion
of limits on the design variables. Because of their simplicity, these upper and lower
limit constraints on the values of the design variables are often treated in a special
way by solution procedures, and are refereed to as side constraints. Constraints
which impose upper or lower limits on quantities are by their very nature inequality
constraints. Sometimes we need eq1tality constraints. For example, the three-bar
truss may be designed subject to a requirement that the vertical component of the
displacement at the point of application of the force be zero. Another example of
equality constraints is provided by the equations of equilibrium that a structure must
satisfy in terms of its design variables.
Some strategies for the solution of nonlinear optimization problems are unable
to handle equality constraints, but are limited to inequality constraints only. In
such instances it is possible to replace the equality constraint with two inequality
constraints that form upper and lower bound constraints with a same limiting value.
However, it is usually undesirable to increase the number of constraints. Another way
of handling equality constraints in such situations will be discussed later in Chapter
5.

1.2.4 Standard Formulation

The notation adopted in this text for design variables, objective function and con-
straints is summarized in the following formulation of the optimization problem. In
this text we deal only with problems formulated to have a single objective function.

minimize I (x)
such that gj(x) ~ 0 , j = 1, ... , ng , (1.2.5)
hk(x) = 0 , k = 1, ... , ne ,
where x denotes a vector of design variables with components Xi, i = 1, ... , n. The
equality constraints hj(x) and the inequality constraints gj(x) are assumed to be
transformed into the form (1.2.5). The fact that the optimization problem is as-
sumed to be a minimization rather than a maximization problem is not restrictive

9
Chapter 1: Introduction
since instead of maximizing a function it is always possible to minimize its negative.
Similarly, if we have an inequality of opposite type, that is

(1.2.6)

we can transform it to a greater-than-zero type by multiplying Eq. (1.2.6) by -1.


However, while most optimization texts deal with minimization rather than maxi-
mization problems, many of them prefer less-than inequalities to greater-than ones.
This choice affects the sign convention in some of the results obtained in this text-
book, and the reader should be alert to this fact when comparing results with texts
that use the opposite inequality convention.
An optimization problem is said to be linear when both the objective function
and the constraints are linear functions of the design variables Xi, i.e., they can be
expressed in the form

(1.2.7)

Linear optimization problems are solved by a branch of mathematical programming


called linear programming. The optimization problem is said to be nonlinear if ei-
ther the objective function or the constraints are nonlinear functions of the design
variables.

Example 1.2.2

Consider the three-bar truss of Figure 1.2.3. Assume that it is made of steel (density
0.29Ib/in3 ), and that we want to minimize the mass subject to the constraint that the
stress in any member does not exceed 30,000 psi in tension or compression. \Ve also
impose a side constraint that the minimum area of any member is 0.1 in 2 • The design
variables are the member cross-sectional areas A l , A 2 , and A 3 , and the horizontal
coordinates Xl, x2 and X3 of the support points. The point of application of the
force is assumed to be fixed. We seek to formulate this optimization problem in the
standard form of (1.2.5).
The objective function is easy to write in terms of the design variables.

Jxr +
where
L; = 100 2 , i=1,2,3.
To calculate the stress constraint it is convenient to introduce the displacements u
and v at the point of application of the force as intermediate variables. It can be
verified that the equations governing u and v are

kll(X)U + k l2 (X)V =10,000,


k 12 (x)u + k22(X)V =0 ,
10
Section 1.2: Elements of Problem Formulation

where
~AiX~
kn(x) =E L...J L~ ,
i=1 I

a
" 100A i x i
k 12 (X ) -- - EL...J L~ ,
i=1 I

3
k 22 (X ) -
-
E " 10,000A i
L...J L~ ,
i=1 I

and where E is Young's modulus for steel (30 x 106 psi). In terms of u and v, the
stresses in the members are given as

(Ii = E( -ux;f L~ + 100v / L~) , i=1,2,3.

Based on the above analysis, one way of formulating the optimization problem in the
standard form is to add u and v to the list of design variables. The formulation is

minimize m =0.29(AILl + A2L2 + AaL3)


such that hI =knu + k 12V - 10 000 =0 ,
h2 =k12U + k 22V =0 ,
and
(tension constraints) = 30 000 - E( -uXi + 100v) / L~ ~ 0 ,
gi

(compression constraints) gi+3 = E( -UXi + 100v)/ L~ + 30 000 ~ 0 ,


(minimum gage constraints) gi+6 = Ai - 0.1 ~ 0 , i = 1,2,3 .

We then have a problem with eight design variables (Ai, Xi, i = 1,2,3 and u, v), two
equality constraints and nine inequality constraints. This formulation including the
response variables u and v together with the structural dimensions as design variables
is called simultaneous analysis and design. Most structural optimization formulations
eliminate the response variables by using the equations of equilibrium. In this problem
we can solve for u and v from the equality constraints, thus eliminating two equality
constraints and two design variables. The new formulation, which does not include the
displacements as design variables, is much more common in structural optimization.
As a result it is rare to encounter formulations of structural optimization problems
which include equality constraints.e e e
While the above formulation of Example 1.2.2 conforms to our standard formu-
lation, we may expect to encounter numerical difficulties when we solve this example
using many standard solution techniques. The reason for the expected numerical
difficulties is the large discrepancy between the magnitudes of the different design
variables and constraints. Consider first the design variables. The area design vari-
ables may be expected to be of the order of the ratio of the applied force to the
allowable stress, that is between 0.1 and 1 in 2 • The coordinate design variables, on
the other hand, may be expected to be of the order of 100 in.
11
Chapter 1: Introduction
Next consider the constraints. If the displacements u and v are about ten percent
below or above their optimal values we can expect the equality constraints hI and
h2 to be of the order of magnitude of ten percent of the applied load. Similarly the
inequality constraints 91 through 96 will be of the order of ten percent of the allowable
stress, 30000 psi. However, the minimum gage constraints 97 through 99 will be of
the order of 0.1 in 2 •
Because many optimization software packages are not numerically robust, it is
a good idea to eliminate such wide variations in the magnitudes of design variables
and constraints by normalization. Design variables may be normalized to order 1
by scaling. In Example 1.2.2 the coordinate design variables may be normalized by
the given vertical distance (100 in), and the area design variables by a nominal area,
Ao = 1/3 in 2 , which is the ratio of the applied load to the allowable stress.
The constraints may be similarly normalized. Usually, inequality constraints can
be normalized by the allowable value which is used to form them. Thus a constraint
that a stress component {J be smaller than an allowable stress {Jal is often written as

9= {Jal - {J ;:::: 0 . (1.2.8)

The value of the constraint depends on the units used, and can be large or small.
Instead the constraint can be normalized as
{J
g=l--;::::O. (1.2.9)
{Jal

Now the constraint values are of order one, and do not depend on the units used.

1.3 The Solution Process

The optimization methods discussed in this text are mostly numerical search tech-
niques. These techniques start from an initial design and proceed in small steps to
improve the value of the objective function, or the degree of compliance with the
constraints, or both. The search is terminated when no progress can be made in
improving the objective function without violating some of the constraints. Some
optimization methods terminate when progress in improving the objective function
becomes very slow. Others check for optimality by employing the necessary condi-
tions, called the K uhn- Tucker conditions (sec Chapter 5), that must be satisfied at a
minimum. We will typically use n to denote the number of design variables, so that
the search for the optimum is carried out in the n-dimensional space of real variables
Rn. Every point in this space constitutes a possible design.
In structural optimization problems the constraints imposed on the design, such
as stress, displacements or frequency constraints, are important. That is, such con-
straints will affect the final design and force the objective function to assume a higher
value than it would take without the constraints. For example, in Example 1.2.2, if
the stress constraints were removed all the cross-sectional areas would be reduced to
their minimum-gage values of 0.1 in 2 , and the coordinates of points 1, 2 and 3, would

12
Section 1.3: The Solution Process

lie directly above point 4, so that the lengths of all three members would take the
minimum value, 100 in., corresponding to a total mass of 8. 7 lb. The resulting stresses
in the members would tend to infinity. Since we cannot tolerate infinite stresses, we
impose stress constraints, and we may expect that the optimum mass will be heavier
than 8.7 lb., and that, at the optimum design, the stress in at least one member will
be equal to the maximum allowable stress of 30,000 psi.
In general, we divide the space of design variables into a feasible domain and
infeasible domain. The feasible domain contains all possible design points that sat-
isfy all the constraints. The infeasible domain is the collection of all design points
that violate at least one of the constraints. Because we expect that the constraints
influence the optimum design, we expect that some constraints will be critical at the
optimum design. This is equivalent to the optimum being on the boundary between
the feasible and infeasible domains. Inequality constraints in our standard formu-
lation, Eq. (1.2.5), are critical when they are equal to zero. These constraints are
also called active constraints, while the rest of the constraints are inactive or pas-
sive. For example, consider the minimum gage constraint g7 of Example 1.2.2. For
Al = 0.lin2 the constraint is active, for Al = 0.llin2 the constraint is passive, and
for Al = 0.09in2 the constraint is violated.
It may be intuitively assumed that all the constraints which are active at the
optimum design influence it; that is, if they were removed the objective function
could be improved. This is not always true. It is possible to have constraints that
are active and can be removed without any impact on the optimum design. Many
optimization procedures calculate, along with the optimum design, a set of numbers,
one for each active constraint, called the Lagrange multipliers (see Chapter 5) which
measure the sensitivity of the optimum design to changes in each constraint. When
the Lagrange multiplier associated with a constraint is zero, it indicates that, to
a first order approximation, removing this constraint will not have any effect on
the optimum value of the objective function. These multipliers also provide very
important design information because in many structural optimization applications
there is some degree of arbitrariness in the choice of parameters that determine the
constraints such as stress limits or minimum gage values. For example, when we
impose stress constraints on a steel structure we typically select ahead of time the
grade of steel to be used. We can use the Lagrange multipliers to estimate the effect
of a change in the stress limit on the objective function. If we find that the optimum
design is very sensitive to this value we may consider using a better grade of steel.
One of the major problems in almost all optimization solution procedures is the
determination of the set of active constraints. If the solution procedure attempts
to consider all constraints during the search process the computational cost of the
optimization may be significantly increased. If, on the other hand, the procedure
deals only with constraints that are active or near active for the trial design, the
convergence of the optimization process may be endangered due to oscillation in the
set of active constraints. Most optimization procedures are usually complemented by
an active set strategy used to determine the set of constraints to be considered at
each trial design.
During the optimization process we move from one design point to another. While
13
Chapter 1: Intmduction

there are many optimization techniques, most of them proceed through four basic
steps in performing the move. The first step is the selection of the active constraint
set discussed above. The second step is the calculation of a search direction based
on the objective function and the active constraint set. Some methods (such as
the gradient projection method) look for a direction which is tangent to the active
constraint boundaries. Other methods, such as the feasible direction or the interior
penalty function method seek to move away from the constraint boundaries. The
third step is to determine how far to go in the direction found in the previous step.
This is often done by a process called a one dimensional line search because it seeks
the magnitude of a single scalar which is the distance to be travelled along the given
direction. The last step is a convergence step which determines whether additional
moves are required.

1.4 Analysis and Design Formulations

In a practical design situation it is not always clear which mathematical formu-


lation of the structural design problem should be used. Consider, for example, the
beam of Figure 1.1.1, and assume that the designer wants to achieve a high-stiffness,
low-mass design. One option that the designer may elect is to employ a multiple
objective function formulation where both the mass m in Eq. (1.1.2) and the maxi-
mum displacement W max , Eq. (1.1.3), are to be minimized simultaneously. A second
approach is to assign some weights 01 and 02 to the two objectives and use them to
form a composite objective function 0lm + 02Wmax' Third, it is possible to set the
mass as the objective function and constrain the magnitude of W max ' Finally, it is
possible to prescribe an upper limit on the mass and use the maximum displacement
as the objective function.
All of the above formulations may be acceptable for the design goal of producing
a strong, lightweight design. However, the mathematical formulation and the solution
difficulties may be quite different. For example, if p = 1 in Eq. (1.1.2), the mass is
a linear function of the design variable I(x) while the maximum displacement W max
is not. Some nonlinear optimization procedures work better when the objective is
linear and the constraints are nonlinear, and others work better when the situation
is reversed. The choice of the formulation may be decided, therefore, on the basis of
the optimization software available to the designer.
The formulation and the solution of the structural optimization problem is also
important. First, because the analysis has to be repeated many times during the op-
timization process it may be crucial to use a solution method that is computationally
inexpensive. Thus a detailed finite element model that is typically used for a single
analysis of the structure may not be affordable for optimization, and it may have to
be replaced with a cruder model.
The choice of the st.ructural analysis solution process may be similarly influenced
by the optimization environment. For example, vibration frequencies and modes of a
structure are typically calculated by an eigenvalue solution procedure. Some of these
procedures benefit from good initial approximations for the eigenwct.ors and some

14
Section 1.5: Specific Versus General Methods
do not. For applications in structural optimization the former procedure gains an
advantage because the eigenvectors (vibration modes) change only gradually 11.<; the
design is modified. Therefore the eigenvectors from an earlier design can serve as
good initial approximations for the current eigenvectors.
Finally, in some cases it may be worthwhile to integrate the analysis and design
procedures. This happens when structural analysis is iterative in nature, as in the
case of nonlinear structural behavior. The analysis and design iterations may be then
integrated so that the analysis iteration is only partially converged for each design
iteration (e.g. [32,33]). In some cases it may be worthwhile to combine the analysis
and design iterations into a single iterative process. This simultaneous analysis and
design approach is discussed in Chapter 10.

1.5 Specific Versus General Methods

The solution methods commonly used for obtaining optimum designs in structural
optimization may be divided into different categories. An important classification
of solution methods considers specific versus general methods. Specific methods are
used exclusively in structural optimization (even if they could be applicable also in
other disciplines). General methods apply to optimization problems in several other
fields. In the early stages of the development of structural optimization, specific
methods enjoyed great popularity. These included methods tailored to some special
structural optimization problems which they could solve more efficiently than any
general method.
The most successful of these specific methods was the fully stressed design tech-
nique described in Chapter 9. It is a method applicable to the design of a structure
subject to stress constraints only, and it works well for lightly-redundant single-
material structures.
The popularity of specific methods is currently waning as their limitations be-
come increasingly apparent. The approach taken in this text is to emphasize general
methods rather than specific ones. General methods not only have the advantage of
wider applicability but also a wider base of resources. Researchers in many disciplines
are constantly improving these methods and developing efficient and reliable software
implementations.
Besides playing down the role of specialized methods for structural design we also
do not discuss some mathematical programming methods applicable to problems of
specialized form such as dynamic programming, geometric programming and optimal
control techniques. These methods have been applied successfully to structural design
problems, but because of space considerations they are not covered here. The reader
is referred to Refs. 34-36 for information on the application of these methods to
structural design.
The important considerations for a structural analyst using general optimization
methods have to do with providing an interface between structural analysis software
and optimization software. This interface includes the three major components of

15
Chapter 1: Introduction
formulation, sensitivity and approximation, and is one of the major thrusts of this
text.
The formulation of a structural design problem is of crucial importance for the
success of the design process. A poor formulation can lead to poor results or pro-
hibitive computational cost. Chapter 3, for example, describes various structural
design problems that can be formulated with a linear objective function and lin-
ear constraints. The reason for the usefulness of a linear formulation is the highly
advanced state of methodology and software for solving such linear problems.
The efficient calculation of derivatives of the constraints and objective function
with respect to design variables, often referred to as sensitivity derivatives, is dis-
cussed in Chapters 7 and 8. Most general purpose optimization algorithms require
such derivatives, and their calculation is often the major computational expense in
the optimization of structures modeled by complex finite element models. These
derivatives can also be used to form constraint approximations which can then be
employed instead of costly exact constraint evaluations during portions of the op-
timization process. The use of constraint approximations is discussed in Chapter
6.
The importance of efficient and accurate calculation of sensitivity derivatives
and of employing constraint approximations is now recognized by most structural
optimization specialists. We believe that it affects the success and overall computa-
tional cost of the optimization process even more than the choice of the optimization
method.

1.6 Exercises

P=25 Kips

d
h

Figure 1.6.1 A tripod under a vertical load.


16
Section 1.6: Exercises

1. A tripod is made from three steel pipes as shown in Figure 1.6.1. The ends of these
pipes are placed 120 0 apart on a circle of radius 6 ft. A vertical downward force of
25 kips is applied at the top. It is required to minimize the weight of the tripod such
that the tripod is safe with respect to Euler buckling, local buckling and yielding.
Assume E = 30 X 106 psi, ayield = 60 X 103 psi and calculate the local buckling stress
in psi by the formula
a cr = 36 x 106 (~) •

Sketch the constraints in the two-dimensional design space of d ( mean diameter of


pipe) and h. Identify the feasible and infeasible domains; plot the contours of the
objective function and locate the optimum solution.
2. A narrow rectangular beam with cross-sectional dimensions band h is cantilevered
over 20 ft and subjected to an end load p = 10 kips (Figure 1.6.2). In addition to a
flexural failure such beams can collapse through lateral instability by twisting. The
critical load for such a beam of length 1 is given by

1"1---------------1 P = 10 kips
=
1 20 ft
Figure 1.6.2 A narrow rectangular cantilever beam.

where E is Young's modulus, Ileast is the smallest moment of inertia and c is the
torsional rigidity of the beam given by 0.312hb3G, G being the shear modulus. Design
a minimum weight beam so as to prevent failure in both flexure and twisting. Assume
E = 30 X 106 psi, G = 12 X 106 psi and aal = 75 ksi in tension and compression.
Locate the optimum solution graphically.
3. Consider the design of the cross-section of an I-beam shown in Figure 1.6.3 with the
objectives of minimizing the cross-sectional area and minimizing the normal stresses
resulting from bending about the horizontal neutral axis. The thicknesses of the
flange and the web of the cross-section are fixed at t = O.lin. The design variables
are the width wand the height h of the cross-section. Determine graphically the
designs which minimize the individual objectives if the width and the height are

17
Chapter 1: Introduction

T
h

L,-------.,1

Figure 1.6.3 Cross-sectional design of an I-beam.

constrained to remain in the range 0.1 ::; w, h ::; 10. Also find the designs by using
weighting function approach with equal weights, and using Eq. (1.2.4).
4. The elastic grillage of Figure 1.6.4 consists of two uniform beams with cross-
sectional areas Al and A 2 . Both beams are subjected to a uniformly distributed load
of 1000 IbJin. The minimum weight design of such a structure was first proposed by
Moses and Onada [371. Develop expressions for the maximum stresses in tension and
compression at sections 1, 2 and 3 in terms of Al and A 2 . Assume that the section
modulus z and moment of inertia I are related to the cross-sectional area as

z= (~)1.82 A )2.65
1=1.007 ( - .
1.48 ' 1.48

Figure 1.6.4 An elastic grillage under uniform load.

18
Section 1.7: References

Assuming an allowable stress of 20,000 psi in tension and compression, formulate


the five constraints and the objective function. Plot the constraints and the objective
function. Identify the feasible and infeasible domains. Comment on the character-
istics of the feasible domain in contrast with those of the previous two problems.
Determine the best design for the grillage.

1.7 References

[I] Wilde, D.J., Globally Optimal Design, John Wiley and Sons, New York, 1978.
[2] Wasiutynski, Z., and Brandt, A., "The Present State of Knowledge in the Field
of Optimum Design of Structures," Appl. Mech. Rev., 16 (5), pp. 341-348, May
1963.
[3] Sheu, C.Y., and Prager, W., "Recent Developments in Optimum Structural De-
sign," Appl. Mech. Rev., 21 (10), pp. 985-992, Oct. 1968.
[4] Schmit, L.A. Jr., "Structural Synthesis 1959-1969: A Decade of Progress," in Re-
cent Advances in Matrix Methods of Structural Analysis and Design, University
of Alabama Press, Huntsville, pp. 565-634, 1971.
[5] Pierson, B.L., "A Survey of Optimal Structural Design Under Dynamic Con-
straints," Int. J. Num. Meth. Eng., 4, pp. 491-499, 1972.
[6] Niordson, F.I., and Pedersen P., "A Review of Optimal Structural Design," in
Theoretical and Applied Mechanics, Proceedings of the Thirteenth International
Congress of Theoretical and Applied Mechanics, E. Becker and G. K. Mikhalov
(eds.), pp. 264-278, Springer-Verlag, Berlin, 1973.
[7] Rao, S.S., "Optimum Design of Structures under Shock and Vibration Environ-
ment," Shock Vibr. Digest, 7 (12), pp. 61-70, Dec. 1975.
[8] Olhoff, N. J., "A Survey of Optimal Design of Vibrating Structural Elements,
Parts I and II," Shock Vibr. Digest, 8 (8&9) , pp. 3-10,1976.
[9] Venkayya, V. B., "Structural Optimization: A Review and Some Recommenda-
tions," Int. J. Num. Meth. Eng., 13, pp. 203-228, 1978.
[10] Lev, O. E., (ed.), Structural Optimization-Recent Developments and Applica-
tions, ASCE Committee on Electronic Computation, New York, 1981.
[11] Schmit, L.A., "Structural Synthesis-its Genesis and Development," AIAA J., 19
(10), pp. 1249-1263,1981.
[12] Haug, E.J., "A Review of Distributed Parameter Structural Optimization Liter-
ature," in Optimization of Distributed Parameter Structures, E.J. Haug and J.
Cea (eds.), Vol. 1, pp. 3-74, Sijthoff and Noordhoff, Alphen aan den Rijn, the
Netherlands, 1981.
19
Chapter 1: Introduction
[13] Ashley, H., "On Making Things the Best~Aeronautical Uses of Optimization,"
J. Aircraft, 19 (1), pp. 5-28, 1982.
[14] Kruzelecki, J., and Zyczkowski, M., "Optimal Structural Design of Shl'lls~A
Survey," SM Archives, 10, pp. 101-170,1985.
[15] Haftka, R. T., and Grandhi, R. V., "Structural Shape Optimization~A Survey,"
Computer Methods in Applied Mechanics and Engineering, 57, pp. 91-106, 1986.
[16] Bushnell, D., Holmes A. M. C., Flaggs, D. L., and McCormick, P. J., "Opti-
mum Design, Fabrication and Test of Graphite-Epoxy, Curved, Stiffened, Locally
Buckled Panels Loaded in Axial Compression" , in Buckling of Structures (ed. I.
Elishakoff et al.) Elsevier Science Publishers B. V., Amsterdam, pp. 61-131, 1988.
[17] Kirsch, U., "Optimal Topologies of Structures," Appl. Mech. Rev., 42, No.8, pp.
223-239, 1989.
[18] Friedmann, P. P., "Helicopter Vibration Reduction Using Structural Optimization
with AeroelasticjMultidisciplinary Constraints~A Survey," J. Aircraft, 28, No.
1, pp. 8-21, 1991.
[19] Sobieszczanski-Sobieski, J., "Structural Optimization: Challenges and Opportu-
nities," Int. J. Vehicle Design, 7, pp. 242-263, 1986.
[20] Prasad, B., and H aft ka, R. T., "Optimal Structural Design with Plate Finite
Elements," ASCE J. Structural Division, 105, pp. 2367-2382, 1979.
[21] Braibant, V., Fleury, C., and Beckers, P., "Shape Optimal Design: An Approach
Matching C.A.D. and Optimization Concepts," Report SA-109, Aerospace Lab-
oratory of the University of Liege, Belgium, 1983.
[22] Edgeworth, F. Y., Mathematical Physics, London, England, 1881.
[23] Pareto, V., Manuale di Economia Politica, Societa Editrice Libraria, Milano, Italy,
1906. Translated into English by A.S. Schwier as Manual of Political Economy,
MacMillan, New York, 1971.
[24] Zeleny, M., Multiple Criteria Decision Making, l\1cGraw-Hill Book Company, New
York, 1972.
[25] Stadler, W., "Natural Structural Shapes of Shallow Arches," J. App!. Mech, 44,
pp.291-298, 1977.
[26] Stadler, W., "Natural Structural Shapes (The Static Case)," Q. J. Mech. Appl.
Math., 31, pp. 169-217, 1978.
[27] Adali, S., "Pareto Optimal Design of Beams Subjected to Support Motions,"
Computers and Structures, 16, pp. 297-303, 1983.
[28] Bends0e, M.P., Olhoff, N., and Taylor, J.E., "A Variational Formulation for
Multicriteria Structural Optimization," J. Struct. Mech., 11 (4), pp. 523-544,
1984.
20
Section 1.7: References

[29] Stadler, W., "Applications of Multicriterion Optimization in Engineering and


the Sciences," in MCDM-Past decade and Future Trends, (Zeleny M., ed.), JAI
Press, Greenwich, Conn., 1984.
[30] Stadler, W., (ed.), Multicriteria Optimization in Engineering and in the Sciences,
Plenum Press, New York, 1988.
[31] Eschenauer, H., Koski, J., and Osyczka, A., Multicriteria Design Optimization:
Procedures and Applications, Springer-Verlag, New York, 1990.
[32] Wu, C.C., and Arora, J.S., "Simultaneous Analysis and Design Optimization of
Nonlinear Response," Engineering with Computers, 2, pp. 53-63, 1987.
[33] Haftka, R.T., "Integrated Analysis and Design", AlA A J., 27, 11, pp.1622-1627,
1989.
[34] Carmichael, D.G., Structural Modeling and Optimization, Halstead Press, Eng-
land, 1981.
[35] Palmer, A.C., "Optimal Structural Design by Dynamic Programming," J. Struct.
Div. ASCE, 94, No. ST8, pp. 1887-1906, 1968.
[36] Hajela, P., "Geometric Programming Strategies in Large-Scale Structural Syn-
thesis", AIAA J., 24 (7), pp. 1173-1178,1986.
[37] Moses, F., and Onoda, S., "Minimum Weight Design of Structures with Applica-
tions to Elastic Grillages", lnt. J. Num. Meth. Eng., 1, pp. 311-331,1969.

21
Classical Tools in Structural Optimization 2

Classical optimization tools used for finding the maxima and minima of functions
and functionals have direct applications in the field of structural optimization. The
words 'classical tools' are implied here to encompass the classical techniques of or-
dinary differential calculus and the calculus of variations. Exact solutions to a few
relatively simple unconstrained or equality constrained problems have been obtained
in the literature using these two techniques. It must be pointed out, however, that
such problems are often the result of simplifying assumptions which at times lack
realism, and result in unreasonable configurations. Still, the consideration of such
problems is not a purely academic exercise, but is very helpful in the process of
solving more realistic problems.

In recent years there has been an increased interest in the application of classical
tools, especially variational methods, in structural optimization. Mathematical for-
mulations of broad classes of structures as optimization problems have been achieved
by adopting variational methods. In addition, the study of classical problems not only
serves to portray the underlying principles of the techniques of classical methods, but
it serves an even more basic need in structural optimization. Closed form exact so-
lutions to classical problems serve to validate solutions obtained using more general
but approximate numerical techniques. More importantly, classical optimization is
perhaps the best vehicle for letting a student of structural optimization appreciate
fully the questions of the existence and uniqueness of the optimum designs, and the
establishment of the necessity and sufficiency of the optimality conditions. Such
questions can be rigorously answered for only the simplest problems of optimization
similar to those considered in this chapter.

2.1 Optimization Using Differential Calculus

In the absence of constraint equations a continuously differentiable objective function


f(Xl, X2,· .. , Xn) of n independent design variables attains a maximum or a minimum
value in the interior of the design space Rn only at those values of the design variables

23
Chapter 2: Classical Tools in Structural Optimization
x* for which the n partial derivatives

...... , (2.1.1)

vanish simultaneously. This is the necessary condition for the point x' to be a
stationary point. We will see in later chapters that this property proves to be a
valuable tool in locating the optimum solution. For a scalar valued function, the
vector of first derivatives is referred to as the gradient vector V f and is used for
finding search directions in optimization algorithms.
Development of a sufficient condition for a stationary point x' to be an extreme
point requires the evaluation of the matrix of second derivatives H of the objective
function. The matrix of second derivatives is also referred as the Hessian matrix and
defined as
~ ~

~l
a:"
8x l 8 X l 8x 2

H= (2.1.2)
~ ~
OXnOXl 8xn8x2 8x n
It can be proved that if the matrix of second derivatives evaluated at x* is positive-
definite then the stationary point is a minimum, if it is negative-definite then the
stationary point is a maximum point [1]. A symmetric matrix H is said to be positive
(negative)-definite if the quadratic form Q = xTHx is positive (negative) for every
x, and is equal to zero if and only if x = O. A computational check for the positive
and negative definiteness of a matrix involves determinants of the principal minors,
Hi(i = 1, ... , n). A principal minor Hi is a square sub-matrix of H of order i whose
principal diagonal lies along the principal diagonal of the matrix H . The matrix H
is positive-definite if the determinants of all the principal minors located at the top
left corner of the matrix are positive; and negative-definite if -H is positive definite.
Alternatively, -His positive definite if Hi is negative and the following principal
minors, H 2 , H 3 , •.. ,H n , are alternately positive and negative [1]. Another property
of positive (negative)-definite matrices can be used as a test. A symmetric matrix is
positive (negative)-definite if and only if all its eigenvalues are positive (negative).
A symmetric matrix H is called positive semi-definite if the quadratic form Q =
xTHx is non-negative for every x. This happens when the eigenvalues of the matrix
are non-negative. Unfortunately, the expected condition that the principal minors
are non-negative is not sufficient for positive semi-definiteness. If a matrix is positive
semi-definite but not positive-definite, then there exist at least one x =J. 0 such
that the quadratic form is zero, at least one of the principal minors is zero, the
matrix is singular, and at least one of the eigenvalues is zero. In that case higher
order derivatives of the function f are needed to establish sufficient conditions for
a minimum. Similarly, when -H is positive semi-definite then H is negative semi-
definite. If H is negative semi-definite but not negative-definite, we need higher
order derivatives to establish sufficient conditions for a maximum. Finally when H
is neither positive semi-definite nor negative semi-definite, it is called indefinite. In
that case the stationary point is neither a minimum nor a maximum but a saddle
point.

24
Section 2.1: Optimization Using Differential Calculus
Two simple examples demonstrate the use of differential calculus in finding opti-
mum structural configurations.

Example 2.1.1

The symmetric statically determinate truss structure shown in Figure (2.1.1) is


to be designed for minimum weight by varying the heights hI and h2 of the vertical
members. Because the truss is statically determinate the forces in the members are
independent of the cross-sectional area, so that the areas can be reduced until each
member is fully stressed (its stress is equal to the allowable stress (70).

Figure 2.1.1 Fully stressed minimu.m weight truss.

For the loading shown in the figure, the forces in each of the members can be
expressed in terms of the geometry of the structure as

P
F2 = - - , (2.1.3)
2

Fs =
[(hI - h2)2 + L2)t p .
(2.1.4)
2hI

If each member is to be fully stressed the cross sectional areas of the members Ai
can be related to the forces carried by the members as

i = 1, ... ,9. (2.1.5)

From Eq. (2.1.3) the cross-sectional area A3 of the horizontal members vanishes.
However, based on stability considerations these members may be assumed to have
a minimum area of Amin. The contribution of the weight of these members to the
total weight of the structure is independent of the design variables hI and h 2 , and

25
Chapter 2: Classical Tools in tructural Optimization
will be ignored for the mini .zation problem. The total volume of material in the
remaining truss structure is de sum of the products of the cross sectional areas and
the member lengths that catt~ expressed in terms of the unknown variables. It can
be shown that the remaining total volume is

(2.1.6)

Differentiating the volume with respect to the unknown variables we obtain

(2.1.7)

The resulting optimum values for the heights are

h~ = ~L, (2.1.8)

and the cross sectional areas of the members are equal to

(2.1.9)

The matrix of second derivatives of the objective function for the problem is

(2.1.10)

which, evaluated at the optimum values of the design variables, is

H* - 2 P J3 [ 1 -1/2] (2.1.11)
- aD L -1/2 1 .

The matrix H* is positive definite (check principal minors), thereby, proving the
sufficiency condition for the optimality of the design .•••

Example 2.1.2

Consider an inextensible structural cable with zero bending stiffness. The cable is
stretched by applying a horizontal force Fh at the ends of the cable, two points
separated by a distance L, and carries a vertical distributed load of intensity p(x),
Figure (2.1.2).
If the cross-sectional area of the cable is allowed to vary along its length so that
the axial stress is equal to the allowable stress aD, determine the optimum value of
the horizontal pull Fh that will minimize the total volume of material of the cable for
a uniform load of p( x) = PD'
26
Section 2.1: Optimization Using Differential Calculus
F;t ~

Figure 2.1.2 Structuml cable design.


Neglecting the weight of the cable, we obtain the equilibrium equations in the
horizontal and vertical directions of a cable as

F cos 0 = Fh = constant, (2.1.12)

where 0 is the angle between the horizontal coordinate axis x and the tangent to the
arc length coordinate s such that cosO = dsldx. For a uniform loading, the second
equilibrium equation can be solved for the vertical displacement along the length of
the cable by integrating twice and making use of the zero displacement conditions at
the two ends to yield
(2.1.13)

The total volume of material in the cable to be minimized is


L

V= / dV, (2.1.14)
o
where
dV = A(s)ds . (2.1.15)
With the assumption that the cross-sectional area is to be fully stressed, A(s) = Flam
the total volume can be expressed as

(2.1.16)

Since
(2.1.17)
Eq. (2.1.16) can be written as

V = Fh
ao
/[1 +
L

o
x
2
(ddy ) Jdx. (2.1.18)

27
Chapter 2: Classical Tools \ Structural Optimization
Substituting the first derh ive of the displacement function of Eq. (2.1.13) into
the above equation, we car:"';:'0w that the volume of the material is related to the
horizontal pull as,

(2.1.19)

If the horizontal pull is small, the volume increases because the cable becomes longer.
If, on the other hand, the horizontal pull is very large the cross-sectional area has
to be large in order to keep the stress level at a o , although the length of the cable
approaches the minimum distance between the support points.

The optimum value of the horizontal pull can be obtained from

dV
dFh = 0, (2.1.20)

which produces
F' _ PoL (2.1.21)
h -v'12'
This corresponds to a minimum total volume of

(2.1.22)

and an optimal cross-sectional area distribution of

dy PoL
1+(-)2=-
VII
-+(---)
x 2
(2.1.23)
dx ao 12 2 L

•••
Although applications of classical calculus can be demonstrated for many other
structures such as beams and arches, it is appropriate to mention the aspects and
assumptions which make these problems tractable using ordinary calculus. The truss
example discussed above, for example, could be treated by ordinary calculus because
of several simplifying assumptions. First, some of the potential design variables, such
as the cross-sectional areas of the truss members, were eliminated by assuming the
stresses in each member to be equal to the maximum allowable value. Second, the
analysis was simplified by neglecting the effect of selfweight of the truss members on
structural response, and by ignoring possible buckling of those members loaded in
compression. Most realistic structural optimization problems cannot be simplified to
the point where they can be solved by ordinary calculus.

28
Section 2.2: Optimization Using Variational Calculus

2.2 Optimization Using Variational Calculus

Some structural design problems, when formulated as optimization problems,


have an objective function in the form of a definite integral involving an unknown
function and some of its derivatives. Such forms, called functionals, assume a specific
numerical value for each function that is substituted into it. The task of the designer is
to find a suitable function that minimizes the functional. The branch of mathematics
that deals with the maxima and minima of functionals is called the Calculus of
Variations. Certain aspects of the methods used in the calculus of variations are
analogous to procedures used for differential calculus, and are discussed in this section.

2.2.1 Introduction to the Calculus of Variations

Consider the problem of determining a function y(x) given at two points, y(a) =
Ya and y(b) = Yb, for which the integral

J
b

J = F(x, y, y)dx, (2.2.1 )


a

assumes a minimum or a maximum value (y == dyjdx). The end conditions on y(a)


and y(b) are referred to as kinematic boundary conditions for the problem. In a more
general case F can be a function of more than one function (Yl, Y2, .... , Yp), and each
of these functions can depend on n independent variables (Xl, X2, ... , xn). Also, higher
order derivatives of these functions with respect to the independent variables may be
included in F. This brief introduction, however, is limited to a functional expressed in
terms of a single function with one independent variable. A more general discussion
of the methods of variational calculus is available in many textbooks (e.g., [2-4]).
Assuming y*(x) to be the function that minimizes our integral, consider another
function y(x) obtained by a small variation 8y from y*(x),

y(x) y*(x) + 8y = y*(x) + f1](X), (2.2.2)

where f is a small amplitude parameter and 1](x) a shape function. The function 1](x)
must satisfy the kinematic boundary conditions

ry(a) = 0, and ry(b) = 0, (2.2.3)


so that y(a) and y(b) will remain unchanged. We substitute Eq. (2.2.2) into the
integral (2.2.1), so that J becomes a function of only the perturbation parameter f

J
b

J( f) = F(x, y* + fry, y.' + fry')dx. (2.2.4)


a

29
Chapter 2: Classical Too.. n Structural Optimiza..-n

Knowing that the val~e of the integral J attains an extremum for E = 0, one can
use ordinary calculus to write the necessary condition

J(
b
OF dy + of dy') dx o. (2.2.5)
oy dE Oy'dE
a

Using Eq. (2.2.2) and defining E (dJ / dE 1.=0) to be the first variation of the functional
J denoted by fjJ we obtain

J
b
of of
fjJ = (oy fjy + Oyfjy')dx = 0. (2.2.6)
a

The variational operator fj is analogous to the differential operator in ordinary


calculus, and the same rules that apply to the differential operator apply to the
variational operator. The property of interchangeability of the two operators
,
E'r/
d'r/ = -E'r/
= E-
dx
d
dx
d
= -fjy
dx
= {j (d-dxy ) = {jy, , (2.2.7)

has been used in order to arrive at Eq. (2.2.6).


In the more general case F depends on more than one function and on higher
order derivatives of these functions with respect to the independent variable x. For
example, if

J
b

J= F{X,Yl,Y2,Yl',Y2',Y2")dx, (2.2.8)
a
then the condition that variation of the functional is zero may be written as

J(
b
fj J OFfj of fj' of of, of {j ")d
= ~ Yl
UYl
+"!lI
UYl
Yl + ~fjY2 + "!lIfjY2 + "?IIi Y2
UY2 UY2 UY2
X=0. (2.2.9)
a

The necessary condition for extremum expressed in the form of Eq. (2.2.6) or
(2.2.9) is usually not very useful. The terms that involve variation of derivatives
can be integrated by parts in order to obtain more useful conditions. For example,
integrating the second term of Eq. (2.2.6) and rearranging we write

fjJ = of {jylb + Jb [OF _


oy' a oy
(OF)] (jydx
dx oy'
~ = o. (2.2.10)
a

For our problem the first term oil the right hand side of Eq. (2.2.10) vanishes due
to the fact that the arbitrary function 'r/(x) satisfies the boundary conditions, 'r/(a) =
'r/(b) = o. By the definition of the variation it follows that
(jy(a) = (jy(b) = o. (2.2.11)

30
Section 2.2: Optimization Using Variational Calculus

Thus, the necessary condition for the extremum of J reduces to

(2.2.12)

Finally, since by is arbitrary, we conclude that the coefficient of by in Eq. (2.2.12)


must vanish identically over the interval of integration. Therefore, if y(x) is to min-
imize (or maximize) J, it must satisfy the following condition, known as the Euler-
Lagrange equation,
aF _~ (aF) =0.
ay dx ay'
(2.2.13)

If the value of the unknown function is not specified at either or both ends, then
the variation of y(x) need not vanish at those points. However, the first term on
the right hand side of Eq. (2.2.lO) must still vanish independently, in order for the
relation to hold. That is if y(x) is not prescribed at the end points the following
conditions, often called the natuml boundary conditions, must be satisfied.

aF]
[ay' x=a
=0
'
and [8F]
ay' x=b
= o. (2.2.14)

Example 2.2.1

A B x

y 1/2

Figure 2.2.1 Supported cable under its own weight.

Consider the problem of determining the equilibrium configuration y(x) of a flex-


ible, constant cross-section cable hanging under its own weight between two points, a
distance 1 apart, as shown in Figure (2.2.1). This is a rather well-known fixed point
problem of the calculus of variations.
The cable assumes a position that is consistent with its potential energy being a
minimum. Hence, to determine the equilibrium shape y(x) we need to minimize the
potential energy functional which can be expressed in terms of the unknown shape

J
function as
J = P9yds, (2.2.15)

31
Chapter 2: Classical Tools in Structural Optimization
where pg is the weight per unit length and ds is an element of arc length of the cable.
Relating the arc length to the horizontal coordinate x, with the origin at the center,
we rewrite Eq. (2.2.15) as

JYV +
1/2

J = pg 1 y,2dx. (2.2.16)
-1/2

At this point one can either take the variation of Eq. (2.2.16) or, since this is a
fixed-end-point problem, apply the Euler-Lagrange equation of Eq. (2.2.13) derived
previously. The resulting necessary condition for the potential energy to be minimum
reduces to the following ordinary differential equation

VI + y,2 - ~
dx J1+Y'2 - .
( yy' ) - 0 (2.2.17)

Expanding the second term and rearranging terms, we simplify Eq. (2.2.17) to

yy" - y,2 - 1 = 0 . (2.2.18)

Introducing dy/dx = t and d 2 y/dx 2 = tdt/dy, we rewrite Eq. (2.2.18) as

tdt dy
(2.2.19)
t2 +1 y

Integrating Eq. (2.2.19) once we obtain

t = dy =
dx
JCI
y2 - 1 . (2.2.20)

Finally, one more integration yields

(2.2.21 )

The condition
dYI = 0 (2.2.22)
dx 0

yields C2 = 0, while Cl can be determined from the condition

y( -l/2) = y(l/2) = Cl cosh(l/(2cd). (2.2.23)

Equation (2.2.21) is the equation of a catenary.•••

32
Section 2.3: Classical Methods for Constrained Problems

2.3 Classical Methods for Constrained Problems

Most practical structural optimization problems have constraints on design vari-


ables in the form of limits or algebraic relations in terms of these design variables.
These constraints may be related to the functional requirements of the design, geom-
etry, availability of the resources, or appearance and esthetic appeal. In this section
we will consider problems with equality constraints on the design variables. Although
these constraints mentioned above appear most often as inequality constraints, they
can be converted into equivalent equality constraints as will be discussed later on.
The general form of the equality constrained problem can be expressed in the
following form.
Minimize f(x) , x = (Xl, ... ,Xn)T,
subject to hj(x) = 0, j=l, ... ,n e , (2.3.1 )
where the number of independent equality constraints ne is less than or equal to the
number of design variables n. If the number of constraints is larger than the number
of design variables, then the problem is over constrained and, in general, there is no
solution.
There is more than one approach to solving problems posed in the form of Eq.
(2.3.1). If the equality constraints can be solved explicitly for ne design variables in
terms of a set of n - ne independent design variables, then the objective function can
be written in terms of the n - ne independent design variables. The new objective
function will not be subject to any constraints and can be minimized using techniques
discussed in the previous section.
For example, for a minimization problem with two design variables subject to a
single equality constraint
Minimize f(XI,X2)
subject to h(XI,X2) = 0, (2.3.2)
we can solve for one of the design variables from the constraint relation,
Xl = hc(X2) , (2.3.3)
and substitute into the objective function. The resulting new objective function,
(2.3.4 )
can be differentiated with respect to the independent design variable X2, and dfr/ dX2
can be set to zero to determine the optimum value of the X2. The optimum value of
Xl can then be obtained from Eq. (2.3.3).
The procedure outlined above is called variable-elimination or direct substitution
method. For problems in which the constraint equations cannot be solved explicitly,
for example, when the constraints are defined in terms of integrals, another method
called the method of Lagrange multipliers is used.

33
Chapter 2: Classical Tools in Structural Optimization
2.3.1 Method of Lagrange Multipliers

In essence, the method of Lagrange multipliers in calculus of variations is a direct


extension of the method for constrained minimization in differential calculus. We
start by reviewing the method as used in differential calculus. For an objective
function f(x) of n design variables to be a minimum, the differential change in the
objective function must still vanish.
of of of
df = ~dXl + ~dX2 + ........... + ~dxn = o. (2.3.5)
UXl U~ U~

However, now the derivative terms can not be set to zero individually because the
differential changes in the design variables (dXl, dX2, .... , dXn) are dependent on one
another through the constraint equations.
For simplicity, assume only a single constraint relation h(x) = 0, the differential
changes in the design variables are related through
oh oh oh
dh = ~dXl + ~dX2 + ........... + ~dxn = O. (2.3.6)
UXl U~ U~

We can multiply Eq. (2.3.6) by an arbitrary (for the time being) constant, A, and
add to the Eq. (2.3.5) to obtain (see [4])

(M+ M)
OXl AOXl dXl + (M + M)
OX2 AOX2 dX2 + ..... + (M + M)
oXn AoXn dX n = o. (2.3.7)

Let A be determined so that the quantities inside each of the parenthesis vanish
to satisfy the previous equation. This leads to n equations for the n + 1 unknowns,
the n design variables, and the unknown multiplier A called the Lagrange multiplier.
The constraint relation h(x) = 0 provides the requisite (n + l)th relation. Equations
(2.3.7) and (2.3.2) are exactly what one would obtained by an unconstrained mini-
mization of an auxiliary function f + Ah with respect to the design variables and the
Lagrange multiplier A.
For multiple constraint functions, one has to introduce a Lagrange multiplier for
each of the constraint functions. Therefore, in general an optimization problem with
an objective function with n design variables plus ne equality constraints stated in
Eq. (2.3.1) is equivalent to an unconstrained problem with an auxiliary function

L Ajhj .
n.
C(x, A) = f(x) + (2.3.8)
j=l

The optimum values of the design variables can be obtained by solving a system of
n + ne equations
i = 1, ... , n,
(2.3.9)
j = 1, ... ,n e ,

for n + ne unknowns.
34
Section 2.3: Classical Methods for Constrained Problems

Example 2.3.1

Figure 2.3.1 Design of trusses with displacement constraint.

Consider a general truss structure with n members under concentrated loads


acting at the junctions of the members. The objective of the design is to minimize
the total volume of material used for the members while specifying the displacement
6. in a given direction, and at a given point of the truss. In general, displacement
constraints at more than one location can be imposed, but for simplicity only one
displacement constraint is considered. Since the overall topology of the structure is
fixed, the only design variables are the cross-sectional areas A;, (i = 1, ... , n) of the
members.
In order to pose the problem as a constrained minimization problem, we need to
express the displacement constraint in terms of the design variables. The dummy-load
method (Method of Virtual Load), which is a special case of the principle of virtual
forces or the principle of complementary virtual work, is used for this purpose. The
principle of complementary virtual work states that the strains and displacements in
a deformable body are compatible and consistent with the support conditions if and
only if the total complementary virtual work is zero [5]

ow; + oW;; = 0 . (2.3.10)

Here oW; is the internal complementary virtual work and oWE is the complementary
virtual work of the external forces. The dummy-load method starts by applying
a unit virtual load at the point of unknown displacement along the displacement
component of interest. The internal complementary virtual work under this loading

J
can be expressed as
oW; = -oUO = - 8r'ijE ij dV , (2.3.11)
v
where OU· is the complementary strain energy, Eij is the strain field under the actual
loads, and Orrij is the virtual stress field due to the dummy-load. In absence of the

35
Chapter 2: Classical Tools in Structural Optimization

body forces the external complementary virtual work is

8W~ = J
s
u;btjdS, (2.3.12)

where Ui are the components of the surface displacements and 8tj are the components
ofthe applied virtual tractions. For a two dimensional truss structure with n constant-
cross section members, Eqs. (2.3.10), (2.3.11), and (2.3.12) yield
n
6. x 1 = L 8a;€;L;A; , (2.3.13)
;=1

where L; is the length of the ith member, E; is the strain due to the actual loads, and
8a; is the dummy stress in the ith member. Relating the stresses and strains to the
design variables, we can rewrite Eq (2.3.13) as

6. = ~ j;F; L. (2.3.14)
~AB
i=l t t
"
where J; and F; are the dummy and actual internal forces in the ith member, respec-
tively, and E; is the elastic modulus of the ith member.
We can now formulate the design problem in the standard form of Eq. (2.3.1) as
n
Minimize
;=1

subject to ~ ji F; Li - 6. = o. (2.3.15)
~AB
i=l 1 1

Introducing the Lagrange multiplier, we write the auxiliary objective function as

n
£(A , A) -- "~ kL·
I I
+ A (n
"
jF
- ' - ' L· -
~ AB. '
6.
)
. (2.3.16)
i=1 i=l 1 t

Then the necessary conditions for extremum are given by the following set of equa-
tions.
3£ j;F;
3k = L; - AA2EL; = 0, (2.3.17)
, , ,
3£ = ~ J;p; L; _ 6. = o. (2.3.18)
3A ~
i=1
AB
1 t

Solving for the cross-sectional areas from Eq. (2.3.17) in terms of the Lagrange
multiplier and substituting back into Eq. (2.3.18), we can determine the value of the
Lagrange multiplier in terms of the specified displacement 6. by

(2.3.19)

36
Section 2.3: Classical Methods for Constrained Problems

Then, the optimum values of the cross-sectional areas are

(2.3.20)

Note that the term inside the square brackets is a constant. We determine the
corresponding total volume of material by substituting Eq. (2.3.20) into the objective
function to obtain

(2.3.21 )

•••

2.3.2 Function Subjected to an Integral Constraint

For problems in which the unknown design variables are functions constrained by
functionals, variational calculus also employs Lagrange multipliers. Recall that for
the supported cable problem the Euler-Lagrange equation was obtained by allowing
the variation of the cable shape function by to be arbitrary, or in other words by
allowing y(x) to be completely unconstrained except for the kinematic boundary
conditions. However, if the function y(x) is required to satisfy a subsidiary integral
constraint of the form

J
b

g[y(x)]dx = c, (2.3.22)
a

then the extremum of the functional J[y(x)] can be determined by the use of the
Lagrange multiplier technique. In this case the necessary condition for an extremum
is the vanishing of the first variation of an auxiliary functional

(2.3.23)

In the following example we illustrate the use of this technique for determination
of the cross-sectional area distribution of minimum weight beams for a specified
displacement at a point along the span.

37
Chapter 2: Classical Tools in Structural Optimization
Example 2.3.2

p(x)

w(x)

Figure 2.3.2 Design of beams for a specified displacement [6}.

Consider a statically determinate beam of variable cross section A(x) loaded


by a concentrated and/or distributed loads and moments which produce a moment
distribution M(x) along the beam. We want to minimize the volume V of the beam
subject to the requirement that the displacement at a point x = ~ is equal to a
specified value ~ [6]. This problem, studied by Barnett [6], is formulated as

J
I

Minimize V = A(x)dx
o
subject to w(~) - ~ = O. (2.3.24)

A convenient expression for the displacement of a beam at a point x = ~ is


obtained again by using the method of virtual load discussed in the previous example
problem. That is

(C)
w..
= JI
M(x)m(x)d
EI(x) x, (2.3.25)
o
where m(x) is the moment distribution generated by a unit load applied at x =~, E
is the elastic modulus of the beam material, and lex) is the cross-sectional moment
of inertia. Since the cross-sectional area distribution function of the beam is the
design variable, the moment of inertia term has to be expressed in terms of the area.
Commonly, the beam moment of inertia function is related to the cross-sectional area
function as
lex) = a[A(x)t , (2.3.26)
where a is a constant related to some physical dimension of the cross-section, and
n is a constant that depends on the physical relation between the two functions.

38
Section 2.3: Classical Methods for Constrained Problems

Here we limit the constant n to the integer values of 1,2, or 3. The case of n = 1
is for a rectangular cross-section beam of constant depth whose width varies along
the length. Such a beam is sometimes referred to as a plane-tapered beam. The
case n = 2 is obtained when both the width and the depth of the cross-section vary
without changing its aspect ratio, and finally the case n = 3 is for a cross-section
with a variable depth and a constant width. The latter may be referred to as the
depth-tapered beam.
The auxiliary functional for the minimization problem, Eq. (2.3.24) takes the
following form.

c~ / A(x)dx + A [ / M~]~~X) dx -,,]. (2.3.27)

The necessary condition for the constrained minimum is the vanishing of the first
variation of this auxiliary functional. At this point we sct n = 1 in order to simplify
the following derivation. The first variation of Eq. (2.3.27) becomes

8C = J[1 -
I

AM(x)m(x)] 8Adx = O.
aEA2(x)
(2.3.28)
o
The corresponding Euler-Lagrange equation is

_ \ M(x)m(x) _
1 A aEA2(x) - 0, or (2.3.29)

The unknown Lagrange multiplier in Eq. (2.3.29) must be determined from the
displacement constraint in Eq. (2.3.24). That is, using Eqs. (2.3.25), (2.3.26), and
(2.3.29) in Eq. (2.3.24) we can extract

(2.3.30)

Then, the optimal area distribution and the corresponding volume are given by

(2.3.31 )

and

(2.3.32)

respectively.•••

39
Chapter 2: Classical Tools in Structural Optimization
2.3.3 Finite Subsidiary Conditions

The problems discussed in the previous section involve a rather simple integral
constraint that require a constant Lagrange multiplier in the auxiliary functional. In
a more general case, as mentioned earlier, we are interested in extremizing functionals
of several functions and their derivatives with respect to more than one independent
variable [see Eq. (2.2.8)]. In addition, there may be m finite subsidiary constraints
of the form

i = 1, ... ,m, (2.3.33)

imposed on the problem. These constraints may range from simple algebraic equa-
tions to highly complicated differential equations that must be satisfied at every point
over the entire domain of the problem.

The Lagrange multiplier method, in this case, still reduces to extremizing an


auxiliary functional of the form

(2.3.34)

The Lagrange multipliers, however, are no longer constants but functions of the
coordinates Xl, ... , X n .

Example 2.3.3

The problem described above can be best illustrated by a design example of a can-
tilever beams of prescribed volume and prescribed loads for minimum deflection. Ex-
cept for a slight change of notation, this example is based upon Makky and Ghalib's
solution [7].

-jb(x) ~
x
c=J h(x)
T
w(x)
Figure 2.3.3 Optimum Design of a Beam for Minimum Deflection.
40
Section 2.3: Classical Methods for Constrained Problems
Figure 2.3.3 shows an elastic cantilever beam fixed at the end x = 0, free at the
end x = I, and acted upon by a specified distribution of transverse loading q(x) per
unit of length. The objective is to minimize some norm of the transverse displacement
of the beam for a given total volume, Yo. The norm we choose is the integral of the
transverse displacement w over the length of the beam. The loading q( x) is restricted
to be unidirectional in order to render the norm appropriate.
The functional to be minimized, in this case, is an integral of the displacement
field w(x) which must satisfy the equation of equilibrium of the beam as well as the
constraint on the total volume of material. The equation of equilibrium is expressed
as
[s(x)w"]" - q(x) = 0, (2.3.35)
with boundary conditions
at x = 0: w = 0, and w' = o. (2.3.36)
at x = I: sw" = 0, and s'w" + sw'" = 0, (2.3.37)
sex) being the bending stiffness ofthe beam that can be related, through Eq. (2.3.26),
to the cross-sectional area of the beam by
sex) = EI(x) = aEAn(x) , n=1,2,or3. (2.3.38)

In addition to the subsidiary condition of Eq. (2.3.35), we must specify an integral


constraint on the total volume, namely

f
I

A(x)dx = Vo. (2.3.39)


o
The auxiliary functional is

C (w(x), ,(x), A(x), A" A,(x)) = / w(x)dx + A. [/ A(x)dx - Vo]

-f
I

A2(X) [sw"" + 2s'w'" + s"w" - q]dx, (2.3.40)


o
which must be stationary with respect to the functions w(x), sex), A(x), A2(X), and
the parameter Al. We note, however, that A(x) depends on sex) through Eq. (2.3.38).
Hence,
oA = (~~) os. (2.3.41 )

The first variation of £,

6C = /6WdXH. [ / Mdx] HA. [! Adx- Vol


41
Chapter 2: Classical Tools in Structuml Optimization

-J
1

bA2(x) [SW IIll + 2iw lll + SIlWIl - qJdx (2.3.42)


o

-J
1

A2(X) [15 SWilli + sbwllll + 2biwlll + slbwlll + bsllwll + sllbwllJdx =


0,
o
can be simplified by several integrations by parts. Collecting the terms multiplied by
arbitrary variations bw, bs, OA1, and bA2 and equating them to zero independently we
obtain the following Euler-Lagrange equations

bw: 1- (A~St = 0, (2.3.43)


bs: A dA - All wll = 0 (2.3.44)
1 ds 2 ,

J
I

bAl : A(x)dx-Vo=O, (2.3.45)


0
bA2 : SW IlIl + 2SlWlll + SIlWIl - q(x) = 0, (2.3.46)

together with the associated boundary conditions at x = 0 and x = I


Either Or
bs = 0, A2Wlll - A~WIl = 0, (2.3.47)
OSI = 0, A2W li = 0, (2.3.48)
bw = 0, A~IS + A~SI = 0, (2.3.49)
bw l = 0, A~S = 0, (2.3.50)
bw ll = 0, - A2S1 + A~S =0, (2.3.51 )
bwlll = 0, A2S = o. (2.3.52)

Equations (2.3.43) through (2.3.46) together with the associated boundary conditions
are general enough that they apply to simply supported as well as to clamped beams.
For the cantilever beam the boundary conditions are Eqs. (2.3.36) and (2.3.37).
Since the bending moment and the shear force at x = 0 cannot vanish because of the
unidirectional nature of the applied loading, the above conditions reduce to

A2(0) = 0, A~(O) = 0, (2.3.53)

A~(l)S(l) = 0, A~I(I)s(l) + A~i(l) = O. (2.3.54)

We can integrate Eqs. (2.3.43) and (2.3.46) twice and make use of both boundary
conditions of Eqs. (2.3.37) and (2.3.54) to get
1
SA~ = -(x - 1)2, (2.3.55)
2
42
Section 2.3: Classical Methods for Constrained Problems

and

(2.3.56)

from which
).."w" _ ~ (x - 1)2p(x)
(2.3.57)
2 - 2 (s(x))2
Combining the last equation with the second Euler-Lagrange equation (2.3.44), we
obtain
2( )dA _ (x -1)2p(x)
s x ds - 2)..1 . (2.3.58)

Specialization to Plane- Tapered Beams. The remainder of this problem will be


specialized to plane-tapered beams, n = 1, under a uniformly distributed load of
intensity q(x) = qo. Evaluating the distribution of p(x), we find that Eq. (2.3.56)
becomes
" (l-x)2 (2.3.59)
sw = qo 2
Also, for a plane-tapered beam

-dA
ds
1
= -erE = c2 = constant. (2.3.60)

Hence, Eq. (2.3.58) becomes

( ) = (x - 1)2
S X
2c
jf;0 Al
\ ' (2.3.61 )

and, therefore, the optimum distribution of the cross-sectional area is

A*(x) = c(x -1)2 f2i. (2.3.62)


2 V).~
The unknown Lagrange multiplier can be evaluated from the volume constraint of
Eq. (2.3.45) to be
).. _ c2 qol6
1 - 36V? .
(2.3.63)
°
The resulting optimal area and bending stiffness distributions are

A*( ) = 3Vo(x - 1)2 *() _ 3Vo(x - 1)2


x (3' an d s x - c2 [3 (2.3.64)

Substituting Eq. (2.3.64) into Eq. (2.3.59) and integrating it twice, we obtain the
deflection function corresponding to the optimal beam as

( ) _ c qol3
2 2
W X - 12Vo x , (2.3.65)

43
Chapter 2: Classical Tools in Structural Optimization
where the boundary conditions in Eq. (2.3.36) were used. The constant c for a
rectangular plane-tapered beam with constant thickness h and varying width b(x) is
2 12
c = Eh 2 . (2.3.66)

The resulting deflection function is

(2.3.67)

For comparison, consider an equivalent uniform beam of the same total volume
Vo, length l, constant thickness h, but a constant width

Vo (2.3.68)
bo = hl·

It is easy to verify that its deflection wo(x) satisfies


I
J w(x)dx 5
o
I g. (2.3.69)
J wo(x)dx
o
That is, the optimal beam is 1.8 times stiffer than the uniform beam of the same
volume.
Several other cases of loading with different types of beams, n = 1,2, and 3 may
be found in Ref. 7. Some of these cases form a part of the exercises at the end of the
chapter. •••

2.4 Local Constraints and the Minmax Approach

In many problems of structural optimization we have constraints that are local in


nature, such as stress constraints. In a beam design problem, for example, we may
require that the stresses do not exceed the yield limit anywhere in the beam. Such
constraints can be expressed as subsidiary conditions similar to Eq. (2.3.33), except
that the equalities are replaced by inequalities

i = 1, ... ,m. (2.4.1)

We can transform the inequalities back to equalities by subtracting slack functions,


ti's, and rewrite Eq. (2.4.1) as

9i (Xl, ... , :::) - t~ (Xl, ... , X n) =0 , i = 1, ... ,m. (2.4.2)

44
Section 2.4: Local Constraints and the Minmax Approach

J[t + t
The auxiliary functional of Eq. (2.3.34) becomes

c=
v
Aj(gj - m] dv. (2.4.3)

When we take the variation of C, the variation of tj will contribute -2 Iv Ajt;ot;dv.


Setting the coefficient of ot; to zero we get

tj)..j = 0, i = 1, ... ,m. (2.4.4)

This equation implies that the Lagrange multipliers are equal to zero when the slack
variables are not zero. That is, the Lagrange multipliers are zero at points in the
design space where the corresponding constraint is not critical. Equation (2.4.4) may
also be written as
i = 1, ... ,m, (2.4.5)
because tj = 0 if and only if gj = o. It can be shown that if we use Eq. (2.4.5), which
is called a constraint qualification equation, we can dispense with the slack functions
in the auxiliary functional. When we do that, we also dispense with the variation
of the auxiliary functional with respect to the Lagrange multiplier, and instead add
the inequality constraints to the optimality conditions. This treatment of inequality
constraints is demonstrated in the following example.

Example 2.4.1

Figure 2.4.1 Hanging cable: (a) general cross section; and (b) two constant-area
segments.

The cable in Figure 2.4.1(a) is loaded by a hanging weight W plus its own self-
weight. The cross-sectional area A( x) of the cable is to be designed for minimum
volume, subject to the constraint that the stress does not exceed an allowable value
of 0"0, and the cross-sectional area is not less than a minimum, Ao. We assume that

45
Chapter 2: Classical Tools in Structural Optimization

the load W is small enough to be supported by a minimum-area cable if the selfweight


is neglected. That is
W ~ Aao. (2.4.6)
We also assume that the cable is long enough so that selfweight requires the top part
of the cable to have a cross-sectional area larger than Ao.
The problem is statically determinate, with the axial load in the cable satisfying

p' + pA = 0, P(l) = W, (2.4.7)

where p is the weight density. The problem can then be formulated as

minimize 11 A( x )dx
such that A(x)ao - P(x) 2: 0, (2.4.8)
A-Ao2:0,
and p' + pA = 0 .

The Lagrangian functional is given as

£(A(x), P(x), >'1 (x), .A2(X), .A3(X)) =11 Adx + 11 .AI (Aao - P)dx

+ 11 A2(A - Ao)dx + 11 A3(P' + pA)dx .


(2.4.9)
We take the variation of £ to obtain

We integrate the term including OP' by parts to convert it to oP, and then set the
coefficients of oA and OP to zero to obtain

(2.4.11)

(2.4.12)
These equations are augmented with the two inequalities

Aao - P 2: 0, (2.4.13)

A - Ao 2: 0, (2.4.14)
the constraint qualification equations

AI(AaO - P) = 0, (2.4.15)

A2(A - Ao) = o. (2.4.16)

46
Section 2.4: Local Constraints and the Minmax Approach

and with Eq. (2.4.7).


To solve the above equations we note that near x = 0 we assumed that A > A o,
so that from Eq. (2.4.16) A2 = O. We can substitute Al from Eq. (2.4.12) into Eq.
(2.4.11) to get
(2.4.17)
This equation is easily solved to yield

A3 = (e Px /uo -l)/p, (2.4.18)


and then from Eq. (2.4.12)
(2.4.19)
These two equations are valid as long as A > Ao. From Eq. (2.4.19) we see that Al
is nonzero, so that from Eq. (2.4.15)
A(x) = P(x)/o-o when A> Ao. (2.4.20)
We can now construct the entire solution for A(x). At the bottom of the cable
A = A o, and from Eq. (2.4.7)

P=W+p(l-x)A o . (2.4.21 )
This solution becomes invalid when P exceeds Aoo-o, which from Eq. (2.4.21) happens
at x = Xt,
Xt = l _ Ao-o - W (2.4.22)
pAo
For x < Xt we have A > A o, so that P = Ao-o, and Eq. (2.4.7) can be replaced by
A' 0-0 + pA = 0 , (2.4.23)
This equation is easily solved to yield
A{x) = AoeP(x,-x)/uo , x < Xt . (2.4.24 )

•••
Another formulation of the problem in Example 2.4.1 is to find a cable with a
given volume that has the lowest possible stress. The objective function is
min max 0-( x) . (2.4.25)
A(x) O:O;:x:O;:l

This is an example of the so-called min-max problems that are common in structural
optimization. Min-max problems present a difficulty in that the maximum of a func-
tion does not have continuous derivatives. This can be seen by considering even the
simplest case of the maximum of a function at two points. Consider, for example,
the case when the cross-sectional area of the cable has to be piecewise constant to
keep down manufacturing cost. Figure 2.4.1(b) shows a case where the number of
segments is limited to two, and the design variables are the two cross-sectional areas
Al and A 2 .
47
Chapter 2: Classical Tools in Structural Optimizati01

0'2
~-----
~ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _-+A2

Figure 2.4.2 Discontinuity of maximum function.


Figure 2.4.2 shows a possible variation of the maximum stresses at each segment
as a function of A 2. Increasing A2 reduces the maximum stress in the bottom segment,
but increases it in the top segment. It is seen that the maximum of the stress over the
beam has a discontinuity in slope at the point where the the location ofthe maximum
jumps from one segment to the other.
For the cable of Example 2.4.1, the two formulations of minimizing volume or
minimizing the maximum stress for a given volume Vo are equivalent. That is, we
can guess a stress allowable and minimize the volume for this stress allowable. If the
resulting optimal volume is larger than Vo we increase the stress allowable and repeat
the optimization. Similarly, if the optimal volume is smaller than Vo we reduce the
stress allowable and reoptimize. However, for many problems it is not possible to find
an equivalent formulation that does not involve the minimum of a maximum. For
example, when we optimize the shape of a hole so as to reduce stress concentration,
we often do not have any constraint on volume. Therefore, we cannot transform the
problem to one of minimizing volume with a constraint on stress. Taylor and Bends('ie
[8] suggested an elegant solution to the problem of discontinuous derivatives. They
suggest replacing the objective function of Eq. (2.4.25) with another objective plus
a constraint equation
mill f3
A(x),!3
(2.4.26)
such that a(x):::; f3,
The additional design variable (J is the unknown stress limit that we want to keep as
low as possible. Now the objective function is equal to one of the design variables,
so that it is perfectly smooth. This tactic for converting a min-max problem to a
smooth problem by using an additional variable is very useful in many applications.
We demonstrate it for the cable problem of Example 2.4.1.

Example 2.4.2

Formulate the problem of designing the cable of Figure 2.4.1(a) for minimum max-
imum stress, subject to a limit Va on the available volume, and a lower limit Ao on
the cross-sectional area.

48
Section 2.5: Necessary and Sufficient Conditions for Optimality

The problem is first formulated as


minimize max a( x)
0$"'9
such that A - Ao ~ 0,

11 A(x)dx = YO,
(2.4.27)

and pI + pA = O.
Note that we have formulated the volume constraint as an equality rather than an
inequality constraint, because common sense tells us that the volume will be fully
utilized in order to minimize the stress. Next we replace the min-max formulation
with the Taylor-Bends0e 'beta' formulation
minimize f3
such that A(x)f3 - P(x) ~ 0,
A - Ao ~ 0,

11 A(x)dx = YO,
(2.4.28)

and p' + pA =0.


The solution of this problem is left as an exercise to the reader (Exercise 4) .•••

2.5 Necessary and Sufficient Conditions for Optimality

In the absence of inequality constraints, the Euler-Lagrange equations provide


the necessary conditions for optimality (Optimality Conditions). As opposed to the
use of Differential Calculus it is, however, not an easy task to verify whether such
necessary conditions are also sufficient conditions for optimality.
Sufficiency of optimality conditions can sometimes be established on the basis
of the variational principles of continuum mechanics. For a discretized model, using
the techniques of mathematical programming for optimization (Chapter 5), we can
establish sufficiency of the optimality conditions on the basis of convexity of the
objective function and the constraints. In general, however, establishing convexity of
the objective function and the constraints is again not an easy task. A vast majority
of the optimization problems are non-convex.
Thus, establishing the sufficiency of the optimality conditions or identifying local
and global optima is a question that cannot be answered fully for most situations.
Often we have to rely either upon engineering intuition or upon trial and error tech-
niques for answering such questions.
Using the variational principles of mechanics, we illustrate how the sufficiency of
the optimality conditions can be established for a select class of classical optimization
problems.
49
Chapter 2: Classical Too in Structural Optimization

2.5.1 Elastic Structures Maximum Stiffness

The development in this section is based on the work of Prager and his collaborators
(see Refs. 9,10).
Consider an elastic structure being acted upon by a load 2P at a point X. Assume
the load is such that it produces a unit displacement in its direction. Then by the
principle of conservation of energy [111

External work = Internal energy stored,

J
or
~(2P x 1) = P= s(X)e[Q(X)]dv, (2.5.1)
v

where e[Q(X)] is the specific elastic strain energy or the strain energy in a structure of
unit stiffness due to a strain field Q(X) produced by the prescribed unit displacement
at X, and s(X) is the specific stiffness of the structure at X. That is, s(X) is the
stiffness per unit length for one-dimensional structures and stiffness per unit of area
for two-dimensional structures.
Thus s(X) specifies the design of the structure while the function e[Q(X)] is inde-
pendent of design parameters. For instance s(X) and e[Q(X)1 for a one-dimensional
beam element would be EI(x) and 1/2(curvature)2, respectively.
We wish to design a structure of a given total stiffness so as to maximize the
magnitude P of the load producing the prescribed unit displacement at X. From
Eq. (2.5.1) it is clear that maximizing P subject to the integral constraint on specific

J
stiffness
s(X)dv = so, (2.5.2)
v

can be performed by seeking a stationary point of the auxiliary functional

(2.5.3)

The necessary condition for C to be stationary is given by

(2.5.4)

Since the structure is required to satisfy the equations of equilibrium for every
structural design, then by the principle of minimum strain energy (which is a special

50
Section 2.5: Necessary and Sufficient Conditions for Optimality

case of the principle of minimum potential energy for prescribed displacements) the
second term within the first integral vanishes yielding

j (e[Q(X)] - A) osdv = O. (2.5.5)


v
Thus for arbitrary variation Os we have
e[Q(X)] = A = constant. (2.5.6)

Equation (2.5.6) is the necessary condition for optimality. That is, the the stiff-
ness of an elastic structure is stationary for a given structural design if the specific
elastic strain energy is constant throughout the structure. We wish to examine if it is
also sufficient. To answer this question we assume two distinct designs sand s with
associated specific strain energies e[Q(X)] and e[Q(X)], both satisfying the constant
total stiffness constraint

j s(X)dv =j s(X)dv = so. (2.5.7)


v v

The loads P and P that correspond to sand s are


P =j s(X)e[Q(X)]dv , and P = j s(X)e[Q(X)]dv, (2.5.8)
v v

respectively. Subtracting P from P, we have

P - P = j s(X)e[Q(X)]dv - j s(X)e[Q(X)]dv. (2.5.9)


v v

Since Q(X) is also a kinematically admissible strain field for the design s, if we replace
Q(x) in the definition of P with Q(X) we are guaranteed by the principle of minimum
potential energy that

j s(X)c[Q(X)]dv ::; j s(X)e[Q(X)]dv . (2.5.10)


v v

Thus
P - P ~ j s(X)e[Q(X)]dv - j s(X)e[Q(X)]dv . (2.5.11)
v v

If the design s satisfies the optimality condition, Eq. (2.5.6), then

P - P ~ A j[s(X) - s(X)]dv. (2.5.12)


v

Finally, in view of Eq. (2.5.7) we have


P-P~O, or P~P. (2.5.13)

This implies that condition (2.5.6) is not only a necessary but also a sufficient
condition for optimality.

51
Chapter 2: Classical Tools in Structural Optimization

2.5.2 Optimal Design of Euler-Bernoulli Columns

\Ve consider the problem of maximizing the buckling load of an Euler-Bernoulli


column of a given volume or weight with cross-sections obeying I( x) = a[A( x )]n, n =
1,2, or 3. It is well known that the buckling load of a structure is given by the min-
imum value of the Rayleigh quotient over all kinematically admissible displacement
fields [11]. For an optimum column we want to maximize this minimum value by vary-
ing the distribution of material along the length of the column. Hence the present
problem can be posed as one of maximizing the minimum value of the Rayleigh
quotient for the buckling load

. f~ EI(x)w" 2 dx . f~ Ea[A(x)]nw"2dx
p = max mm = Inax mm / ' (2.5.14)
I(x) w(x) fo w ,2 dx
I
A(x) w(x) fo W ,2 dx
subject to the constant volume constraint
/

i A(x)dx = Vo. (2.5.15)


a
Using the Lagrange multiplier technique, we have

e = max min
A(x) w(x)
fo/
Ea[A(x)]nw" 2 dx
/
r w ,2 dx
- A [ iA(x)dx
/
- Va 1 (2.5.16)
Ja a
The necessary conditions for stability and optimality can be determined by requiring
the first variation of the Lagrangian to vanish, that is

_ 2 f~ Ea[A(x)tw"bw"dx
be - 1 -
2 f~ Ea[A(x)]n w" 2 dx
2
[1/ W
I I
bw dx
]

fa w ,2 d.T [f~ wl2dx] a

+ f~ nEa[A(x)]n-l
/ w"
r w,2dx
2
bAdx
-
l
A [i bA (x) dx 1 = 0 . (2.5.17)
Jo 0

The terms involving the variations of derivatives of w need to be integrated by parts.


After a rearrangement of terms, the coefficients of bw and bw' yield the stability
equation and the associated boundary conditions for every design A(x) while the
coefficient of bA yields the optimality condition.
Stability Equation: [EaAn(x)w"J" + pw" = O. (2.5.18)
Boundary Conditions: bw = 0, or [EaAn(x)w''J' + pw' = 0,(2.5.19)
I5w ' = 0, or EaAn(x)w" = O. (2.5.20)

Optimality Conditions: nEaAn- 1 w" 2 - A 11 w ,2 dx = o. (2.5.21 )

52
Section 2.5: Necessary and Sufficient Conditions for Optimality

Since the second term in equation (2.5.21) is a constant, the equation can be simplified
to
(2.5.22)
and this can be verified to be a statement of constant strain energy density in the
buckled mode shape of the optimum column.
The sufficiency of the optimality condition can be very easily established for the
case n=l. For this case Eq. (2.5.22) reduces to

(2.5.23)
We begin by assuming two distinct designs A( x) and A( x) both of which satisfy the
constant volume constraint (2.5.15) to yield

j(A -
1

A)dx = o. (2.5.24)
o
The corresponding buckling loads Per and Per with associated buckling modes wand
ill are given by

_ t EaAw" 2 dx
::.;0"--.,..-_ __ _
rl
Jo
- - 2
EaAw" dx
P
er - rl 2
Jo Wi dx
' Per = rl 2
JO Wi dx
(2.5.25)

Since the buckling mode w is also kinematically admissible for design A(x), by
the Rayleigh quotient Eq. (2.5.14), the magnitude of the quantity p defined by

= f~ EaAw" 2 dx
P= (2.5.26)
fo1 w ,2dx '

has the property that


p ~ Per· (2.5.27)
Subtracting Per from both sides of Eq. (2.5.27) and rearranging we have

Per - Per ~ Per - p. (2.5.28)

Thus,
_ > f~ E aw l/ 2 (A - A)dx
Per - Per - rl 2 (2.5.29)
Jo Wi dx
If the design A(x) satisfies the optimality condition (2.5.23), then by virtue of Eq.
(2.5.24)
Per - Per ~ 0, (2.5.30)
meaning that of all the designs with different cross-sectional shapes the one that
satisfies the optimality condition has the largest value of the critical load, thereby
establishing the sufficiency of the optimality condition.
53
Chapter 2: Classical Tools in Structural Optimization
Prager and Taylor [9J provide a similar sufficiency proof for the dual problem
namely the case of minimizing the volume or weight of an Euler-Bernoulli column for
a given buckling load.
Although it is difficult to prove the sufficiency of the optimality condition for
values of n other than I, explicit solutions for the optimum designs for all classical
boundary conditions are well known and are available from Refs. 12-16. Approx-
imate numerical solutions using the finite element displacement models have also
been reported by Refs. 17-20 for elastically supported columns with a very general
distributed axial loading and for portal frames.
Earlier works, especially those of Tadjbaksh and Keller [13]' assumed unimodal
behavior and did not allow for discontinuity in the slope and the shear force at places
where the area of cross-section vanished. Olhoff and Rasmussen [21J have shown that
the design of Tadjbaksh and Keller [13J for the clamped column is non-optimal and
have outlined more accurate bimodal numerical solutions with a constraint on the
minimum cross-sectional area. Olhoff and Rasmussen identify a threshold value for
the minimum area constraint below which the optimum clamped columns exhibit a
bimodal behavior. Papers by Masur [22,23]' Olhoff [24], and by Plaut, Johnson, and
Olhoff [25J outline less approximate and properly formulated multi-modal solutions
for the elastically supported columns.

Example 2.5.1

By way of illustration we outline the solution for one of the classical cases here while
relegating others to the exercises. Consider maximizing the critical load of a simply-
supported column oflength 1 subject to the constant volume constraint, Eq. (2.5.15).
An explicit solution to this problem was first outlined in [19J. We begin by listing
the governing equations and boundary conditions of the problem.

Stability Equation: [EaA 3 w"J" + pw" . (2.5.31 )


Boundary Conditions: w(O) = w(l) = 0, (2.5.32)
A3(0)W"(0) = A3 (l)w"(I) = O. (2.5.33)
Optimality Conditions: A 2w,,2 = c2 , or w" = ±c/A. (2.5.34)

A consequence of the boundary condition of (2.5.33) and the optimality condition


(2.5.34) is
A(O) = A(l) = O. (2.5.35)
Substituting the optimality condition into the stability equation we obtain

A,,2 + (32 = 0 (2.5.36)


A '
where
(2.5.37)

54
Section 2.5: Necessary and Sufficient Conditions for Optimality

The differential equation of (2.5.36) and the associated boundary conditions can be
solved by using a change of variables. Letting

A = u 1/ 2 , (2.5.38)

we can integrate once the differential equation to obtain

(2.5.39)

Cl being a constant of integration. The above equation can be integrated once more

J
giving
Ix - c21 = - du 1 • (2.5.40)
(Cl - 4jJ2u 1/ 2 )l"
Using another change of variables with Cl - 4f3 2 U 1/ 2 = t we can integrate the right-
hand-side of this equation once more to give

(2.5.41 )

The two constants of integration, namely Cl and C2, can be determined by using the
boundary condition given in Eq. (2.5.35) which yields

(2.5.42)

consequently
(2.5.43)
The optimal value of the cross-sectional area at any point along the length of the
column can, therefore, be determined from Eq. (2.5.41).
To determine the critical load parameter 13 we use the volume constraint

J J J J ul/2~:
1 1/2 1/2 u(I/2)

A(x)dx =2 A(x)dx =2 U 1/ 2 dx =2 du = Vo, (2.5.44)


0 0 0 0

or from Eq. (2.5.39)

J
o
1

A(x)dx = Vo = 2 J
uU/~

0
(Cl _
~~
4f3 2 u 1/ 2 )1/2du. (2.5.45)

The right-hand side of this equation can be integrated to obtain

Vro -- (Cl - 4f3


2 12
8136u / )1/2 [_2
CJ:
_ ~C (C _
3 1 1
4f3 2U 1/ 2 ) + l(c
5 1
_ 4f3 2 u 1/ 2 )2] uU/2)
0"
(2 5 46)
.

55
Chapter 2: Classical Tools in Structuml Optimization

Recalling the definition of u, we can find the value of u(I/2) from Eqs. (2.5.41) and
(2.5.43) as
(2.5.47)

Substituting Eq. (2.5.47) and the value of the constant CI from Eq. (2.5.43) into Eq.
(2.5.46) we determine the optimum value of the load parameter and the critical load
to be
2 (15Vo)3 125 EaV03
f30p t = 2431 5 ' and (Per )opt = 9 - [ 5 - . (2.5.48)

Al
Vo
1.5

1.0

0.5

0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.6 0.7 0.8 0.9 1.0 X 11

0.0 0.05 0.10 0.20 0.30 0.40 0.50


xll 1.0 0.95 0.90 0.80 0.70 0.60

Al 0.0 0.58540 1.02345 1.22751


0.78730 1.15651 1.25000
Vo

Figure 2.5.1 Area distribution for the column.

In comparison to a constant area beam with Ao = Vo/l and 10 = al~3 /1 3

125 EIo 7r 2 EIo


(Pcr)opt = 9 r = 1.41-12- = 1.41POcr· (2.5.49)

That is the optimum depth-tapered column is 41 % stronger than the corresponding


uniform column of the same volume. With CI, C2, and f3 known, A(x) is completely
known from Eq. (2.5.41). Figure 2.5.1 shows this area distribution along the length
of the column. Notice the undesirable feature of zero areas of cross-section at the
two ends of the column. This is a consequence of not having specified a lower bound
constraint on the area distribution .•••
Optimum design of Thin Plates for Stability. For a column, the axial stress-
resultant in the pre buckling state is independent of changes in cross-sectional areas

56
Section 2.5: Necessary and Sufficient Conditions for Optimality

along the length of the column. However, this is not true for thin plates. The in-
plane stress-resultants in the pre buckled state of a thin plates are indeed functions
of the thickness distribution. The problem of optimizing thin plates for stability is,
therefore, significantly more complicated than that for a column.
The situation is not as bad for thin circular plates, for which the resulting govern-
ing equations (stability and optimality) are ordinary nonlinear differential equations
which can be solved approximately by some numerical schemes like those proposed
by Frauenthal [26].
The problem is more complicated for thin rectangular plates which are gov-
erned by nonlinear partial differential equations. For instance, questions about the
uniqueness of solutions are not as easily answered. Under the assumption of in-
extensional prebuckling deformations, which lead to thickness-independent in-plane
stress-resultants in the pre-buckled state, a condition of uniform strain energy density
has been established as being the optimality condition for such plates [27]. Even so,
optimization of plates on the basis of such assumptions have led to unsatisfactory
solutions for plates with aspect ratios close to unity.
Armand and Lodier [28] have attempted to explain this difficulty in optimizing
plates by linking it to the existence of infinitely many local extrema rather than
a single global optimum. According to this explanation, the solution obtained by
Frauenthal [26] is only a local optimum in the class of continuous thickness distri-
butions. Simitses [29] has shown that for the same volume, stiffened circular plates
yield much higher buckling loads than Frauenthal's optimum plate. Similarly, Kamat
[27] who optimized finite element models of rectangular plates, observed discontin-
uous thickness distributions that exhibit a tendency toward formation of ribs, and
suspected that stiffened plates would be superior. Haftka and Prasad [30], in their
survey paper on optimum structural design with plate bending elements, explain the
radically different designs obtained for the same problem by different researchers by
offering the conjecture that rib-stiffened plates are better than optimum plates with
continuous thickness distributions. Olhoff [31] provides a mathematical justification
for this behavior and for the questions of singularities and local optima in plates. The
reader is referred to the monograph by Gajewski and Zyczkowski [32] for additional
references on this topic.

2.5.3 Optimum Vibrating Euler-Bernoulli Beams

The fundamental frequency of free vibration of a beam is given by the minimum


value of the Rayleigh quotient

2 . I~ Ea[A(x)]n w" 2 dx
w = max min (2.5.50)
A(x) w(x) IoI pA(x)w2 dx '

over all kinematically admissible displacement fields [11]. However, even though both
stability and vibration of Euler-Bernoulli beams are governed by a similar eigenvalue
system, the criteria for optimization of freely vibrating Euler-Bernoulli beams are
57
Chapter 2: Classical Tools in Structural Optimi: tion
different from those for Euler-Bernoulli columns.-.bnlike the case of columns, the de-
nominator of the Rayleigh quotient for a freely vibrating beam involves the structural
mass which is a function of the cross-sectional area.
Consider the problem of the maximization of the fundamental frequency of a
freely vibrating Euler-Bernoulli beam of a specified volume YO and specific mass p .
We again assume that I(x) = a[A(xtJ, n = 1,2, or 3.
The equation of motion of the beam and the necessary optimality condition are
then obtained by maximizing the minimum value of the Rayleigh quotient, w 2 , subject
to the constant volume constraint. In other words starting with the Lagrangian

£ = max min faI


A(x) w(x)
Ea[A(xWw" 2 dx
I
fa pA(x)w 2dx
- A[ J
a
I
A( x )dx - Va ] , (2.5.51 )

which is a functional of the functions w(x) and A(x), and setting its total variation
with respect to both functions to zero we get

8£ =
2 f/a Ea[A(xWw I 8w"dx
I -
2 ~l Ea[A(xWw"2dx
a 2
[1 1
pA(x)w8wd:r
]

fa pA(x)w 2dx [f~ pA(x)w 2dx] a

+f~ nEa[A(x)Jn-l w" 28Adx _ f~ Ea[A(x)tw"2dx [/1 PW 8A(X)dX]


2

f~ pA(x)w 2dx [f~ PA(x)w 2 d.rf


+.1 [j hA(x )dx 1~ 0 . (2.5.52)

Integrating by parts the first term on the right-hand side of the above equation and
collecting the coefficients of 8w and 8A, we obtain the

Equation of Motion: [EaAnwIJ" - w2pAw = O. (2.5.53)


Boundary Conditions: 8w = 0, or [EaAnw"J' = 0, (2.5.54)
8w' = 0, or [EaAnw"J = O. (2.5.55)
Optimality Condition: nEaAn- 1 w" 2 - w2 pW 2 = constant. (2.5.56)

Equation (2.5.56) can be interpreted to imply that the Lagrangian energy density
must be uniform in the fundamental mode of an optimum vibrating beam.
As with columns the sufficiency of this optimality condition can be easily demon-
strated for the case n = 1. For this case Eq. (2.5.56) reduces to

(2.5.57)

58
Section 2.5: Necessary and Sufficient Conditions for Optimality
We begin by assuming two designs A( x) and A( x) both of which satisfy the constant
volume constraint, Eq. (2.5.15) and hence also Eq. (2.5.24). Assume wand w to be
the fundamental frequencies and wand ill to be the associated fundamental modes
corresponding to the two designs A(x) and A(x), respectively. Thus

(2.5.58)

Since w is kinematically admissible for w2 , by the Rayleigh quotient [11) we are


guaranteed that

~2
rl
Jo
E a A- w 112dx -2
W = 1 >w (2.5.59)
10 pAw 2 dx -
But

(2.5.60)

and

w 211 pAw2dx = 11 EaAw" dx. 2 (2.5.61 )

Subtracting Eq. (2.5.61) from Eq. (2.5.60) we get

(2.5.62)

Now assume that the design A(x) is one that satisfies the optimality condition, Eq.
(2.5.57). Equation (2.5.62) can then be written as

[;;21 1pAw dx -
2 w2 11 pAw dx = 11 (A -
2 2
A) ( c + w pw 2 )dx , (2.5.63)

which upon simplification and use of Eq. (2.5.25) yields

(2.5.64)
In light of Eq. (2.5.59) it follows that

(2.5.65)
59
Chapter 2: Classical Too in Structural Optimization
thereby establishing the efficiency of the optimality condition of Eq. (2.5.57).
It should be noted tl t the same optimality condition can be shown to hold for
the dual problem of the I' imum weight design of the beam for a specified frequency.
Several similar examples optimization with frequency constraints may be found in
Refs. 33-38. In particui Turner [34] and Taylor [35] provide exact solutions for
axially vibrating minimc: _ aass structures at specified natural frequencies.
As in the case of columns, several approximate numerical solutions using the finite
element displacement method are available for maximum fundamental frequency of
elastically supported vibrating beams of fixed weight carrying a combination of con-
centrated and distributed non-structural masses and subjected to upper and lower
bounds on cross-sectional areas. For examples of this kind of approximate designs,
see Refs. 39 and 40. By comparison, published literature on the more practical dual
problem of minimizing the weight of beams for specified lower bounds on natural
frequencies and upper and lower bounds on design variables appear to be limited
[41]. It is not clear whether the primal and dual problems in this case are always
equivalent [41,42].
In closing this topic of vibrating beams, it is appropriate to point out that the
same optimality condition, Eq. (2.5.57) also applies to the optimum design of sand-
wich beams under the constraint of prescribed deflection at the point of application of
a single concentrated periodic load (e.g. see Icerman [43]). A more general optimal-
ity condition for the constraint of a prescribed deflection at a specified point under
a general distributed loading has been provided by Plaut [44]. For sandwich beams,
Plaut has shown that it is possible to establish the sufficiency of the optimality con-
dition on the basis of the principle of stationary mutual potential energy introduced
by Shield and Prager [45]. A mathematically more rigorous study of this problem
using the dynamic compliance of the structure as a constraint has been provided by
Mroz [46]. A very extensive bibliography on the topic of optimization for dynamic
response may be found in the survey papers referenced in Chapter 1.
Optimum design of Thin Plates for Vibration. The problem of the optimum
design of thin plates for vibration is not beset with the difficulty (encountered in
the design for buckling) associated with the dependence of the prebuckling stress-
resultants on the thickness distribution. That may explain why the problem of the
optimum design of thin plates for vibration appears to have received a greater at-
tention than the corresponding problem for stability. Haftka and Prasad [30] have
provided an extensive bibliography on the optimum design of plate bending elements
for vibration.
The solution to the problem of the optimum design of a circular plate for vi-
bration was first provided by Olhoff [47]. Olhoff showed (sec Exercise 8) that under
the assumption of a rotationally symmetric lowest mode, the problem reduces to
an ordinary, fourth-order, nonlinear, singular but homogeneous eigenvalue problem.
An approximate numerical solution to this problem was generated, but the solution
so obtained is only a local optimum belonging to the class of continuous thickness
distributions. For the same volume, it is easy to devise stiffened circular plate con-
figurations that possess far higher fundamental vibration frequencies than that of
Olhoff's original solution [47].

60
Section 2.6: Use of Series Solutions in Structural Optimization
For rectangular plates, the optimum designs of finite element models that allow
discontinuous thickness distributions again exhibit a tendency to distribute the ma-
terial of the plate along discrete ribs [48-50]. For the same volume, a stiffened rectan-
gular plate can be expected to have much higher fundamental frequency of vibration
than that of a plate optimized on the basis of a continuous thickness distribution.

2.6 Use of Series Solutions in Structural Optimization

The methods of calculus of variations discussed in the previous sections are ide-
ally suited for simple problems where the unknowns are design functions such as
area distributions. These problems are called distributed parameter optimization
problems.
Another approach for solving distributed parameter problems which are not sim-
ple enough to be attacked by the methods of Variational Calculus is the use of series
solutions. The basic idea is to assume a series representation of the unknown design
function within the domain of the structure along with the assumed response func-
tions such as displacements. In general, therefore, the series solution method reduces
continuous mathematical programming problems to discrete ones with a finite number
of design variables. These variables are the coefficients of the series representation of
the unknown design function. This idea was initially presented by Balasubramanyam
and Spillers [511 who solved various vibration and buckling problems using Fourier
series representation of the cross-sectional area of beam and column structures. A
similar procedure was recently used by Parbery [521 to obtain minimum-area shapes
for desired torsional and flexural rigidity. The method will be demonstrated by the
following example.

Example 2.6.1

The optimum design of a buckling critical simply supported column is repeated in this
example [51] to demonstrate the use of Fourier series approach. As in the examples
discussed earlier there is a fixed material volume constraint, see Eq. (2.5.15). The
objective is to find the cross-sectional area distribution of a plane-tapered column
that maximizes the buckling load. That is, the cross-sectional area distribution is as-
sumed to be related to a change in width (direction perpendicular to the deformation
direction) of a rectangular section with constant depth. This corresponds to n = 1
in Eq. (2.3.26).
We start with the governing stability equation for the problem
EaA(x)w" + pw = 0 . (2.6.1 )
Expanding the unknown quantities in two-term truncated Fourier series we have
. 7rX • 37rX
W = a1slllT +a3 s111 L , (2.6.2)

61
Chapter 2: Classical Tools in Structural Optimization
27rx
A(x) = {30 - {32 cos L ' (2.6.3)

Note that the boundary conditions of the column eliminate the need for a constant
term in Eq. (2.6.2). Because of the expected symmetry of the mode shape and the
cross-sectional area distribution a2 and {31 terms are omitted from the Fourier series.
The selection of the cosine representation for the cross-sectional area makes it possible
to reduce the products of Fourier series (Aw") directly to a single series.
The key strategy in this application is to reduce the number of unknown terms
by substituting these assumed forms into the appropriate equations. Equating the
coefficients of the similar trigonometric functions one obtains algebraic equations that
must be satisfied by these coefficients. For example, using Eq. (2.4.15) we can show
that

{30 = Vo (2.6.4 )
L'
Substituting {30 back to Eq. (2.6.3) and then using Eq. (2.6.2) we obtain the following
product
" 7r 2 [ aVo a{32 9a{32 . 7rX
aA(x)w = -( -) (-al + - a l - --a3) sm-
L L 2 2 L
-a{32 9aVo . 37rX 9a{32 . 57rX]
+(--al + --a3)sm- - --a3 sm - (2.6.5 )
2 L L 2 L'
where the trigonometric identity

2 cos A sin B = sin(A + B) - sin(A - B) , (2.6.6)

has also been used. Using Eq. (2.6.5) in the equilibrium equation and equating the
coefficients of the sine terms we obtain the following algebraic equations.

(2.6.7)

7r 9a Vo a{32
-E(y)
2
(Ta3-Tal)+pa3=0, (2.6.8)

E( ~ )2( 9a{32 a3) = O. (2.6.9)


L 2

For a nontrivial solution, the determinant of the coefficient matrix for the un-
known mode shape (aI, a3)Y must vanish. This results in the following quadratic
relation for the buckling load p in terms of the only unknown coefficient (32 left in
the problem

(2.6.10)

62
Section 2.7: Exercises
The expression for the critical load is given by

= E( {ya [20% + (3 ± (256 vi _ 32 V0L(32 + 37(32)t] (2.6.11)


P 4 L 2 £2 2

In order to determine the value of (32 that maximizes the buckling load we take the
derivative of Eq. (2.6.11) with respect to the unknown parameter (32 and equate it
to zero. Resulting optimum value of (32 is

(3 * - 32 Vo (2.6.12)
2 - 37 L '

and the corresponding optimum value of the buckling load is

• 45 1[2 EaVo 45
Pcr = 37 L3 = 37 POcr . (2.6.13)

where POcr is the buckling load of the constant-cross-section column of volume Vo.
Although 22% stronger than the constant cross-section column of the same volume,
this design is inferior to the design obtained in Example 2.5.1. In that example the
change in area was achieved by varying the depth of the cross section keeping the
width constant (n = 3). Clearly modifying the depth of the cross section is a more
effective way of achieving increased buckling resistance. Example 2.5.1 repeated with
n = 1 results in a quadratic distribution of the cross-sectional area with a critical load
of P~r = 12/1[2pOcr , which is almost identical to the result obtained in Eq. (2.6.13).
Moreover, the advantage of this method over other classical methods is in its ability
to deal with more general structural problems under a variety of load conditions that
may not be possible to solve using variational calculus .•••

The success of the series solution in optimization is closely related to the form
of the series chosen for the representation of the unknown function. In order to keep
the number of design variables to a minimum, only few terms in the series represen-
tation should be used. But, with a small number of terms used in the series, the
approximation of the solution of the governing differential equations of the problem
may be poor. Selection of the two-term approximation for the mode shape in the
example just covered makes it possible to come up with a one parameter solution for
the maximum buckling load in a closed form. However, it is important to note that
the two-term solution shown above does not satisfy the equilibrium equation exactly.
The last term in Eq. (2.6.5), when substituted into the equilibrium equation does
not vanish. If, on the other hand, one uses too many terms in the series finding the
optimum values of the coefficients of the terms becomes difficult, and may require
the use of a formal search technique. A simple way of reducing the number of design
variables without the loss of accuracy is to use possible symmetry inherent to the
problem so that only a part of the geometry needs to be modelled. A good example
of this approach is demonstrated in [52] where three-fold symmetry is used for the
cross-sectional shape of a bar in torsion.

63
Chapter 2: Classical Tools in Structural Optimization
2.7 Exercises

1. The equations of equilibrium and the associated boundary conditions of an elastic


structure can be obtained by requiring that the potential energy of the structure be
a minimum. Illustrate this for a cantilever Euler-Bernoulli beam. Comment on the
types of boundary conditions at the two ends of the beam. Assume that the potential
energy of such a beam is given by
I

rr=~jaEdv- jq(X)WdX,
v 0

where
rf-w
E = -y dx2 ' and a = EE,
and q(x) is the distributed external transverse loading acting along the beam.
2. Solve Example Problem 2.3.2 for q = qo, ~ = 1/2, assuming n = 2 and n = 3.
3. Solve Example Problem 2.3.3 for the following cases
a)n = 1; q(x) = qo(l- x)/l.
b)n = 1; q(x) = 4qo(lx - x 2)/12.
c)n = 2; q(x) = qo.
d)n = 2; q(x) = qo(l- x)/l.
e)n = 3; q(x) = qo.
J)n = 3; q(x) = qo(l- x )/1.
4. Solve Example 2.4.2.
5. Determine the optimum area distribution and corresponding buckling loads of the
following Euler-Bernoulli columns subject to the constant volume constraint;
a) cantilever column, n=l, 2, and 3.
b) simply-supported column, n = 1,2.

6. The Rayleigh quotient for an axially vibrating bar with an attached non-structural
mass m at the free end x = I is given by

2 f~ EA(x)u '2 dx
W = I .
fo pA(x)u 2 dx + m[u(l)]2
a) Derive the equation of motion and the optimality condition for the minimum
mass design of the bar with a specified fundamental frequency WOo

64
Section 2.8: Exercises
b) Verify Turner's solution [34] that for such a bar

A{x) =f3 m tanh,Bl


p
[COS~:l]2
cos x
, and Vo = mp sinh2 f3l ,
u{x) =sinhf3x/sinhf3l.

where f3 = ";wgp/ E.
7. Begin with a Rayleigh quotient similar to that of the previous problem for a
vibrating cantilever beam of sandwich construction. Assume that the beam carries a
distributed non-structural mass m(x) »pA(x). Verify Taylor's solution [35] that the
area distribution A(x) for the optimum beam with a specified fundamental frequency
Wo is given by

-
I

A(x) = ;~2 J(~ x)em(~)dC


x

where 2c2 = I(x)/A(x), and ~ = x/l.


8. Show that the fundamental frequency for a thin circular plate of radius a and
thickness distribution function t( r) is given by
1
J h3(~)[wI/2 + 2vw"w' /~ + (w' /02J~d~
w2 = min -"-0_ _ _ _ _ _ _ _ _ _ _ _ __

w(x) 1
J h(~)W2~d~
o
where

J
a
c-!:.. ,
<,- 27rtrdr,
and Vo =
a
o
with primes denoting differentiation with respect to the non-dimensional radial co-
ordinate ~.
Also, show that the optimality condition for maximizing the fundamental fre-
quency of such a plate with a specified volume Vo,

can be reduced to imply a constant Lagrangian density in the fundamental mode.


9. Solve Example Problem 2.6.1 assuming n = 2.
10. The governing equation of motion for the steady-state forced vibration of a
simply-supported Euler-Bernoulli beam is given

(Elw")" - pAw2 w = q(x, t),


65
Chapter 2: Classical Tools in Structural Optimization
where the applied transverse load q(x, t) = qo sinwt and the area distribution is
related to the moment of inertia of the cross-section by I(x) = a[A(x )]n. Determine
the optimal distribution of the cross-sectional area for n = 1 and 2 such that the
center displacement, w(l/2) is minimized subject to a specified constant volume, Vo,
constraint. Assume two-term symmetric solution for the displacement and the area
distribution.

2.8 References

[1] Hancock, H., Theory of Maxima and ~Jinima. Ginn and Company, New York,
1917.
[2] Gelfand, I.M. and Fomin, S.V., Calculus of Variations. Prentice Hall, Inc., En-
glewood Cliffs, NJ, 1963.
[3] Pars, L.A., An Introduction to the Calculus of Variations. Heinmann, London,
1962.
[4] Hildebrand, F. 13., lvlethods of Applied Mathematics. Prentice-Hall, New Jersey,
1965.
[5] Reddy, J.N., Energy and Variational Methods in Applied l\Iechanics. John Wiley
and Sons, New York, 1984.
[6] Barnett, R.L., "Minimum 'Weight Design of Beams for Deflection," J. EM Divi-
sion, ASCE, Vol. EMl, 1961, pp. 75-95.
[7] Makky, S.M. and Ghalib, M.A., "Design for Minimum Deflection," Eng. Opt., 4,
pp. 9-13, 1979.
[8] Taylor, J.E., and nends(le, M.P." "An Interpretation for Min-Max Structural
Design Problems Including a IvIethod for Relaxing Constraints," International
Journal of Solids and Structures, 30, 4, pp. 301-314, 1984.
[9] Prager, \V. and Taylor, J.E., "Problems of Optimal Structural Design," J. App!.
Mech. 35, pp. 102-106,1968.
[10] Prager, \V., "Optimization of Structural Design," J. Optimization Theory and
Applications, 6, pp. 1·21,1979.
[11] Washizu, K., Variational Methods in Elasticity and Plasticity. 2nd ed. Pergamon
Press, 1975.
[12] Keller, J.B., "The Shape of the Strongest Column," Arch. Rat. Mech. Ana!' 5,
pp. 275-285, 1960.
[13] Tadjbaksh, I. and Keller, J.B., "Strongest Columns and Isoperimetric Inequalities
for Eigenvalues," J. App!. Mech. 29, pp. 159-164,1962.

66
Section 2.8: References
[14] Keller, J.B. and Niordson, F.L, "The Tallest Column," J. Math. Mech., 29, pp.
433-446,1966.
[15] Huang, N.C. and Sheu, C.Y., "Optimal Design of an Elastic Column of Thin-
Walled Cross Section," J. Appl. Mech., 35, pp. 285-288, 1968.
[16] Taylor, J.E., "The Strongest Column ~ An Energy Approach," J. Appl. Mech.,
34, pp. 486-487, 1967.
[17] Salinas, D., On Variational Formulations for Optimal Structural Design. Ph.D.
Dissertation, University of California, Los Angeles, 1968.
[18] Simitses, G.J., Kamat, M.P. and Smith, C.V., Jr., "The Strongest Column by the
Finite Element Displacement Method," AIAA Paper No: 72-141, 1972.
[19] Hornbuckle, J.C., On the Automated Optimal Design of Constrained Structures.
Ph.D. Dissertation, University of Florida, 1974.
[20] 'furner, H.K. and Plaut, R.H., "Optimal Design for Stability under Multiple
Loads," J. EM Div. ASCE 12, pp. 1365-1382,1980.
[21] Olhoff, N.J. and Rasmussen, H., "On Single and Bimodal Optimal Buckling
Modes of Clamped Columns," Int. J. Solids and Structures, 13, pp. 605-614,
1977.
[22] Masur, E.F., "Optimal Structural Design under Multiple Eigenvalue Con-
straints," Int. J. Solids Structures, 20, pp. 211-231, 1984.
[23] Masur, E.F., "Some Additional Comments on Optimal Structural Design under
Multiple Eigenvalue Constraints," Int. J. Solids Structures, 21, pp. 117-120, 1985.
[24] Olhoff, N.J., "Structural Optimization by Variational Methods," in Computer
Aided Structural Design: Structural and Mechanical Systems (C.A. Mota Soares,
Editor), Springer Verlag, pp. 87-164, 1987.
[25] Plaut, R.H., Johnson, L.W. and Olhoff, N., "Bimodal Optimization of Com-
pressed Columns on Elastic Foundations," J. Appl. Mech., 53, pp. 130-134,1986.
[26] Frauenthal, J.C., "Constrained Optimal Design of Circular Plates against Buck-
ling," J. Struct. Mech., 1, pp. 159-186,1972.
[27] Kamat, M.P., Optimization of Structural Elements for Stability and Vibration.
Ph.D. Dissertation, Georgia Institute of Technology, Atlanta, GA, 1972.
[28] Armand, J.L. and Lodier, B., "Optimal Design of Bending Elements," Int. J.
Num. Meth. Eng., 13, pp. 373-384, 1978.
[29] Simitses, G.J., "Optimal Versus the Stiffened Circular Plate," AlA A J., 11, pp.
1409-1412, 1973.
[30] Haftka, R.T. and Prasad, B., "Optimum Structural Design with Plate Bending
Elements ~ A Survey," AIAA J., 19, pp. 517-522, 1981.

67
Chapter 2: Classical Tools in Structural Optimization

[31] Olhoff, N., "On Singularities, Local Optima and Formation of Stiffeners in Op-
timal Design of Plates," In: Optimization in Structural Design, A. Sawczuk and
Z. Mroz (eds.). Springer-Verlag, 1975, pp. 82-103.
[32] Gajewski, A., and Zyczkowski, M., Optimal Structural Design under Stability
Constraints, Kluwer Academic Publishers, 1988.
[33] Niordson, F.!., "On the Optimal Design of a Vibrating Beam," Quart. App!.
Math., 23, pp. 47-53, 1965.
[34] Turner, M.J., "Design of Minimum-Mass Structures with Specified Natural Fre-
quencies," AIAA J., 5, pp. 406-412, 1967.
[35] Taylor, J.E., "Minimum-Mass Bar for Axial vibration at Specified Natural Fre-
quency," AIAA J. ,5, pp. 1911-1913,1967.
[36] Zarghamee, M.S., "Optimum Frequency of Structures," AIAA J., 6, pp. 749-750,
1968.
[37] Brach, R.M., "On Optimal Design of Vibrating Structures," J. Optimization The-
ory and Applications, 11, pp. 662 667, 1973.
[38] Miele, A., Mangiavacchi, A., Mohanty, B.P. and \Vu, A.K., "Numerical Determi-
nation of Minimum Mass Structures with Specified Natural Frequencies," Int. J.
Num. Meth. Engng., 13, pp. 265282, 1978.
[39] Kamat, M.P. and Simitses, G.J., "Optimum Beam Frequencies by the Finite
Element Displacement Method," Int. J. Solids and Structures, 9, pp. 415-429,
1973.
[40] Kamat, M.P., "Effect of Shear Deformations and Rotary Inertia on Optimum
Beam Frequencies," Int. J. Num. Meth. Engng., 9, pp. 51-62, 1975.
[41] Pierson, B.L., "A Survey of Optimal Structural Design under Dynamic Con-
straints," Int. J. Num. Meth. Engng., 4, pp. 491-499, 1972.
[42] Kiusalaas, J., "An Algorithm for Optimal Structural Design with Frequency Con-
straints," Int. J. Nnm. Meth. Engng., 13, pp. 283-295, 1978.
[43] kerman, L.J., "Optimal Structural Design for given Dynamic Deflection," Int. J.
Solids and Structures, 5, pp. 473-490, 1969.
[44] Plaut, R.H., "Optimal Structural Design for given Deflection under Periodic
Loading," Quart. App!. Math., 29, pp. 315-318, 1971.
[45] Shield, R.T. and Prager, W., "Optimal Structural Design for given Deflection,"
z. Angew. Math. Phys., 21, pp. 513-523, 1970.
[46] Mroz, Z., "Optimal Design of Elastic Structures subjected to Dynamic, Harmon-
ically Varying Loads," Z. Angew. Math. Mech., 50, pp. 303-309,1970.
[47] Olhoff, N., "Optimal Design of Vibrating Circular Plates," Int. J. Solids and
Structures, 6, pp. 139-156,1970.

68
Section 2.8: References
[48] Olhoff, N., "Optimal Design of Vibrating Rectangular Plates," lnt. J. Solids and
Structures, 10, p. 93-109, 1974.
[49] Kamat, M.P., "Opt.imal Thin Rectangular Plates for Vibration," Recent Advances
in Engineering Science, Vol. 3. Proceedings of the 10th Annual Meeting of the
Society of Engineering Science, pp. 101-108, 1973.
[50] Armand, J.L., Lurie, K.A. and Cherkaev, A.V., "Exist.ence of Solutions of
the Plate Optimization Problem," Proceedings of the Int.ernational Symposium
St.ructural Design, Tucson, AZ, pp. 3.1-3.2, 1981.
[51] Balasubramanyam, K. and Spillers, W.R., "Examples of the Use of Fourier Series
in Structural Opt.imization," Quart. of Appl. Math., 3, pp. 559-566, 1986.
[52] Parbery, R.D., "On Minimum-Area Convex Shapes of given Torsional and Flex-
ural Rigidity," Eng. Opt., 13, pp. 189-196,1988.

69
Linear Programming 3

Mathematical programming is concerned with the extremization of a function f


defined over an n-dimensional design space R n and bounded by a set S in the de-
sign space. The set S may be defined by equality or inequality constraints, and
these constraints may assume linear or nonlinear forms. The function f together
with the set S in the domain of f is called a mathematical progmm or a mathemat-
ical programming problem. This terminology is in common usage in the context of
problems which arise in planning and scheduling which are generally studied under
operations research, the branch of mathematics concerned with decision making pro-
cesses. Mathematical programming problems may be classified into several different
categories depending on the nature and form of the design variables, constraint func-
tions, and the objective function. However, only two of these categories are of interest
to us, namely linear and nonlinear progmmming problems (commonly designated as
LP and NLP, respectively).
The term linear programming (LP) describes a particular class of extremization
problems in which the objective function and the constraint relations are linear func-
tions of the design variables. Because the necessary condition for an interior minimum
is the vanishing of the first derivative of the function with respect to the design vari-
ables, linear programming problems have a special feature. That is, the derivatives
of the objective function with respect to the variables are constants which are not
necessarily zeroes. This implies that the extremum of a linear programming problem
cannot be located in the interior of the feasible design space and, therefore, must lie on
the boundary of the design space described by the constraint relations. Since the con-
straint relations are also linear functions of the design variables the optimum design
must lie at the intersection of two or more constraint functions, unless the bounding
constraint is parallel to the contours of the objective function. This special feature of
the linear programming problems makes it possible to devise effective algorithms that
are suitable for reaching optimum solutions. Linear programming problems involving
large number of design variables and constraints are usually solved by an extremely
efficient and reliable method known as the simplex method.
Unfortunately, however, very few physically meaningful problems in structural
design, if any, can be formulated directly as LP problems without involving a degree

71
Chapter 3: Linear Programming

of simplification. Most structural design problems involve highly nonlinear objective


function and constraint relations. Nevertheless, the category of LP is of interest to
us because of several reasons. First of all, many nonlinear constrained problems can
be approximated by linear ones which can be solved efficiently by using standard
LP algorithms. Using such approximations opens up a possibility for solving NLP
problems. That is, almost all NLP problems can be solved as a sequence of repetitive
approximate LP problems which converge to the exact solution of the original NLP
problem provided that the procedure is repeated enough number of times. This
powerful procedure is called sequential linear programming (SLP) and is discussed in
Chapter 6. Also, methods intended for nonlinear constrained problems often utilize
linear programming as an intermediate step. For example, Zoutendijk's method of
feasible directions (see Chapter 5) employs a LP to generate a search direction.

Whether a given nonlinearly constrained problem in structural optimization can


be replaced by an approximately equivalent linearly constrained problem depends to
a great extent on the intuition of the designer and his knowledge and experience with
the given problem. Such approximations must usually be made so as not to alter the
overall character of the problem radically. The trade-off between a higher value of the
objective function attained because of the approximation and a lower computational
cost must be weighted carefully. Fortunately, there are a few classes of problems in
structural analysis and design in which such approximations have found to be indeed
reasonable. In the following sections some of those problems will be presented as
LP problems, and graphical solution of a simple LP problem will be demonstrated.
Next, the standard formulation of the mathematical LP problems will be presented,
and solution techniques for LP problems will be discussed. Finally, we would discuss
a special case of LP problems that require the design variables to assume values from
a set of discrete or integer values.

3.1 Limit Analysis and Design of Structures Formulated as LP Problems

In many structural design problems the initiation of yielding somewhere in the


structure is considered to be a criterion for failure, but this is not always reasonable.
In many cases we are not interested in the initiation of failure but in the maximum
load, called the limit load or the collapse load, that a structure may carry without
losing its functionality. The collapse load can be defined as the load required to
generate enough local plastic yield points (referred as plastic hinges for bending type
members) to cause the structure to become a mechanism and develop excessive de-
flections. While the exact calculation of the collapse load of a structure requires the
solution of a costly nonlinear problem, for ductile materials it is possible to obtain
a conservative estimate of that load by making the assumption that the material
behaves as an elastic-perfectly plastic material. That is, the material is assumed
to follow tlle stress-strain diagram shown in Fig. 3.1.1, yielding at stress level 170
but functioning as a constant stress carrying medium beyond the elastic limit. It is
this important assumption that allows the limit analysis and design problems to be
formulated as LP problems.

72
Section 3.1: Limit Analysis and Design of Structures Formulated as LP Problems

Strain, E

Figure 3.1.1 The stress-strain curve for an elastic-perfectly plastic material.

A simple example of a three bar truss is used in the following example to illustrate
the difference between the calculation of the load which initiates yielding and the
estimate of the collapse load.

Example 3.1.1

Figure .'1.1.2 Collapse of a three bar truss subject to a single load.

We perform the collapse analysis of a three bar pin jointed truss under a vertical
load as shown in Fig. 3.1.2. All three bars have the same cross-sectional area A, and
are made of material having Young's modulus E and yield stress 0"0' We start by
calculating the load p at which the first bar yields. Denoting the vertical displacement
at the common joint D by v, we obtain the strains in the three members
V V
fB = I' fA = fC = 4l . (3.1.1)

The corresponding member forces are


EA
nB = -l-v, (3.1.2)

Using the two equations of equilibrium at joint D, we get

(3.1.3)

73
Chapter 3: Linear Programming

and the internal forces in the three members are determined as

nA = nc = 0.2p, nB = 0.8p. (3.1.4)

Clearly, as the load is increased from zero member B yields first, when

or p = 1.25Ao-o . (3.1.5)

The structure does not collapse, however, at p = 1.25Ao-o since members A and
C can still carry the applied load without experiencing excessive deformations. We
may increase the load until member A or C yields. Since we have assumed elastic-
perfectly plastic material behavior, the stress in member B will remain at 0-0 as we
increase the load beyond the initial yield load. Due to the symmetry in this problem,
the next yielding takes place simultaneously in members nA and nco Therefore, at
collapse all three members will be at the yield point so that

(3.1.6)

and from the equations of equilibrium Eq. (3.1.4) we have

(3.1. 7)

This is a 60% increase over the load when first yielding starts .•••

In example 3.1.1 it was easy to identify the sequence of yielding of the members
and determine the state of stress in the members at collapse. This fact permitted us to
determine the collapse load without difficulty. In general, it is not easy to determine
the combination of members that will yield at collapse, and the stress distribution at
the collapse is not known. Fortunately, it is possible to cast the problem as an LP
problem in order to determine the collapse load [1] based on a general theorem of
the theory of plasticity. This theorem is the lower bound theorem, and it is quoted
below from Calladine Ref. 2.

The Lower Bound Theorem: If any stress distribution throughout the structure
can be found which is everywhere in equilibrium internally and balances the external
loads, and at the same time does not violate the yield conditions, these loads will be
camed safely by the structure.

The application of this theorem will now be demonstrated for a problem where
the choice of stress at collapse is not as trivial as it was in example 3.1.1. We use the
same structure used in the previous example, but with an added horizontal load at
point D.

74
Section 3.1: Limit Analysis and Design of Structures Formulated as LP Problems
Example 3.1.2

p
Figure 3.1.3 Limit analysis of a three bar truss subjected to two loads.

Consider the limit analysis of the three bar truss of Figure 3.1.3 under the com-
bined vertical and horizontal loads of equal magnitude, p. The equations of equilib-
rium in this case are
I
nB + 2{nA + nc) - p = 0,
(3.1.8)
y'3
T (nA - nc) - p = 0 ,
and we have the constraints
(3.1.9)
It is no longer easy to know which two of the three bars yield at the collapse. However,
we may try different combinations of nA, nB, and nc that satisfy the equations of
equilibrium in order to obtain a lower bound to the collapse load. For example, if we
try nc = 0, we obtain from the equilibrium relations (3.1.8)
2
nA = y'3p = 1.155p, and nB = 0.423p . (3.1.10)

Clearly in this case nA reaches its yield value of Aao before nB so that
y'3
nA = Aao, nB = 0.366Aao, and p = TAao = 0.866Aao . (3.1.11)

Having satisfied all the requirements for the lower bound theorem, we thus know
that the collapse load is bounded below by 0.866Aao. We can now try different
combinations of member force distribution until we obtain a higher value of p than
the one obtained in Eq. (3.1.11). To get the best estimate, we cast the problem as a
maximization problem
maximize P
such that Eqs. (3.1.8) and Eqs. (3.1.9) are satisfied. (3.1.12)
This is clearly a LP problem in the variables nA, nB, nc and p , and may be solved
using any LP algorithm. It is also simple enough to admit a graphical solution if
required (see Exercise 1) .•••
75
Chapter 3: Linear Programming

The general formulation of the calculation of the limit load for truss structures
is similar to the procedure used in example 3.1.2 . It is assumed that no part of the
truss structure fails by buckling before the plastic collapse load is reached. If we have
a truss structure with r members loaded by a system of loads AP, where p is a given
load vector and A is a scalar, the limit load can be determined by finding the largest
value of A that the structure can support. The equations of equilibrium are written
as r

L eijnj = APi, i = 1, ... ,m, (3.1.13)


j=l

where nj (j = 1, ... , r) are the forces in each of the truss members, eij are direction
cosines, and m is the number of equilibrium equations. The yield constraints are
written as
(3.1.14)
where A j , (lCj, and (lTj are the cross-sectional areas, and the yield stresses in com-
pression and tension, respectively. The limit or collapse load is then the solution to
the following linear programming problem:
maximize A
such that Eq. (3.1.13) and Eq. (3.1.14) are satisfied, (3.1.15)
where A and the member forces nj are treated as the design variables.
A related problem is the problem of limit design where the collapse load is spec-
ified and the optimal cross-sectional areas are sought. Often, the objective is to
minimize the total mass of the structure

minimize m = LPjAjlj , (3.1.16)


j=l

where Pj and lj are the mass density and the length of member j, respectively. The
minimization problem of Eq. (3.1.16) has the same set of constraints, Eqs. (3.1.13)
and (3.1.14), that applies to the limit analysis problem, but both nj and Aj are
treated as design variables. This time, however, the load amplitude A is specified.

Example 3.1.3

Formulate the limit analysis and design of the five bar truss shown in Figure (3.1.4)
as linear programs. Assume that all bars are made of the same material and that
(lc = -(IT = (10·
The vertical and horizontal equations of equilibrium at the unrestrained nodes of
the structure are

n13
V2
+ Tn23 = 0,
V2
n24 + Tn14 = 0, (3.1.17a)

V2 V2
n34 + Tn23 = 0, n34 + Tn14 = P . (3.1.17b)

76
Section 3.1: Limit Analysis and Design of Structures Formulated as LP Problems

4
P
Figure 3.1.4 Limit analysis and design of a five bar truss.

The yield constraints are

-A130'O ~ n13 ~ A 13O'o, - A 23 0'O ~ n23 ~ A23O'O ,


-A140'O ~ n14 ~ A 140'O, - A 24 0'O ~ n24 ~ A 24 0'O , (3.1.18)
- A 34 0'O ~ n34 ~ A 34 0'O .

The limit load problem is specified as defined previously: maximize p, by varying


the member forces, such that the equations of equilibrium and the yield constraints
are satisfied. The limit design problem is

minimize ~ = A13 + A24 + A31 + v'2(A14 + A 23 )


such that Eq. (3.1.17) and Eq. (3.1.18) are satisfied. (3.1.19)

For the limit design problem both the cross-sectional areas and the member forces
are treated as design variables .•••

The analysis and design of structures that include members under bending may
be formulated as LP problems as in Refs. 3-5. Cohn, Ghosh, and Parimi [3] provide
an excellent unified approach to both the analysis and design of beams, frames, and
arches of given configurations under fixed, alternating, and variable repeated or shake-
down loadings. We focus our attention here only on simple examples in this class of
problems.

The basic hypothesis regarding the material is that the beam or frame is elastic-
perfectly plastic. The fully plastic moment, m p , of a beam cross-section is defined as
the bending moment, m, required to make the entire cross-section yield so as to form
a hinge with constant. bending resistance.

77
Chapter 3: Linear Programming
Example 3.1.4

1/2

Figure 3.1.5 Limit analysis of a two-span beam.

Limit analysis of bending members is illustrated by using a two-span continuous


beam under the loading shown in Figure 3.1.5. Following the general formulation
presented earlier, the limit load is the largest value of >. that the structure can support
without forming a mechanism. As in the case of Example 3.1.2 the sequence of
hinge formation to form a beam mechanism and the distribution of the bending
moments along the span of the beam is not obvious. In fact, there are infinitely
many statically admissible bending moment distributions that satisfy the equilibrium
equations. However, there are only two possible collapse mechanisms. The two
elementary mechanisms and the moment distribution for the beam are presented in
Figure 3.1.5.
The LP problem for the plastic analysis is

maximize >.
subject to i = 1,2,3, (3.1.20)

where ml,m2,and ,m3 are the magnitudes of the bending moment at those points
along the beam which have the potential to form plastic hinges; at these points
the bending moments have local maxima. These three moments are also unknowns
for the problem and need to be determined. At the onset of either of the collapse
mechanisms shown in Figure 3.1.5, we can write down two equations of equilibrium
by using the principle of virtual displacements. The basic assumption in writing the
virtual displacements is that the hinges in the figure are not plastic hinges, but are
introduced to permit the small displacements that are assumed to take place while
the members between them remain straight. The resulting equilibrium relations are

(3.1.21)

78
Section 3.1: Limit Analysis and Design of Structures Formulated as LP Problems
(3.1.22)
where Oi, O2are the virtual rotations of the member at the expected plastic joints and
8i, 82 the virtual displacements of the beam under the load points. The virtual dis-
placements and the rotations are related to one another through kinematic relations,
and can be eliminated from the equations. Furthermore, using the two equilibrium
equations, we can eliminate the two variables, ml and m3, to reduce the LP problem
of 3.1.20 to finding the ..\ and m2 such that

maximize ,\
pi 1
subject to - mp<
-
(-..\
4 - -m2)
2 -< m p,
- mp ::; m2 ::; mp , (3.1.23)
- m < (pl,\
p- 2 - !m2)
2 -< m p .

This is a simple two variable (m2 and ..\) LP problem that can be solved graphically.
•••
Example 3.1.5

As an illustration of limit design for bending type problems, consider the well-known
problem of minimizing the weight of a plane frame to resist a given set of ultimate
loads. A single bay, single story portal frame is loaded by a horizontal and a vertical
load of magnitude p as shown in Figure 3.1.6. For this design problem the top hori-
zontal member is assumed to be different from the two vertical columns. Accordingly,
we assume the beam and the column cross-sections to have associated fully plastic
moments mpB and mpC, respectively. These two plastic moments depend on the
cross-sectional properties of their respective members and, therefore, are the design
variables for the problem.

P
1

p--,--...--i-------;

21

Figure 3.1.6 Portal frame design against plastic collapse.

In order to pose the problem as a weight minimization design problem, we need


to relate the design variables and the structural weight. Massonet and Save [6] have
shown that for beam sections in bending there is an approximate linear relation
79
Chapter 3: Linear Programming

between the weight per running foot, WI, and the plastic section modulus, mp/(J'o.
Over the relevant range of sections that may be expected to be used for a given
frame the error involved in this linearization is of the order of 1%. It is this single
assumption which renders the plastic design problem linear.
We will, therefore, assume that the problem of minimizing the weight of a frame
for a set of ultimate loads reduces to minimizing a function

(3.1.24)

In the interest of non-dimensionalization we divide both sides of Eq. (3.1.24) by 2p{2


to obtain the weight proportional objective function

(3.1.25)

p p p

p P
p--.,...-----~
---1~ __
--~,....,--

1.4mb;?: pi 2. 2 mb + 2 me ;?: pi 3. 2 mb + 2 me ;?: 2 pi


P P P

p--..,.-_.....I..._...., p-~--...l-- p-~r--_.J__

4. 4 me;?: 2 pi 5. 4 mb + 2 me ;?: 3 pi 6. 2 mb + 4 me ;?: 3 pi

Figure 3.1.7 Collapse mechanisms for the portal frame of Figure 3.1.6.

The equations of equilibrium can be obtained by using the same approach used
in the previous example. Figure 3.1.7 shows all possible collapse mechanisms for the
frame. The ultimate load carrying capacity of the structure for any given collapse
mechanism is obtained by the virtual work equivalence between the external work
of the applied loads and the internal work of the fully plastic moments experienced

80
Section 3.2: Prestressed Concrete Design by Linear Programming
while undergoing virtual rotations of the plastic hinges. Thus a permissible design
is one for which the capacity for internal virtual work is greater than or equal to
the external work. It is left as an exercise (see Exercise 4) to verify that behavioral
constraints associated with the collapse mechanism of Figure 3.1.7 reduce to

4X2 ~ 1, beam mechanism 1 , (3.1.26)


2XI + 2X2 ~ 1, beam mechanism 2 , (3.1.27)
Xl + X2 ~ 1, sway mechanism 1 , (3.1.28)
2XI ~ 1, sway mechanism 2 , (3.1.29)
2XI + 4X2 ~ 3, combined mechanism 1 , (3.1.30)
4XI + 2X2 ~ 3, combined mechanism 2 . (3.1.31 )

Furthermore since Xl and X2 represent cross-sectional variables it is required that

and (3.1.32)
Thus the problem of weight minimization under a set of ultimate load has been
reduced to the determination of those non-negative values of xland X2 for which I
as given by Eq. (3.1.25) is minimized subject to constraints Eqs. (3.1.26 - 3.1.32).
The problem is clearly an LP problem. We will defer the analytical solution of this
problem until later .•••

3.2 Prestressed Concrete Design by Linear Programming

Since concrete is weak in tension, prestressing helps to eliminate undesirable ten-


sile stresses in concrete and thereby improve its resistance in bending. A prestressing
cable or a tendon exerts an eccentrically applied compressive load to the beam cross-
section giving rise to an axial load and possibly a bending moment due to an offset in
the cable. In evaluating the total stresses at any given cross-section we must super-
impose the stresses due to dead and live loads on the stresses due to the eccentrically
applied prestressing forces of the tendons. For a beam of fixed cross-sectional dimen-
sions, the total cost of the beam may be assumed to be approximately proportional
to the cost of building in a desired prestressing force. The optimization problem for
the design of a prestressed beam thus reduces to minimizing the magnitude of the
prestressing force 10.
Consider the following simple problem of the optimum design of the simply-
supported beam shown in Figure 3.2.1 . The initial value of the prestressing force
10 and the eccentricity Ie is to be determined such that 10 is a minimum subject to
constraints on normal stress, transverse displacement, and upper and lower bound
constraints on the design variables. Additionally, in designing a prestressed concrete
beam which is expected to remain in service for a number of years, we must allow for
the loss of prestressing force through time dependent shrinkage and creep effects in
concrete. To simplify design considerations it is frequently assumed that the realizable
81
Chapter 3: Linear Progmmming

-------------- -.. --I------


.• e=cable
. •.• eccentricity
I====:s;~:::::::::~==~

'-A Section A-A

Figure 3.2.1 Simply supported post-tensioned beam.

prestressing force in service is a constant fraction a of the initial prestressing force fo.
In calculating the bending moment distribution or the deflected shape of a prestressed
beam, in addition to the usual dead and live loads, we must allow for the equivalent
distributed loading (see Exercise 6a) and the end loads resulting from the curved
profile of the eccentrically placed tendons. It can be shown [7,8] that for parabolic
profiles of the cables (see Figure 3.2.1) the induced moments and deflections are
linearly related to the quantity foe with the constant of proportionality k being a
function of the known material and cross-sectional properties. With this assumption
maximum stresses and the deflections of a simply supported beam occur at the center
of the beam. If the maximum positive bending moment and maximum deflection at
the center of the simply-supported beam of Figure 3.2.1 due to external loads in
the ith loading condition are denoted by mei and bei, respectively, then the beam
optimization problem reduces to
minimize fUo,e) = fo (3.2.1 )
subject to a Ii < - ± mei - afoe <
_ -afo _a ui , (3.2.2)
a z
bli ~ bei + akfoe ::; b" i , (3.2.3)
el ~ e ~ e", (3.2.4)
fo 2: 0, i = 1, ... ,nl. (3.2.5)
Here nl denotes the number of different loading conditions; ai, aU, bl , b" , el , and e"
denote lower and upper bounds on stress, deflections and the tendon eccentricity;
a and z denote the effective area and the section modulus of the cross-section.
The problem as formulated by Eqs. (3.2.1) through (3.2.5) is not an LP problem
because it includes the product foe of the two variables. However, it can be easily
cast as one by letting
m = foe, (3.2.6)
and expressing the problems in terms of the new design variables fo and m. The
transformed problem thus reduces to the following LP problem
minimize fUo,m) = fo (3.2.7)

82
Section 3.3: Minimum Weight Design of Statically Determinate Trusses

subject to
Ii alo ± mei - am ui
(J :::; - - :::; (J , (3.2.8)
a z
81i + akm
:::; 8ei :::; 8ui , (3.2.9)
ml:::; m:::; m U ,
(3.2.10)
fo 2: 0, i = 1, ... , nl, (3.2.11 )
with mt and m U being the upper and lower bounds on foe.
Morris [9] has treated a similar problem, but with additional constrains on ulti-
mate moment capacity. He also modified the constraint (3.2.11) to allow the Ameri-
can Concrete Institute's limit on the prestressing force intended to prevent premature
failure of the beam by pure crushing of the concrete. Morris linearizes part of the
problem by using the reciprocal of the prestressing force as one of the design variables;
this transformation however fails to linearize the constraint on the ultimate moment
capacity. In the interest of linearization, this nonlinear constraint is replaced by a
series of piecewise linear connected chords with true values at chord intersections.
Kirsch [10] has shown that appropriate transformations can also be used to reduce
the design of continuous prestressed concrete beams to equivalent linear program-
ming problems. These problems involve not only the optimization of the prestressing
force and the tendon configuration, but also the optimization of the cross-sectional
dimensions of the beam.

3.3 Minimum Weight Design of Statically Determinate Trusses

As another example of the design problems that can be turned into LP problems
we consider the minimum weight design of statically determinate trusses under stress
and deflection constraints. The difficulty in these problems arises due to the nonlinear
nature of the deflections as a function of the design variables which are the cross-
sectional areas of the truss members. This type of problem, however, belongs to
the class of what is known as separable programming [11] problems. In this class of
programming the objective function and the constraints can be expressed as a sum
of functions of a single design variable. Each such function can be approximated by
a piecewise linear function or a set of connected line segments or chords interpolating
the actual function at the chord intersections.
A nonlinear separable function of n design variables,
(3.3.1)
can be linearized as
m m m

I = L r/lk!lk +L 'f/2khk + ... + L 'f/ndnk , (3.3.2)


k=O k=O k=O

with
m m

Xl = L 'f/lkXlk, ,Xn = L 'f/nkXnk , (3.3.3)


k=O k=O

83
Chapter 3: Linear Programming
m m m

L 'T/lk = L'T/2k = ... = L'T/nk = 1 , (3.3.4)


k=O k=O k=O

'T/jk;::: 0, j = 0,1, ... , n, and k = 0,1, ... , m . (3.3.5)


Here /ik and Xjk are the values of the functions /1,12, ... , in and the design vari-
ables Xl, X2, ••• , X n at m + 1 preselected points along each of the design variables,
and 'T/nk'S are the interpolation functions for the design variables. Note that the
number, m, of points selected for each design variable can, in general, be different
(ml, m2, ... , m n , etc. ), but for the sake of simplicity they are taken to be equal here.

The purpose of using m intervals with m + 1 values of the design variables is to


cover the entire range of the possible design space accurately. The number of segments
m decides the degree of approximation to the original problem- the larger the m
the closer the solution of the linear problem will be to the true solution. However, at
any given design point, a linear approximation to a nonlinear function requires only
the value of the function at two values of a design variable. We, therefore, require
that for every design variable j(j = 1, ... , n), at most two adjacent 'T/jk be positive.
This implies that if, for example, 'T/pq and 'T/p(q+l) are non-zero with all other 'T/pk zero,
then the value of xp is in the interval between xpq and xp(q+l) and is given by

with 'T/pq + 'T/p(q+l) = 1 . (3.3.6)

The variables, (Xl, ... , xn), of the function have thus been replaced by the interpola-
tion functions, 'T/jk, only two of which are constrained to be non-zero for each of the
design variables. Therefore, we have a linear approximation to the function at every
design variable.

Example 3.3.1

As an illustration we consider a problem similar to the one solved by Majid [12]. The
objective is the minimum weight design of the four bar statically determinate truss
shown in Figure 3.3.1 with stress constraints in the members and a displacement
constraint at the tip joint of the truss. In order to simplify the problem we assume
members 1 through 3 to have the same cross-sectional area A l , and the member 4 the
area A 2 • Under the specified loading, the member forces and the vertical displacement
at joint 2 can easily verified to be

(3.3.7)

(3.3.8)

84
Section 3.3: Minimum Weight Design of Statically Determinate Trusses

Figure 3.3.1 Four bar statically determinate truss.


where negative values for the forces denote compression. Allowable stresses in tension
and compression are assumed to be 7.73 x 1O- 4 E and 4.833 x 1O- 4 E, respectively
and the vertical tip displacement is constrained to be no greater than 3 x 1O- 3 l. The
problem of the minimum weight design subject to stress and displacement constraints
can be formulated in terms of the non-dimensional variables

and (3.3.9)

as

minimize f(xt, x 2) = -
3 v'3
+- (3.3.10)
Xl X2
subject to 18xl + 6v'3X2 :::; 3, (3.3.11)
0.05 :::; Xl :::; 0.1546, (3.3.12)
0.05 :::; X2 :::; 0.1395, (3.3.13)
where lower bound limits on Xl and X2 have been assumed to be 0.05. Except for the
objective function which is a separable nonlinear function, the rest of the problem is
linear. The objective function can be put in a piecewise linear form by using Eqs.
(3.3.2) and (3.3.3). For the purpose of demonstration, we divide the design variable
intervals of Eqs. (3.3.12) and (3.3.13) into two equal segments (m = 2) resulting in
XIO = 0.05, Xu = 0.1023, X12 = 0.1546,
and X20 = 0.05,
X2l = 0.09475, X22 = 0.1395 .
Objective function values corresponding to these points are
flO = 20, fll = 9.76, h2~= 6.47,
and 120 = 34.64, 121 = 18.28, 122 = 12.42 .
Therefore, the linearized objective function is
f(xl, X2) = 207]10 + 9.761]11 + 6.471]12 + 34.641]20 + 18.281]21 + 12.421]22 .

85
Chapter 3: Linear Programming

After substituting

Xl = 0.051]10 + 0.10231]11 + 0.15461]12,


and X2 = 0.051]20 + 0.094751]21 + 0.15461]22,
into the constraint equations of (3.3.11) through (3.3.13), a standard LP algorithm
can be applied with the additional stipulation that only two adjacent 1]ik for every
design variable Xi be positive .•••

3.4 Graphical Solutions of Simple LP Problems

For simple problems with no more than two design variables a graphical solution
technique may be used to find the solution of a LP problem. A graphical method
not only gives a solution, but also helps us to understand the nature of LP problems.
The following example is included in order to illustrate the nature of the design space
and the optimal solution.

Example 3.4.1

Consider the portal frame limit design problem of example 3.1.5. The problem was
reduced to minimizing the objective function

!(Xl,X2) = 2XI +X2, (3.4.1 )


subject to inequality constraints Eqs. (3.1.26) through (3.1.32).
Since the problem is an LP problem in two-dimensional space it is possible to obtain
a graphical solution. Constraints (3.1.32) imply that we can restrict ourselves to the
non-negative quadrant of the XI - X2 plane in Figure 3.4.1. 'Ve plot all the straight
lines corresponding to Eqs. (3.1.26) through (3.1.31) as strict equalities (these lines
identify the constraint boundaries). To identify the feasible and the infeasible portions
on either side of a given constraint line we choose a point on either side and substitute
its coordinates in the inequality. If the inequality is satisfied then the portion on the
side of the constraint line which contains this point is the feasible portion, if not it is
infeasible. For example, if the coordinates XI = 0 and X2 = 0 are substituted into the
inequality (3.1.27), the inequality is violated, implying that the origin does not belong
to the feasible domain. If we continue this process for all the inequality constraints we
will soon end up with a feasible region that is a convex polygon; the corners are called
extreme points. The feasible region corresponding to the constraints is illustrated in
Figure 3.4.1.
Next, we plot the contours of the objective function by setting the function 2xJ +
X2 equal to a constant and plotting the lines corresponding to various values of this
constant. The optimum point is obtained by finding the contollf of the objective
function which just barely touches the feasible region. The direction of decreasing!
is shown in Figure 3.4.1 with the optimum solution identified as

XI = X2 = 1/2, (3.4.2)

86
Section 3.4: Graphical Solutions of Simple LP Problems

3.5 , Eq. (3.1.29)


,,
,,
3 , ,
,,
,,
2.5 , ,,, ,
,,
'-'.y
'.,K>
,, \\

2 , ,,, ......., , " '\J'


,, , \,....,. \.
, ~ '\\ ' feasible region
\ '0> \.
Eq. (3.1.31) 1.5
~
\ , ....... \....,.\·0 \. \. ,
, '\\ ' ,,
\.

..... \.\~ \. ,,
Eq. (3.1.28) 1 \~
'~\J' "
\. \.

Eq. (3.1.30) ,\\, '


\~ \. \.

0.5 '~o \.
, '
Eq. (3.1.27)
0
o 0.5 1 1.5 2 2.5 3 3.5

Figure 3.4.1 Graphical solution of the portal frame LP problem.

with fmin = 1.5 .•••


Barring degeneracy, the optimum solution in an LP problem will always lie at a
corner or an extreme point. The degenerate case may occur when the gradient of the
objective function is a constant multiple of the gradient of one of the constraints along
which the optimum solution lies. Then, every point along this constraint including
the extreme points constitutes an optimum solution. For example if the problem just
discussed had an objective function of the type

(3.4.3)

with c being a constant, then every point along the line [a,bl in Figure 3.4.1 would
constitute an optimum solution.
The concept of a convex polygon with corners or vertices in two dimensions
generalizes to a convex polytope with extreme points in Rn. For example, a convex
polytope [111 is defined to be the set which is obtained by the intersection of a finite
number of closed half-spaces. Similarly, an extreme point of a set is defined to be a
point x in Rn which cannot be expressed as a convex combination OXI + (1 - O)X2
(0 < a < 1) of two distinct points Xl and X2 belonging to the set. Finally, as in the
two-dimensional case of Figure 3.4.1, barring degeneracy, a linear objective function
in Rn achieves its minimum only at an extreme point of a bounded convex polytope.

87
Chapter 3: Linear Programming

Interested readers are advised to consult either Ref. 11 or 13 for a comprehensive


treatise on this subject.
It is obvious that the above graphical procedure cannot be used for linear pro-
gramming problems involving more than two variables. We have to look at alternative
means of solving such problems. The simplex method first proposed by Dantzig [13]
is an efficient method for solving problems with a large number of variables and con-
straints. We will study the simplex method next and to this end we outline a few
definitions and some very important concepts in linear programming.

3.5 A Linear Program in a Standard Form

A linear program is said to be in a standard form if it is posed as

minimize f = cTx (3.5.1 )


subject to Ax =b, (3.5.2)
x ~ 0, (3.5.3)
where c is an n X 1 vector, A is a m X n matrix, and b is a Tn x 1 vector. Any
linear program including inequality constraints can be put into the standard form by
the use of what are known as slack and surplus variables. Consider, for example, the
linear program defined by Eqs. (3.1.26) through (3.1.32). We can transform those
inequalities into strict equalities as

4X2 - X3 = 1, (3.5.4)
2Xl + 2X2 - X4 = 1 , (3.5.5)
:7:1 + X2 - X5 = 1 , (3.5.6)
2.rl - X6 = 1, (3.5.7)
2Xl + 4X2 - X7 = 3, (3.5.8)
4.rl + 2.1:2 - Xs = 3, (3.5.9)
by the addition of the surplus variables X3 through Xs, provided that these variables
are restricted to be non-negative, that is

.ri ~ 0, i=1,oo.,8 . (3.5.10)


If the inequalities in Eqs. (3.1.26) through (3.1.31) were of the opposite kind we
would add non-negative variables X3 through .rs to achieve equality constraints. In
this case the variables X3 through Xs would be referred to as the slack variables. If the
original values of the design variables are not required to be non-negative we can still
convert the problem to a standard form of Eqs. (3.5.1) through (3.5.3) by defining
either
and (3.5.11)
where Ul, 1l2, VI, V2 ~ 0, or by adding a large enough positive constant A1 to the design
variable
(3.5.12)

88
Section 3.6: The Simplex Method

so that the new variable never becomes negative during the design. Such artificial
variables are often used in structural design problems where quantities such as stresses
are used as design variables. Stresses can be both positive or negative depending upon
the loading condition. It is clear from Eq. (3.5.11) that putting LP program in a
standard form may cause an increase in the dimension of the design space. Using
Eq. (3.5.12) does not increase the dimension of the problem but it may be difficult to
know a priori the value of the constant M that will make the design variable positive
(the choice of a very large number may result in numerical ill-conditioning).
Going back to Eq. (3.5.2) we notice that if m = n and all the equations are
linearly independent, we have a unique solution to the system of equations, whereas
with m > n we have, in general, an inconsistent system of equations. It is only when
m < n that we have many possible solutions. Of all these solutions we seek the one
which satisfies the non-negativity constraints and minimizes the objective function
f.
3.5.1 Basic Solution

We assume the rank of the matrix A to be m and select from the n columns of A a
set of m linearly independent columns. We denote this m X m matrix by D. Then
D is non-singular and we can obtain the solution

Xn = D- 1 b n ,
(3.5.13)
mx1 mxmmx1

where Xn is the vector of independent variables and b D is the corresponding right-


hand vector. Thus it can easily be verified that

(3.5.14)

is a solution of the system of Eqs. (3.5.2). Such a solution is known as a basic


solution, and XD is called the vector of basic variables. A basic solution, however,
need not satisfy the non-negativity constraints (3.5.3). Those basic solutions which
do indeed satisfy these constraints are known as basic feasible solutions and can be
shown to be extreme points. In other words all basic feasible solutions to Eqs. (3.5.2)
will correspond to corners or extreme points of the convex polytope [13].
The total number of possible basic solutions to Eqs. (3.5.2) can be estimated
by identifying the number of possibilities for selecting m variables arbitrarily from a
group of n variables. From the theory of permutations and combinations we know
this number to be

(~) - m!(n~ m)! . (3.5.15)

Not all of these possibilities will however be feasible.

89
Chapter 3: Linear Progmmming
3.6 The Simplex Method

The idea of the simplex method is to continuously decrease the value of the
objective function by going from one basic feasible solution to another until the
minimum value of the objective function is achieved. We will postpone the discussion
of how to generate a basic feasible solution and assume that we have a basic feasible
solution to start the algorithm. Indeed, if we had the following inequality constraints

ailxI + ai2x2 + ... + ai"Xn ~ bi, i = l, ... ,m, (3.6.1)


Xj ~ 0, j = 1, ... ,n, (3.6.2)
°
where bi ~ for every constraint, then the process of converting the constraint set
to the standard form yields the following

ailxl + ai2 X 2 + ... + ainXn + Yi = bi, i = 1, ... ,m, (3.6.3)


Xj ~ 0, j = 1, ... ,n, (3.6.4 )
Yi ~ 0, i = 1, ... ,m, (3.6.5)
and we immediately recognize

i = 1, ... ,m, and Xj = 0, j = 1, ... ,n, (3.6.6)


as a basic feasible solution. A formal scheme for generating a basic feasible solution
will be discussed later in this section. The question of immediate interest at this
point is how to go from one basic feasible solution to another basic feasible solution.
Without loss of generality let us assume that we have a system of equations in the
canonical form shown below (such forms can always be obtained through the well-
known Gauss elimination scheme for a matrix A with rank m).

Xl +0 + ... +0 + ... +0 +al,m+l Xm+l + ... +al,n Xn bl


0 +X2 + ... +0 + ... +0 +a2,m+1 Xm+l + ... +a2,n Xn b2

,
0 +0 + ... +xs + ... +0 +as,m+l Xm+l + ... +a.,n Xn = bs

0 +0 + ... +0 + ... +xm +am,m+l Xm+l + ... +am,n Xn = bm


(3.6.7)
with a basic feasible solution

Xl = bl , X2 = b2 , X. = b., Xm = bm ,
Xm+l = X m +2 = = o. (3.6.8)
The variables Xl through Xm are called basic and the Xm+l through Xn are called
non-basic variables.
90
Section 3.6: The Simplex Method

3.6.1 Changing the Basis

The simplex procedure changes the set of basic variables while improving the ob-
jective function at the same time. However, for the purpose of clarity we will first
demonstrate the approach for going from one basic feasible solution to another. The
objective function improvement will be discussed in the following section.
We wish to make one ofthe current non-basic variables of Eq. (3.6.7), say Xt (m <
t ::;n), basic and in the process cause a basic variable, xs(l :::; s :::; m), to become
non-basic. At this point we assume that we know the variable Xt which we will bring
into the basic set. We only need to decide which variable to drop from the basic set.
Consider the selected terms shown below for the coefficients of the sth equation and
an additional arbitrary ith equation.
s t

1
° = bi (3.6.9)

s 0 1
Since we want to make Xt basic, we need to eliminate it from the rest of the equations
except the sth one by reducing the coefficients ait (i = 1, ... ,n; i =j:. s) to zeroes, and
making the coefficient ast unity by dividing the sth equation throughout by ast. We
can do this only if ast is non-zero. Also, unless asl is positive, the process of dividing
the sth equation by ast will produce a negative term on the right-hand side since
bs is positive because the current solution is a basic feasible solution. To eliminate
the new basic variable Xt from the ith equation (i = 1, ... , n; i =j:. s) we have to
multiply the 8th equation by the factor (aidasl) and subtract the resulting equation
from each of these equations. The resulting coefficients on the right-hand side of the
ith equation will be
b~ = bi - bs( ail) . (3.6.10)
asl
To guarantee that the resulting solution is a basic feasible solution we must require
that b~ ~ 0, or rearranging Eq. (3.6.10) we have

(~):::; (~) . (3.6.11)


asl ail
Equation (3.6.11) together with the condition
asl > 0, (3.6.12)
are the two conditions which identify possible sth rows in changing from one basic
feasible solution to another. Thus for a given non-basic variable Xt that is to be
made basic we check the coefficients of all the terms in the tth column. We eliminate
from consideration all elements in the tth column with non-positive coefficients as
violating condition (3.6.12). Among those with positive coefficients we compute the
ratios b;j ail (i = 1, ... , n). We select the row, s, for which the ratio bi / ail has the
smallest value and call it bs/ast, Eq. (3.6.11). It is the basic variable corresponding
to that row which will become non-basic in the process of making Xt basic.

91
Chapter 3: Linear Programming

Example 3.6.1

vVe illustrate the foregoing discussion with an example. Consider the system of
equations
2Xl + 2X2 + X3 = 6 ,

3Xl + 4X2 + X4 = 10, (3.6.13)


Xl + 2X2 + X5 = 4 .

The system is already in the canonical form with a basic feasible solution being

(3.6.14)

The variables Xl and X2 are the non-basic variables, whereas X3, X4, and, X5 are the
basic variables. Now, let us assume that we want to make Xl basic. Rewriting Eqs.
(3.6.13) in a matrix form we have

(3.6.15)

Since Xl is to made basic we consider the first column. To chose the variable to be
made non-basic we form the ratios (b;jaid, i = 1,2,3.

~=3, ~-3~
3'
all a21 -

The smallest ratio is bJ/all and so we pivot on all. Thus the new system of equations
IS

[~
1
1
1 ;~5~ ~ ~1{~j} ~ n} , (3.6.16)

and the process of making Xl basic has resulted in the variable X3 being non-basic.
The new feasible solution .is

Xl = 3, X4 = 1, X5 = 1.

It may be verified by the reader that by using a pivot other than all we would end
up with an infeasible basic solution. For example, if al3 is a pivot we obtain

Xl = 4, X3 = -2, X4 = -2,

which is not feasible since X3 < 0 and X4 < O.•••


92
Section 3.6: The Simplex Method
3.6.2 Improving the Objective Function

In the preceding section we considered making a particular non-basic variable Xt basic


without losing feasibility. We also need to decide the variable that we make basic.
We should seek to bring into the basis only that variable which will decrease the
objective function while yielding at the same time a basic feasible solution. Notice
that the objective function is a linear equation just like the other equations and hence
it can be included with the others. The objective function equation may be written
as
(3.6.17)
Assume the system of equations (3.5.2) is in the canonical form, and append Eq.
(3.6.17) at the end of all other equations. The form of the equations that includes
the objective function is often referred as the simplex tableau. We now eliminate all
the basic variables from this last equation by subtracting Ci times each of the equations
in the canonical form. Then the right-hand of Eq. (3.6.17) becomes (f - clb l - c2b2-
C3b3 - ... - Cm b m ). Thus if we ignore the presence of f, the right- hand side represents
the negative of the value of the objective function since Xm+l = Xm+2 = ... = Xn = O.
The left-hand side of this last equation will contain only non-basic variables. Next,
assume that the coefficient of one of the non-basic variables on the left-hand side of
the last equation is negative. If we make this variable basic then we will increase
the value of this variable from its present value of zero to some positive value. Since
the last equation is just one of the equations, when we pivot on one of the equations
(sth) and eliminate the corresponding variable (xs) from the basic set we perform
the operations described in the previous section on all the m + 1 equations. When
the particular variable with the negative coefficient in the last equation is eliminated,
the right-hand side of this equation will increase since the variable has increased in
value from zero to a positive value. Since the right-hand side represents the negative
of the value of the objective function, a function decrease is therefore guaranteed.
Thus the criterion for guaranteeing an improvement of the objective function is to
bring into the basis a variable that has a negative coefficient in the objective function
equation after it has been cleared of all the basic variables. This can be verified by
the following example.
Example 3.6.2

minimize f = Xl + X2 + X3 (3.6.18)
subject to 2XI + 2X2 + X3 = 6 , (3.6.19)
3XI + 4X2 + X4 = 10, (3.6.20)
Xl + 2X2 + X5 = 4 . (3.6.21)
As mentioned above we rewrite the constraint equations (3.6.21) in the matrix form

1] {~1} ±}
together with the objective function appended as the last row of the matrix

[J_ ~ ~ ~
1 1 1 0 0 X5
= {
0
(3.6.22)

93
Chapter 3: Linear Programming
A basic solution is

Xl = X2 = 0, X3 = 6, X4 = 10, Xl) =4 . (3.6.23)

The variable X3 is a basic variable that appears in the last equation of Eqs. (3.6.22)
and must be eliminated from it so that its right-hand side yields the negative of the
current value of the objective function.

}
2 1 o
[J_ J-l {~1} ~t
4 o 1
2 o o (3.6.24)
={
-1 -1 o o o X5 -6 =-f
We can pivot either on column (1) or column (2). That is to say the objective function
will decrease in value by bringing either Xl or X2 into the basis. If we pivot on column
(1) (bringing Xl into the basis) the pivot element is all because it yields the smallest
(b;jail) ratio. The new simplex tableau becomes

1 0.5 0

J-l {~~}
1 -1.5 1
1

o
-0.5

0.5
0

0 o X5
={ J_=-f }
-3
(3.6.25)

The value of the objective function has been reduced from 6 to 3. Since the last
equation contains no non-basic variable with a negative coefficient, it is no longer
possible to decrease the value of the objective function further. Thus the minimum
value of the objective function is 3 and corresponds to the basic solution

Xl = 3, X4 = 1, Xl) =1 . (3.6.26)

If we had decided to bring X2 into the basis first, we would have reduced the objective
function from 6 to 4, and there would have been a negative number in the last equation
in the first column indicating the need for another round of pivoting to bring Xl into
the basis .•••
This would have completed the discussion of the simplex method except for the
fact that we need a basic feasible solution to start the simplex method and we may
not have one readily available. This is our next topic.

3.6.3 Generating a Basic Feasible Solution-Use of Artificial Variables

In the process of converting an LP problem given in the form of Eqs. (3.6.4) and
(3.6.5)
Ax :::; b, where b > 0, and x ;::: 0, (3.6.27)
into the standard form by adding slack variables we obtained a basic feasible solution
to start the simplex method. However, when we have a linear program which is

94
Section 3.6: The Simplex Method
already in the standard form of Eqs. (3.5.2) and (3.5.3) we cannot, in general,
identify a basic feasible solution. The following technique can be used in such cases.
Consider the following minimization problem
m

minimize L Yi (3.6.28)
i=l
subject to Ax+y = b, (3.6.29)
x ~ 0, and y ~ 0, (3.6.30)
where y is a vector of artificial variables. There is no loss of generality in assuming
that b > 0 so that the LP problem (3.6.29) has a known basic feasible solution
y = b, and x =0, (3.6.31)
so that the simplex method can be easily applied to solve the LP problem of Eqs.
(3.6.30). Note that if a basic feasible solution to the original LP problem (3.6.28)
exists then the optimum solution to the modified problem (3.6.30) must have Yi'S
as non-basic variables (y= 0). However, if no basic feasible solution to the original
problem exists then the minimum value of Eq (3.6.29) will be greater than zero.

Example 3.6.3

We illustrate the use of artificial variables with the following example for which we
seek a basic feasible solution to the system

Xl + 2X2 + X3 = 7, (3.6.32)
Xi ~ 0, i=I,2,3.

Introduce the artificial variables Yl and Y2 and pose the following minimization prob-
lem.
minimize f = Y1 + Y2 (3.6.33)
subject to 2X1 + X2 + 3X3 + Y1 = 13,
Xl + 2X2 + X3 + Y2 = 7, (3.6.34)
Xi ~ 0, i = 1,2,3, and Yi ~ 0, j = 1,2.
With the basic feasible solution, Y1 = 13, Y2 = 7, and Xl = X2 = X3 = 0 known, we
append the objective function (3.6.33) and clear the basic design variables Yl and Y2
from it to obtain the initial simplex tableau

1 3 1
2 1 o (3.6.35)
-3 -4 o
95
Chapter 3: Linear Programming

Since it has the largest negative number we choose column (3) for pivoting with ala
as the pivot element since 13/3 < 7/1,

2/3
[ 1/3
-1/3 -5/3
1/3
5/3
1
o
o
1/3
-1/3

4/3
}-l {~~}
o Yl
Y2
={ 1~3
-8/3
} (3.6.36)

Next we choose a22 as the pivot element to obtain

~~~ ~5J{~} {~}


9/15 o 1
[ 1/5 1 o (3.6.37)
=
o o o
The process has converged to the basic feasible solution
Xl = 0, X2 = 8/5, and Xa = 19/5 . (3.6.38)
to the original problem .•••

3.7 Duality in Linear Programming

It was shown by Dantzig [131 that the primal problem of minimization of a linear
function over a set of linear constraints is equivalent to the dual problem of the
maximization of another linear function over another set of constraints. Both the
dual objective function and constraints of the dual problem are obtained from the
objective function and constraints of the primal problem. Thus if the primal problem
is defined to be
minimize (n variables)

L
n

subject to aijXj ~ b;, i = 1, ... ,m, ( m constraints)


j=l
Xj ~ 0, j = 1, ... , n, (3.7.1)
then the dual problem is defined to be
maximize

L
m
subject to aijA; S Cj, j = 1, ... , n, (n constraints)
;=1
Ai ~ 0, j = 1, ... , m . (3.7.2)
The choice of the primal or dual formulation depends on the number of design vari-
ables and the number of constraints. The computational effort in solving an LP

96
Section 3. 7: Duality in Linear Programming
problem increases as the number of constraints increases. Therefore, if the number
of constraint relations is large compared to the number of design variables then it
may be desirable to solve the dual problem which will require less computational
effort. The classification of problems into the primal and dual categories is, however,
arbitrary since if the maximization problem is defined as the primal then the min-
imization problem is its dual. It can be shown [13] that the optimal values of the
basic variables of the primal can be obtained from the solution of the dual and that
(fp)min = (fd)max. Thus if Xj is a basic variable in the primal problem, then it implies
that the jth constraint of the dual problem is active and vice versa.
If the primal problem is stated in its standard form; namely with equality con-
straints
minimize (n variables)
n

subject to 2: aijXj = bi , i = 1, ... , m, (m constraints)


j=l
Xj ~ 0, j = 1, ... , n, (3.7.3)
then the corresponding dual problem is
maximize fd = bl).l + ...... + bm).m = bT.x (m variables)
m

subject to La;j).; ~ ej, j = 1, ... ,n, (n constraints)


;=1
(3.7.4)
with the variables ).i being unrestricted in sign [11].
It should be noted that, in practice, it is rare for a LP problem to be solved either
as a primal or as a dual problem. Most state-of-the-art LP software employ what is
known as a primal-dual algorithm. This algorithm begins with a feasible solution to
the dual problem that is successively improved by optimizing an associated restricted
primal problem. The details of this algorithm are beyond the scope of this book and
interested readers should consult Ref. [11].
Example 3.7.1

As an example of the simplex method for solving an LP problem via the dual formu-
lation we use the portal frame problem formulated in Example 3.1.5 with a slightly
different loading condition. The new loading condition is assumed to correspond to
a 25% increase in the magnitude of the horizontal load while keeping the magnitude
of the vertical load the same. The corresponding constraint equations have different
right-hand sides than those given in Eqs. (3.5.4) through (3.5.9), namely
4X2 ~ 1,
2Xl + 2X2 ~ 1,
Xl + X2 ~ 1.25, (3.7.5)
2Xl ~ 1.25,

2Xl + 4X2 ~ 3.5 ,


4Xl + 2X2 ~ 3.5 .

97
Chapter 3: Linear Progrc _ 'ing

However, when put into the standard form, not only does the problem involve a total
of 8 variables, but also a basic feasible solution to the problem is not immediately
obvious. Because the objective function (3.1.25) involves only two variables Xl and X2
the solution of the dual problem may be more efficient. The dual problem is
1 1 1 1
maximize + A2 + 14" A3 + 14" A4 + 3'2A5 + 3'2A6
fd = Al (3.7.7)
subject to 2A2 + A3 + 2A4 + 2A5 + 4A6 :::; 2 ,
4Al + 2A2 + A3 + 4A5 + 2A6 :::; 1 , (3.7.8)
Ai 2 0, i = 1, ... ,6 .
Maximizing fd is same as minimizing - fd and the process of converting the above
linear problem to the standard form yields
1 1 1 1
minimize - fd = -AI - A2 - 14" A3 - 14" A4 - 3'2A5 - 3'2A6 (3.7.9)
subject to 2A2 + A3 + 2A4 + 2A5 + 4A6 + A7 = 2 ,
4A1 + 2A2 + A3 + 4A5 + 2A6 + A8 = 1, (3.7.10)
Ai20, i=1, ... ,8,
with the basic feasible solution
Ai=O, i=1, ... ,6, and A7 = 2, A8 =1 .
We can begin with the initial simplex tableau with the basic variables cleared
from the last equation which represents the objective function.
Al
A2

[-~
2 1 2 2 4 1 A3
2 1 0 4 2 0 A4

-1 -1 -5/4 -5/4 -7/2 -7/2 0 -:-J A5


A6
A7
= { -:- }

A8
(3.7.11)
Although we should perhaps be choosing fifth or sixth column for pivoting, since
it has the largest negative value, pivoting on third column produces the same final
answer with one less simplex tableau. Pivoting on element a23 we have
Al

-1]
A2

=UJ
0 0 2 -2 2 1 A3

[~
2 1 0 4 2 0 A4 (3.7.12)
3/2 0 -5/4 3/2 -1 0 5;4 A5
A6
A7
A8
98
Section 3.7: Duality in Linear Programming
Because of the presence of negative terms in the last equation, it is clear that the
objective function can still be decreased further. Pivoting on element a14 we obtain

Al

[-2 -t]
.A2

~ r2}
0 0 1 -1 1 1/2 .A3
2 1 0 4 2 0 .A4
3~ 1~/8
(3.7.13)
.As
3/2 0 0 1/4 1/4 5/8 5/8 .A6
.A7
.As
Hence we conclude that (Jd)min = -15/8 or (Jd)max = (Jp)min = 15/8 with the
solution
(3.7.14)
The non-zero A's indicate that the active constraints in the primal problem are the
third and fourth, namely
2XI = 1.25, and Xl + X2 = 1.25, (3.7.15)
Solution of Eqs. (3.7.15) yields Xl = X2 = 5/8 .•••
In closing this section, it is interesting to point out that the dual variables can be
interpreted as the prices of the constraints. For a given variation on the right hand
side b of the constraint relations of Eq. (3.7.5), the change in the optimum value of
the objective function can be determined from

b.f* = >? b.b . (3.7.16)


For Eq. (3.7.16) to hold, however, the changes in the b vector must be such that it
does not result in a change in the active constraint set. The dual problem can also
be viewed as one of maximization of a profit subject to limitations on availability of
resources. It is clear then that the non-negative dual variables can be interpreted as
increased costs which would ensue from a violation of given constraints on resource
availabilities. Similarly a primal problem can be viewed as one of minimization of
total cost while satisfying demand. The full significance of dual variables, however,
can be brought out more clearly only in the context of the Kuhn-Tucker conditions
and the sensitivity of the optimum solutions to changes in design parameters which
will be discussed in Chapter 5. The following example demonstrates the use of dual
variables to find the sensitivity of the optimal solution to a change in a problem
parameter.

Example 3.7.2

Consider the portal frame design problem solved in Example 3.7.1 using dual vari-
ables. We will determine the change in the value of the optimum objective function
1* = 1.875 corresponding to a 25% reduction in the value of the horizontal force,
99
Chapter 3: Linear Programming
keeping the vertical force at p. These loads correspond to the problem formulated in
Example 3.1.5 and solved graphically in Example 3.4.1 .
From Eqs. (3.7.5) and (3.1.26) through (3.1.31) the change in the right-hand side
is b.ba = b.b4 = -i, and b.b5 = b.b6 = -~. Using the values of the dual variables
from Example 3.7.1 in Eq. (3.7.15) we obtain

b.j* = - (~) 1 + - (~) (~) = -0.375.


Therefore the optimum value of the objective function under this new loading config-
uration would be 1* = 1.5, of course, assuming that the active constraints (the ones
associated with non-zero dual variables) remain active. Fortunately, that assumption
is correct for the present example. However, beside the two constraints that are active
initially there are two more constraints which become active at the new design point
(see Fig. 3.4.1). Any reduction larger than 25% in the value of the horizontal load
would have caused a change in the active constraint set and resulted in an incorrect
answer.
We, therefore, emphasize the fact that in applying Eq. (3.7.15) one has to be
cautious not to perturb the design parameter to an extent that the active constraint
set changes. This is generally achieved by limiting the parameter perturbations to be
small. However, if we had used the design in Example 3.4.1 as our nominal design, no
matter how small the perturbation of the magnitude of the horizontal force, the active
constraint set would have changed. This is due to the redundancy of the constraints
at the optimal solution of Example 3.4.1. •••

3.8 An Interior Method - Karmarkar's Algorithm

In using the simplex algorithm discussed in section 3.6, we operate entirely along
the boundaries of the polytope in Rn moving from one extreme point (vertex) to
another following the shortest path between them, an edge of the polytope. Of all
the possible vertices adjacent to the one at which we start, the selection of the next
vertex is based on the maximum reduction in the objective function. With these
basic premises, the simplex algorithm is only a systematic approach for identifying
and examining candidate solutions to the LP problem. The number of operations
needed for convergence grows exponentially with the number of variables. In the
worst case, the number of operations for convergence for an n variable problem with
a set of s constraints can be s!/n!(s - n)!. However, it is possible to choose a move
direction different from an edge of the polytope, be consistent with the constraint
relations, and attain larger gains in the objective function. Although such a choice can
lead to a rapid descent toward the optimal vertex, it will do so through intermediate
points which are not vertices.
Interior methods of solving LP problems have drawn serious attention only since
the dramatic introduction of Karmarkar's algorithm [14J by AT&T Bell Laborato-
ries. This new algorithm was originally claimed to be 50 times faster than the simplex

100
Section 3.8: An Interior Method - Karmarkar's Algorithm
method. Since then, much work has been invested in improvements and extensions
of Karmarkar's algorithm. Developments include demonstration of how dual solu-
tions can be generated during the course of this algorithm [15], and extension of
Karmarkar's algorithm to treat upper and lower bounds more efficiently [16] byelim-
inating the slack variables which are commonly used for such bounds in the Simplex
algorithm.
Because some of the recent developments of the algorithm are mathematically
involved and beyond the scope of this book, only a general outline of Karmarkar's
algorithm are presented in the following sections. At this point we would like to
warn the reader that the tools used in the algorithm were originally introduced for
minimization of constrained and unconstrained nonlinear functions which are covered
in Chapters 4 and 5. Therefore, the reader is advised to read these chapters before
proceeding to the next section.
3.8.1 Direction 0/ Move

The direction of maximum reduction in the objective function is the direction of


steepest descent, which is the direction of the negative of the gradient of the objective
function \7/ (see section 4.2.2). For an LP problem posed in its standard form, see
Eq. (3.5.1), the gradient direction is,
\7/=c. (3.8.1 )
Although we are not limiting the move direction to be an edge of the polytope formed
by the constraint surfaces, for an LP problem the move direction cannot be selected
simply as the negative of the gradient direction. The direction must be chosen such
that the move leads to a point in the feasible region. This can be achieved by using
the projection matrix P
(3.8.2)
derived in section 5.3, where the columns of the matrix N correspond to the gradient
of the constraint equations. Since the constraints are linear functions of the variables,
we have N = AT. Operating on the gradient vector -c, P projects the steepest
descent direction onto the nullspace of the matrix A. That is, if we start with an
initial design point Xo which satisfies the constraint equation Axo = b, and move in
a direction -Pc we will remain in the subspace defined by that constraint equation.
Note that in numerical application of this projection the matrix product AAT may
not actually be inverted, but rather the linear system AAT y = Ac may be solved
and then the projected gradient may be calculated by using Pc = c - AT y. A
more efficient and better conditioned procedure based on QR factorization of the
matrix A for the solution of the projection matrix is described in section 5.5 . The
following simple example by Strang from reference [17] illustrates graphically the
move direction for a three dimensional design space.
Example 3.8.1

Consider the following minimization problem in three design variables,


minimize (3.8.3)

101
Chapter 3: Linear Programming
subject to Xl + X2 + X3 = 1, (3.8.4)
x~ O. (3.8.5)

Starting at an initial point x(O) = (1/3,1/3, 1/3)T determine the direction of move.

/-PC=(-1,O,1)T

:\I_C=(1, 2, 3)T

Figure 3.8.1 Design space and move direction.

The design space and the constraint surface for the problem are shown in Figure
(3.8.1). The direction corresponding to the negative of the gradient vector is marked
as -c. The projection matrix for the problem can be obtained from Eq. (3.8.2) where
A = [1 1 1]. The system AAT Y = Ac produces a scalar for y,

{l l}{l} y~{l l}{=D,


1 1 (3.8.6)

Y= -2.
The projected direction Pc is then given by

Pc=c-yAT , (3.8.7)

Pc ={ =~ }-{=~} ={ 11 } (3.8.8)

Moving in a direction -Pc guarantees maximum reduction in the objective func-


tion while remaining in the plane PQR formed by the constraint equation. The mini-
mum value of the objective function for this problem is achieved at the vertex R which,
clearly, can not be reached in one iteration. Therefore, the move has to be terminated
before the non-negativity requirement is violated (which is at x(l) = (2/3,1/3, O)T),
102
Section 3.8: An Interior Method - Karmarkar's Algorithm

and the procedure has to be repeated until a reasonable convergence to the minimum
point is achieved .•••
In the preceding example no explanation is provided for the selection of the initial
design point, and for the distance travelled in the chosen direction. Karmarkar [14]
=
stops the move before hitting the polytope boundary, say at x(1) (19/30,1/3, 1/30)T
in the previous example, so that there will be room left to move in the next iteration.
That is, starting either at the polytope or close to it increases the chances of hitting
another boundary before making real gains in the objective function. The solution
to this difficulty is accomplished by transforming the design space discussed in the
next section.
3.8.2 Transformation of Coordinates

In order to focus on the ideas which are important for his algorithm, Karmarkar
[14] makes several assumptions with respect to the form of the LP problem. In his
canonical representation, the LP problem takes the following form,
minimize f= cT:x: (3.8.9)
subject to Ax = 0 , (3.8.10)
(3.8.11)
(3.8.12)
where e is a 1 X n vector, e = (1, ... , l)T. The variable:x: represents the transformed
coordinate such that the initial point is the center, x(O) = e/n, of a unit simplex,
and is a feasible point, Ax(O) = o. A simplex is a generalization to n dimensions of
a 2-dimensional triangle and 3-dimensional tetrahedron. A unit simplex has edges
of unit length along each of the coordinate directions. Karmarkar also assumes that
c T x ~ 0 for every point that belongs to the simplex, and the target minimum value of
the objective function is zero. Conversion of the standard form of an LP problem into
this new canonical form can be achieved through a series of operations that involve
combining the primal and dual forms of the standard formulation, introducing of
slack and artificial variables, and transforming coordinates. The combination of the
primal and dual formulations is needed to accommodate the assumption that the
target minimum value of the objective function be zero. Details of the formation of
this new canonical form is provided in Ref. [14]. In this section we will demonstrate
the coordinate transformation which is referred as projective rescaling transformation.
This is the same transformation that helps to create room for move as we proceed
from one iteration to another.
Consider an arbitrary initial point x(a) in the design space, and let
D ., -- D·lag «a) (a»)
Xl , ... , Xn • (3.8.13)
The transformation, T." used by Karmarkar maps each facet of the simplex given by
Xi = 0 onto the corresponding facet Xi = 0 in the transformed space, and is given by

• 1 n- l (3.8.14)
x = eTD-lx ., x .
"
103
Chapter 3: Linear P1'Ogramming
While mapping the unit simplex onto itself, this transformation moves the point
x(a)to the center of the simplex, xeD) = (1/ n )e. Karmarkar showed that repeated
application of this transformation, in the worst case, leads to convergence to the
optimal corner in less than O( n f) arithmetic operations.
Karmarkar's transformation is nonlinear and a simpler form of this transformation
has been suggested. A linear transformation,

(3.8.15)

has been shown to perform as well as Karmarkar's algorithm in practice and to


converge in theory [18].

3.8.3 Move Distance

Following the transformation, Karmarkar optimizes the transformed objective func-


J
tion over an inscribed sphere of radius l' = 1/( n( n - 1) centered at x(D). This is the
largest radius sphere that is contained inside the simplex. For the three dimensional
design space of Example 3.8.1, for example, where there is one constraint surface, the
'sphere' is a circle in the plane of the constraint equation. In practice, the step length
along the projected direction used by Karmarkar is a fraction, 0:, of the radius. Thus,
the new point at the end of the move is given by
X(k+l) = x(k) _ O:1'(k)pC(k) , (3.8.16)

where 0 < 0: < 1. A typical value of 0: used by Karmarkar is 1/4.


During the course of the algorithm the optimality of the solution is checked
periodically by converting the interior solution to an extreme point solution at the
closest vertex. If the extreme point solution is better than the current interior, then,
it is tested for optimality.

3.9 Integer Linear Programming

Solution techniques for the LP problems considered so far have been developed
under the assumption that the design variables are positive and continuously-valued;
they can thus assume any value between their lower and upper bounds. In certain
design situations, some or all of the variables of a LP problem are restricted to take
discrete values. That is, the standard form of the LP problem of Eq. (3.5.1-3.5.3)
takes the form
minimize f(x) = cTx
such that Ax=b, (3.9.1)
Xi E X'i = {d il , di2 , . .. ,dil },

where Id is the set of design variables that can take only discrete values, and Xi is
the set of allowable discrete values. Design variables such as cross-sectional areas of

104
Section .'1.9: Integer Linear Programming

trusses and ply thicknesses of laminated composite plates often fall in this category.
Those problems with discrete-valued design variables are called discrete programming
problems.

In general, a discrete programming problem can be converted to a form where


design variables can assume only integer values. This conversion can be achieved by
having the design variable Xi to represent the index j of the dij,j = 1, ... , I, Eq.
(3.9.1). If the values in the discrete set are uniformly spaced, it is possible to scale
the set to form a set of integer values only. The problem is then called an integer
linear programming (ILP) problem,

minimize f(x) = ci x + cry


such that A I x+A 2y = b,
(3.9.2)
Xi ~ 0 integer,
Yj ~ 0 .

This form, where certain design variables are allowed to be continuous, is referred to
as mixed integer linear programming (MILP) problem. Problems where all variables
are integer are called pure ILP problems or in short ILP problems. It is also common
to have problems where design variables are used to indicate a 0/1 type decision
making situation. Such problems are referred to as zer%ne or binary ILP problems.
For example, a truss design problem where the presence of a particular member or
the lack of it is represented by a binary variable falls into this category. Any ILP
problem with an upper bound on the design variable Xi of 2K - 1 can be posed as
binary ILP problem by replacing the variable with f{ binary variables XiI, ... ,XiK
such that
K-I
Xi = XiI + 2Xi2 + ... + 2 XiK· (3.9.3)

It is also possible to convert the linear discrete programming problem to a binary


ILP by using binary variables (Xij E {O, I}, j = 1, ... ,I) such that

(3.9.4)

and XiI + Xi2 + ... + Xii = 1. (3.9.5)

Most of the following discussion assumes problems to be pure ILP.

A practical approach to solving ILP problems is to round-off the optimum val-


ues of the variables, obtained by assuming them to be continuous, to the nearest
acceptable integer value. For problems with n design variables there are 2n possible
rounded-off designs, and the problem of choosing the best one is formidable for large
n. Furthermore, for some problems the optimum design may not even be one of these
rounded-off designs, and for others none of the rounded-off designs may be feasible.
A more systematic way of trying possible combinations of variables that will satisfy
the requirements of a given problem can be explained by using the enumeration tree
example of Garfinkel and Nemhauser [19J.
105
Chapter 3: Linear Programming

Example 3.9.1

Consider the binary ILP problem of choosing a combination of five variables such
that the following summation is satisfied
5
f = Lixi = 5.
;=1

A decision tree representing the progression of solution of this problem is composed


of nodes and branches that represent the solutions and the combinations of variables
that lead the those solutions, respectively (Figure 3.9.1). The top node of the tree
corresponds to a solution which all the variables are turned off (Xi = 0, i = 1, ... ,5)
with a function value of f = O. Branching off from this solution are two paths
corresponding to the two alternatives for the first variable. The branch which has
Xl = 1 has a function value of f = 1 and tolerates turning additional variables on
without running into the risk of exceeding the required function value of 5. Of course
the other branch is same as the initial solution, and can be branched further. Next,
these two nodes are branched by considering the on and off alternatives for the second
variable. The node arrived by taking Xl = X2 = 1 has f = 3 and is terminated as
indicated by a vertical line. Such a vertex is said to be fathomed, because further
branching would mean adding a number that would cause f to exceed its required
value of 5. The other three vertices are said to be live, and can be branched further by
considering the alternatives for the remaining variables in a sequential manner until
either the created nodes are fathomed or the branches arrive at feasible solutions to
the problem.

Figure 3.9.1 Enumeration tree for binary ILP problem of f = L:~=l iXi = 5.
106
Section 3.9: Integer Linear Progmmming
For the present problem, after considering 19 possible combinations of variables,
we identified 3 feasible solutions which are marked by an asterisk. This is a 40%
=
reduction in the total number of possible trials, namely 25 32, needed to identify
all feasible solutions. For a structural design problem in which trials with different
combinations of variables would possibly require expensive analysis an enumeration
tree can yield substantial savings .•••

3.9.1 Bmnch-and-Bound Algorithm

The basic concept behind the enumeration technique forms the basis for this powerful
algorithm suitable for MILP problems as well as nonlinear mixed integer problems
[20,21]. The original algorithm developed by Land and Doig [22] relies on calculating
upper and lower bounds on the objective function so that nodes that result in designs
with objective functions outside the bounds can be fathomed and, therefore, the
number of analyses required can be cut back. Consider the mixed ILP problem of
Eq. (3.9.4). The first step of the algorithm is to solve the LP problem obtained from
the MILP problem by assuming the variables to be continuous valued. If all the x
variables for the resulting solution have integer values, there is no need to continue,
the problem is solved. Suppose several of the variables assume noninteger values and
the objective function value is h. The h value will form a lower bound h = h for the
MILP since imposing conditions that require any of the noninteger valued variables
to take integer values can only cause the objective function to increase. This initial
problem is labeled as LP-1 and is placed in the top node of the enumeration tree as
shown in Figure (3.9.2). For the purpose of illustration, it is assumed that only two
variables Xk and Xk+l violate the integer requirement with Xk = 4.3 and Xk+1 = 2.8 .

,,
.. . .-
.. ..

.. . .
.. . ...
...

Figure 3.9.2 Bmnch-and-bound decision tree for ILP problems.

The second step of the algorithm is to branch from the node into two new LP
problems by adding a new constraint to the LP-1 that would involve only one of the
noninteger variables, say Xk. One of the problems, LP-2, will require the value of the
branched variable, Xk to be less than or equal to the largest integer smaller than Xk,
107
Chapter 3: Linear Programming

and the other, LP-3, will have a constraint that Xk is larger than the smallest integer
larger than Xk. As will be demonstrated later in Example 3.9.2, these two problems
actually do branch the feasible design space of the LP-l into two segments. There
are several possibilities for the solution of these two new problems. One of these
possibilities is to have no feasible solution for the new problem. In that case the new
node will be fathomed. Another possibility is to reach an all integer feasible solution
(see LP-3 of Figure 3.9.2) in which case the node will again be fathomed but the value
of the objective function will become an upper bound lu for the MILP problem. That
is, beyond this solution point, any node that has an LP solution with a larger value
of the objective function will be fathomed, and only those solutions that have the
potential of producing an objective function between h and lu will be pursued. If
there are no solutions with an objective function smaller than lu, then the node is
an optimum solution. If there are other solutions with an objective function smaller
than lu, they may still include noninteger valued variables (LP-2 of Figure 3.9.2),
and are labeled as live nodes. Live nodes are then branched again by considering one
of the remaining noninteger values and resulting solutions are analyzed until all the
nodes are fathomed.
Example 3.9.2

Consider the portal frame problem problem of Example 3.1.5 (see Eqs. (3.1.25)
through (3.1.31)) with the requirement that Xi E {O.O, 0.2, 0.4, 0.6, 0.8, 1.0}, i = 1,2.
We rescale the design variables by a factor of 5 to pose the problem as an integer
linear programming problem,
1
minimize I = "5( 2X I + X2)
such that X2 ~ 1.25,
Xl + X2 ~ 2.5,
Xl + X2 ~ 5,
Xl ~ 2.5,

Xl + 2X2 ~ 7.5,

2XI + X2 ~ 7.5,
Xi ~ 0 integer, i = 1,2 .
Graphical solution of this scaled problem (presented in Example 3.4.1 without the
integer design variable requirement before scaling) is
Xl = X2 = 2.5, I = 7.5,
and forms a lower bound for the objective function, h = 7.5. That is, the optimal
integer solution cannot have an objective function smaller than h = 7.5. Next, we
choose Xl and investigate solutions for which Xl s:; 2 and Xl ~ 3 by forming two new
LP's by adding each one of these constraints to the original set of constraints. Since
the original set has a constraint that requires Xl ~ 2.5, the first LP problem with
Xl s:; 2 has no solution. The solution of the second LP is shown graphically in Figure
(3.9.3). The active constraints at the optimum are, Xl ~ 3 and Xl + 2X2 ~ 7.5, and
the solution is,
X2 = 2.25, 1= 8.25.
108
Section 3.9: Integer Linear Programming
\ \
\
\
4 \
\
\,...,
3 \ \ \\
\~ \~
\~ \
2 \
\
1

2 3 4 5 6

Figure 3.9.3 Branch-and-bound solution for Xl ::; 2 and Xl ~ 3 of Example 3.9.2 .

Since X2 is still non integer, we create two more LP's, this time by imposing
X2 ::;2 and X2 ~ 3, respectively. Graphical solutions of the new LP's are shown in
Figure (3.9.4). The solution for the case X2 ~ 3 is at the vertex Xl = 3 and X2 = 3,
and is a feasible solution for the integer problem with an objective function value of
I = 9. This value of the objective function, therefore, establishes an upper bound,
Iv = 9 for the problem. The solution for the case X2 ::; 2, on the other hand is at the
intersection of X2 = 2 and Xl + 2X2 = 5 leading to

Xl = 3.5, X2 = 2, and I =9 .

Figure 3.9.4 Branch-and-bound solution for X2 ::; 2 and X2 ~ 3 of Example 3.9.2 .


109
Chapter 3: Linear Programming

This solution is not discrete and can be interrogated further by branching on Xl (that
is creating new LP's by adding Xl ~ 3 and Xl ~ 4). However, since its objective
function is equal to the upper bound, we cannot improve the objective function any
further. To do so would necessitate introducing a further constraint which could
only increase the objective function. Therefore, the optimal solution is the one with
Xl = X2 = 3, and f = 9.•••

As can be observed from the example, performance of the Branch-and-Bound


algorithm relies heavily on the choice of noninteger variable to be used for branching,
and the selection of node to be branched. If a selected node and branching variable
leads to an upper bound close to the objective function of the LP-l early in the
enumeration scheme, then substantial computational savings can be obtained because
of the elimination of branches that would not be capable of generating solutions lower
than the upper bound. A rule of thumb for choosing the noninteger variable to be
branched is to take the variable with the largest fraction. For the selection of the
node to be branched, we choose, among all the live nodes, the LP problem which has
the smallest value of the objective function; that node is most likely to generate a
feasible design with a tighter upper bound.

Branch-and-Bound is only one of the algorithms for the solution of ILP or MILP
problems. However, because of its simplicity it is incorporated into many commer-
cially available computer programs [23, 241. There are a number of other techniques
which are capable of handling general discrete-valued problems (see, for example,
Ref. [25]). Some of these algorithms are good not only for ILP problems but also
for NLP problems with integer variables. Particularly, methods based on proba-
bilistic search algorithms are emerging for many applications, including structural
design applications, that involve linear and nonlinear programming problems. Two
of such techniques, namely simulated annealing and genetic algorithms, are discussed
in Chapter 4. Another approach, which is based on an extension of the penalty
function approach for constrained NLP problems, is presented in Chapter 5. Finally,
the use of dual variables (which are presented to be useful as prices of constraints in
section 7.3) in ILP problems are discussed in Chapter 9.

One of the interesting design applications of the ILP was introduced by Haftka
and Walsh [261 for the stacking sequence design of laminated composite plates for
improved buckling response. Since the formulation of this problem involves mate-
rial introduced in Chapter 11, discussion and demonstration of this application is
presented in that chapter.

3.10 Exercises

1. Estimate the limit load for the three bar truss example 3.1.2 using a graphical
approach. Verify your solution using the simplex method.

110
Section 3.10: Exercises

I 1" I 1" I 1" I 1" 1" '" 1" I. 1"

1 2

3 4
6
w1
5

w2

Figure 3.10.1 Platform support system

2. Consider the platform support system shown in Figure 3.10.1 in which cables 1
and 2 can support loads up to 400 lb each; cables 3 and 4 up to 150 lb each and
cables 5 and 6 up to 75 lb each. Neglect the weight of the platforms and cables,
and assume the weights WI, w2, and W3 at the positions indicated in the figure. Also
neglect the bending failure of the platforms. Using linear programming determine
the the maximum total load that the system can support.
3. Solve the limit design problem for the truss of Figure 3.1.4 using the sim-
plex algorithm. Assume Al3 = A24 = A 34 , Al4 = A 23 , and use appropriate non-
dimensionalization.
4. Using the method of virtual displacements verify that the collapse mechanisms for
the portal frame of Figure 3.1.6 lead to Eqs. (3.1.26) through (3.1.31) in terms of the
non dimensional variables Xl and X2.
5. The single bay, two story portal frame shown in Figure (3.10.2) is subjected
to a single loading condition consisting of 4 concentrated loads as shown. Following
Example 3.1.5 formulate the LP problem for the minimum weight design of the frame
against plastic collapse.
6. Consider the continuous prestressed concrete beam shown in Figure (3.10.3),
a) Verify that the equivalent uniformly distributed upward force exerted on the
concrete beam by a prestressing cable with a force f and a parabolic profile defined
by eccentricities YI, Y2, and Y3 at the three points X = 0, x = 1/2, and x = I
respectively is given by

b) The beam in the figure is subjected to two loading conditions: the first con-
sisting of a dead load of 1 kip/ft together with an equivalent load due to a parabolic
111
Chapter 3: Linear Programming

2p
P
rnb
3p 21
rna rna

2p
rnb

21

.1, 31/2 .1

Figure 3.10.2 Two story portal frame

0.5 ft

2.0 ft

0.5 ft

Figure 3.10.3 A continuous prestressed concrete beam

prestressing cable with a force f, and the second due to an additional live load of 2.5
kips/ft in service. It is assumed, however, that in service a 15% loss of prestressing
force is to be expected. Formulate the LP problem for the minimum cost design
of beam assuming f, Yl, and Y2 as design variables. Assume the allowable stress
for the two loading conditions to be (11 = 200 psi, (1i = -3000 psi, (12 = 0 psi,
(1~ = -2000 psi and the upper and lower bound limits on the eccentricities Yl and Y2
to be OAft ~ Yi ~ 2.6ft, i = 1,2.
c) Solve the LP problem by the simplex algorithm and obtain the solution for the
minimum prestressing force and the tendon profile.
7. Consider the statically determinate truss of Figure 3.3.1 and its minimum weight
design formulation as described by Eqs. (3.3.9) through (3.3.13). Use the linearization
scheme implied by Eqs. (3.3.2) through (3.3.5) to formulate the LP prohlem for m=3.
Solve the LP by the simplex algorithm and compare the approximate solution with

112
Section 3.11.' References

the graphical or an exact solution to the problem.


8. Use Branch-and-Bound algorithm to solve the limit design problem of Exercise 3
by assuming the cross-sections of the members to take values from the following sets
a) {O.O, 0.25, 0.5, 0.75,1.0,1.25,1.5,1.75, 2.0}.
b) {O.O, 0.3, 0.6, 0.9, 1.2, 1.5, 1.8,2.1}.

3.11 References

[1] Charnes, A. and Greenberg, H. J., "Plastic Collapse and Linear Programming,"
Bull. Am. Math. Soc., 57, 480, 1951.
[2] Calladine, C.R., Engineering Plasticity. Pergamon Press, 1969.
[3] Cohn, M.Z., Ghosh, S.K. and Parimi, S.R., "Unified Approach to Theory of Plas-
tic Structures," Journal of the EM Division, 98 (EM5), pp. 1133-1158, 1972.
[4] Neal, B. G., The Plastic Methods of Structural Analysis, 3rd edition, Chapman
and Hall Ltd., London, 1977.
[5] Zeman, P. and Irvine, H. M., Plastic Design, An Imposed Hinge-Rotation Ap-
proach, Allen and Unwin, Boston, 1986.
[6] Massonet, C.E. and Save, M.A., Plastic Analysis and Design, Beams and Frames,
Vol. 1. Blaisdell Publishing Co., 1965.
[7] Lin, T.Y. and Burns, N.H., Design of Prestressed Concrete Structures, 3rd ed.
John Wiley and Sons, New York, 1981.
[8] Parme, A.L. and Paris, G.H., "Designing for Continuity in Prestressed Concrete
Structures," J. Am. Concr. Inst., 23 (1), pp. 45-64, 1951.
[9] Morris, D., "Prestressed Concrete Design by Linear Programming," J. Struct.
Div., 104 (ST3), pp. 439-452, 1978.
[10] Kirsch, U., "Optimum Design of Prestressed Beams," Computers and Structures
2, pp. 573-583, 1972.
[11] Luenberger, D. G., Introduction to Linear and Nonlinear Programming, Addison-
Wesley, Reading, Mass., 1973.
[12] Majid, K.I., Nonlinear Structures, London, Butterworths, 1972.
[13] Dantzig, G., Linear Programming and Extensions, Princeton University Press,
Princeton, NJ, 1963.
[14] Karmarkar, N., "A New Polynomial-Time Algorithm for Linear Programming,"
Combinatorica,4 (4), pp. 373-395, 1984.

113
Chapter 3: Linear Programming
[15] Todd, M. J. and Burrell, B. P., "An Extension of Karmarkar's Algorithm for
Linear Programming Using Dual Variables," Algorithmica, 1, pp. 409-424, 1986.
[16] Rinaldi, G., "A Projective Method for Linear Programming with Box-type Con-
straints," Algorithmica, 1, pp. 517-527, 1986.
[17] Strang, G., "Karmarkar's Algorithm and its Place in Applied Mathematics," The
Mathematical Intelligencer, 9, 2, pp. 4-10, 1987.
[18] Vanderbei, R. F., Meketon, M. S., and Freedman, B. A., "A Modification of Kar-
markar's Linear Programming Algorithm," Algorithmica, 1, pp. 395-407, 1986.
[19] Garfinkel, R. S., and Nemhauser, G. L., Integer Programming, John Wiley &
Sons, Inc., New York, 1972.
[20] Lawler, E. L., and Wood, D. E., "Branch-and-Bound Methods-A Survey," Op-
erations research, 14, pp. 699-719,1966.
[21] Tomlin, J. A., "Branch-and-Bound Methods for Integer and Non-convex Pro-
gramming," in Integer and Nonlinear Programming, J. Abadie (cd.), pp. 437-450,
Elsevier Publishing Co., New York, 1970.
[22] Land, A. H., and Doig, A. G., "An Automatic Method for Solving Discrete Pro-
gramming Problems," Econometrica, 28, pp. 497-520, 1960.
[23] Johnson, E. L., and Powell, S., "Integer Programming Codes," in Design and
Implementation of Optimization Software, Greenberg, H. J. (ed.), pp. 225-240,
1978.
[24] Schrage, L., Linear, Integer, and Quadratic Programming with LINDO, 4th Edi-
tion, The Scientific Press, Redwood City CA., 1989.
[25] Kovacs, 1. B., Combinatorial Methods of Discrete Programming, Mathematical
Methods of Operations Research Series, Vol. 2, Akademiai Kiad6, I3udapest, 1980.
[26] Haftka, R. T., and Walsh, J. L., "Stacking-sequence Optimization for Buckling
of Laminated Plates by Integer Programming," AIAA J. (in press).

114
Unconstrained Optimization 4

In this chapter we study mathematical programming techniques that are commonly


used to extremize nonlinear functions of single and multiple (n) design variables
subject to no constraints. Although most structural optimization problems involve
constraints that bound the design space, study of the methods of unconstrained op-
timization is important for several reasons. First of all, if the design is at a stage
where no constraints are active then the process of determining a search direction and
travel distance for minimizing the objective function involves an unconstrained func-
tion minimization algorithm. Of course in such a case one has constantly to watch
for constraint violations during the move in design space. Secondly, a constrained
optimization problem can be cast as an unconstrained minimization problem even if
the constraints are active. The penalty function and multiplier methods discussed in
Chapter 5 are examples of such indirect methods that transform the constrained min-
imization problem into an equivalent unconstrained problem. Finally, unconstrained
minimization strategies are becoming increasingly popular as techniques suitable for
linear and nonlinear structural analysis problems (see Kamat and Hayduk[l]) which
involve solution of a system of linear or nonlinear equations. The solution of such
systems may be posed as finding the minimum of the potential energy of the system
or the minimum of the residuals of the equations in a least squared sense.

4.1 Minimization of Functions of One Variable

In most structural design problems the objective is to minimize a function with


many design variables, but the study of minimization of functions of a single de-
sign variable is important for several reasons. First, some of the theoretical and
numerical aspects of minimization of functions of n variables can be best illustrated,
especially graphically, in a one dimensional space. Secondly, most methods for un-
constrained minimization of functions f(x) of n variables rely on sequential one-
dimensional minimization of the function along a set of prescribed directions, Sk, in
the multi-dimensional design space R n. That is, for a given design point Xo and a
specified search direction at that point so, all points located along that direction can
be expressed in terms of a single variable Q by
x=xo+nso, (4.1.1)

115
Chapter 4: Unconstrained Optimization
where a is usually referred to as the step length. The function f(x) to be minimized
can, therefore, be expressed as

f(x) = f(xo + aso) = f(a) . (4.1.2)

Thus, the minimization problem reduces to finding the value a* that minimizes the
function, f(a). In fact, one of the simplest methods used in minimizing functions
of n variables is to seek the minimum of the objective function by changing only
one variable at a time, while keeping all other variables fixed, and performing a one-
dimensional minimization along each of the coordinate directions of an n-dimensional
design space. This procedure is called the univariate search technique.
In classifying the minimization algorithms for both the one-dimensional and
multi-dimensional problems we generally use three distinct categories. These cat-
egories are the zeroth, first, and second order methods. Zeroth order methods use
only the value of the function during the minimization process. First order methods
employ values of the function and its first derivatives with respect to the variables.
Finally, second order methods use the values of the function and its first and sec-
ond derivatives. In the following discussion of one-variable function minimizations,
the function is assumed to be in the form f = f(a). However, the methods to be
discussed are equally applicable for minimization of multivariable problems along a
preselected direction, s, using Eq. (4.1.1).

4.1.1 Zeroth Order Methods

Bracketing Method. As the name suggests, this method brackets the minimum of the
function to be minimized between two points, through a series of function evaluations.
The method begins with an initial point ao, a function value f(ao), a step size /30,
and a step expansion parameter, > 1. The steps of the algorithm [2] are outlined as
1. Evaluate f(ao) and f(ao + /30).
2. If f(ao + /30) < f(ao), let a1 ao + /30 and /31 = ,/30, and evaluate
/(a1 + /3d. Otherwise go to step 4.
3. If f(a1 + /3d < f(a1), let a2 = a1 + /31 and /32 = ,/31, and continue
incrementing the subscripts this way until f(ak + /3k) > f(ak). Then, go to step 8.
4. Let a1 = ao and /31 = -~/30, where ~ is a constant that satisfies 0 < ~ < 1/"
and evaluate f(a1 + /31).
5. If f( a1+ /3d > f( ad go to step 7.
6. Let a2 = a1 + /31 and /32 = ,/3h and continue incrementing the subscripts
this way until f(ak + /3k) > f(ak). Then, go to step 8.
7. The minimum has been bracketed between points (ao - ~/30) and (ao + /30).
Go to step 9.
8. The last three points satisfy the relations f(ak-2) > f(ak-d and f(ak-d <
f(ak), and hence, the minimum is bracketed.
116
Section 4.1: Minimization of FUnctions of One Variable
9. Use either one of the two end points of the bracket as the initial point. Begin
with a reduced step size and repeat steps 1 through 8 to locate the minimum to a
desired degree of accuracy.
Quadratic Interpolation. The method known as quadratic interpolation was first
proposed by Powell [3] and uses the values of the function f to be minimized at three
points to fit a parabola
p( a) = a + ba + ca2 , (4.1.3)
through those points. The method starts with an initial point, say, a = 0 with
a function value Po = f(xo), and a step size fJ. Two more function evaluations
are performed as described in the following steps to determine the points for the
polynomial fit. In general, however, we start with a situation where we have already
bracketed the minimum between al = al and a2 = au by using the bracketing
method described earlier. In that case we will only need an intermediate point ao in
the interval (ai, au).
1. Evaluate PI = p(fJ) = f(xo + fJs)
2. If PI < Po, then evaluate P2 = p(2fJ) = f(xo + 2fJs). Otherwise evaluate
P2=p(-fJ)=f(xo-fJs). Theconstantsa,b, and cinequationEq. (4.1.3) can now
be uniquely expressed in terms of the function values Po, PI, and P2 as

a =Po,

b = 4PI - 3po - P2 and P2 + Po - 2PI


if P2 = f(xo + 2fJs), (4.1.4)
2fJ ' C = 2fJ2
or

b = PI - P2 and PI - 2po + P2
= 2fJ2 ' if P2 = f(xo - fJs) . (4.1.5)
2fJ ' C

3. The value of a = a* at which pea) is extremized for the current cycle is then
given by
b
a * = -2c
- (4.1.6)
.

4. a* corresponds to a minimum of P if c > 0, and the prediction based on


Eq. (4.1.3) is repeated using (xo + a*s) as the initial point for the next cycle with
Po = f(xo + a*s) until the desired accuracy is obtained.
5. If the point a = a* corresponds to a maximum of P rather than a minimum, or
if it corresponds to a minimum of P which is at a distance greater than a prescribed
maximum fJmax (possibly meaning a* is outside the bracket points), then the max-
imum allowed step is taken in the direction of decreasing f and the point furthest
away from this new point is discarded in order to repeat the process.
In step 4, instead of starting with (xo + a*s) as the initial point and repeating
the previous steps, there is a cheaper alternative in terms of the number of function
evaluations. The point (xo + a*s) and the two points closest to it from the left and
117
Chapter 4: Unconstrained Optimization
right can be used in another quadratic interpolation to give a better value of a'.
Other strategies for improving the accuracy of the prediction will be discussed later
in Section 4.1.4.
Fibonacci and the Golden Section Search. Like bracketing, the Fibonacci and
the golden section search techniques are very reliable, if not the most efficient, line
search techniques for locating the unconstrained minimum of a function f(a) within
the interval ao :::: a :::: boo It is assumed that the function f is unimodal, or that it
has only one minimum within the interval. Unimodal functions are not necessarily
continuous or differentiable, nor convex (see Figure 4.1.1). A function is said to be
unimodal [3] in the interval To if there exist an a* E To such that a* minimizes! on
To, and for any two points all a2 E To such that a1 < a2 we have
implies that !(ad > !(a2) , (4.1. 7)
implies that !(a2) > f(at) . (4.1.8)

f(a)

a* a

Figure 4.1.1 A typical unimodal function.

The assumption of unimodality is central to the Fibonacci search technique which


seeks to reduce the interval of uncertainty within which the minimum of the function
! lies.
The underlying idea behind the Fibonacci and the golden section search tech-
niques can be explained as follows. Consider the minimization of f in the interval
(ao, bo). Let us choose two points in the interval (ao, bo) at a = a1 and at a = a2
such that a1 < a2, and evaluate the function! at these two points. If f(ad > !(a2),
then since the function is unimodal the minimum cannot lie in the interval (ao, ad.
The new interval is (aI, bo) which is smaller than the original interval. Similarly, if
f(a2) > !(a1), then the new interval will be (ao, a2)' The process can be repeated to
118
Section 4.1: Minimization of Functions of One Variable
reduce the interval to any desired level of accuracy. Only one function evaluation is
required in each iteration after the first one, but we have not specified how to choose
the locations where f is evaluated. The best placement of these points will minimize
the number of function evaluations for a prescribed accuracy requirement (i.e., re-
duction of the interval of uncertainty to a prescribed size). If the number of function
evaluations is n the most efficient process is provided by a symmetric placement of
the points provided by the relations [4]

In-Ii
(\(1 = ao + - f 0,
n+l
( 4.1.9)

fn-li
(\(2 = b0 - - - 0, (4.1.lO)
In+l
and
= ak + In-(Hl) Ik bk -
fn-(k+1) I
k, (4.1.11)
In-(k-l) fn-(k-l)
where In are Fibonacci numbers defined by the sequence 10 = 1, II = 1, In =
In-2 + In-I, and lk is the length of the kth interval (ak,b k ). The total number of
required function evaluations n may be determined from the desired level of accuracy.
It can be shown that the interval of uncertainty after n function evaluations is 2do
where
1
E=--. (4.1.12)
In+l

A disadvantage of the technique is that the number of function evaluations has


to be specified in advance in order to start the Fibonacci search. To eliminate this
undesirable feature a quasi-optimal technique known as the golden section search
technique has been developed. The golden section search technique is based on the
finding that for sufficiently large n, the ratio

fn-l --+ 0.382 . (4.1.13)


In+l

Thus, it is possible to approximate the optimal location of the points given by Eqs.
(4.1.9 - 4.1.11) by the following relations

(\(1 = ao + 0.382io , (4.1.14)

(\(2 = bo - 0. 38210, (4.1.15)


and
(4.1.16)

119
Chapter 4: Unconstrained Optimization
Example 4.1.1

Determine the value of a, to within t = ±0.1, that minimizes the function f(a) =
a(a - 3) on the interval 0 ~ a ~ 2 using the golden section search technique.
From Eqs. (4.1.14) and (4.1.15) we can calculate
al = 0 + 0.382(2) = 0.764, f(ad = -1.708,
a2 = 2 - 0.382(2) = 1.236, f(a2) = -2.180 .
Since f(a2) < f(al) we retain (al, 2). Thus, the next point is located at
a3 = 2 - 0.382(2 - 0.764) = 1.5278, f(a3) = -2.249 .
Since f(a3) < f(a2) we reject the interval (al, (2). The new interval is (a2,2). The
next point is located at
a4 = 2 - 0.382(2 - 1.236) = 1.7082, f(a4) = -2.207 .
f (a)
a
0.0

-0.5

-1.0

-1.5

-2.0

-2.5

Figure 4.1.2 Iteration history for the function minimization f(a) = a(a - 3).

Since f(a4) < f(a2) < f(2) we reject the interval (a4,2) and retain (a2, (4) as
the next interval and locate the point a5 at
a5 = 1.236 + 0.382(1.7082 - 1.236) = 1.4164, f(a5) = -2.243 .
Since f(a5) < f(a4) < f(a2) we retain the interval (a5,a4). The next point is
located at
a6 = 1.7082 + 0.382(1.7082 - 1.4164) = 1.5967, f(a6) = -2.241 .
Since f(a6) < f(a4) we reject the interval (a6, (4) and retain the interval (a5, (6)
of length 0.18, which is less than the interval of specified accuracy, 2£ = 0.2. The
iteration history for the problem is shown in Figure 4.1.2. I-Ience, the minimum has
been bracketed to within a resolution of ±0.1. That is, the minimum lies between
a5 = 1.4164 and a6 = 1.5967. We can take the middle of the interval, a = 1.5066 ±
0.0902 as the solution. The exact location of the minimum is at a = 1.5 where the
function has the value -2.25 .•••

120
Section 4.1: Minimization of Functions of One Variable

4.l.2 First Order Methods

Bisection Method. Like the bracketing and the golden section search techniques
which progressively reduce the interval where the minimum is known to lie, the
bisection technique locates the zero of the function f' by reducing the interval of
uncertainty. Beginning with the known interval (a,b) for which 1'(a)1'(b) < 0, an
approximation to the root of l' is obtained from
* a+b
a =-2-' (4.l.17)

which is the point midway between a and b. The value of l' is then evaluated at
00*. If l' (00*) agrees in sign with l' (a) then the point a is replaced by 00* and the new
interval of uncertainty is given by (00*, b). If on the other hand 1'( 00*) agrees in sign
with 1'(b) then the point b is replaced by 00* and the new interval of uncertainty is
( a, 00*). The process is then repeated using Eq. (4.1.17).
Davidon's Cubic Interpolation Method. This is a polynomial approximation
method which uses both the function values and its derivatives for locating its min-
imum. It is especially useful in those multivariable minimization techniques which
require the evaluation of the function and its gradients.
We begin by assuming the function to be minimized f(xo + 0080) to be approxi-
mated by a polynomial in the form
p( a) = a + boo + ca 2 + da 3 , (4.1.18)
with constants a, b, c, and d to be determined from the values of the function,
Po and PI, and its derivatives, go and gl, at two points, one located at a = 0 and
the other at a = (3.
Po = p(O) = f(xo), Pl = p((3) = f(xo + (38), (4.1.19)
and
dp( dp
go = da 0) = 8
T
'V f(xo), gl = -((3) = 8 'V f(xo
T
+ (38) . (4.1.20)
da
After substitutions, Eq. (4.l.18) takes the following form

( ) _ go +e 2 go + gl + 2e 3
P a - Po + goa - -(3-00 + 3(32 a, (4.l.21)

where
( 4.l.22)

We can now locate the minimum, a = am, of Eq. (4.1.21) by setting its derivative
with respect to a to be zero. This results in

(3 ( go +c± h ) (4.1.23)
am = go + gl + 2e '
121
Chapter 4: Unconstrained Optimization
where
( 4.1.24)
It can be easily verified, by checking d 2p/da2 , that the positive sign must be retained
in Eq. (4.1.23) for am to be a minimum rather than a maximum. Thus, the algorithm
for Davidon's cubic interpolation [5] may be summarized as follows.
1. Evaluate Po = f(xo) and go = sTV f(xo) and make sure that go < O.
2. In the absence of an estimate of the initial step length {3, we may calculate it
on the basis of a quadratic interpolation derived using Po, go and an estimate of Pmin.
Thus,
{3 = 2(Pmin - Po) . (4.1.25)
go

3. Evaluate PI = f(xo + {3s) and gl = df(xat {3s)


4. If gl > 0 or if PI > Po go to step 6, or else go to step 5.
5. Replace {3 by 2{3 and go to step 3.
6. Calculate am using Eq. (4.1.23) with a positive sign.
7. Use the interval (0, am) if

(4.1.26)

or else use the interval (am' {3) and return to step 4.


8. If am corresponds to a maximum, restart the algorithm by using new points.
Selection of the new points may be performed by using a strategy similar to that
described for the quadratic interpolation technique.

4.1.3 Second Order Method

The problem of minimizing the function f( a) is equivalent to obtaining the root of


the nonlinear equation
!,(a) =0, (4.1.27)
because this is the necessary condition for the extremum of f. A convenient method
for solving (4.1.27) is Newton's method. This method consists of linearizing f'(a)
about a point a = aj and then determining the point ai+l at which the linear
approximation
(4.1.28)
vanishes. This point
f'( aj)
aj+l = aj - !"(ai) , (4.1.29)

122
Section 4.2: Minimization of Functions of Several Variables

serves as a new approximation for a repeated application of Eq. (4.1.29) with i re-
placed by i + 1. For a successful convergence to the minimum it is necessary that the
second derivative of the function f be greater than zero. Even so the method may
diverge depending on the starting point. Several strategies exist [6] which modify
Newton's method to make it globally convergent (that is, it will converge to a mini-
mum regardless of the starting point) for multi variable functions; some of these will
be covered in the next section.
The reason this method is known as a second order method is not only because
it uses second derivative information about the function f, but also because it has
a rate of convergence to the minimum that is quadratic. In other words, Newton's
algorithm converges to the minimum a* such that

· la;+l - a* 1
11m (3
2 = , ( 4.1.30)
(a; - a*)
;--->00

where a; and a;+l are the ith and the (i + 1)st estimates of the minimum value of
the a*, (3 is a non-zero constant.

4.1.4 Safeguarded Polynomial Interpolation [7], p. 92

Polynomial interpolations such as the Quadratic interpolation and the Davidon's


cubic interpolation are sometimes found to be quite inefficient and unreliable for
locating the minimum of a function along a line. If the interpolation function is not
representative of the behavior of the function to be minimized within the interval
of uncertainty, the minimum may fall outside the interval, or become unbounded
below, or the successive iterations may be too close to one another without achieving
a significant improvement in the function value. In such cases, we use what are
known as safeguarded procedures. These procedures consist of combining polynomial
interpolations with a simple bisection technique or the golden section search technique
described earlier. At the end of the polynomial interpolation, the bisection technique
would be used to find the zero of the derivative of the function f. The golden
section search, on the other hand, would work with the function f itself using the
known interval of uncertainty (a, b) and locate the point a* which corresponds to the
minimum of f within the interval.

4.2 Minimization of Functions of Several Variables

4.2.1 Zeroth Order Methods

Several methods exist for minimizing a function of several variables using only func-
tion values. However, only two of these methods may be regarded as being useful.
These are the sequential simplex method of Spendley, Hext and Himsworth [8] and
123
Chapter 4: Unconstrained Optimization
Powell's conjugate direction method [3]. Both of these methods require that the
function f(x),x ERn, be unimodal; that is the function f has only one minimum.
The sequential simplex does not require that the function f be differentiable, while
the differentiability requirement on f is implicit in the exact line searches of Powell's
method. It appears from tests by NeIder and Mead [9] that for most problems the
performance of the sequential simplex method is comparable to if not better than
Powell's method. Both of these methods are considered inefficient for n :2: 10; Pow-
ell's method may fail to converge for n :2: 30. A more recent modification of the
simplex method by Chen, et al. [10] extends the applicability of this algorithm for
high dimensional cases. If the function is differentiable, it is usually more efficient
to use the more powerful first and second order methods with derivatives obtained
explicitly or from finite difference formulae.
Sequential Simplex Method. The sequential simplex method was originally pro-
posed by Spendley, Hext and Himsworth [8] and was subsequently improved by NeIder
and Mead [9]. The method begins with a regular geometric figure called the simplex
consisting of n + 1 vertices in an n-dimensional space. These vertices may be defined
by the origin and by points along each of the n coordinate directions. Such a simplex
may not be geometrically regular. The following equations are suggested in Ref. 8
for the calculation of the positions of the vertices of a regular simplex of size a in the
n-dimensional design space
n

Xj = Xo + pej + Lqek, j=l, ... ,n, (4.2.1)


k=l
k¢i

with
p= arnCVn+1 + n -1), and q= alO (Vn+1-1), (4.2.2)
ny2 ny2
where ek is the unit base vector along the kth coordinate direction, and Xo is the
initial base point. For example, for a problem in two-dimensional design space Eqs.
(4.2.1) and (4.2.2) lead to an equilateral triangle of side a.
Once the simplex is defined, the function f is evaluated at each of the n+ 1 vertices
XO,xl, ... ,xn' Let Xh and XI denote the vertices where the function f assumes its
maximum and minimum values, respectively, and Xs the vertex where it assumes the
second highest value. The simplex method discards the vertex Xh and replaces it
by a point where f has a lower value. This is achieved by three operations namely
reflection, contraction, and expansion.
The reflection operation creates a new point Xr along the line joining Xh to the
centroid x of the remaining points defined as
1 n
X=-LXi' i ¥ h . (4.2.3)
n ;=0

The vertex at the end of the reflection is calculated by


Xr = X + a(x - Xh) , (4.2.4)

124
Section 4.2: Minimization 01 Functions 01 Several Variables
with 0: being a positive constant called the reflection coefficient which is usually
assumed to be unity. Any positive value of the reflection coefficient in Eq. (4.2.4)
guarantees that Xr is on the other side of the x from Xh. If the value of the function
at this new point, Ir = I(x r ), satisfies the condition II < Ir ~ Is, then Xh is replaced
by Xr and the process is repeated with this new simplex. If, on the other hand, the
value of the function Ir at the end of the reflection is less than the lowest value of the
function II = l(xl), then there is a possibility that we can still decrease the function
by going further along the same direction. We seek an improved point Xe by the
expansion technique using the relation

Xe = X + (3(xr - x), (4.2.5)

with the expansion coefficient {3 often being chosen to be 2. If the value of the function
Ie is smaller than the value at the end of the reflection step, then we replace Xh by
Xe and repeat the process with the new simplex. However, if the expansion leads to a
function value equal to or larger than fr, then we form the new simplex by replacing
Xh by Xr and continue.
Finally, if the process of reflection leads to a point Xr such that, fr < fh, then
we replace Xh by Xr and perform contraction. Otherwise (lr ~ fh)' we perform
contraction without any replacement using

( 4.2.6)

°
with the contraction coefficient /, < / < 1, usually chosen to be 1/2. If fe = f(xe)
is greater than Ih, then we replace all the points by a new set of points

1
x·, = X·'+
2 -(Xl - x·)
' , i=O,I, ... ,n, (4.2.7)

and restart the process with this new simplex. Otherwise, we simply replace Xh by
Xc and restart the process with this simplex. The operation in Eq. (4.2.7) causes the
distance between the points of the old simplex and the point with the lowest function
value to be halved and is therefore referred to as the shrinkage operation. The flow
chart of the complete method is given in Figure 4.2.1. For the convergence criterion
to terminate the algorithm NeIder and Mead [9] proposed the following

In
{ 1 + n ~ [fi - f( X W
}t < f, (4.2.8)

where f is some specified accuracy requirement.


An improvement in the performance of the simplex algorithm for those cases with
large number of design variables, n, is achieved by Chen, Saleem, and Grace [10]. A
modified simplex search procedure proposed in Ref. [10] executes the reflection,
expansion, contraction, and, shrinkage operations on more than one vertex of the
simplex at a given step. This is achieved by first separating the vertices of the simplex
125
Chapter 4: Unconstrained Optimization

Initialize a simplex
Detennine
r----~ xh. xs' xl. and x
fh. f s• fl

Reflection: xr = x + a(x - xh)

Expansion: xe= x + ~(xr - x)

Figure 4.2.1 Flow chari of the Sequential Simplex Algorithm.


into two groups by defining a cutting value (CV) of the function. fev. The cutting
value is defined by the relation

f,ev -- (/h +
2
fi) + "Is, (4.2.9)
where s is the standard deviation of the values of the function corresponding to the

126
Section 4.2: Minimization of Functions of Several Variables

vertices of the simplex,

S= [~(Ji - ])2/(n + l)r ' 1

(4.2.10)

and 'I} is a parame!er (discussed below) that controls the number of vertices to be
operated on. The f value in Eg. (4.2.10) is the average of the function values over
the entire current simplex.
The vertices with function values higher than the cutting value form the group
to be reflected (and to be dropped). The other vertices serve as reference points.
If the parameter 'I} is sufficiently large, all the vertices of the simplex except the Xh
stay in the group to be used as the reference points and, therefore, the algorithm is
equivalent to the original form. For sufficiently small values of the parameter 'I} , all
points except the Xn are dropped. The selection of the parameter 17 depends on the
difficulty of the problem as well as the number of variables. Recommended values
for 'I} are given in Table II of Ref. [10]. Among the n + 1 vertices of the current
simplex, we rearrange and number the vertices from largest to smallest function
values as Xo, Xl, ... ,Xcv, ... ,Xn where i = 0, ... ,ncv are the elements of the group to
be reflected next. The centroid of the vertices in the reference group is defined as

LXi.
n
1
X= (4.2.11)
n - ncv .
l=n cv +l

The performance of this modified simplex method has been compared [10] with the
simplex method proposed by NeIder and Mead, and also with more powerful meth-
ods such as the second order Davidon-Fletcher-Powell (DFP) method which will be
discussed later in this chapter. For high dimensional problems the modified simplex
algorithm was found to be more efficient and robust than the DFP algorithm. Nclder
and Mead [9] have also provided several illustrations of the use of their algorithm
in minimizing classical test functions and compared its performance with Powell's
conjugate directions method which will be discussed next.
Powell's Conjugate Directions Method and its Subsequent Modification. Al-
though most problems have functions which are not quadratic, many unconstrained
minimization algorithms are developed to minimize a quadratic function. This is be-
cause a function can be approximated well by a quadratic function near a minimum.
Powell's conjugate directions algorithm is a typical example. A quadratic function in
Rn may be written (l.')
1
f(x) = 2xTQx+bTx+c. (4.2.12)

A set of directions Si, i = 1,2 ... are said to be Q-conjugate if


for i =f. j . (4.2.13)

Furthermore, it can be shown that if the function f is minimized once along each
direction of a set s of linearly independent Q-conjugate directions then the minimum

127
Chapter 4: Unconstrained Optimization

of f will be located at or before the nth step regardless of the starting point provided
that no round-off errors are accumulated. This property is commonly referred to as
the quadratic termination property. Powell provided a convenient method for gen-
erating such conjugate directions by a suitable combination of the simple univariate
search and a pattern search technique [3]. However, in certain cases Powell's algo-
rithm generates directions which are linearly dependent and thereby fails to converge
to the minimum. Hence, Powell modified his algorithm to make it robust but at the
expense of its quadratic termination property.

Powell's strategy for generating conjugate directions is based on the following


property (see Ref. 3 for proof). If Xl and X2 are any two points and s a specified
direction, and XIs corresponds to the minimum point of a quadratic function f on
a line starting at Xl along sand X2s is the minimum point on a line starting at X2
along s, then the directions sand (X2s - XIs) are Q-conjugate. The basic steps of
Powell's modified method are based on a cycle of univariate minimizations. For each
cycle we use the following steps.

1. Minimize f along each of the coordinate directions (univariate search) starting


at x~ and generating the points x}, ... ,x~ where k is the cycle number.

2. After completing the univariate cycle find the index m corresponding to the
direction of the univariate search which yields the largest function decrease in going
from X~_l to x~.

3. Calculate the "pattern" direction s~ = x~ - x~ (which is the sum of all the


univariate moves) and determine the value of a from x~ along s~ that minimizes f.
Denote this new point by X~+l.

4. If

(4.2.14)

then use the same old directions again for the next univariate cycle (that is do not
discard any of the directions of the previous cycle in preference to the pattern direction
s~). If Eq. (4.2.14) is not satisfied then replace the mth direction by the pattern
direction s~.

5. Begin the next univariate cycle with the directions decided in step 4, and
repeat the steps 2 through 4 until convergence to a specified accuracy. Convergence
is assumed to be achieved when the Euclidean norm Ilxk- 1 - xkll is less than a prc-
specified quantity E.

Although Powell's original method does possess a quadratic termination property,


his modified algorithm does not [3]. The modified method will now be illustrated on
the following simple example from structural analysis.

128
Section 4.2: Minimization of Functions of Several Variables

Example 4.2.1

The problem of determination of the maximum deflection and tip-rotation of a can-


tilever beam oflength l shown in Figure (4.2.2) loaded at its tip is considered. Solution
of this problem is formulated as a minimization of the total potential energy of the
beam which is modelled using a single cubic beam finite element. For a two-noded
beam element with two degrees of freedom at each node, the displacement field is
assumed to be

~~..~ ~~!
(b)

.. x ===(a)

1, EI
=&'CD
Figure 4.2.2 Tip loaded cantilever beam and its finite element model.

v(€) ~ [(1- 3<' + ~') I(€ - 2<' H') (3<' - 2<') 1(-<' +e')] nn '
(4.2.15)
where ~ = x/l. The corresponding potential energy of the beam model is given by

(4.2.Hi)

Because of the cantilever end condition at ~ = 0, the first two degrees of freedom
in Eq. (4.2.15) are zero. Therefore, substituting Eq. (4.2.15) into Eq. (4.2.16) we
obtain
(4.2.17)

Defining f = 2rrP / El, Xl = V2, X2 = 02l, and choosing pl3 / El = 1, the problem
of determining the tip deflection and rotation of the beam reduces to an unconstrained
minimization of
(4.2.18)

Starting with an initial point of x6 = (-1, _2)T and f(x6) = 2 we will minimize
f using Powell's conjugate directions method. The exact solution of this problem is
at x* = (-1/3, -1/2)T.

129
Chapter 4: Unconstrained Optimization
Since we have an explicit relation for the objective function I, the one dimensional
minimizations along a given direction will be performed exactly without resorting to
any of the numerical techniques discussed in the previous section. However, if these
minimizations were done numerically, one of the zeroth order techniques would be
sufficient. We use superscripts to denote the univariate cycle number and subscripts
to denote the iteration number within a cycle.
First, we perform the univariate search along the Xl and X2 directions. Choosing
sf = (1, Of we have
Xl
1
= {-I}
-2 + 0 {I}
0 = {-1+0}
-2 ' (4.2.19)

and
1(0) = 12( -1 + o? + 4( _2)2 - 12( -1 + 0)( -2) + 2( -1 + 0) . (4.2.20)
Taking the derivative of Eq. (4.2.20) with respect to 0, we obtain the value of 0
which minimizes 1 to be 0 = -1/12. Hence,

x~ ={
-13 }
..2~ and l(xD = 1.916666667 .
Choosing s~ = (0, If, we obtain
1_ { 12
-13 } { 0 } - { 12
-13 } (4.2.21)
x2 - -2 +0 1 - -2 + 0 '

and

_13)2 +4(-2+0?-12 (-13) (-2+0)+2 (-13) ' (4.2.22)


1(0)=12 ( 12 12 12
which is minimum at 0 = 3/8. Therefore, at the end of the univariate search we have

xi = { -11; } and I(x~) = 1.354166667 .


-13
-8

At this point we construct a pattern direction as

(4.2.23)

and minimize the function along this direction by

Xo =
2 { }+ {-I} {
-1
-2 0
12
~ =
-1 _ ~
12 }
-2 + 3: '
(4.2.24)

130
Section 4-2: Minimization of Functions of Seveml Variables
which attains its minimum value for a = 40/49 at
-157 }
2 147
Xo = {
-83
and f(x~) = 1.319727891 .
49

The direction that corresponds to the largest decrease in the objective function f
during the first cycle of the univariate search is associated with the second variable.
We can now decide whether we want to replace the second (m = 2) univariate search
direction by the pattern direction or not by checking the condition stated in step 4
of the algorithm, Eq. (4.2.24). That is, Powell's criterion
1
40 [ 2 - 1.319727891 ] 2"
( 4.2.25)
lal = 49 < 1.916666667 - 1.354166667 .
is satisfied, therefore, we retain the old univariate search directions for the second
cycle and restart the procedure by going back to step 2 of the algorithm. The results
of the second cycle are tabulated in Table 4.2.1.
Table 4.2.1. Solution of the beam problem using Powell's conjugate directions method
CycleNo. f
o -1.0 -2.0 2.0
1 -1.083334 -2.0 1.916667
1 -1.083334 -1.625 1.354167
2 -0.895834 -1.625 0.9322967
2 -0.895834 -1.34375 0.6158854
2 -0.33334 -0.499999 -0.333333

The effectiveness of Powell's modified method can be seen to be much more


pronounced on the minimization of the following function considered by Avriel [2],

f = (Xl + X2 - x3f + (Xl - X2 + X3)2 + (-Xl + X2 + X3?,


and left as an exercise for the reader (see Exercise 2) .•••
Before we proceed to the discussion of first order methods, it is worthwhile to
consider when zeroth order method should be used. The sequential simplex method
can be used for non differentiable functions where first order methods are not appro-
priate. For those unconstrained minimization problems with differentiable functions,
it is preferable to calculate the exact derivatives, or generate such derivatives by
using finite differences and subsequently use a first order method for minimization
when these derivatives can be calculated accurately. Zeroth order methods such as
Powell's conjugate directions algorithm may still have a place for problems with a
highly nonlinear objective functions where the accuracy of the function evaluations
may be poor. The poor accuracy in function evaluations may call for high order
finite difference formulae to be used for derivative calculations, therefore, the use of
a zeroth order method for minimization may be a prudent alternative.
131
Chapter 4: Unconstrained Optimization

4.2.2 First Order Methods

First order methods for unconstrained minimization of a function / in Rn use the


gradient of the function as well as its value in calculating the move direction for
the function minimization. These methods possess a linear or a superlinear rate of
convergence. A sequence Xb k = 0,1,2, ... , is said to be q-superlinear convergent to
X* of order at least p if
(4.2.26)
where Ck converges to zero. If Ck in Eq.( 4.2.26) is a constant then the convergence is
said to be a simple q-order convergence of order at least p. Thus, if p = 1 with Ck
equal to a constant then we have a linear convergence rate, whereas if p = 1 and Ck is
a sequence that converges to zero then the convergence is said to be superlinear (see
Ref. 6 for additional definitions).
Perhaps the oldest known method for minimizing a function of n variables is the
steepest descent method first proposed by Cauchy [11] for solving a system of linear
equations. It can be used for function minimization as follows. The direction of move
is obtained by minimizing the directional derivative of /
n a/
VPs= La.Si'
i=1 x,
(4.2.27)

subject to the condition that s be a unit vector in Rn in the Euclidean sense.


ST S = 1. (4.2.28)
It can easily be verified (see Exercise 6) that the steepest descent direction is given
by
V/
(4.2.29)
s= -IIV/II'
where II II denotes the Euclidean norm, and it provides the largest decrease in the
function /. Starting with a point Xk at the kth iteration of the minimization process,
we obtain the next point Xk+! as
(4.2.30)
Here s is given by Eq. (4.2.29) and a is determined such that. / is minimized along
the chosen direction by using anyone of the one-dimensional minimization techniques
covered in the previous section. If the function to be minimized is quadratic in Rn
and expressed as
1
/ = 2"x Qx + b x + c,
l' l'
(4.2.31 )

the step length can be determined directly by substituting Eq. (4.2.30) into Eq.
(4.2.31) for the (k + 1 )st iteration followed by a minimization of / with respect to a
which yields
(4.2.32)

132
Section 4.2: Minimization of Functions of Several Variables

In obtaining Eq. (4.2.32) we assume that the Hessian matrix Q of the quadratic form
is available explicitly, and we make use of the symmetry of Q.
The performance of the steepest descent method depends on the condition number
of the Hessian matrix Q. The condition number of a matrix is the ratio of the largest
to the smallest eigenvalue. A large condition number implies that the contours of
the function to be minimized form an elongated design space, and therefore the
progress made by the steepest descent method is very slow and proceeds in a zigzag
pattern known as hemstitching. This is even true for quadratic functions, and can
be improved by re-scaling the variables.
Example 4.2.2

12x12_12xl x2+4x22+2xl = constant

Conjugate
Gradient

Figure 4.2.3 Contours of the cantilever beam potential energy function.

The cantilever problem discussed in the previous example illustrates this behavior
most vividly. The steepest descent method when applied to this problem may exhibit
the typical hemstitching phenomenon as shown in Figure 4.2.3 for certain initial
starting points. However, a simple transformation of variables to improve the scaling
of the variables causes the steepest descent method to converge to the minimum in a
single step. For example, consider the following transformation

( 4.2.33)

The function f may now be expressed in terms of the new variables Yl and Y2 as
(4.2.34)

133
Chapter 4: Unconstrained Optimization

As a result of the scaling and elimination of the cross-product term, the condition
number of the Hessian of f is unity. Contours of the function f in the YI - Y2 plane
will appear as circles. Beginning with any arbitrary starting point Yo and applying
the steepest descent method we have

YI = Yo +a { 22Ylo +yi '}


3. (4.2.35)
Y20 +""6
It can be easily verified that the value of a* that minimizes f is 0.5. Therefore,

YI = { _~~ } ,
at which the gradient of f is zero, implying that it is a minimum point. The corre-
sponding values of the original variables xi, and x 2are -1/3 and -1/2, respectively.
This simple demonstration clearly shows the effectiveness of scaling in convergence
of the steepest descent algorithm to the minimum of a function in R n. It can be
shown [6] that the steepest descent method has only a linear rate of convergence in
the absence of an appropriate scaling .•••
Unfortunately, in most multivariable function minimizations it is not easy to de-
termine the appropriate scaling transformation that leads to a one step convergence
to the minimum of a general quadratic form in R n using the steepest descent algo-
rithm. This would require calculating the Hessian matrix and then performing an
expensive eigenvalue analysis of the matrix. Hence, we are forced to look at other
alternatives for rapid convergence to the minimum of a quadratic form. One such
alternative is provided by minimizing along a set of conjugate gradient directions
which guarantees a quadratic termination property. Hestenes and Stiefel [12] and
later Fletcher and Reeves [13] offered such an algorithm which will be covered next.
Fletcher-Reeves' Conjugate Gradient Algorithm. This algorithm begins from
an initial point Xo by first minimizing f along the steepest descent direction,
So = - '\l f(xo) = go, to the new iterate XI. The direction for the next iteration
SI must be constructed so that it is Q-conjugate to 80 where Q is the Hessian of
the quadratic f. The function is then minimized along Sl to yield the next iterate
X2. The next direction 82 from X2 is constructed to be Q-conjugate to the previous
directions 80 and 81, and the process is continued until convergence to the mini-
mum is achieved. By virtue of Powell's theorem on conjugate directions for quadratic
functions, convergence to the minimum is theoretically guaranteed at the end of the
minimization of the function f along the conjugate direction 8 n -1. For functions
which are not quadratic, conjugacy of the directions s;, i = 1, ... , n loses its mean-
ing since the Hessian of the functions is not a matrix of constants. However, it is a
common practice to use this algorithm for non-quadratic functions. Since, for such
functions, convergence to the minimum will rarely be achieved in n steps or less, the
algorithm is restarted after every n steps. The basic steps of the algorithm at the
(k + 1 )th iterate is as follows
1. Calculate Xk+1 = Xk + ak+18k where ak+1 is determined such that
df(ak+d = 0 . ( 4.2.36)
daHl
134
Section {2: Minimization of Functions of Several Variables

2. Let Sk = gk = -V f(Xk) if k = 0; and Sk = gk + fhsk-l if k > 0 with

(4.2.37)

3. If IlgHll1 or If(XHd - f(Xk)1 is sufficiently small, then stop. Otherwise


4. If k < n go to step number 1, or else restart
Example 4.2.3

We will show the effectiveness of this method on the cantilever beam problem for
which we minimize
f = 12xi + 4x~ - 12xIX2 + 2Xl ,
starting with the initial design point xT; = (-1, -2). The initial move direction is
calculated from the gradient

V f(xo) = { 24x1 - 12x2 + 2 } ,


8X2 - 12x1 x=xo

So =- V f(xo) = { ~2} ,
and at the end of the first step we have

f(ad = 12( -1 - 2a1? + 4( -1 + 4ad 2 - 12( -1 - 2ad( -2 + 4ad + 2( -1- 2a1) .


The value of a1 for which the function f is a minimum is obtained from the condition
df Ida1 = 0, or a1 = 0.048077. The new design point and the gradient at that point
are
{ -1.0961} -2.6154}
Xl = -1.8077 ' and V f(xt} = { -1.3077

Next, let 81 = - V f(xt} + /3180 with /31 from Eq. (4.2.37), or

/3 - (-2.6154)2+(-1.3077)2
1- (_2)2+(4)2 = 0.4275,

The new move direction is

81 =- { -2.6154}
-1.3077 + 0.4275
{-2}
4 = { 1.76036}
3.0178 '

and
-1.0961} {1.76036}
X2 = { -1.8077 + a2 3.0178 .

135
Chapter 4: Unconstrained Optimization
Again setting dl(0:2)/d(0:2) = 0 we obtain 0:2 = 0.4334,
_ {-0.3334}
X2 - -0.50 '

Finally, since
{_2}T
4
[24
-12
-12] { 1.76036} '" 0
8 3.0178 - .
we have verified the Q-conjugacy of the two directions So and S1' The progress of
minimization using this method is illustrated in Figure (4.2.3) .•••
Beale's Restarted Conjugate Gradient Technique. In minimizing non-quadratic
functions using the conjugate gradient method, restarting the method after every
n steps is not always a good strategy. Such a strategy seems to be insensitive to
the nonlinear character of the function being minimized. Beale [14] and latcr Powcll
[15] have proposed restart techniques that take the nonlinearity of the function into
account in deciding when to restart the algorithm. Numerical experiments with
minimization of several general functions have led to the following algorithm by Powell
[15].
1. Given Xo, define So to be the steepest descent direction,
So = -V!(xo) = go,
let k = t = 0, and begin iterations by incrementing k.
2. For k ~ 1 the direction Sk is defined by Beale's formula [14]
Sk = -gk + (3kSk-1 + 'YkSt, and gk = -V!(Xk), (4.2.38)
where
(4.2.39)

and
gf[gH1 - gtl
'Yk = StT[gt+1 - if k > t + 1, (4.2.40)
gt ] ,
'Yk = 0, if k = t +1. (4.2.41 )

3. For k ~ 1 test the inequality


IgLlgkl ~ 0.211gk1l 2 • (4.2.42)
If this inequality holds, then it is taken to be an indication that enough orthogonality
between gk_1and gk has been lost to warrant a restart. Accordingly, t is reset
t = k - 1 to imply restart.
4. For k > t + 1 the direction Sk is also checked to guarantee a sufficiently large
gradient by testing the inequalities
(4.2.43)
136
Section 4.2: Minimization of Functions of Several Variables

If these inequalities are not satisfied, the algorithm is restarted by setting t = k - 1.


5. Finally, the algorithm is also restarted by setting t =k - 1, if k - t ~ n as in
the case of the Fletcher-Reeves method.
6. The process is terminated if Ilgk-lli or If(xk+d - f(Xk)1 is sufficiently small.
If not, k is incremented by one and the process is repeated by going to step 2.
Powell [15] has examined in great detail the effectiveness of the new restart proce-
dure using Beale's basic algorithm on a variety of problems. These experiments clearly
establish the superiority of the new procedure over the algorithms of Fletcher-Reeves
and Polak-Ribiere [16]. The only disadvantage of this new algorithm appears to be its
slightly increased storage requirements arising from the need for storing the vectors
Sl and (gHl - gt) after a restart. More recent enhancement for the first order con-
jugate gradient type algorithms [17, 18] involve inclusion of certain preconditioning
schemes to improve the rate of convergence.

4.2.3 Second Order Methods

The oldest second order method for minimizing a nonlinear multivariable function
in Rn is Newton's method. The motivation behind Newton's method is identical
to the steepest descent method. In arriving at the steepest descent direction, s, we
minimized the directional derivative, Eq. (4.2.27), subject to the condition that the
Euclidean norm ofs was unity, Eq. (4.2.28). The Euclidean norm, however, does not
consider the curvature of the surface. Hence, it motivates the definition of a different
norm or a metric of the surface. Thus, we pose the problem as finding the direction
s that minimizes
(4.2.44)

subject to the condition that


(4.2.45)
The solution of this problem is provided by Newton direction (see Exercise 6) to
within a multiplicative constant, namely

s = -Q-1V'f, ( 4.2.46)

where Q is the Hessian of the objective function. The general form of the update
equation of Newton's method for minimizing a function in Rn is given by

( 4.2.47)

where <lk+l is determined by minimizing f along the Newton direction. For Q = I,


Eq. (4.2.47) yields the steepest descent solution since the norm in Eq. (4.2.45)
reduces to the Euclidean norm. For quadratic functions it can be shown that the
update equation reaches the optimum solution in one step with <l = 1

(4.2.48)
137
Chapter 4: Unconstraine Jptimization
regardless of the initial p .nt Xo.
Newton's method can also shown to have a quadratic rate of convergence (see
for example [4] or [8]), by.t the serious disadvantages of the method are the need to
evaluate the Hessian Q ~-d then solve the system of equations
Qs=-Vf, (4.2.49)
to obtain the direction vector s. For every iteration (if Q is non-sparse), Newton's
method involves the calculation of n(n + 1)/2 elements of the symmetric Q matrix,
and n 3 operations for obtaining s from the solution of Eqs. (4.2.49). It is this feature
of Newton's method that has led to the development of methods known as quasi-
Newton or variable-metric methods which seek to use the gradient information to
construct approximations for the Hessian matrix or its inverse.
Quasi-Newton or Variable Metric Algorithms. Consider the Taylor series expan-
sion of the gradient of f around Xk+l
(4.2.50)
where Q is the actual Hessian of the function f. Assuming Ak(A k == A(Xk)) to be
an approximation to the Hessian at the kth iteration, we may write equation (4.2.50)
in a more compact form as
(4.2.51)
where
and (4.2.52)
Similarly, the solution of Eq. (4.2.51) for Pk can be written as
Bk+lYk = Pk, (4.2.53)
with Bk+l being an approximate inverse of the Hessian Q. If Bk+l is to behave
eventually as Q-l then Bk+l A k = I. Equation (4.2.53) is known as the quasi-Newton
or the secant relation. The basis for all variable-metric or quasi-Newton methods is
that, the formulae which update the matrix Ak or its inverse Bk must satisfy Eq.
(4.2.53) and, in addition, maintain the symmetry and positive definiteness properties.
In other words, if Ak or Bk are positive definite then Ak+l or Bk+l must remain so.
A typical variable-metric algorithm with an inverse Hessian update may be stated
as
(4.2.54)
where
(4.2.55)
with Bk being a positive definite symmetric matrix.
Rank-One Updates. In the class of rank-one updates we have the well-known
symmetric Broyden's update [19] for Bk+l given as

B k+l -- B k + (Pk - BkYk)(Pk -)T BkYkf (4.2.56)


(Pk - BkYk Yk
138
Section 4.2: Minimization of Functions of Several Variables

To start the algorithm, an initial positive definite symmetric matrix Bo is assumed


and the next point Xl is calculated from Eq. (4.2.54). Then, Eq. (4.2.56) is used
to calculate the updated approximate inverse Hessian matrix. It is easy to verify
that the columns of the second matrix on the right-hand side of Eq. (4.2.56) are
multiples of each other. In other words, the update matrix has a single independent
column and, hence is rank-one. Furthermore, if Bk is symmetric then Bk+l will also
be symmetric. It is, however, not guaranteed that B k + l will remain positive definite
even if Bk is. This fact can lead to a breakdown of the algorithm especially when
applied to general non-quadratic functions. Broyden [19] suggests choosing the step
lengths Ok+l in Eq. (4.2.54) by either (i) an exact line search, or by (ii) 0k+l = 1 for
all steps, or by (iii) choosing 0k+l such that IIV' fll is minimized or reduced.
Irrespective of the type of line search used, Broyden's update guarantees a
quadratic termination property. However, because of the lack of robustness in min-
imizing general non-quadratic functions, rank-one updates have been superseded by
rank-t.wo updates which guarantee both symmetry and positive definiteness of the
updated matrices.
Rank- Two Updates. Rank-two updates for the inverse Hessian approximation
may generally be written as

BkYkyIBk PkPr
Bk+l = [ Bk - TB + fhvkVkT] Pk + -T-' (4.2.57)
Yk kYk PkYk
where
(4.2.58)

and fh and Pk are scalar parameters that are chosen appropriately. Updates given
by Eqs. (4.2.57) and (4.2.58) are subsets of Huang's family of updates [20] which
guarantee that Bk+IYk = Pk for all choices of Ok and Pk. If we set Ok = and°
Pk = 1 for all k we obtain the Davidon-Fletcher-Powell's (DFP) update formula [21,
22] which is given as

B k+l -- B k - BkYkyIBk
TB
+ -T-
PkPI
. ( 4.2.59)
Yk kYk PkYk
The DFP update formula preserves the positive definiteness and symmetry of the
matrices B k , and has some other interesting properties as well. \Vhen used for mini-
mizing quadratic functions, it generates Q-conjugate directions and, therefore, at the
nth iteration Bn becomes the exact inverse of the Hessian Q. Thus, it has the features
of the conjugate gradient as well as the Newton-type algorithms. The DFP algorithm
can be used without an exact line search in determining Ok+l in Eq. (4.2.54). How-

°
ever, the step length must guarantee a reduction in the function value, and must
be such that prYk > in order to maintain positive definiteness of B k . The perfor-
mance of the algorithm, however, was shown to deteriorate as the accuracy of the line
search decreases [20]. In most cases the DFP formula works quite successfully. In a
few cases the algorithm has been known to break down because Bk became singular.
This has led to the introduction of another update formula developed simultaneously

139
Chapter 4: Unconstrained Optimization
by Broyden [19]' Fletcher [23], Goldfarb [24], and Shanno [25] and known known as
BFGS formula. This formula can be obtained by putting fh = 1 and Pk = 1 in Eq.
(4.2.57) which reduces to

B
k+1
= B
k
+ [1 + YkBkYk]
T
PkPk _ PkYkBk _ BkYkPk
T T T· (4.2.60)
PkYk PkYk PkYk PkYk
Equation (4.2.60) can also be written in a more compact manner as

Bk+1 = [I - p~Yr] Bk [I _Y~r] + p~pr . (4.2.61 )


PkYk PkYk PkYk

Using A k + l = Bk~l and Ak = Bkl we can invert the above formula to arrive at an
update for the Hessian approximations. It is found that this update formula reduces
to
A k+l -A AkPkPk Ak + YkYk (4.2.62)
- k - TA -T- ,
Pk kPk YkPk
which is the analog of the DFP formula (4.2.59) with Bk replaced by A k , and
Pk and Yk interchanged. Conversely, if the inverse Hessian Bk is updated by the DFP
formula then the Hessian Ak is updated according to an analog of the DFP formula.
It is for this reason that the BFGS formula is often called the complementary DFP
formula. Numerical experiments with BFGS algorithm [26J suggest that it is superior
to all known variable-metric algorithms. We will illustrate its use by minimizing the
potential energy function of the cantilever beam problem.

Example 4.2.4

Minimize f(XI, X2) = 12xy + 4x~ -12xIX2 + 2XI by using the BFGS update algorithm
with exact line searches starting with the initial guess xij = (-1, -2).
We initiate the algorithm with a line search along the steepest descent direction.
This is associated with the assumption that Bo = 1 which is symmetric and positive
definite. The resulting point is previously calculated in example 4.2.3 to be

Xl
-1.0961}
= { -1.8077 ' and "f(xd = { -2.6154}
-1.3077 .

From Eq. (4.2.52) we calculate

-1.0961} { -1 } { -0.0961}
Po = { -1.8077 - -2 = 0.1923 '

-2.6154} {2}
Yo = { -1.3077 - -4 = { -4.6154
2.6923
}

Substituting the terms


P5 Yo = (-0.0961)( -4.6154) + (0.1923)(2.6923) = 0.96127,
140
Section 4.2: Minimization of Functions of Several Variables

T _ { -0.0961} [-4.6154 2.6923] _ [0.44354 -0.25873]


POYo - 0.1923 - -0.88754 0.51773 '
into Eq. (4.2.61), we obtain

B _ ([ 1
1 - 0
0] _
1
1 [0.44354
0.96127 -0.88754
-0.25873])
0.51773
[10 0]1
( [ 1 0] 1 [0.44354 -0.88754]) 1 [0.00923 -0.01848]
X 0 1 - 0.96127 -0.25873 0.51773 + 0.96127 -0.01848 0.03698 '

_ [0.37213 0.60225]
- 0.60225 1.10385 .
Next, we calculate the new move direction from Eq. (4.2.55)

__ [0.37213 0.60225] {-2.6154} _ {1.7608}


81 - 0.60225 1.10385 -1.3077 - 3.0186 '

and obtain
-1.0961 } { 1. 7608 }
X2 = { -1.8077 + 0'2 3.0186 .
Setting the derivative of f(x2) with respect to 0:2 to 0 yields the value CY2 = 0.4332055,
and
-0.3333}
X2 = { -0.5000 ' with

This implies convergence to the exact solution. It is left to the reader to verify that
if Bl is updated once more we obtain

B - [0.1667 0.25]
2 - 0.25 0.5 '

which is the exact inverse of the Hessian matrix

It can also be verified that, as expected, the directions 80 and 81 are Q-conjugate .
•••
Q-conjugacy of the directions of travel has meaning only for quadratic functions,
and is guaranteed for such problems in the case of variable-metric algorithms be-
longing to Huang's family only if the line searches are exact. In fact, Q-conjugacy
of the directions is not necessary for ensuring a quadratic termination property [26].
This realization has led to the development of methods based on the DFP and I3FGS
formulae that abandon the computationally expensive exact line searches. The line
searches must be such that they guarantee positive definiteness of the Ak or Bk
matrices while reducing the function value appropriately. Positive definiteness is

141
Chapter 4: Unconstrai1 . Jptimization

guaranteed as long as pIYk > O. To ensure a wide radius of convergence for a quasi-
Newton method, it is also necessary to satisfy the following two criteria. First, a
sufficiently large decrease in the function f must be achieved for the step taken and,
second, the rate of decrease of f in the direction Sk at Xk+l must be smaller than the
rate of decrease of f at Xk [26]. In view of this observations, most algorithms with
inexact line searches require the satisfaction of the following two conditions.

( 4.2.63)

and

(4.2.64 )

The convergence of the BFGS algorithm under these conditions has been studied by
Powell [27]. Similar convergence studies with Beale's restarted conjugate gradient
method under the same two conditions have been carried out by Shanno [28].

4.2.4 Applications to Analysis

Several of the algorithms for unconstrained minimization of functions in Rn can also


be used for solving a system of linear or nonlinear equations. In some cases, like the
problems of nonlinear structural analysis, the necessary condition for the potential
energy to be stationary is that its gradient vanish. The latter can be construed as
solving a system of equations of the type

v f(x) = g(x) = 0, (4.2.65)

where the Hessian of f and the Jacobian of g are the same. In cases where the
problems are posed directly as
g(x) = 0, ( 4.2.66)

Dennis and Schnabel [6] and others solve Eq. (4.4.2) by minimizing the nonlinear
least squares function

(4.2.67)

In this case, however, the Hessian of f and the Jacobian of g are not identical but a
positive definite approximation to the Hessian of f appropriate for most minimiza-
tion schemes can be easily generated from the Jacobian of g [6]. Minimization of f
then permits the determination of not only stable but also unstable equilibrium con-
figurations provided the minimization does not converge to a local minimum. In the
case of convergence to a local minimum, certain restart [6] or deflation and tunnelling
techniques [29, 30] can be invoked to force convergence to the global minimum of f
at which Ilgll = O.

142
Section 4.3: Specialized Quasi-Newton Methods
4.3 Specialized Quasi-Newton Methods

4.3.1 Exploiting Sparsity

The rank-one and rank-two updates that we discussed in the previous section yield
updates which are symmetric but not necessarily sparse. In other words the Hessian
or Hessian inverse updates lead to symmetric matrices which are fully populated. In
most structural analysis problems using the finite element method it is well known
that the Hessian of the potential energy (the tangent stiffness matrix) is sparse. This
may be also true of many structural optimization problems. For such sparse systems
the solution phase for finite element models exploits the triple LDLT factorization.
Thus the Hessian or the Hessian inverse updates discussed previously are not ap-
propriate for solving large-scale structural analysis problems which involve sparse
Hessians.
In applying the BFGS method for solving large-scale nonlinear problems of struc-
tural analysis Matthies and Strang [31] have proposed an alternate implementation
of the method suitable for handling large sparse problems by storing the vectors

(4.3.1)

and
(4.3.2)
and reintroducing them to compute the new search directions. After a sequence of
five to ten iterations during which the BFGS updates are used, the stiffness matrix
is recomputed and the update information is deleted.
Sparse updates for solving large-scale problems were perhaps first proposed by
Schubert [32], who proposed a modification of Broyden's method [33] according to
which the ith row of the Hessian Ak+l is updated by using

(4.3.3)

with Pk obtained from Pk by setting to zero those components corresponding to


known zeros in A~i). The method has the drawback, however, that it cannot retain
symmetry of the resulting matrix even when starting with a symmetric, positive
definite matrix. Not only does this result in slightly increased demands on storage,
but it also requires special sparse linear equation solvers. Toint [34] and Shanno
[35] have recently proposed algorithms which find updating formulae for symmetric
Hessian matrices that preserve known sparsity conditions. The update is obtained
by calculating the smallest correction subject to linear constraints that include the
sparsity conditions. This involves the solution of a system of equations with the same
sparsity pattern as the Hessian.

143
Chapter 4: Unconstrained Optimization

Curtis, Powell and Reid [36], and Powell and Toint [37] have proposed finite
difference strategies for the direct evaluation of sparse Hessians of functions. In
addition to using the finite difference operations, they used concepts from graph
theory that minimize the number of gradient evaluations required for computing the
few non-zero entries of a sparse Hessian. By using these strategies, we can exploit
the sparsity not only in the computation of the Newton direction but also in the
formation of Hessians [38, 39]
The Curtis-Powell-Reid (CPR) strategy exploits sparsity, but not the symmetry
of the Hessian. It divides the columns of the Hessian into groups, so that in each
group the row numbers of the unknown elements of the column vectors are all dif-
ferent. After the formation of the first group, other groups are formed successively
by applying the same strategy to columns not included in the previous groups. The
number of such groups for sparse or banded matrices is usually very small by compar-
ison with n. To evaluate the Hessian of f at Xo we evaluate the gradient of f at Xo.
After this initial gradient evaluation, only as many more gradient evaluations as the
number of groups are needed to evaluate all the non-zero elements of the Hessian
using forward difference approximation. Thus
()gi gi(XO + hjej) - gi(XO)
aij = -;-- = h ' (4.3.4)
uXj j

where ej is the jth coordinate vector and hj is a suitable step size. Each step size may
be adjusted such that the greatest ratio of the round-off to truncation error for any
column of the Hessian falls within a specified range. However, such an adjustment of
step sizes would require a significantly large number of gradient evaluations. Hence,
to economize on the number of gradient evaluations the step sizes are not allowed to
leave the range
( 4.3.5)
where f is the greatest relative round-off in a single operation, 1] is the relative machine
precision, and huj is an upper bound on hj [36].
Powell and Toint [37] extended the CPR strategy to exploit symmetry of the
Hessian. They proposed two methods, one of which is known as the substitution
method. According to this, the CPR strategy is first applied to the lower triangular
part, L, of the symmetric Hessian, A. Because, all the elements of A computed this
way will not be correct, the incorrect elements are corrected by a back-substitution
scheme. Details of this back-substitution schcme may be found in Ref. 37.
The Powell-Toint (PT) strategy of estimating sparse Hessians directly appears
to be a much bettcr alternative to Toint's sparse update algorithm [38]. One major
drawback of Toint's update algorithm is that the updated Hessian approximation is
not guaranteed to remain positive definite even if the initial Hessian approximation
was positive definite.
4.3.2 Coercion of Hessians for Suitability with Quasi-Newton Methods

In minimizing a multivariable function f using a discrete Newton method or the


Toint's update algorithm we must ensure that the Hessian approximation is positive

144
Section 4.4: Probabilistic Search Algorithms

definite. If this is not so, then Newton's direction is not guaranteed to be a descent
direction. There are several strategies for coercing an indefinite Hessian to a positive
definite form. Prominent among these strategies is the one proposed by Gill and
Murray [40]. The most impressive feature of this strategy is that the coercion of
the Hessian takes place during its LDLT decomposition for the computation of the
Newton direction. The diagonal elements of the D matrix are forced to be sufficiently
positive to avoid numerical difficulties while the off-diagonal terms of LDl/2 are lim-
ited by a quantity designed to guarantee positive definiteness of the resulting matrix.
This is equivalent to modifying the original non-positive definite Hessian matrix by
the addition of an appropriate diagonal matrix. Because this matrix modification
is carried out during its LDLT decomposition, the strategy for the computation of
Newton's descent direction does not entail a great deal of additional computations.
4.3.3 Making Quasi-Newton Methods Globally Convergent

It is well known that despite a positive definite Hessian approximation, New-


ton's method can diverge for some starting points. Standard backtracking along
the Newton direction by choosing shorter step lengths can achieve convergence to
the minimum. However, backtracking along the Newton direction fails to use the
n-dimensional quadratic model of the function f. Dennis and Schnabel [7] have pro-
posed a strategy called the double-dogleg strategy which uses the full n-dimensional
quadratic model to choose a new direction obtained by a linear combination of the
steepest descent and the Newton direction. This new direction is a function of the
radius of the trust region within which the n-dimensional quadratic model of the
function approximates the true function well. The double-dogleg strategy not only
makes Newton's method globally convergent (that is converge to the minimum of the
function irrespective of the starting point) but also makes it significantly more effi-
cient for certain poorly scaled problems. For details about the double-dogleg strategy
readers are advised to consult Ref. 7. More recent attempts to widen the domain of
convergence of the quasi-Newton method or make it globally convergent for a wide
class of problems are studied in Refs. [41, 42].

4.4 Probabilistic Search Algorithms

A common disadvantage of most of the algorithms discussed so far is their inability


to distinguish local and global minima. Many structural design problems have more
than one local minimum, and depending on the starting point, these algorithms
may converge to one of these local minima. The simplest way to check for a better
local solution is to restart the optimization from randomly selected initial points to
check if other solutions are possible. However, for problems with a large number of
variables the possibility of missing the global minimum is large unless unpractically
large number of optimization runs are performed. The topic of global optimization
is an area of active research where new algorithms are emerging and old algorithms
are constantly being improved [43-45].

145
Chapter 4: Unconstrained Optimization

Dealing with the problem of local minima becomes even worse if the design vari-
ables are required to take discrete values. First of all, for such problems the design
space is discontinuous and disjointed, therefore derivative information is either useless
or is not defined. Secondly, the use of discrete values for the design variables intro-
duces multiple minima corresponding to various combinations of the variables, even if
the objective function for the problem ha.'l a single minimum for continuous variables.
A methodical way of dealing with multiple minima for discrete optimization prob-
lems is to use either random search techniques that would sample the design space
for a global minimum or to employ enumerative type algorithms. In either case, the
efficiency of the solution process deteriorates dramatically as the number of variables
is increa.'led.

Two algorithms, Simulated Annealing and Genetic Algorithms (see, Lam'hoven


[46] and Goldberg [47], respectively), have emerged more recently as tools ideally
suited for optimization problems where a global minimum is sought. In addition to
being able to locate near global solutions, these two algorithms arc also powerful tools
for problems with discrete-valued design variables. Both algorithms rely on naturally
observed phenomena and their implementation calls for the useu£a random selection
process which is guided by probabilistic decisions. In the following sections brief
descriptions of the two algorithms are presented. Application of the algorithms to
structural design will be demonstrated for laminated composites in Chapter 11.

4.4.1 Simulated Annealing

The development of the simulated annealing algorithm was motivated by studies in


statistical mechanics which deal with the equilibrium of large number of atoms in
solids and liquids at a given temperature. During solidification of metals or forma-
tion of crystals, for example, a number of solid states with different internal atomic
or crystalline structure that correspond to different energy levels can be achieved
depending on the rate of cooling. If the system is cooled too rapidly, it is likely that
the resulting solid state would have a small margin of stability because the atoms will
assume relative positions in the lattice structure to reach an energy state \vhich is
only locally minimal. In order to reach a more stable, globally minimum energy state,
the process of annealing is used in which the metal is reheated to a high temperature
and cooled slowly, allowing the atoms enough time to find positions that minimize
a steady state potential energy. It is observed in the natural annealing process that
during the time spent at a given temperature it is possible to have the system jump
to a higher energy state temporarily before the steady state is reached. As will be
explained in the following paragraphs, it is this characteristic of the annealing process
which makes it possible to achieve near global minimum energy states.
A computational algorithm that simulates the annealing process wa.'l proposed
by Metropolis et al. [48], and is referred to as the Metropolis algorithm. At a given
temperature, T, the algorithm perturbs the position of an atom randomly and com-
putes the resulting change in the energy of the system, f:,.E. If the new energy state
is lower than the initial state, then the new configuration of the atoms is accepted.
If, on the other hand f:,.E ~ 0, the perturbed state causes an increase in the energy,

146
Section 4.4: Probabilistic Search Algorithms

the new state might be accepted or rejected based on a random probabilistic decision.
The probability of acceptance, P(~E), of a higher energy state is computed as

(4.4.1)
where kB is the Boltzmann's constant. If the temperature of the system is high, then
the probability of acceptance of a higher energy state is close to one. If, on the other
hand, the temperature is close to zero, then the probability of acceptance becomes
very small.
The decision to accept or reject is made by randomly selecting a number in an
interval (0,1) and comparing it with P(~E). If the number is less than P(~E), then
the perturbed state is accepted, if it is greater than P(~E), the state is rejected.
At each temperature, a pool of atomic structures would be generated by randomly
perturbing positions until a steady state energy level is reached (commonly referred
to as thermal equilibrium). Then the temperature is reduced to start the iterations
again. These steps are repeated iteratively while reducing the temperature slowly to
achieve the minimal energy state.
The analogy between the simulated annealing and the optimization of functions
with many variables was established recently by Kirkpatrick et al. [49], and Cerny
[50]. By replacing the energy state with an objective function J, and using variables
x for the the configurations of the particles, we can apply the Metropolis algorithm
to optimization problems. The method requires only function values. The moves in
the design space from one point, Xi to another xi causes a change in the objective
function, ~Jij. The temperature T now becomes a control parameter that regulates
the convergence of the process. Important elements that affect the performance of
the algorithm are the selection of the initial value of the "temperature", To, and
how to update it. In addition, the number of iterations (or combinations of design
variables) needed to achieve "thermal equilibrium" must be decided before the T can
be reduced. These parameters are collectively referred to as the "cooling schedule" .
A flow chart of a typical simulated annealing algorithm is shown in Figure 4.4.l.
The definition of the cooling schedule begins with the selection of the initial temper-
ature. If a low value of To is used, the algorithm would have a low probability of
reaching a global minimum. The initial value of To must be high enough to permit vir-
tually all moves in the design space to be acceptable so that almost a random search
is performed. Typically, To is selected such that the acceptance ratio X (defined as
the ratio of the number of accepted moves to total number of proposed moves) is
approximately Xo = 0.95 [51]. Johnson et al. [52] determined To by calculating the
average increase in the objective function, z;:t+), over a predetermined number of
moves and solved

( 4.4.2)
leading to

(4.4.3)

147
Chapter 4: Unconstrained Optimization
initialize xO , To
°
f o= f (x )
k=o,m =o

~ =xi +L\x
ti = f (xi)
~tij = fl- ti

m=m+ l

reduce temperat.
k=k+l

y
N

Figure 4.4.1 Flow chart of the simulated annealing algorithm.

is performed
Once the temper ature is set, a numbe r of moves in the variable space
ature must be large
by perturb ing the design. The numbe r of moves at a given temper is to
local minimu m. One possibi lity
enough to allow the solutio n to escape from a d number ,
ve functio n does not change for a specifie
move until the value of the objecti discrete
[53) for
M, of successive iteratio ns. Anothe r possibility suggested by Aarts
ations of design
valued design variables is to make sure that every possible combin
state design is visited at least once with a
variables in the neighborhood of a steady
148
Section 4.4: Probabilistic Search Algorithms

probability of P. That is, if there are S neighboring designs, then

M = SIn (_1_) ,
1-P
(4.4.4)

where P = 0.99 for S > 100, and P = 0.995 for S < 100. For discrete valued
variables there are often many options for defining the neighborhood of the design.
One possibility is to define it as all the designs that can be obtained by changing one
design variable to its next higher or lower value. A broader immediate neighborhood
can be defined by changing more than one design variables to their next higher or
lower values. For an n variable problem, the immediate neighborhood has

S = 3n - 1. ( 4.4.5)

Once convergence is achieved at a given temperature, generally referred to as thermal


equilibrium, the temperature is reduced and the process is repeated.
Many different schemes have been proposed for updating the temperature. A
frequently used rule is a constant cooling update

k = 0,1,2, ... ,J(, (4.4.6)

where 0.5 ::; a ::; 0.95. Nahar [54) fixes the number of decrement steps J(, and
suggests determination of the values of the Tk experimentally. It is also possible to
divide the interval [0, To) into a fixed J( number of steps and use
J( - k
TK=~To, k= 1,2, ... ,J(. (4.4.7)

The number of intervals typically ranges from 5 to 20.


The use of simulated annealing for structural optimization has been quite recent.
Elperin [55) applied the method to the design of a ten-bar truss problem where
member cross-sectional dimensions were to be selected from a set of discrete values.
Kincaid and Padula [56] used it for minimizing the distortion and internal forces in a
truss structure. A 6-story 156 member frame structure with discrete valued variables
was considered by Balling and May [57). Optimal placement of active and passive
members in a truss structure was investigated by Chen et al. [58) to maximize the
finite-time energy dissipation to achieve increased damping properties.

4.4.2 Genetic Algorithms

Genetic algorithms use techniques derived from biology, and rely on the principle of
Darwin's theory of survival of the fittest. When a population of biological creatures
is allowed to evolve over generations, individual characteristics that are useful for
survival tend to be passed on to the future generations, because individuals carry-
ing them get more chances to breed. Those individual characteristics in biological
populations are stored in chromosomal strings. The mechanics of natural genetics
is based on operations that result in structured yet randomized exchange of genetic

149
Chapter 4: Unconstrained Optimization
information (i.e., useful traits) between the chromosomal strings of the reproducing
parents, and consists of reproduction, crossover, occasional mutation, and inversion
of the chromosomal strings.
Genetic algorithms, developed by Holland [59], simulate the mechanics of natural
genetics for artificial systems based on operations which are the counterparts of the
natural ones (even called by the same names), and are extensively used as multi-
variable search algorithms. As will be described in the following paragraphs, these
operations involve simple, easy to program, random exchanges of location of num-
bers in a string, and, therefore, at the outset look like a completely random search
of extremum in the parameter space based on function values only. However, ge-
netic algorithms are experimentally proven to be robust, and the reader is referred to
Goldberg [47] for further discussion of the theoretical properties of genetic algorithms.
Here we discuss the genetic representation of a minimization problem, and focus on
the mechanics of three commonly used genetic operations, namely; reproduction,
crossover, and mutation.
Application of the operators of the genetic algorithm to a search problem first
requires the representation of the possible combinations of the variables in terms
of bit strings that are counterparts of the chromosomes. Naturally, the measure of
goodness of a specific combination of genes is represented in an artificial system by
the objective function of the search problem. For example, if we have a minimization
problem
minimize f(x), (4.4.8)
a binary string representation of the variable space could be of the form

(4.4.9)

where string equivalents of the individual variables are connected head-to-tail, and,
in this example, base 10 values of the variables are Xl = 6, X2 = 5, X3 = 3, X4 = 11,
and their ranges correspond to {15 ~ XI,X4 ~ 0},{7 ~ X2 ~ O},and {3 ~ X3 ~
O}. Because of the bit string representation of the variables, genetic algorithms are
ideally suited for problems where the variables are required to take discrete or integer
variables. For problems where the design variables are continuous values within a
range xf ~ Xi ~ xf, one may need to use a large number of bits to represent the
variables to high accuracy. The number of bits that are needed depends on the
accuracy required for the final solution. For example, if a variable is defined in a
range {0.01 ~ Xi ~ l.81} and the accuracy needed for the final value is x incr = 0.001,
then the number of binary digits needed for an appropriate representation can be
calculated from
(4.4.10)
where m is the number of digits. In this example, the smallest number of digits that
satisfy the requirement would bem = 11, which actually produces increments of
0.00087 in the value of the variable, instead of the required value of 0.00l.
Unlike the search algorithms discussed earlier that move from one point to another
in the design variable space, genetic algorithms work with a population of strings

150
Section 4.4: Probabilistic Search Algorithms
(chromosomes). This aspect of the genetic algorithms is responsible for capturing
near global solutions, by keeping many solution points that may have the potential
of being close to minima (local or global) in the pool during the search process rather
than singling out a point early in the process and running the risk of getting stuck at
a local minimum. Working on a population of designs also suggests the possibility of
implementation on parallel computers. However, the concept of parallelism is even
more basic to genetic algorithms in that evolutionary selection can improve in parallel
many different characteristics of the design. Also, the outcome of a genetic search is
a population of good designs rather than a single design. This aspect can be very
useful to the designer.
Initially the size of the population is chosen and the values of the variables in
each string are decided by randomly assigning O's and 1's to the bits. The next
important step in the process is reproduction, in which individual strings with good
objective function values are copied to form a new population, an artificial version
of the survival of the fittest. The bias towards strings with better performance can
be achieved by increasing the probability of their selection in relation to the rest of
the population. One way to achieve this is to create a biased roulette wheel where
individual strings occupy areas proportional to their function values in relation to
the cumulative function value of the entire population. Therefore, the population
resulting from the reproduction operation would have multiple copies of the highly
fit individuals.
Once the new population is generated, the members are paired off randomly for
crossover. The mating of the pair also involves a random process. A random integer
k between 1 and L - 1, where L is the string length, is selected and two new strings
are generated by exchanging the O's and 1's that comes after the kth location in the
first parent with the corresponding locations of the second parent. For example, the
two strings of length L = 9
parent 1: o 1 1 0 1110 1 1 1 (4.4.11 )
parent 2: o 1 0 0 1110 0 0 1 '
are mated with a crossover point of k = 5, the offsprings will have the following
composition,
offspring 1: 011010001
(4.4.12)
offspring 2: o 1 001 0 1 1 1
Multiple point crossovers in which information between the two parents are swapped
among more string segments are also possible, but because of the mixing of the strings
the crossover becomes a more random process and the performance of the algorithm
might degrade, De Jong [60J. Exception to this is the two-point crossover. In fact,
the one point crossover can be viewed as a special case of the two point crossover in
which the end of the string is the second crossover point. Booker [61] showed that
by choosing the end-point of the segment to be crossed randomly, the performance
of the algorithm can actually be improved.
Mutation serves an important task of preventing premature loss of important
genetic information by occasional introduction of random alteration of a string. As

151
Chapter 4: Unconstrained Optimization

mentioned earlier, at the end of reproduction it is possible to have populations with


multiple copies of the same string. In the worst scenario, it is possible to have the
entire pool to be made of the same string. In such a case, the algorithm would
be unable to explore the possibility of a better solution. Mutation prevents this
uniformity, and is implemented by randomly selecting a string location and changing
its value from 0 to 1 or vice versa. Based on small rate of occurrence in biological
systems and on numerical experiments, the role of the mutation operation on the
performance of a genetic algorithm is considered to be a secondary effect. Goldberg
[49] suggests a rate of mutation of one in one thousand bit operations.

Application of genetic algorithms in optimal structural design has started only


recently. The first application of the algorithm to a structural design was presented by
Goldberg and Samtani [62] who used the 10-bar truss weight minimization problem.
More recently, Hajela [63] used genetic search for several structural design problems
for which the design space is known to be either nonconvex or disjoint. Rao et al.
[64] address the optimal selection of discrete actuator locations in actively controlled
structures via genetic algorithms.

In closing, the basic ideas behind the simulation of a natural phenomena is find-
ing a more mathematically sound foundation in the area of probabilistic search al-
gorithms, especially for discrete variables. Improvements in the performance of the
algorithms are constantly being made. For example, modifications in the cooling
schedule proposed by Szu [65] led to the development of a new algorithm know as
the fast simulated annealing. Applications and analysis of other operations that
mimic the natural biological genetics (such as inversion, dominance, niches, etc.) are
currently being evaluated for genetic algorithms.

4.5 Exercises

1. Solve the problem of the cantilever beam problem of example 4.2.1 by

(a) Nelder-Mead's simplex algorithm, and

(b) Davidon- Fletcher- Powell's algorithm.

Begin with X6 = (-1, -2). For the simplex algorithm assume an initial simplex
of size a=2.0. Assume an initial base point Xo with the coordinates of the other
vertices to be given by Eqs. (4.2.1) and (4.2.1).

2. Find the minimum of the function

using Powell's conjugate directions method, starting with Xo = (0,0, of.

152
Section 4.5: Exercises

Figure 4.5.1 Two bar unsymmetric shallow truss.


3. Determine the minimum of
f(x) = 100(X2 - xi)2 + (1 - xd 2 ,
using steepest descent method, starting with Xo = (1.2, 1.0f.
4. The stable equilibrium configuration of the two bar unsymmetric shallow truss of
Figure 4.5.1 can be obtained by minimizing the potential energy function f of the
non-dimensional displacement variables Xl, X2 as

X2) 4
1
f(XI, X2) = 2" m 'Y( -O'IXI
1 2
+ 2"X 1
1 + X2) + 2"
2 ( 1
-0'1 X1 + 2"X 1 -
2
-:y 'Y -
_
P'Y X1 ,

where m, 'Y, 0'1,15 are nondimensional quantities defined as

- P
P = EA 2 '

and E is the elastic modulus, Al and A2 are the cross-sectional areas of the bars. Us-
ing the BFGS algorithm determine the equilibrium configuration in terms of Xl and X2
for m = 5, 'Y = 4,0'1 = 0.02,15 = 2 x 10- 5 . Use X6 = (0,0).
5. Continuing the analysis of the problem 4 it can be shown that the critical load Per
at which the shallow truss is unstable (snap-through instability) is given by

EAIA2'Yb + 1? O'r
Per = (AI + A 2'Y) 3V3·
Suppose now that Per as given above is to be maximized subject to the condition that
AlII + A2l2 = Vo = constant.
The exterior penalty formulation of Chapter 5 reduces the above problem to the
unconstrained minimization of

153
Chapter 4: Unconstrained Optimization
where r is a penalty parameter. Carry out the minimization of an appropriately
nondimensionalized form of po. for II = 200 in, l2 = 50 in, h = 2.50 in, Vo =
200 in3 , E = 106 psi, r = Wi to determine an approximate solution for the op-
timum truss configuration and the corresponding value of Pc.. Use the BFGS al-
gorithm for unconstrained minimization beginning with an initial feasible guess of
A1 = 0.952381in2 and A2 = 0.190476in2 .
6. a) Minimize the directional derivative of / in the direction s

subject to the condition


n

'
~,"8
2
=1,
;=1

to show that the steepest descent direction is given by

\1/ (4.5.1)
s = -11\1/11 .
b) Repeat the above with the constraint condition on s replaced by

STQS = 1,
to show that the Newton direction is given by

Q being the Hessian of the quadratic function /.

4.6 References

[1] Kamat, M.P. and Hayduk, R.J., "Recent Developments in Quasi-Newton Meth-
ods for Structural Analysis and Synthesis," AIAA J., 20 (5), 672-679, 1982.
[2] Avriel, M., Nonlinear Programming: Analysis and Methods. Prentice-Hall, Inc.,
1976.
[3] Powell, M.J.D., "An Efficient Method for Finding the Minimum of a Function of
Several Variables without Calculating Derivatives," Computer J., 7, pp. 155-162,
1964.
[4] Kiefer, J., "Sequential Minmax Search for a Maximum," Proceedings of the Amer-
ican Mathematical Society, 4, pp. 502-506, 1953.

154
Section 4.6: References
[5] Walsh, G.R., Methods of Optimization, John Wiley, New York, 1975.
[6] Dennis, J.E. and Schnabel, R.B., Numerical Methods for Unconstrained Opti-
mization and Nonlinear Equations, Prentice-Hall, 1983.
[7] Gill, P.E., Murray, W. and Wright, M.H., Practical Optimization, Academic
Press, New York, p. 92, 198!.
[8] Spendley, W., Hext, G. R., and Himsworth, F. R., "Sequential Application of
Simplex Designs in Optimisation and Evolutionary Operation," Technometrics,
4 (4), pp. 441-461,1962.
[9] NeIder, J. A. and Mead, R., "A Simplex Method for Function Minimization,"
Computer J., 7, pp. 308-313, 1965.
[10] Chen, D. H., Saleem, Z., and Grace, D. W., "A New Simplex Procedure for
Function Minimization," Int. J. of Modelling & Simulation, 6, 3, pp. 81-85, 1986.
[11] Cauchy, A., "Methode Generale pour la Resolution des Systemes D'equations
Simultanees," Compo Rend. l'Academie des Sciences Paris, 5, pp. 536-538, 1847.
[12] Hestenes, M.R. and Stiefel, E., "Methods of Conjugate Gradients for Solving
Linear Systems," J. Res. Nat. Bureau Stand., 49, pp. 409-436, 1952.
[13] Fletcher, R. and Reeves, C.M., "Function Minimization by Conjugate Gradients,"
Computer J., 7, pp. 149-154, 1964.
[14] Gill, P.E. and Murray, W., "Conjugate-Gradient Methods for Large Scale Nonlin-
ear Optimization," Technical Report 79-15; Systems Optimization Lab., Dept. of
Operations Res., Stanford Univ., pp. 10-12, 1979.
[15] Powell, M.J.D., "Restart Procedures for the Conjugate Gradient Method," Math.
Prog., 12, pp. 241-254,1975.
[16] Polak, E., Computational Methods in Optimization: A Unified Approach, Aca-
demic Press, 1971.
[17] Axelsson, O. and Munksgaard, N., "A Class of Preconditioned Conjugate Gra-
dient Methods for the Solution of a Mixed Finite Element Discretization of the
Biharmonic Operator," Int. J. Num. Meth. Engng., 14, pp. 1001-1019, 1979.
[18] Johnson, O.G., Micchelli, C.A. and Paul, G., "Polynomial Preconditioners for
Conjugate Gradient Calculations," SIAM J. Num. Anal., 20 (2), pp. 362-376,
1983.
[19] Broyden, C.G., "The Convergence of a Class of Double-Rank Minimization Al-
gorithms 2. The New Algorithm," J. Inst. Math. Appl., 6, pp. 222-231, 1970.
[20] Oren, S.S. and Luenberger, D., "Self-sealing Variable Metric Algorithms, Part
I," Manage. Sci., 20 (5), pp. 845-862, 1974.
[21] Davidon, W.C., Variable Metric Method for Minimization. Atomic Energy Com-
mission Research and Development Report, ANL-5990 (Rev.), November 1959.
155
Chapter 4.. Unconstrained ( timization
[22] Fletcher, R. and Powel M.J.D., "A Rapidly Convergent Descent Method for
Minimization," Compu . J., 6, pp. 163-168, 1963.
[23] Fletcher, R., "A New k,roach to Variable Metric Algorithms," Computer J., 13
(3), pp. 317-322, 1970.
[24] Goldfarb, D., "A Fan~ of Variable-metric Methods Derived by Variational
Means," Math. Comput., 24, pp. 23-26, 1970.
[25] Shanno, D.F., "Conditioning of Quasi-Newton Methods for Function Minimiza-
tion," Math. Comput., 24, pp. 647-656, 1970.
[26] Dennis, J.E., Jr. and More, J.J., "Quasi-Newton Methods, Motivation and The-
ory," SIAM Rev., 19 (1), pp. 46-89, 1977.
[27] Powell, M.J.D., "Some Global Convergence Properties of a Variable Metric Algo-
rithm for Minimization Without Exact Line Searches," In: Nonlinear Program-
ming (R.W.Cottle and C.E. Lemke, eds.), American Mathematical Society, Prov-
idence, RI, pp. 53-72, 1976.
[28] Shanno, D.F., "Conjugate Gradient Methods with Inexact Searches," Math.
OpeL Res., 3 (2), pp. 244-256, 1978.
[29] Kamat, M.P., Watson, L.T. and Junkins, J.L., "A Robust Efficient Hybrid
Method for Finding Multiple Equilibrium Solutions," Proceedings of the Third
Intl. Conf. on Numerical Methods in Engineering, Paris, France, pp. 799-807,
March 1983.
[30] Kwok, H.H., Kamat, M.P. and Watson, L.T., "Location of Stable and Unstable
Equilibrium Configurations using a Model Trust Region, Quasi-Newton Method
and Tunnelling," Computers and Structures, 21 (6), pp. 909-916, 1985.
[31] Matthies, H. and Strang, G., "The Solution of Nonlinear Finite Element Equa-
tions," Int. J. Num. Meth. Enging., 14, pp. 1613-1626, 1979.
[32] Schubert, L.K., "Modification of a Quasi-Newton Method for Nonlinear Equations
with a Sparse Jacobian," Math. Comput., 24, pp. 27-30, 1970.
[33] Broyden, C.G., "A Class of Methods for Solving Nonlinear Simultaneous Equa-
tions," Math. Comput., 19, pp. 577-593, 1965.
[34] Toint, Ph.L., "On Sparse and Symmetric Matrix Updating Subject to a Linear
Equation," Math. Comput., 31, pp. 954-961, 1977.
[35] Shanno, D.F., "On Variable-Metric Methods for Sparse Hessians," Math. Com-
put., 34, pp. 499-514, 1980.
[36] Curtis, A.R., Powell, M.J.D. and Reid, J.K., "On the Estimation of Sparse Jaco-
bian Matrices," J. Inst. Math. Appl., 13, pp. 117-119,1974.
[37] Powell, M.J .D. and Toint, Ph.L., "On the Estimation of Sparse Hessian Matrices,"
SIAM J. Num. Anal., 16 (6), pp. 1060-1074,1979.
156
Section 4.6: References
[38] Kamat, M.P., Watson, L.T. and VandenBrink, D.J., "An Assessment of Quasi-
Newton Sparse Update Techniques for Nonlinear Structural Analysis," Comput.
Meth. Appl. Mech. Enging., 26, pp. 363~375, 1981.
[39] Kamat, M.P. and VandenBrink, D.J., "A New Strategy for Stress Analysis Using
the Finite Element Method," Computers and Structures 16 (5), pp. 651~656,
1983.
[40] Gill, P.E. and Murray, W., "Newton-type Methods for Linearly Constrained Opti-
mization," In: Numerical Methods for Constrained Optimization (Gill & Murray,
eds.), pp. 29~66. Academic Press, New York 1974.
[41] Griewank, A.O., Analysis and Modifications of Newton's Method at Singularities.
Ph.D. Thesis, Australian National University, 1980.
[42] Decker, D.W. and Kelley, C.T., "Newton's 11ethod at Singular Points, I and II,"
SIAM J. Num. Anal., 17, pp. 66~70; 465~471, 1980.
[43] Hansen, E., "Global Optimization Using Interval Analysis~ The Multi Dimen-
sional Case," Numer. Math., 34, pp. 247~270, 1980.
[44] Kao, J.-J., Brill, E. D., Jr., and Pfeffer, J. T., "Generation of Alternative Optima
for Nonlinear Programming Problems," Eng. Opt., 15, pp. 233~251, 1990.
[45] Ge, R., "Finding More and More Solutions of a System of Nonlinear Equations,"
Appl. Math. Computation, 36, pp. 15-30, 1990.
[46] Laarhoven, P. J. M. van., and Aarts, E., Simulated Annealing: Theory and Ap-
plications, D. Reidel Publishing, Dordrecht, The Netherlands, 1987.
[47] Goldberg, D. E., Genetic Algorithms in Search, Optimization, and Machine
Learning, Addison-Wesley Publishing Co. Inc., Reading, Massachusetts, 1989.
[48] Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., and Teller,
E., "Equation of State Calculations by Fast Computing Machines," J. Chern.
Physics, 21 (6), pp. 1087~1092, 1953.
[49] Kirkpatrick, S., Gelatt, C. D., Jr., and Vecchi, M. P., "Optimization by Simulated
Annealing," Science, 220 (4598), pp. 671 ~680, 1983.
[50] Cerny, V., "Thermodynamical Approach to the Traveling Salesman Problem: An
Efficient Simulation Algorithm," J. Opt. Theory Appl., 45, pp. 41~52, 1985.
[51] Rutenbar, R. A., "Simulated Annealing Algorithms: An Overview," IEEE Cir-
cuits and Devices, January, pp. 19~26, 1989.
[52] Johnson, D. S., Aragon, C. R., McGeoch, L. A., and Schevon, C., "Optimization
by Simulated Annealing: An Experimental Evaluation. Part I. Graph Partition-
ing," Operations Research, 37, 1990, pp. 865~893.
[53] Aarts, E., and Korst, J., Simulated Annealing and Boltzmann Machines, A
Stochastic Approach to Combinatorial Optimization and Neural Computing,
John Wiley & Sons, 1989.
157
Chapter 4: Unconstrained Optimization
[541 Nahar, S., Sahni, S., and Shragowithz, E. V., in the Proceedings of 22nd Design
Automation Conf., Las Vegas, June 1985, pp. 748-752.
[551 Elperin, T, "Monte Carlo Structural Optimization in Discrete Variables with
Annealing ALgorithm," Int. J. Num. Meth. Eng., 26, 1988, pp. 815-821.
[56] Kincaid, R. K., and Padula, S. L., "Minimizing Distortion and Internal Forces
in Truss Structures by Simulated Annealing," Proceedings of the AIAA/ ASME
/ ASCE/ AHS/ ASC 31st Structures, Structural Dynamics, and Materials Confer-
ence, Long Beach, CA., 1990, Part 1, pp. 327-333.
[57] Balling, R. J., and May, S. A., "Large-Scale Discrete Structural Optimization:
Simulated Annealing, Branch-and-Bound, and Other Techniques," presented at
the AIAA/ AS ME/ ASCE/ AHS/ ASC 32nd Structures, Structural Dynamics, and
Materials Conference, Long Beach, CA., 1990,
[58] Chen, G.-S., Bruno, R. J., and Salama, M., "Optimal Placement of Active/Passive
Members in Structures Using Simulated Annealing," AIAA J., 29 (8), August
1991, pp. 1327-1334.
[59] Holland, J. H., Adaptation of Natural and Artificial Systems, The University of
Michigan Press, Ann Arbor, MI, 1975.
[60] De Jong, K. A., Analysis of the Behavior of a Class of Genetic Adaptive Systems
(Doctoral Dissertation, The University of Michigan; University Microfilms No.
76-9381), Dissertation Abstracts International, 36 (10), 5140B, 1975.
[61] Booker, L., "Improving Search in Genetic Algorithms," in Genetic Algorithms
and Simulated Annealing, Ed. L. Davis, Morgan Kaufmann Publishers, Inc., Los
Altos, CA. 1987, pp. 61-73.
[62] Goldberg, D. E., and Samtani, M. P., "Engineering Optimization via Genetic
Algorithm," Proceedings of the Ninth Conference on Electronic Computation,
ASCE, February 1986, pp. 471-482.
[63] Hajela, P., "Genetic Search-An Approach to the Nonconvex Optimization Prob-
lem," AIAA J., 28 (7), July 1990, pp. 1205-1210.
[64] Rao, S. S., Pan, T.-S., and Venkayya, V. B., "Optimal Placement of Actuators in
Actively Controlled Structures Using Genetic Algorithms," AIAA J., 29 (6), pp.
942-943, June 1991.
[65] Szu, H., and Hartley, R.L., "Nonconvex Optimization by Fast Simulated Anneal-
ing," Proceedings of the IEEE, 75 (11), pp. 1538-1540,1987.

158
Constrained Optimization 5

Most problems in structural optimization must be formulated as constrained min-


imization problems. In a typical structural design problem the objective function
is a fairly simple function of the design variables (e.g., weight), but the design has
to satisfy a host of stress, displacement, buckling, and frequency constraints. These
constraints are usually complex functions of the design variables available only from
an analysis of a finite element model of the structure. This chapter offers a review of
methods that are commonly used to solve such constrained problems.
The methods described in this chapter are for use when the computational cost of
evaluating the objective function and constraints is small or moderate. In these meth-
ods the objective function or constraints these are calculated exactly (e.g., by a finite
element program) whenever they are required by the optimization algorithm. This
approach can require hundreds of evaluations of objective function and constraints,
and is not practical for problems where a single evaluation is computationally ex-
pensive. For these more expensive problems we go through an intermediate stage of
constructing approximations for the objective function and constraints, or at least
for the more expensive functions. The optimization is then performed on the approx-
imate problem. This approximation process is described in the next chapter.
The basic problem that we consider in this chapter is the minimization of a
function subject to equality and inequality constraints
minimize f(x)
such that hj(x) = 0, i = 1, ... , n e , (5.1 )
gj(x) ~ 0, j = 1, ... ,ng .
The constraints divide the design space into two domains, the feasible domain
where the constraints are satisfied, and the infeasible domain where at least one of
the constraints is violated. In most practical problems the minimum is found on
the boundary between the feasible and infeasible domains, that is at a point where
gj(x) = 0 for at least one j. Otherwise, the inequality constraints may be removed
without altering the solution. In most structural optimization problems the inequality
constraints prescribe limits on sizes, stresses, displacements, etc. These limits have

159
Chapter 5: Constrained Optimization

great impact on the design, so that typically several of the inequality constraints are
active at the minimum. .
While the methods described in this section are powerful, they can often per-
form poorly when design variables and constraints are scaled improperly. To prevent
ill-conditioning, all the design variables should have similar magnitudes, and all con-
straints should have similar values when they are at similar levels of criticality. A
common practice is to normalize constraints such that g(x) = 0.1 correspond to a
ten percent margin in a response quantity. For example, if the constraint is an upper
limit aa on a stress measure a, then the constraint may be written as
a
g=1--2:0. (5.2)
aa

Some of the numerical techniques offered in this chapter for the solution of con-
strained nonlinear optimization problems are not able to handle equality constraints,
but are limited to inequality constraints. In such instances it is possible to re-
place the equality constraint of the form h;(x) = 0 with two inequality constraints
h;(x) ~ 0 and hi(x) 2: O. However, it is usually undesirable to increase the number of
constraints. For problems with large numbers of inequality constraints, it is possible
to construct an equivalent constraint to replace them. One of the ways to replace a
family of inequality constraints (g;(x) 2: 0, i = 1 ... m) by an equivalent constraint is
to use the Kreisselmeier-Steinhauser function [1] (KS-function) defined as

(5.3)

where p is a parameter which determines the closeness of the KS-function to the


smallest inequality min[g;(x)]. For any positive value of the p, the KS-function
is always more negative than the most negative constraint, forming a lower bound
envelope to the inequalities. As the value of p is increased the KS-functions conforms
with the minimum value of the functions more closely. The value of the K S-function
is always bounded by

, ~(m)
gmin ~ K S[g;(x)] ~ gmin - -- . (5.4)
p

For an equality constraint represented by a pair of inequalities, h;(x) ~ 0 and -


h;(x) ~ 0, the solution is at a point where both inequalities are active, h;(x) =
-h;(x) = 0, Figure 5.1. Sobieski [2] shows that for a KS-function defined by such
a positive and negative pair of hi, the gradient of the KS-function at the solution
point h;(x) = 0 vanishes regardless of the p value, and its value approaches to zero
as the value of p tends to infinity, Figure 5.1 Indeed, from Eq. (5.4) at x where
hi = 0, the KS-function has the property

02: KS(h, -h) 2: _In(2) . (5.5)


p
160
Section 5.1: The Kuhn-Tucker Conditions

2
KS
h
o
-2 - hex) =0",
-4 ,, ,, "KS -(p.";' o.ir -. - - -- - ~ _. -
, ,.
-6
,
-8
o 2 4 6 8 10 12 14 16
x

Figure 5.1 Kreisselmeier-Steinhauser Junction Jor replacing hex) = o.


Consequently, an optimization problem

minimize J(x)
(5.6)
such that hk(x) = 0, k = 1, ... , n e ,

may be reformulated as

minimize J(x)
(5.7)
such that KS(hI, -hI, h2' -h 2, ... , hne' -hnJ ~ -10 •

where 10 is a small tolerance.

5.1 The Kuhn-Tucker Conditions

5.1.1 General Case

In general, problem (5.1) may have several local minima. Only under special circum-
stances are sure of the existence of single global minimum. The necessary conditions
for a minimum of the constrained problem are obtained by using the Lagrange mul-
tiplier method. We start by considering the special case of equality constraints only.
Using the Lagrange multiplier technique, we define the Lagrangian function
n•
.c(x,.\) = J(x) - L >.; h; (x) , (5.1.1)
;=1
161
Chapter 5: Constrained Optimization

where Aj are unknown Lagrange multipliers. The necessary conditions for a stationary
point are

ac aj ne ahj
-=--~Aj-=O i = 1, ... ,n, (5.1.2)
ax· ax ~ ax
,
'
'j=l •
ac
aA' = hj(x) = 0, j = 1, ... , ne . (5.1.3)
J

These conditions, however, apply only at a regular point, that is at a point where the
gradients of the constraints are linearly independent. If we have constraint gradients
that are linearly dependent, it means that we can remove some constraints without
affecting the solution. At a regular point, Eqs. (5.1.2) and (5.1.3) represent n + ne
equations for the ne Lagrange multipliers and the n coordinates of the stationary
point.
The situation is somewhat more complicated when inequality constraints are
present. To be able to apply the Lagrange multiplier method we first transform the
inequality constraints to equality constraints by adding slack variables. That is, the
inequality constraints are written as

j = 1, ... , n g , (5.1.4)

where tj is a slack variable which measures how far the jth constraint is from being
critical. We can now form a Lagrangian function
ng

C(x,t,'x)=j- LAj(gj-t;). (5.1.5)


j=l

Differentiating the Lagrangian function with respect to x, ,X and t we obtain

ac
aXi
= Of _
aXi )=1
.
I:>-j aXiagj = 0, i=l, ... ,n, (5.1.6)

ac 2
aA' = -gj + tj = 0, j=l, ... ,ng , (5.1.7)
J
ac
-;::;- = 2Ajtj = 0, j = 1, ... , ng . (5.1.8)
utj

Equations (5.1.7) and (5.1.8) imply that when an inequality constraint is not critical
(so that the corresponding slack variable is non-zero) then the Lagrange multiplier
associated with the constraint is zero. Equations (5.1.6) to (5.1.8) are the necessary
conditions for a stationary regular point. Note that for inequality constraints a regular
point is one where the gradients of the active constraints are linearly independent.
These conditions are modified slightly to yield the necessary conditions for a minimum
and are known as the Kuhn-Tucker conditions. The Kuhn-Tucker conditions may be
summarized as follows:

162
Section 5.1,' The Kuhn-Tucker Conditions

A point x is a local minimum of an inequality constrained problem only if a set


of nonnegative ).,j'S may be found such that:
1. Equation (5.1.6) is satisfied
2. The corresponding ).,j is zero if a constraint is not active.

Vf
Figure 5.1.1 A geometrical interpretation of Kuhn-Tucker condition for the case of
two constraints.

A geometrical interpretation of the Kuhn-Tucker conditions is illustrated in Fig.


(5.1.1) for the case of two constraints. '\191 and '\1g 2 denote the gradients of the two
constraints which are orthogonal to the respective constraint surfaces. The vector s
shows a typical feasible direction which does not lead immediately to any constraint
violation. For the two-constraint case Eq. (5.1.6) may be written as

(5.1.9)

Assume that we want to determine whether point A is a minimum or not. To improve


the design we need to proceed from point A in a direction s that is usable and feasible.
For the direction to be usable, a small move along this direction should decrease the
objective function. To be feasible, s should form an obtuse angle with -'\1 gl and
- '\192. To be a direction of decreasing f it must form an acute angle with - '\1 f.
Clearly from Figure (5.1.1), any vector which forms an acute angle with - '\1 f will also
form and acute angle with either - '\191 or -'\192. Thus the Kuhn-Tucker conditions
mean that no feasible design with reduced objective function is to be found in the
neighborhood of A. Mathematically, the condition that a direction s be feasible is
written as
(5.1.10)

163
Chapter 5: Constrained .'Jtimization
,
where fA is the set of active constraints Equality in Eq. (5.1.10) is permitted only
for linear or concave constraints (see Section 5.1.2 for definition of concavity). The
condition for a usable direction (one that decreases the objective function) is

sT'\1 f < °. (5.1.11)


Multiplying Eq. (5.1.6) by Sj and summing over i we obtain
ng

sT'\1 f = L AjST'\1gj . (5.1.12)


j=l

In view of Eqs. (5.1.10) and (5.1.11), Eq. (5.1.12) is impossible if the A/S are positive.
If the Kuhn-Tucker conditions are satisfied at a point it is impossible to find a
direction with a negative slope for the objective function that does not violate the
constraints. In some cases, though, it is possible to move in a direction which is
tangent to the active constraints and perpendicular to the gradient (that is, has zero
slope), that is
(5.1.13)
The effect of such a move on the objective function and constraints can be determined
only from higher derivatives. In some cases a move in this direction could reduce the
objective function without violating the constraints even though the Kuhn-Tucker
conditions are met. Therefore, the Kuhn-Tucker conditions are necessary but not
sufficient for optimality.
The Kuhn-Tucker conditions are sufficient when the number of active constraints

with s °
is equal to the number of design variables. In this case Eq. (5.1.13) cannot be satisfied
t- because '\1 gj includes n linearly independent directions (in n dimensional
space a vector cannot be orthogonal to n linearly independent vectors).
When the number of active constraints is not equal to the number of design
variables sufficient conditions for optimality require the second derivatives of the
objective function and constraints. A sufficient condition for optimality is that the
Hessian matrix of the Lagrangian function is positive definite in the subspace tangent
to the active constraints. If we take, for example, the case of equality constraints,
the Hessian matrix of the Lagrangian is
n,

'\1 2 £ = '\1 2 f - L Aj'\1 2h j (5.1.14)


j=l

The sufficient condition for optimality is that


ST('\1 2 £)s > 0, for all s for which sTh j = 0, j = 1 ... , ne . (5.1.15)
When inequality constraints are present, the vector s also needs to be orthogonal to
the active constraints with positive Lagrange multipliers. For active constraints with
zero Lagrange multipliers, s must satisfy
sT'\1gj ?: 0, when gj = ° and Aj = 0 . (5.1.16)

164
Section 5.1: The Kuhn-Tucker Conditions

Example 5.1.1

Find the minimum of

f = -xr - 2x~ + 10Xl - 6 - 2xL

subject to
gl = 10 - XIX2 ;::: 0,
g2 = Xl ;::: 0,
g3 = 10 - X2 ;::: 0 .
The Kuhn-Tucker conditions are

- 3xi + 10 + )qX2 - .\2 = 0,


- 4X2 - 6x~ + AIXI + A3 = 0 .

We have to check for all possibilities of active constraints.


The simplest case is when no constraints are active, Al = A2 = A3 = O. \Ve get
Xl = 1.826, X2 = 0, f = 6.17.
The Hessian matrix of the Lagrangian,

is clearly negative definite, so that this point is a maximum. We next assume that the
first constraint is active, XIX2 = 10, so that Xl i 0 and g2 is inactive and therefore
A2 = O. We have two possibilities for the third constraint. If it is active we get .Tl = 1,
X2 = 10, Al = -0.7, and A3 = 639.3, so that this point is neither a minimum nor a
maximum. If the third constraint is not active A3 = 0 and we obtain the following
three equations
-3xi + 10 + AIX2 = 0,
-4X2 - 6x~ + Al Xl = 0,
XjX2 = 10 .
The only solution for these equations that satisfies the constraints on Xl and X2 is

Xl = 3.847, X2 = 2.599, Al = 13.24, f = -73.08.

This point satisfies the Kuhn-Tucker conditions for a minimum. However, the Hessian
of the Lagrangian at that point

\7 2 .c _ [-23.08 13.24 ]
- 13.24 -35.19 '

is negative definite, so that it cannot satisfy the sufficiency condition. In fact, an


examination of the function f at neighboring points along XlX2 = 10 reveals that the
point is not a minimum.
165
Chapter 5: Constrained Optimization

Next we consider the possibility that gl is not active, so that )'1 = 0, and
-3xI + 10 - = 0,
°.
A2

-4X2 - 6x~ + A3 =
We have already considered the possibility of both A'S being zero, so we need to

°
consider only three possibilities of one of these Lagrange multipliers being nonzero,
or both being nonzero. The first case is A2 f:. 0, A3 = 0, then g2 = and we get Xl = 0,
X2 = 0, A2 = 10, and f = -6, or Xl = 0, X2 = -2/3, A2 = 10, and f = -6.99. Both
points satisfy the Kuhn-Tucker conditions for a minimum, but not the sufficiency
°
condition. In fact, the vectors tangent to the active constraints ~XI = is the only
one) have the form ST = (0, a), and it is easy to check that sT'1 £'s < 0. It is also
easy to check that these points are indeed no minima by reducing X2 slightly.
The next case is A2 A3 f:. 0, so that g3 = O. We get Xl = 1.826, X2 = 10,
= 0,
A3 = 640 and f = -2194. this point satisfies the Kuhn-Tucker conditions, but it is
not a minimum either. It is easy to check that '12 £, is negative definite in this case
so that the sufficiency condition could not be satisfied. Finally, we consider the case
Xl = 0, x2 = 10, A2 = 10, A3 = 640, f = -2206. Now the Kuhn-Tucker conditions
are satisfied, and the number of active constraints is equal to the number of design
variables, so that this point is a minimum .•••
5.1.2 Convex Problems

There is a class of problems, namely convex problems, for which the Kuhn-Tucker
conditions are not only necessary but also sufficient for a global minimum. To define
convex problems we need the notions of convexity for a set of points and for a function.
A set of points S is convex whenever the entire line segment connecting two points
that are in S is also in S. That is
ifxI,X2ES, theno:xI+(1-0:)X2ES, 0<0:<1. (5.1.l7)
A function is convex if
0<0:<1. (5.1.18)
This is shown pictorially for a function of a single variable in Figure (5.1.2). The
straight segment connecting any two points on the curve must lie above the curve.
Alternatively we note that the second derivative of f is non-negative J"(x) ;::: O. It
can be shown that a function of n variables is convex if its matrix of second derivatives
is positive semi-definite.
A convex optimization problem has a convex objective function and a convex
feasible domain. It can be shown that the feasible domain is convex if all the inequality
constraints gj are concave (that is, -gj are convex) and the equality constraints are
linear. A convex optimization problem has only one minimum, and the Kuhn-Tucker
conditions are sufficient to establish it. Most optimization problems encountered in
practice cannot be shown to be convex. However, the theory of convex programming is
still very important in structural optimization, as we often approximat.e optimization
problems by a series of convex approximations (see Chapter 9). The simplest such
approximation is a linear approximation for the objective function and constraints-
this produces a linear programming problem.

166
Section 5.1: The Kuhn-Tucker Conditions

Figure 5.1.2 Convex function.

Example 5.1.2

Figure 5.1.3 Four bar statically determinate truss.

Consider the minimum weight design of the four bar truss shown in Figure (5.1.3).
For the sake of simplicity we assume that members 1 through 3 have the same area
A1 and member 4 has an area A 2. The constraints are limits on the stresses in the
members and on the vertical displacement at the right end of the truss. Under the
specified loading the member forces and the vertical displacement /) at the end are
found to be
f1 = 5p, 12 = -p, h = 4p, f4 = -2V3p,

/) = 6pl
E
(~+
A1
v'A23)
We assume the allowable stresses in tension and compression to be 8.74 X 10-4 E and
4.83 x 10-4 E, respectively, and limit the vertical displacement to be no greater than
3 x 10- 3 1. The minimum weight design subject to stress and displacement constraints
167
Chapter 5: Constrained Optimization
can be formulated in terms of nondimensional design variables

Xl -_ 10- 3AIE
--,
P
as
minimize f = 3XI + V3X2
18 613
subject to gl = 3- - - - ~ 0,
Xl X2
g2 = Xl - 5. 73 ~ 0,
g3 = X2 - 7.17 ~ °.
The Kuhn-Tucker conditions are

~
UXi
3
() f _ "'"
~]~
A ,ogj
. 1 UXi
-
- °, i = 1,2,
]=

or

Consider first the possibility that Al = 0. Then clearly A2 = 3, A3 = 13 so that


g2 °
= and g3 = 0, and then Xl = 5.73, X2 = 7.17, gl = -1.59, so that this solution
is not feasible. We conclude that Al f:. 0, and the first constraint must be active
at the minimum. Consider now the possibility that A2 = A3 = 0. We have the two
°
Kuhn-Tucker equations and the equation gl = for the unknowns AI, Xl, X2. The
solution is
Xl = x2 = 9.464, Al = 14.93, f = 44.78 .
The Kuhn-Tucker conditions for a minimum are satisfied. If the problem is convex
the Kuhn-Tucker conditions are sufficient to guarantee that this point is the global
minimum. The objective function and the constraint functions g2 and g3 are linear,
so that we need to check only gl. For convexity gl has to be concave or - gl convex;
this holds if the second derivative matrix -AI of -gl is positive semi-definite

-AI = [360/X~ 0]
12v'3x~ .

°
Clearly, for Xl > and X2 > 0, -AI is positive definite so that the minimum that we
found is a global minimum .•••
168
Section 5.2: Quadratic Programming Problems
5.2 Quadratic Programming Problems

One of the simplest form of nonlinear constrained optimization problems is in


the form of Quadratic Programming (QP) problem. A general QP problem has a
quadratic objective function with linear equality and inequality constraints. For the
sake of simplicity we consider only an inequality problem with ng constraints stated
as
1
minimize f(x) = c T X + 2xTQX
such that Ax;:::b, (5.2.1)
Xi;::: 0, i = 1, ... ,n.
The linear constraints form a convex feasible domain. If the objective function is
also convex, then we have a convex optimization problem in which, as discussed in
the previous section, the Kuhn-Tucker conditions become sufficient for the optimality
of the problem. Hence, having a positive semi-definite or positive definite Q matrix
assures a global minimum for the solution of the problem, if one exists. For many
optimization problems the quadratic form x T Qx is either positive definite or positive
semi-definite. Therefore, one of the methods for solving QP problems relies on solving
the Kuhn-Tucker conditions.
We start by writing the Lagrange function for the Problem (5.2.1)

where>. and I' are the vectors of Lagrange multipliers for the inequality constraints
and the nonnegativity constraints, respectively, and {tn and {sn are the vectors of
positive slack variables for the same. The necessary conditions for a stationary point
are obtained by differentiating the Lagrangian with respect to the x, >., 1', t, and s,

~~ =c-Qx-AT>.-I'=O, (5.2.3)

ac { 2}
a>. = Ax - tj - b = °, (5.2.4)
ac
al'=x-{sn=o, (5.2.5)
ac
-a = 2>'jtj = 0, j = 1, ... , ng , (5.2.6)
tj
ac
-a = 2j.Li S i = 0, i = 1, ... ,n . (5.2.7)
Si

where ng is the number of inequality constraints, and n is the number of design


variables. We define a new vector {qj} - {tn, j = 1, ... , ng (q;::: 0). After
multiplying Eqs. (5.2.6) and (5.2.7) by {tj} and {Si}, respectively, and eliminating
169
Chapter 5: Constrained Optimization
is;} from the last equation by using Eq. (5.2.5), we can rewrite the Kuhn-Tucker
conditions

Qx + AT ,\ + p. =c , (5.2.8)
Ax-q=b, (5.2.9)
j=l, ... ,ng , (5.2.10)
= 0, i = 1, ... ,n, (5.2.11)
°.
J.liXi
x ~ 0, ,\ ~ 0, and p. ~ (5.2.12)

Equations (5.2.8) and (5.2.9) form a set of n + ng linear equations for the solution
of unknowns Xi,Aj,J.l;, and qj which also need to satisfy Eqs. (5.2.10) and (5.2.11).
Despite the nonlinearity of the Eqs. (5.2.10) and (5.2.11), this problem can be solved
as proposed by Wolfe [3] by using the procedure described in 3.6.3 for generating
a basic feasible solution through the use of artificial variables. Introducing a set
of artificial variables, y;, i = 1, ... , n, we define an artificial cost function to be
minimized,

L y;
n

minimize (5.2.13)
;=1
subject to Qx + AT ,\ + p. + y =c , (5.2.14)

°.
Ax-q=b, (5.2.15)
x ~ 0, ,\ ~ 0, p. ~ 0, and y ~ (5.2.16)

Equations (5.2.13) through (5.2.16) can be solved by using the standard simplex
method with the additional requirement that (5.2.10) and (5.2.11) be satisfied. These
requirements can be implemented during the simplex algorithm by simply enforcing
that the variables Aj and qj (and J.li and Xi) not be included in the basic solution
simultaneously. That is, we restrict a non-basic variable J.li from entering the basis if
the corresponding Xi is already among the basic variables.
Other methods for solving the quadratic programming problem are also available,
and the reader is referred to Gill et al. ([4], pp. 177-180) for additional details.

5.3 Computing the Lagrange Multipliers

As may be seen from example 5.1.1, trying to find the minimum directly from
the Kuhn-Tucker conditions may be difficult because we need to consider many com-
binations of active and inactive constraints, and this would in general involve the
solution of highly nonlinear equations. The Kuhn-Tucker conditions are, however,
often used to check whether a candidate minimum point satisfies the necessary con-
ditions. In such a case we need to calculate the Lagrange multipliers (also called the
Kuhn-Tucker multipliers) at a given point x. As we will see in the next section, we

170
Section 5.3: Computing the Lagrange Multipliers

may also want to calculate the Lagrange multipliers for the purpose of estimating the
sensitivity of the optimum solution to small changes in the problem definition. To
calculate the Lagrange multipliers we start by writing Eq. (5.1.6) in matrix notation
as
Vj-N,x=O, (5.3.1)
where the matrix N is defined by
agj
nij =-, j = 1, ... , r, and i=1, ... ,n. (5.3.2)
aXi

We consider only the active constraints and associated lagrange multipliers, and as-
sume that there are r of them.
Typically, the number, r, of active constraints is less than n, so that with n
equations in terms of r unknowns, Eq. (5.3.1) is an overdetermined system. We
assume that the gradients of the constraints are linearly independent so that N has
rank r. If the Kuhn-'TUcker conditions are satisfied the equations are consistent and
we have an exact solution. We could therefore use a subset of r equations to solve for
the Lagrange multipliers. However, this approach may be susceptible to amplification
of errors. Instead we can use a least-squares approach to solve the equations. We
define a residual vector u
u=N,x-Vj, (5.3.3)
A least squares solution ofEq. (5.3.1) will minimize the square of the Euclidean norm
of the residual with respect to ,x

To minimize lIull 2 we differentiate it with respect to each one of the Lagrange multi-
pliers and get
{5.3.5}
or
(5.3.6)
This is the best solution in the least square sense. However, if the Kuhn-'TUcker
conditions are satisfied it should be the exact solution of Eq. (5.3.1). Substituting
from Eq. (5.3.6) into Eq. (5.3.1) we obtain
PVj = 0, (5.3.7)
where
(5.3.8)
P is called the projection matrix. It will be shown in Section 5.5 that it projects a
vector into the subspace tangent to the active constraints. Equation (5.3.7) implies
that for the Kuhn-'TUcker conditions to be satisfied the gradient of the objective
function has to be orthogonal to that subspace.
In practice Eq. (5.3.6) is no longer popular for the calculation of the Lagrange
multipliers. One reason is that the method is ill-conditioned and another is that it is
171
Chapter 5: Constmined Optimization
not efficient. An efficient and better conditioned method for least squares calculations
is based on the QR factorization of the matrix N. The QR factorization of the matrix
N consists of an r x r upper triangular matrix R and an n x n orthogonal matrix Q
such that
(5.3.9)

Here Q1 is a matrix consisting of the first r rows of Q, Q2 includes the last n - r


rows of Q, and the zero represents an (n - r) x r zero matrix (for details of the QR
factorization see most texts on numerical analysis, e.g., Dahlquist and Bjorck [5]).
Because Q is an orthogonal matrix, the Euclidean norm of Qu is the same as that of
u, or

lIuII = IIQul1 = IIQN)' - QV/1l = I (~)). - QV/I1 = I (R~Q~;/) Ir .


2 2 2
2

(5.3.10)
From this form it can be seen that lIull is minimized by choosing). so that
2

(5.3.11)

The last n - r rows of the matrix Q denoted Q2 are also im/ortant in the following.
They are orthogonal vectors which span the null space of N . That is NT times each
one of these vectors is zero.

Example 5.3.1

Check whether the point (-2, -2,4) is a local minimum of the problem

1 = Xl + X2 + x3,
91 = 8 - X~ - X~ ;::: 0,
92 = x3 - 4;::: 0,
93 = X2 + 8;::: °.
Only the first two constraints are critical at (-2, -2,4)

891 = 0,
8X3

892 = 0, 892 = 0, 892 = 1,


8X1 8X2 8X3
81 _ 81 _ 81 _ 1
8X1 - 8X2 - 8X3 - .
So

172
n,
Section 5.4: Sensitivity 01 Optimum Solution to Problem Pammeters

NTN = [302 NTV I = { ~} ,

A= (NTN)-lNTVI = {1{4} ,
also
[I - N(NTNt1NT] VI =0 .
Equation (5.3.7) is satisfied, and all the Lagrange multipliers are positive, so the
Kuhn-Tucker conditions for a minimum are satisfied .•••

5.4 Sensitivity of Optimum Solution to Problem Parameters

The Lagrange multipliers are not only useful for checking optimality, but they
also provide information about the sensitivity of the optimal solution to problem
parameters. In this role they are extremely valuable in practical applications. In
most engineering design optimization problems we have a host of parameters such as
material properties, dimensions and load levels that are fixed during the optimization.
We often need the sensitivity of the optimum solution to these problem parameters,
either because we do not know them accurately, or because we have some freedom to
change them if we find that they have a large effect on the optimum design.
We assume now that the objective function and constraints depend on a param-
eter p so that the optimization problem is defined as

minimize I(x,p)
(5.4.1)
such that gj(x,p) ;::: 0, j = 1, ... ,ng .

The solution of the problem is denoted x*(p) and the corresponding objective function
f*(p) = I(x*(p),p). We want to find the derivatives of x* and f* with respect to
p. The equations that govern the optimum solution are the Kuhn-Tucker conditions,
Eq. (5.3.1), and the set of active constraints

(5.4.2)

where ga denotes the vector of r active constraint functions. Equations (5.3.1) and
(5.4.2) are satisfied by x*(p) for all values of p that do not change the set of active
constraints. Therefore, the derivatives of these equations with respect to p are zero,
provided we consider the implicit dependence of x and A on p. Differentiating Eq.
(5.3.1) and (5.4.2) with respect to p we obtain

dx* dA
(A-Z)--N-+-(Vf)- a (ON)
- A=O, (5.4.3)
dp dp op op

N Tdx * oga = 0 (5.4.4)


dp + op ,

173
Chapter 5: Constrained Optimization
where A is the Hessian matrix of the objective function f, aij = eJ2 f jox/)Xj, and Z
is a matrix whose elements are

( 5.4.5)

Equations (5.4.3) and (5.4.4) are a system of simultaneous equations for the deriva-
tives of the design variables and of the Lagrange multipliers. Different special cases
of this system are discussed by Sobieski et al. [6].
Often we do not need the derivatives of the design variables or of the Lagrange
multipliers, but only the derivatives of the objective function. In this case the sensi-
tivity analysis can be greatly simplified. We can write

(5.4.6)

Using Eq. (5.3.1) and (5.4.4) we get

df _ of >.70ga
dp - op - op' (5.4.7)

Equation (5.4.7) shows that the Lagrange multipliers are a measure of the effect
of a change in the constraints on the objective function. Consider, for example,
a constraint of the form gj(x) = Gj(x) - p ~ O. By increasing p we make the
constraint more difficult to satisfy. Assume that many constraints are critical, but
that p affects only this single constraint. We see that ogjjop = -1, and from Eq.
(5.4.7) df j dp = Aj, that is Aj is the 'marginal price' that we pay in terms of an
increase in the objective function for making gj more difficult to satisfy.
The interpretation of Lagrange multipliers as the marginal prices of the con-
straints also explains why at the optimum all the Lagrange multipliers have to be
non-negative. A negative Lagrange multiplier would indicate that we can reduce the
objective function by making a constraint more difficult to satisfy- an absurdity.

Example 5.4.1

Consider the optimization problem

f = Xl + X2 + X3,
gl = P - xi - x~ ~ 0,
g2 = X3 - 4 ~ 0,
g3 = X2 +P ~ 0.

This problem was analyzed for p = 8 in Example 5.3.1, and the optimal solution was
found to be (-2, -2,4). We want to find the derivative of this optimal solution with
respect to p. At the optimal point we have f = 0 and >.7 = (0.25,1.0), with the

174
Section 5.5: Gradient Projection and Reduced Gradient Methods
first two constraints being critical. We can calculate the derivative of the objective
function from Eq. (5.4.7)

of
op
= 0
,
oga =
op
{I}
0'

so
df
dp = -0.25.

To calculate the derivatives of the design variables and constraints we need to set up
Eqs. (5.4.3) and (5.4.4). We get

oVf =0 oN
A=O, op , op =0.

Only gl has nonzero second derivatives o2gdoxi = f.J2gdox~ = -2 so from Eq.


(5.4.5 )

-2 o
Z11 = -2A2 = -2, Z22 = -2A2 = -2, Z =[ ~ -2
o
With N from Example 5.3.1, Eq. (5.4.3) gives us

2Xl - 4Al = 0,
2X2 - 4~1 = 0,
~2 = 0,

where a dot denotes derivative with respect to p. From Eq. (5.4.4) we get

4Xl + 4X2 + 1 = 0,
X3 = 0 .

The solution of these five coupled equations is

Xl = X2 = -0.125, X3 = 0, ~1 = -0.0625, ~2 =0 .
We can check the derivatives of the objective function and design variables by chang-
ing p from 8 to 9 and re-optimizing. It is easy to check that we get Xl = X2 = -2.121,
X3 = 4, f = -0.242. These values compare well with linear extrapolation based on
the derivatives which gives Xl = X2 = -2.125, X3 = 4, f = -0.25.e ••

175
Chapter 5: Constrained rnization
5.5 Gradient Projecti. I.lld Reduced Gradient Methods

Rosen's gradient projection method is based on projecting the search dirt'ction into
the subspace tangent to the active constraints. Let us first examine the method for
the case of linear constraints [7]. We define the constrained problem as

minimize f (x)
n
such that gj(X) = L ajiXi - bj 2: 0, j = 1, ... , ng .
(5.5.1)
i=1

In vector form
(5.5.2)
If we select only the r active constraints (j E fA), we may write the constraint
equations as
(5.5.3)
where ga is the vector of active constraints and the columns of the matrix N are
the gradients of these constraints. The basic assumption of the gradient projection
method is that x lies in the subspace tangent to the active constraints. If

X;+1 = Xi + O'S , (5.5.4)


and both Xi and X;+1 satisfy Eq. (5.5.3), then

NTs = o. (5.5.5)

If we want the steepest descent direction satisfying Eq. (5.5.5), we can pose the
problem as
minimize
such that (5.5.6)
and sTs = 1 .
That is, we want to find the direction with the most negative directional deriva-
tive which satisfies Eq. (5.5.5). We use Lagrange multipliers oX and f.1 to form the
Lagrangian
(5.5.7)
The condition for £ to be stationary is
a12
as = V' f - NoX - 2f.18 = 0 . (5.5.8)

Premultiplying Eq. (5.5.8) by NT and using Eq. (5.5.5) we obtain

NTV' f - NTNoX = 0, (5.5.9)

or
(5.5.10)

176
Section 5.5: Gradient Projection and Reduced Gradient Methods

So that from Eq. (5.5.8)

s = L[I - N(NTN)-INT)Vf = LpVf . (5.5.11)

P is the projection matrix defined in Eq. (5.3.8). The factor of 1/2f.1, is not significant
because s defines only the direction of search, so in general we use s = -PV f. To
show that P indeed has the projection property, we need to prove that if w is an
arbitrary vector, then Pw is in the subspace tangent to the active constraints, that
is Pw satisfies
NTpw=O. (5.5.12)
We can easily verify this by using the definition of P.
Equation (5.3.8) which defines the projection matrix P does not provide the most
efficient way for calculating it. Instead it can be shown that

(5.5.13)

where the matrix Q2 consists of the last n - r rows of the Q factor in the QR
factorization of N (see Eq. (5.3.9)).
A version of the gradient projection method known as the generalized reduced
gradient method was developed by Abadie and Carpentier [8). As a first step we
select r linearly independent rows of N, denote their transpose as NI and partition
NT as
(5.5.14)
Next we consider Eq. (5.5.5) for the components Si of the direction vector. The r
equations corresponding to N I are then used to eliminate r components of sand
obtain a reduced order problem for the direction vector.
Once we have identified N I we can easily obtain Q2 which is given as

(5.5.15)

Equation (5.5.15) can be verified by checking that NTQI = 0, so that Q2N = 0,


which is the requirement that Q2 has to satisfy (see discussion following Eq. (5.3.11)).
After obtaining s from Eq. (5.5.11) we can continue the search with a one di-
mensional minimization, Eq. (5.5.4), unless s = O. When s = 0 Eq. (5.3.7) indicates
that the Kuhn-Tucker conditions may be satisfied. We then calculate the Lagrange
multipliers from Eq. (5.3.6) or Eq. (5.3.11). If all the components of A are non-
negative, the Kuhn-Tucker conditions are indeed satisfied and the optimization can
be terminated. If some of the Lagrange multipliers are negative, it is an indication
that while no progress is possible with the current set of active constraints, it may
be possible to proceed by removing some of the constraints associated with negative
Lagrange multipliers. A common strategy is to remove the constraint associated with
the most negative Lagrange multiplier and repeat the calculation of P and s. If s
177
Chapter 5: Constrained Optimization
is now non-zero, a one-dimensional search m.::.y be started. If s remains zero and
there are still negative Lagrange multipliers, we remove another constraint until all
Lagrange multipliers become positive and we satisfy the Kuhn-Tucker conditions.

After a search direction has been determined, a one dimensional search must be
carried out to determine the value of a in Eq. (5.5.4). Unlike the unconstrained case,
there is an upper limit on a set by the inactive constraints. As a increases, some
of them may become active and then violated. Substituting x = Xi + as into Eq.
(5.5.2) we obtain
(5.5.16)

or
(5.5.17)

Equation (5.5.17) is valid if aJ s < O. Otherwise, there is no upper limit on a due to


the jth constraint. From Eq. (5.5.17) we get a different a, say aj for each constraint.
The upper limit on a is the minimum

0: = <>j>O,
min a j .
]3IA
(5.5.18)

At the end of the move, new constraints may become active, so that the set of active
constraints may need to be updated before the next move is undertaken.

The version of the gradient projection method presented so far is an extension


of the steepest descent method. Like the steepest descent method, it may have slow
convergence. The method may be extended to correspond to Newton or quasi-Newton
methods. In the unconstrained case, these methods use a search direction defined as

s=-BV'j, (5.5.19)

where B is the inverse of the Hessian matrix of j or an approximation thereof. The


direction that corresponds to such a method in the subspace tangent to the active
constraints can be shown [4] to be

(5.5.20)

where AL is the Hessian of the Lagrangian function or an approximation thereof.

The gradient projection method has been generalized by Rosen to nonlinear con-
straints [9]. The method is based on linearizing the constraints about Xi so that

(5.5.21)

178
Section 5.5: Gradient Projection and Reduced Gradient Methods

restoration
move

projection
move -----..,.

Figure 5.5.1 Projection and restoration moves.


The main difficulty caused by the nonlinearity of the constraints is that the
one-dimensional search typically moves away from the constraint boundary. This
is because we move in the tangent subspace which no longer follows exactly the
constraint boundaries. After the one-dimensional search is over, Rosen prescribes a
restoration move to bring x back to the constraint boundaries, see Figure 5.5.L
To obtain the equation for the restoration move, we note that instead of Eq.
(5.5.2) we now use the linear approximation
gj ~ gj(Xi) + VgJ(Xi - Xi) . (5.5.22)
We want to find a correction Xi - Xi in the tangent subspace (i.e. P(Xi - Xi) = 0)
that would reduce gj to zero. It is easy to check that
Xi - Xi = -N(NTN)-lga(Xi) , (5.5.23)
is the desired correction, where ga is the vector of active constraints. Equation
(5.5.23) is based on a linear approximation, and may therefore have to be applied
repeatedly until ga is small enough.
In addition to the need for a restoration move, the nonlinearity of the constraints
requires the re-evaluation of N at each point. It also complicates the choice of an
upper limit for a which guarantees that we will not violate the presently inactive
constraints. Haug and Arora [10] suggest a procedure which is better suited for the
nonlinear case. The first advantage of their procedure is that it does not require
a one-dimensional search. Instead, a in Eq. (5.5.4) is determined by specifying a
desired specified reduction 'Y in the objective function. That is, we specify
J(Xi) - J(xi+d ~ 'YJ(Xi) . (5.5.24)
Using a linear approximation with Eq. (5.5.4) we get
* 'Y J(Xi)
a =- STVJ . (5.5.25)
The second feature of Haug and Arora's procedure is the combination of the projection
and the restoration moves as
Xi+l = Xi + a*s - N(NTN)-lga , (5.5.26)
where Eqs. (5.5.4), (5.5.23) and (5.5.25) are used.

179
Chapter 5: Constrained Optimization
Example 5.5.1

Use the gradient projection method to solve the following problem


minimize f = xi + x~ + x~ + x~ - 2Xl - 3X4
subject to 91 = 2Xl + X2 + X3 + 4X4 - 7 ~ 0,
92 = xl + X2 + x~ + X4 - 5.1 ~ 0,
Xi ~ 0, i = 1, ... ,4.
Assume that as a result of previous moves we start at the point x6 = (2,2,1,0),
f(xo) = 5.0, where the nonlinear constraint 92 is slightly violated. The first constraint
is active as well as the constraint on X4. We start with a combined projection and
restoration move, with a target improvement of 10% in the objective function. At Xo

N=
[2 1 0]
1 1 0
1 2 0 ' NTN = [229 9
7 !j,
4 1
4 1 1
-5
(NTNt l = 6~ [ -~ -19]
14
l1 -19 14 73

[1 -3
!] , Vf={j}.
1
P = I _ N(NTN)-lN T = ~ -3 9 -3
l1 1 -3 1
o 0 0
The projection move direction is s = -PVf = [8/11,-24/l1,8/11,0]T. Since the
magnitude of a direction vector is unimportant we scale s to ST = [1, -3, 1, OJ. For a
10% improvement in the objective function 'Y = 0.1 and from Eq. (5.5.25)
• = _ O.lf = _ 0.1 x 5 = 0 0625
a sTVf -8 . .
For the correction move we need the vector ga of constraint values, gr = (0, -0.1,0),
so the correction is

-1 { _;
-N(NT N) -1 ga = l10 -i } .

H} ,
Combining the projection and restoration moves, Eq. (5.5.26)

x, = + 00625 { -t }- 1 :0 { =} } F~~}
=

we get f(xt} = 4.64, 9l(Xl) = 0, g2(Xt} = 0.016. Note that instead of lO% reduction
we got only 7% due to the nonlinearity of the objective function. However, we did
satisfy the nonlinear constraint.e e e

180
Section 5.5: Gmdient Projection and Reduced Gmdient Methods
Example 5.5.2

Consider the four bar truss of Example 5.1.2. The problem of finding the minimum
weight design subject to stress and displacement constraints was formulated as
minimize f = 3XI + ..j3x2
18 6v'3
subject to gl = 3 - -Xl - - - ~ 0,
°,
X2
g2 = Xl - 5. 73 ~
g3 = X2 - 7.17 ~ 0,

where the Xi are non-dimensional areas


AiE
x·--- i = 1,2 .
• - lOOOP'
The first constraint represents a limit on the vertical displacement, and the other two
represent stress constraints.

Xl
Assume that we start the search at the intersection of gl = and g3 = 0, where
= 11.61, X2 = 7.17, and f = 47.25.
°
The gradients of the objective function and
two active constraints are
0.1335} N _ [0.1335 0]
VgI = { 0.2021 ' - 0.2021 1 .

Because N is nonsingular, Eq. (5.3.8) shows that P = 0. Also since the number of
linearly independent active constraints is equal to the number of design variables the
tangent subspace is a single point, so that there is no more room for progress. Using
Eqs. (5.3.6) or (5.3.11) we obtain

,\ _ { 22.47 }
- -2.798 .

The negative multiplier associated with g3 indicates that this constraint can be
dropped from the active set. Now

N _ [0.1335]
- 0.2021 .
The projection matrix is calculated from Eq. (5.3.8)

0.6962 -0.4600] { -1.29}


P = [ -0.4600 0.3036' s = -PV f = 0.854

We attempt a 5% reduction in the objective function, and from Eq. (5.5.25)


0.* = 0.05 x 47.25 = 0.988 .
[-1.290.854) { ~}
181
Chapter 5: Constrained Optimization
Since there was no constraint violation at Xo we do not need a combined projection
and correction step, and

Xl • = {11.61}
= Xo + as 7.17 + 0.98 8{-1.29}
0.854 = {10.34}
8.01 .

At Xl we have f(xJ) = 44.89, gl(XJ) = -0.0382. Obviously g2 is not violated. If there


were a danger of that we would have to limit a* using Eq. (5.5.17). The violation of
the nonlinear constraint is not surprising, and its size indicates that we should reduce
the attempted reduction in f in the next move. At Xl, only gl is active so

0.1684}
N = Vg l = { 0.1620
The projection matrix is calculated to be

P _ [0.4806 -0.4996] -0.5764}


- -0.4996 0.5194 ' s = -PVf = { 0.5991 .

Because of the violation we reduce the attempted reduction in f to 2.5%, so


a* =_ 0.025 X 44.89 = 1.62 .
[-0.5670.599) { ~}
We need also a correction due to the constraint violation (ga = -0.0382)

-N(NTN)-l ga -_ { 0.118}
0.113 .

Altogether

X2 = Xl +a*s- N(NTN)-l ga = { 1O.34}


8.01 - 162 0.576} + {0.118}
. { -0.599 0.113 = {9.52}
9.10

We obtain f(X2) = 44.32, gl (X2) = -0.0328.


The optimum design is actually XT = (9.464,9.464), f(x) = 44.78, so after two
iterations we are quite close to the optimum design .•••

5.6 The Feasible Directions Method

The feasible directions method (11) has the opposite philosophy to that of the
gradient projection method. Instead of following the constraint boundaries, we try to
stay as far away as possible from them. The typical iteration of the feasible direction
method starts at the boundary of the feasible domain (unconstrained minimization
techniques are used to generate a direction if no constraint is active).

182
Section 5.6: The Feasible Directions Method

Figure 5.6.1 Selection of search direction using the feasible directions method.

Consider Figure 5.6.1. As a result of a previous move the design is at point x


and we look for a direction s which keeps x in the feasible domain and improves the
objective function. A vector s is defined as a feasible direction if at least a small step
can be taken along it that does not immediately leave the feasible domain. If the
constraints are smooth, this is satisfied if

(5.6.1)

where fA is the set of critical constraints at x. The direction s is called a usable


direction at the point x if in addition

(5.6.2)

That is, s is a direction which reduces the objective function.


Among all possible choices of usable feasible directions we seek the direction
which is best in some sense. We have two criteria for selecting a direction. On the
one hand we want to reduce the objective function as much as possible. On the other
hand we want to keep away from the constraint boundary as much as possible. A
compromise is defined by the following maximization problem

maximize j3
such that - sTVg·3 + O·j3
3 <
- 0,
(5.6.3)
sTVf + j3 ~ 0,
Isd ~ 1.

The OJ are positive numbers called "push-off' factors because their magnitude deter-
mines how far x will move from the constraint boundaries. A value of OJ = 0 will
result in a move tangent to the boundary of the the jth constraint, and so may be
appropriate for a linear constraint. A large value of OJ will result in a large angle
between the constraint boundary and the move direction, and so is appropriate for a
highly nonlinear constraint.
183
Chapter 5: Constrained Optimization
The optimization problem defined by Eq. (5.6.3) is linear and can be solved using
the simplex algorithm. If (3max > 0, we have found a usable feasible direction. If we
get (3max = 0 it can be shown that the Kuhn-Tucker conditions are satisfied.
Once a direction of search has been found, the choice of step length is typically
based on a prescribed reduction in the objective function (using Eq. (5.5.25». If
at the end of the step no constraints are active, we continue in the same direction
as long as sT"V f is negative. We start the next iteration when x hits the constraint
boundaries, or use a direction based on unconstrained technique if x is inside the
feasible domain. Finally, if some constraints are violated after the initial step we
make x retreat based on the value of the violated constraints. The method of fea.<;ible
directions is implemented in the popular CONMIN program [12].

Example 5.6.1

Consider the four bar truss of Example 5.1.2. The problem of finding the minimum
weight design subject to stress and displacement constraints was formulated as

minimize f = 3X1 + -/3x2


18 6-/3
subject to gl = 3 - - - -- ~ 0,
Xl X2

g2 = Xl - 5.73 ~ 0,
g3=x2-7.17~0,

where the Xi are non-dimensional areas


A;E
.T; = 1000P , i = 1,2 .

The first constraint represents a limit on the vertical displacement, and the other two
constraints represent stress constraints.
Assume that we start the search at the intersection of gl = 0 and g3 = 0 where
x'{; = (11.61,7.17) and f = 47.25. The gradient of the objective function and two
active constraints are

0.1335}
"V gl = { 0.2021 '

Selecting fh = ()2 = 1, we find that Eq. (5.6.3) becomes

maximize (3
subject to - 0.1335s 1 - 0.2021s 2 + (3 :::; 0,
- S2 + (3 :::; 0,
3s 1 + -/3S2 + (3 :::; 0 ,
- 1 :::; Sl :::; 1,
- 1 :::; S2 :::; 1 .
184
Section 5.6: The Feasible Directions Method

The solution of this linear program is Sl = -0.6172, S2 = 1, and we now need to


execute the one dimensional search

_ {11.61}
X1- 7.17
+ (\' {-0.6172}
1 .

Because the objective function is linear, this direction will remain a descent direction
indefinitely, and (\' will be limited only by the constraints. The requirement that g2
is not violated will lead to (\' = 9.527, Xl = 5.73, X2 = 16.7 which violates gl. We
see that because gl is nonlinear, even though we start the search by moving away
from it we still bump into it again (see Figure 5.6.2). It can be easily checked that
for (\' > 5.385 we violate gl. So we take (\' = 5.385 and obtain Xl = 8.29, X2 = 12.56,
f = 46.62.

5 6 7 8 9 10 11 12 Xl

Figure 5.6.2 Feasible direction solution of 4. bar truss example.

For the next iteration we have only one active constraint

0.2619}
V gl = { 0.0659 ' Vf={~} .
The linear program for obtaining s is
maximize {3
subject to - 0.2619s 1 - 0.0659s 2 + {3 ~ 0,
3S1 + v'3s 2 + {3 ~ 0,
- 1 ~ Sl ~ 1,
-1 ~ S2 ~ 1.

185
Chapter 5: Constrained Optimization

The solution of the linear program is Sl = 0.5512, S2 = -1, so that the one-
dimensional search is
_ { 8.29 }
x - 12.56
+Q {0.5512 }
-1 .

Again Q is limited only by the constraints. The lower limit on X2 dictates Q :::; 5.35.
However, the constraint gl is again more critical. It can be verified that for Q > 4.957
it is violated, so we take Q = 4.957, Xl = 11.02, X2 = 7.60, f = 46.22. The optimum
design found in Example 5.1.2 is Xl = X2 = 9.464, f = 44.78. The design space and
the two iterations are shown in Figure 5.6.2 .•••

5.7 Penalty Function Methods

\Vhen the energy crisis erupted in the middle seventies, the United States Congress
passed legislation intended to reduce the fuel consumption of American cars. The
target was an average fuel consumption of 27.5 miles per gallon for new cars in 1985.
Rather than simply legislate this limit Congress took a gradual approach, with a
different limit set each year to bring up the average from about 14 miles per gallon
to the target value. Thus the limit was set at 26 for 1984, 25 for 1983, 24 for 1982,
and so on. Furthermore, the limit was not absolute, but there was a fine of $50 per
0.1 miles per gallon violation per car.
This approach to constraining the automobile companies to produce fuel efficient
cars has two important aspects. First, by legislating a penalty proportional to the
violation rather than an absolute limit, the government allowed the auto companies
more flexibility. That meant they could follow a time schedule that approximated
the government schedule without having to adhere to it rigidly. Second, the gradual
approach made enforcement easier politically. Had the government simply set the ul-
timate limit for 1985 only, nobody would have paid attention to the law in the 1970's.
Then as 1985 moved closer there would have been a rush to develop fuel efficient cars.
The hurried effort could mean both non-optimal car designs and political pressure to
delay the enforcement of the law.
The fuel efficiency law is an example in which constraints on behavior or eco-
nomic activities are imposed via penalties whose magnitude depends on the degree of
violation of the constraints. It is no wonder that this simple and appealing approach
has found application in constrained optimization. Instead of applying constraints
we replace them by penalties which depend on the degree of constraint violations.
This approach is attractive because it replaces a constrained optimization problem
by an unconstrained one.
The penalties associated with constraint violation have to be high enough so
that the constraints are only slightly violated. However, just as there are political
problems associated with imposing abrupt high penalties in real life, so there are
numerical difficulties associated with such a practice in numerical optimization. For
this reason we opt for a gradual approach where we start with small penalties and
increase them gradually.

186
Section 5.7: Penalty Function Methods
5.7.1 Exterior Penalty Function

The exterior penalty function associates a penalty with a violation of a constraint.


The term 'exterior' refers to the fact that penalties are applied only in the exterior
of the feasible domain. The most common exterior penalty function is one which
associates a penalty which is proportional to the square of a violation. That is, the
constrained minimization problem, Eq. (5.1)
minimize f(x)
such that h;(x) = 0, i = 1, ... , n e , (5.7.1)
gi(x) ~ 0, j = 1, ... ,n g ,

is replaced by
Re n,
minimize ¢(x, r) = f(x) + r L h~(x) + r L < -gj >2
(5.7.2)
;=1 j=1
ri -+ 00,

where < a > denote the positive part of a or max(a,O). The inequality terms are
treated differently from the equality terms because the penalty applies only for con-
straint violation. The positive multiplier r controls the magnitude of the penalty
terms. It may seem logical to choose a very high value of r to ensure that no con-
straints are violated. However, as noted before, this approach leads to numerical
difficulties illustrated later in an example. Instead the minimization is started with
a relatively small value of r, and then r is gradually increased. A typical value for
ri+1/r; is 5. A typical plot of ¢(x, r) as a function of r is shown in Figure 5.7.1 for a
simple example.

¢(x,r) x=4

~
r=2.5

f(x) =0.5 x

Figure 5.7.1 Exterior penalty function for f = 0.5x subject to x - 4 ~ O.

We see that as r is increased, the minimum of ¢ moves closer to the constraint


boundary. However, the curvature of ¢ near the minimum also increases. It is
187
Chapter 5: Constrained Optimization

the high values of the curvature associated with large values of r which often lead
to numerical difficulties. By using a sequence of values of r, we use the minima
obtained for small values of r as starting points for the search with higher r values.
Thus the ill-conditioning associated with the large curvature is counterbalanced by
the availability of a good starting point.
Based on the type of constraint normalization given by Eq. (5.2) we can select
a reasonable starting value for the penalty multiplier r. A rule of thumb is that
one should start with the total penalty being about equal to the objective function
for typical constraint violation of 50% of the response limits. In most optimization
problems the total number of active constraints is about the same as or just slightly
lower than the number of design variables. Assuming we start with one quarter of
the eventual active constraints being violated by about 50% (or g = -0.5) then we
have
f(xo)
or 1'0 = 16--. (5.7.3)
n

It is also important to obtain a good starting point for restarting the optimization
as l' is increased. The minimum of the optimization for the previous value of l' is a
reasonable starting point, but one can do better. Fiacco and McCormick [13] show
that the position of the minimum of ¢(x, 1') has the asymptotic form

x*(1') = a + bl1', as l' ---+ 00 . (5.7.4)

Once the optimum has been found for two values of 1', say 1'i-1, and 1'i, the vectors a
and b may be estimated, and the value of x*(r) predicted for subsequent values of 1'.
It is easy to check that in order to satisfy Eq. (5.7.4), a and b are given as

cx*(1'i-d - x*(1'i)
a= ,
c-1 (5.7.5)
b = [x*(ri-d - a] 1'i-1 ,

where
(5.7.6)
In addition to predicting a good value of the design variables for restarting the op-
timization for the next value of 1', Eq. (5.7.4) provides us with a useful convergence
criterion, namely
Ilx* - all:::; 1'1 , (5.7.7)
where a is estimated from the last two values of 1', and 1'1 is a specified tolerance
chosen to be small compared to a typical value of Ilxll.
A second convergence criterion is based on the magnitude of the penalty terms,
which, as shown in Example 5.7.1, go to zero as l' goes to infinity. Therefore, a
reasonable convergence criterion is

(5.7.8)

188
Section 5.7: Penalty FUnction Methods

Finally, a criterion based on the change in the value of the objective function at the
minimum !* is also used
!*(r;) - !*(r;-d < 0 .
I I (5.7.9)
f*(r;) -
A typical value for f2 or f3 is 0.001.
Example 5.7.1

Minimize f = xi + 10x~ such that Xl + X2 = 4. We have


<p = xI + 10x~ + r( 4 - Xl - X2)2 •

The gradient V' ¢ is given as

_ { 2XI (1 + r) + 2rx2 - 8r }
g- 2x2(10 + r) + 2rxi - 8r

Setting the gradient to zero we obtain


40r 4
10 + llr .
X ---- X ----
1-1O+11r' 2 -

The solution as a function of r is shown in Table 5.7.1.


Table 5.7.1 Minimization of ¢ for different penalty multipliers.
r \ Xl X2 f ¢
1 1.905 0.1905 3.992 7.619
10 3.333 0.3333 12.220 13.333
100 3.604 0.3604 14.288 14.144
1000 3.633 0.3633 14.518 14.532

It can be seen that as r is increased the solution converges to the exact solution
of xT = (3.636,0.3636), f = 14.54. The convergence is indicated by the shrinking
difference between the objective function and the augmented function ¢. The Hessian
of ¢ is given as
H _ [2 + 2r
- 2r
2r]
20 + 2r .
As r increases this matrix becomes more and more ill-conditioned, as all four compo-
nents become approximately 2r.
This ill-conditioning of the Hessian matrix for large
values of r often occurs when the exterior penalty function is used, and can cause
numerical difficulties for large problems.
We can use Table 5.7.1 to test the extrapolation procedure, Eq. (5.7.4). For
example, with the values of r = 1 and r = 10, Eq. (5.7.5) gives
_ O.lx*(l) - x*(lO) _ { 3.492 }
a - -0.9 - 0.3492 '

189
Chapter 5: Constrained Optimization

h = x *(1) - a = { -0.0159
-0.159 }

We can now use Eq. (5.7.4) to find a starting point for the optimization for r = 100
to get
a + h/lDO = (3.490, 0.3490f ,
which is substantially closer to x*(100) = (3.604,0.3604f than to x*(IO) = (3.333,
0.3333f· •••

5.7.2 Interior and Extended Interior Penalty Functions

\Vith the exterior penalty function, constraints contribute penalty terms only when
they are violated. As a result, the design typically moves in the infeasible domain.
If the minimization is terminated before r becomes very large (for example, because
of shortage of computer resources) the resulting designs may be useless. \Vhen only
inequality constraints are present, it is possible to define an interior penalty function
that keeps the design in the feasible domain. The common form of the interior penalty
method replaces the inequality constrained problem

minimize f (x)
(5.7.10)
such that gj(x);::: 0, j = 1, ... , n g ,

by
ng

minimize ¢(x, r) = f(x) + r L Ilgj(X) ,


(5.7.11)
j=1

ri -+ 0, ri >0.

¢(x, r)
r
x-4

x=4 x

Figure 5.7.2 Interior penalty function for f(x) = 0.5.T subject to x - 4 ;::: o.
190
Section 5.7: Penalty Function Methods
The penalty term is proportional to 1/ gj and becomes infinitely large at the
boundary of the feasible domain creating a barrier there (interior penalty function
methods are sometimes called barrier methods). It is assumed that the search is
confined to the feasible domain. Otherwise, the penalty becomes negative which
does not make any sense. Figure 5.7.2 shows the application of the interior penalty
function to the simple example used for the exterior penalty function in Figure 5.7.l.
Besides the inverse penalty function defined in Eq. (5.7.11), there has been some use
of a logarithmic interior penalty function
n.
4>(x, r) = f(x) - r L log(gj(x)) . (5.7.12)
j=l

While the interior penalty function has the advantage over the exterior one in
that it produces a series of feasible designs, it also requires a feasible starting point.
Unfortunately, it is often difficult to find such a feasible starting design. Also, because
of the use of approximation (see Chapter 6), it is quite common for the optimization
process to stray occasionally into the infeasible domain. For these reasons it may be
advantageous to use a combination of interior and exterior penalty functions called
an extended interior penalty function. An example is the quadratic extended interior
penalty function of Haftka and Starnes [14]
n.
4>(x, r) = f(x) + r LP(gj) ,
j=l
(5.7.13)
rj -0,
where
(5.7.14)

It is easy to check that p(gj) has continuity up to second derivatives. The transi-
tion parameter go which defines the boundary between the interior and exterior parts
of the penalty terms must be chosen so that the penalty associated with the con-
straint, rp(gj), becomes infinite for negative gj as r tends to zero. This results in the
requirement that
(5.7.15)
This can be achieved by selecting go as
go = cr 1/ 2 , (5.7.16)
where c is a constant.
It is also possible to include equality constraints with interior and extended in-
terior penalty functions. For example, the interior penalty function Eq. (5.7.11) is
augmented as
n, ne

4>(x, r) = f(x) + r L l/gj(x) + r- 1/ 2 L h~(x),


j=l i=l
(5.7.17)
r=r1,r2,···, ri - o.
191
Chapter 5: Constrained Optimization
\
ct>(x, r) \ \
- - - interior P. F.
"I\-\ "
\
\
" - - - quadratic extension
\
\

"

x=4 x

Figure 5.7.3 Extended interior penalty function for f(x) = 0.5x subject to g(x) =
x - 4 2:: O.
The considerations for the choice of an initial value of r are similar to those for
the exterior penalty function. A reasonable choice for the interior penalty function
would require that n/4 active constraints at g = 0.5 (that is 50% margin for properly
normalized constraints) would result in a total penalty equal to the objective function.
Using Eq. (5.7.3) we obtain
n l'
f(x) = 4" 0.5' or r = 2f(x)/n .

For the extended interior penalty function it is more reasonable to assume that the
n/4 constraints are critical (g = 0), so that from Eq. (5.7.13)
n 3 4
f(x) = r--, or l' = -gof(x)/n .
4 go 3
A reasonable starting value for go is 0.1. As for the exterior penalty function, it is
possible to obtain an expression for the asymptotic (as r --+ 0) coordinates of the
minimum of 1> as [10]
x*(r) = a+ br 1/ 2 , as l' --+ 0, (5.7.18)
and
/*(1') = a + br 1 / 2 , as r --+ 0.
a, b, a and b may be estimated once the minimization has been carried out for two
values of r. For example, the estimates for a and bare
c1 / 2 x*(ri_d - x*(ri)
a = c1/2 _ 1 '
(5.7.19)
b = x*(ri-d - a
1/2 '
ri- 1

where c = r;j ri-1. As in the case of exterior penalty function, these expressions may
be used for convergence tests and extrapolation.

192
Section 5.7: Penalty Function Methods
5.7.3 Unconstrained Minimization with Penalty Functions

Penalty functions convert a constrained minimization problem into an unconstrained


one. It may seem that we should now use the best available methods for uncon-
strained minimization, such as quasi-Newton methods. This may not necessarily be
the case. The penalty terms cause the function ¢ to have large curvatures near the
constraint boundary even if the curvatures of the objective function and constraints
are small. This effect permits an inexpensive approximate calculation of the Hessian
matrix, so that we can use Newton's method without incurring the high cost of cal-
culating second derivatives of constraints. This may be more attractive than using
quasi-Newton methods (where the Hessian is also approximated on the basis of first
derivatives) because a good approximation is obtained with a single analysis rather
than with the n moves typically required for a quasi-Newton method. Consider, for
example, an exterior penalty function applied to equality constraints
ne
¢(x, r) = f(x) + r L h;(x) . (5.7.20)
;=1

The second derivatives of ¢ are given as

(5.7.21)

Because of the equality constraint, h; is close to zero, especially for the later stages
of the optimization (large r), and we can neglect the last term in Eq. (5.7.21). For
large values of r we can also neglect the first term, so that we can calculate second
derivatives of ¢ based on first derivatives of the constraints. The availability of
inexpensive second derivatives permits the use of Newton's method where the number
of iterations is typically independent of the number of design variables. Qua.<;i-Newton
and conjugate gradient methods, on the other hand, require a number of iterations
proportional to the number of design variables. Thus the use of Newton's method
becomes attractive when the number of design variables is large. The application of
Newton's method with the above approximation of second derivatives is known as
the Gauss-Newton method.
For the interior penalty function we have a similar situation. The augmented
objective function ¢ is given as

+r L
ng

¢(x, r) = f(x) l/g j (x), (5.7.22)


j=l

and the second derivatives are

(5.7.23)

193
Chapter 5: Constrained Optimization

Now the argument for neglecting the first and last terms in Eq. (5.7.23) is somewhat
lengthier. First we observe that because of the 1/ g] term, the second derivatives
are dominated by the critical constraints (gj small). For these constraints the last
term in Eq. (5.7.23) is negligible compared to the first-derivative term because gj is
small. Finally, from Eq. (5.7.18) it can be shown that rig] goes to infinity for active
constraints as r goes to zero, so that the first term in Eq. (5. 7.23) can be neglected
compared to the second. The same argument can also be used for extended interior
penalty functions [14].
The power of the Gauss-Newton method is shown in [14] for a high- aspect-ratio
wing made of composite materials (see Figure 5.7.4) designed subject to stress and
displacement constraints.

1803 All Dimensions are in Centimeters

1872

_L ___.J~_¥i 1;-..,=-l'T--
Figure 5.7.4 Aerodynamic planform and structural box for high-aspect ratio wing,
from [14}.

Table 5.7.2 Results of high-aspect-ratio wing study


Number of CDC 6600 Total number of
design CPU time unconstrained Total number Final
variables sec minimizations of analyses mass, kg
13 142 4 21 887.3
25 217 4 19 869.1
32 293 5 22 661.7
50 460 5 25 658.2
74 777 5 28 648.6
146 1708 5 26 513.0

The structural box of the wing was modeled with a finite element model with
67 nodes and 290 finite elements. The number of design variables controlling the
thickness of the various elements was varied from 13 to 146. The effect of the number
of design variables on the number of iterations (analyses) is shown in Table 5.7.2.
194
Section 5.7: Penalty Function Methods

It is seen that the number of iterations per unconstrained minimization is almost


constant (about five). With a quasi-Newton method that number may be expected
to be similar to the number of design variables.
Because of the sharp curvature of ¢ near the constraint boundary, it may also be
appropriate to use specialized line searches with penalty functions [15].

5.7.4 Integer Progmmming with Penalty Functions

An extension of the penalty function approach has been implemented by Shin et


al. [16] for problems with discrete-valued design variables. The extension is based
on introduction of additional penalty terms into the augmented-objective function
¢(x, r) to reflect the requirement that the design variables take discrete values,
(5.7.24)
where Id is the set of design variables that can take only discrete values, and Xi is
the set of allowable discrete values. Note that several variables may have the same
allowable set of discrete values. In this case the augmented objective function which
includes the penalty terms due to constraints and the non-discrete values of the design
variables is defined as
ng

¢(x, r, s) = f(x) + r LP(gj) + s L 1fJd(Xi) , (5.7.25)


j=l

where s is a penalty multiplier for non-discrete values of the design variables, and
1fJd(Xi) the penalty term for non-discrete values of the ith design variable. Different
forms for the discrete penalty function are possible. The penalty terms 1fJd(Xi) are
assumed to take the following sine-function form in Ref. [16],

./. () = -1(.
'f/d Xi
2
SIn
21l'[Xi - ~(di(j+l) + 3dij )] +
dii+l - dij
1) , dij ::; Xi::; diCi+l)' (5.7.26)

While penalizing the non-discrete valued design variables, the functions 1fJd(Xi) as-
sure the continuity of the first derivatives of the augmented function at the discrete
values of the design variables. The response surfaces generated by Eq. (5.7.25) are
determined according to the values of the penalty multipliers rand s. In contrast
to the multiplier r, which initially has a large value and decreases as we move from
one iteration to another, the value of the multiplier s is initially zero and increases
gradually.
One of the important factors in the application of the proposed method is to
determine when to activate s, and how fast to increase it to obtain discrete optimum
design. Clearly, if the initial value of s is too big and introduced too early in the
design process, the design variables will be trapped away from the global minimum,
resulting in a sub-optimal solution. To avoid this problem, the multiplier s has to be
activated after optimization of several response surfaces which include only constraint
penalty terms. In fact, since sometimes the optimum design with discrete values is
195
Chapter 5: Constrained Optl ;zation

in the neighborhood of the c( tinuous optimum, it may be desirable not to activate


the penalty for the non-disc~ .,.; design variables until reasonable convergence to the
continuous solution is achieved. This is especially true for problems in which the
intervals between discrete values are very small.
A criterion for the activation of the non-discrete penalty multiplier s is the same
as the convergence criterion of Eq. (5.7.6), that is

I¢ 7f I ~ Ec . (5.7.27)

A typical value for fc is 0.01. The magnitude of the non-discrete penalty multiplier,
s, at the first discrete iteration is calculated such that the penalty associated with
the discrete-valued design variables that are not at their allowed values is of the order
of 10 percent of the constraint penalty.

s ::::: O.lrp(g) . (5.7.28)

As the iteration for discrete optimization proceeds, the non-discrete penalty multiplier
for the new iteration is increased by a factor of the order of 10. It is also important to
decide how to control the penalty multiplier for the constraints, r, during the discrete
optimization process. If r is decreased for each discrete optimization iteration as in
the continuous optimization process, the design can be stalled due to high penalties
for constraint violation. Thus, it is suggested that the penalty multiplier r be frozen at
the end of the continuous optimization process. However, the nearest discrete solution
at this response surface may not be a feasible design, in which case the design must
move away from the continuous optimum by moving back to the previous response
surface. This can be achieved by increasing the penalty multiplier, r, by a factor of
10.
The solution process for the discrete optimization is terminated if the design
variables are sufficiently close to the prescribed discrete values. The convergence
criterion for discrete optimization is

(5.7.29)

where a typical value of the convergence tolerance Ed is 0.001.

Example 5.7.2

Cross-sectional areas of members of a two-bar truss shown in the Figure 5.7.5 are
to be selected from a discrete set of values, Ai E {1.0, 1.5, 2.0}, i = 1,2. Determine
the minimum weight structure using the modified penalty function approach such
that the horizontal displacement u at the point of application of the force does not
exceed 2/3(FI/ E). Use a tolerance Ec = 0.1 for the activation of the penalty terms
for non-discrete valued design variables, and a convergence tolerance for the design
variables fd = 0.001.

196
Section 5.7: Penalty Function Methods

Figure 5.7.5 Two-bar truss.


Upon normalization, the design problem is posed as
W
minimize f = - = Xl + X2
pi
uE
subject to g= Ft = 1.5 - l/XI - 1/x2 ~ 0,
Xi = Ai E {1.0, 1.5, 2.0}, i = 1, ... ,2 .

Using an initial design of Xl = X2 = 5 and transition parameter go = 0.1, we have


g = 1.1 > go, therefore, from Eq. (5.7.14) the penalty terms for the constraints are
in the form of peg) = 1/ g. The augmented function for the extended interior penalty
function approach is
r
¢=XI+X2+1.5_1/XI_1/X2·
Setting the gradient to zero, we can show that the minimum of the augmented func-
tion as a function of the penalty multiplier r is
24 + ";=57=6---::3'-::-6=(1"7"6---:-4r'")
Xl = x2 = 18 .
The initial value of the penalty multiplier r is chosen so that the penalty introduced
for the constraint is equal to the objective function value,
1
=
r-(-) f(xo),
g Xo
r = 11 .

The minima of the augmented function as functions of the penalty multiplier rare
shown in Table 5.7.3 . After four iterations the constraint penalty (¢ - f) is within
the desired range of the objective function to activate the penalty terms for the
non-discrete values of the design variables.
From Eq. (5.7.25) the augmented function for the modified penalty function
approach has the form
A.. r s{l +sin[41r(xl -1.125)]}
'f' =XI + X2 + + ~----..!..-~'----~
1.5 - l/XI - 1/x2 2
+(s/2) {I + sin[41r (X2 - 1.125)]} .
197
Chapter 5: Constrained Optimization
Table 5.7.3 Minimization of ¢ without the discrete penalty
r Xl X2 f g ¢
5.000 5.000 10.00 1.100
11 3.544 3.544 7.089 0.9357 18.844
1.1 2.033 2.033 4.065 0.5160 6.197
0.11 1.554 1.554 3.109 0.2134 3.624
0.011 1.403 1.403 2.807 0.0747 2.954

The minimum of the augmented function can again be obtained by setting the gra-
dient to zero
r
1- 2 + 27fS cos[47f (Xl - 1.125)J = 0,
(1.5 - 2/XI) Xl 2

which can be solved numerically. The initial value of the penalty multiplier s is
calculated from Eq. (5.7.28)

1
s = 0.1 (0.011) 0.0747 = 0.0147 .
The minima of the augmented function (which includes the penalty for the non-
discrete valued variables) are shown in Table 5.7.4 as a function of s.

Table 5.7.4 Minimization of ¢ with the discrete penalty


r s Xl X2 f ¢
0.011 0.0147 1.406 1.406 2.813 2.963
0.1472 1.432 1.432 2.864 3.021
1.472 1.493 1.493 2.986 3.060
14.72 1.499 1.499 2.999 3.065
147.2 1.500 1.500 3.000 3.066

After four discrete iterations we obtain a minimum at Xl = X2 = 3/2. There are


two more minima, x = (2,1) and x = (1,2), with the same value of the objective
function of f = 3.0 .•••

5.8 Multiplier Methods

Multiplier methods combine the use of Lagrange multipliers with penalty functions.
\Vhen only Lagrange multipliers are employed the optimum is a stationary point
rather than a minimum of the Lagrangian function. When only penalty functions
are employed we have a minimum but also ill-conditioning. By using both we may
hope to get an unconstrained problem where the function to be minimized does not
suffer from ill-conditioning. A good survey of multiplier methods was conducted by

198
Section 5.8: Multiplier Methods

Bertsekas [17]. We study first the use of multiplier methods for equality constrained
problems.
minimize f (x)
(5.8.1)
such that hj(x) = 0, j = 1, ... , ne .

We define the augmented Lagrangian function

ne ne

C(x, A, r) = f(x) - L Ajhj(x) + r L h;(x) . (5.8.2)


j=1 j=1

If all the Lagrange multipliers are set to zero, we get the usual exterior penalty
function. On the other hand, if we use the correct values of the Lagrange multipliers,
A;, it can be shown that we get the correct minimum of problem (5.8.1) for any
positive value of r. Then there is no need to use the large value of r required for the
exterior penalty function. Of course, we do not know what are the correct values of
the Lagrange multipliers.

Multiplier methods are based on estimating the Lagrange multipliers. When the
estimates are good, it is possible to approach the optimum without using large r
values. The value of r needs to be only large enough so that C has a minimum rather
than a stationary point at the optimum. To obtain an estimate for the Lagrange
multipliers we compare the stationarity conditions for C,

(5.8.3)

with the exact conditions for the Lagrange multipliers

of _
OXi
t A/
.
)=1
hj = 0 .
O.ri
(5.8.4)

Comparing Eqs. (5.8.3) and (5.8.4) we expect that

(5.8.5)

as the minimum is approached. Based on this relation, Hestenes [18] suggested using
Eq. (5.8.5) as an estimate for A;.
That is

(5.8.6)

where k is an iteration number.

199
Chapter 5: Constrained Optimization
Example 5.8.1

We repeat Example 5.7.1 using Hestenes' multiplier method.

f(x) = xi + 10x~ ,
hex) = Xl + X2 - 4 =0 .
The augmented Lagrangian is

To find the stationary points of the augmented Lagrangian we differentiate with


respect to Xl and X2 to get

2Xl - >. + 2r(x1 + X2 - 4) = 0,


20X2 - >. + 2r(xl + X2 - 4) = 0,

which yield
+ 40r
Xl = 10x2 = 5>'
10 + llT
.
We want to compare the results with those of Example 5.7.1, so we start with the
same initial r value ro = 1, the initial estimate of >. = 0 and get

Xl = (1.905, 0.1905f, h = -1.905 .


So, using Eq. (5.8.6) we estimate >.(1) as

>.(1) = -2 x 1 x (-1.905) = 3.81 .


We next repeat the optimization with r(l) = 10, >.(1) = 3.81 and get

X2 = (3.492,0.3492f, = -0.1587 .
h

For the same value of r, we obtained in Example 5.7.1 X2 = (3.333, 0.3333f, so that
we are now closer to the exact solution of x = (3.636,0, 3636)T. Now we estimate a
new>. from Eq. (5.8.6)

>.(2) = 3.81 - 2 x 10 x (-0.1587) = 6.984 .


For the next iteration we may, for example, fix the value of r at 10 and change only
>.. For>. = 6.984 we obtain

X3 = (3.624,0.3624), h = -0.0136,

which shows that good convergence can be obtained without increasing r.e e e

200
Section 5.9: Projected Lagrangian Methods (Sequential Quadratic Programming)
There are several ways to extend the multiplier method to deal with inequality
constraints. The formulation below is based on Fletcher's work [19J. The constrained
problem that we examine is

minimize f (x)
(5.8.7)
such that gj(x) ~ 0, j = 1, ... ,ng .

The augmented Lagrangian function is

(5.8.8)

where < a >= max(a, 0). The condition of stationarity of £ is

(5.8.9)

The exact stationarity condition is

of _ ~ AJJgj = 0 (5.8.10)
ax·• ~
j=l
J ax·• '
where it is also required that A;gj = o. Comparing Eqs (5.8.9) and (5.8.10) we expect
an estimate for A; of the form

(5.8.11)

5.9 Projected Lagrangian Methods (Sequential Quadratic Programming)

The addition of penalty terms to the Lagrangian function by multiplier methods


converts the optimum from a stationary point of the Lagrangian function to a min-
imum point of the augmented Lagrangian. Projected Lagrangian methods achieve
the same result by a different method. They are based on a theorem that states that
the optimum is a minimum of the Lagrangian function in the subspace of vectors
orthogonal to the gradients of the active constraints (the tangent subspace). Pro-
jected Lagrangian methods employ a quadratic approximation to the Lagrangian in
this subspace. The direction seeking algorithm is more complex than for the methods
considered so far. It requires the solution of a quadratic programming problem, that
is an optimization problem with a quadratic objective function and linear constraints.
Projected Lagrangian methods are part of a class of methods known as sequential
quadratic programming (SQP)methods. The extra work associated with the solution
of the quadratic programming direction seeking problem is often rewarded by faster
convergence.

201
Chapter 5: Constrained Optimization

The present discussion is a simplified version of Powell's projected Lagrangian


method [20]. In particular we consider only the case of inequality constraints

minimize f (x)
(5.9.1 )
such that gj(x) ~ 0, j = 1, ... , ng .

Assume that at the ith iteration the design is at Xi, and we seek a move direction s.
The direction s is the solution of the following quadratic programming problem
1
minimize ¢(s) = f(Xi) + ST g(Xi) + 2sT A(Xi' Ai)S
(5.9.2)
such that gj(Xi) + sT\7gj(Xi) ~ 0, j = 1, ... , n g ,

where g is the gradient of f, and A is a positive definite approximation to the Hessian


of the Lagrangian function discussed below. This quadratic programming problem
can be solved by a variety of methods which take advantage of its special nature. The
solution of the quadratic programming problem yields sand Ai+l' vVe then have

Xi+l = Xi + as, (5.9.3)

where a is found by minimizing the function


ng

1j;(a) = f(x) + Lf-ljlmin(O,gj(x))I, (5.9.4)


j=1

and the f-lj are equal to the absolute values of the Lagrange multipliers for the first
iteration, i.e.
f-lJ. = rnax [I,(i). ~( (i-I)
A J '2 f-lJ
+ Idi-lll)]
J A ' (5.9.5)

with the superscript i denoting iteration number. The matrix A is initialized to some
positive definite matrix (e.g the identity matrix) and then updated using a I3FGS type
equation (see Chapter 4).

A~X~XT A ~l~lT
Anew = A - ~xTA~x + ~XT ~X '
(5.9.6)

where
~X = Xi+l - Xi , (5.9.7)
where L is the Lagrangian function and \7 x denotes the gradient of the Lagrangian
function with respect to x. To guarantee the positive definiteness of A, ~l is modified
if ~XT ~l :::; 0.2~xT A~x and replaced by

~l' = 6~1 + (1 - 6)A~x, (5.9.8)

where
0.8~XTA~x
(5.9.9)
e= ~XT A~x _ ~xT~1

202
Section 5.9: Projected Lagrangian Methods (Sequential Quadratic Programming)
Example 5.9.1

Consider the four bar truss of Example 5.1.2. The problem of finding the minimum
weight design subject to stress and displacement constraints was formulated as

minimize f = 3X1 + V3X2


18 6V3
subject to gl = 3 - - - - ;:: 0 ,
Xl X2
g2 = XI - 5.73 ;:: 0,
g3 = X2 - 7.17 ;:: 0 .

Assume that we start the search at the intersection of gl = 0 and g3 = 0 where


Xl = 11.61, X2 = 7.17 and f = 47.25.
The gradient of the objective function and two
active constraints are

0.1335}
V gl = { 0.2021 '
N _ [0.1335
- 0.2021
0]1 .

We start with A set to the unit matrix so that

4>(s) = 47.25 + 3s 1 + V382 + 0.5si + 0.58~,


and the linearized constraints are
gl(S) = 0.133581 + 0.202182;:: 0,
g2(S) = 5.88 + 81 ;:: 0,
g3(S) = 82 ;:: 0 .

vVe solve this quadratic programming problem directly with the use of the Kuhn-
Tucker conditions
3 + 81 - 0.1335).1 - ).2 = °,
V3 + 82 - 0.2021).1 - ).3 = 0.
A consideration of all possibilities for active constraints shows that the optimum is
obtained when only gl is active, so that ).2 = ).3 = 0 and ).1 = 12.8, 81 = -1.29,
82 = 0.855. The next design is

11.61} {-1.29}
Xl = { 7.17 + 0: 0.855 '

where 0: is found by minimizing 1jJ(0:) of Eq. (5.9.4). For the first iteration Jlj = I).jl
so

1jJ = 3(I1.61-1.290:)+V3(7.17+0.8550:)+12.8 3 -
I 18 - 6V3
11.61 - 1.290: 7.17 + 0.8550:
. I
203
Chapter 5: Constrained Optimization
By changing a systematically we find that 1j; is a minimum near a = 2.2, so that
Xl = (8.77, 9.05f, f(xd = 41.98, gl(xd = -0.201 .
To update A we need ~x and ~l. We have

so that
VxL = (3 - 230.4jxI, V3 - 133.0jx~f,
and

-2.84}
~x = Xl - Xo ={ 1.88 '

With A being the identity matrix we have ~XT A~x = 11.6, ~XT ~l = 5.53. Because
~xT ~l > 0.2~XT A~x we can use Eq. (5.9.5) to update A

~X~XT ~l~lT [0.453 0.352]


Anew =I - ~XT ~x + ~XT ~x = 0.352 0.775 .

For the second iteration


¢(s) = 41.98 + 3s 1 + V3S2 + 0.5(0.453sI + 0.775s~ + 0.704s 1S2) ,
gl(S) = -0.201 + 0.234s 1 + 0.127s2 ~ 0,
g2(S) = 3.04 + Sl ~ 0,
g3(S) = 1.88+ S2 ~ O.
We can again solve the quadratic programming directly with the use of the Kuhn-
Tucker conditions
3 + 0.453s 1 + 0.352s2 - 0.234A1 - A2 = 0,
V3 + 0.352s1 + 0.775s 2 - 0.127,\1 - A3 = 0 .
The solution is

Al = 14.31, A2 = A3 = 0, Sl = 1.059, 82 = -0.376 .


The one dimensional search seeks to minimize

where
fJ1 = max(A1' ~(IAll + fJ~ld)) = 14.31 .
The one-dimensional search yields approximately a = 0.5, so that
X2 = (9.30, 8.86f, f(X2) = 43.25, gl(X2) = -0.108,
so that we have made good progress towards the optimum x· = (9.46, 9.46)T . •••

204
Section 5.11: References

5.10 Exercises

1. Check the nature of the stationary points of the constrained problem

minimize f(x) = xi + 4x~ + 9x~


such that Xl + 2X2 + 3X3 ~ 30 ,
X2X3 ~ 2,
X3 ~ 4,
XIX2 ~ 0.

2. For the problem

minimize f(x) = 3xi - 2XI - 5x~ + 30X2


such that 2XI + 3X2 ~ 8 ,
3XI + 2X2 S; 15,
X2 S; 5 .

Check for a minimum at the following points: (a) (5/3, 5.00) (b) (1/3, 5.00) (c)
(3.97,1.55).

3. Calculate the derivative of the solution of Example 5.1.2 with respect to a change in
the allowable displacement. First use the Lagrange multiplier to obtain the derivative
of the objective function, and then calculate the derivatives of the design variables
and Lagrange multipliers and verify the derivative of the objective function. Finally,
estimate from the derivatives of the solution how much we can change the allowable
displacement without changing the set of active constraints.

4. Solve for the minimum of problem 1 using the gradient projection method from
the point (17, 1/2, 4).

5. Complete two additional moves in Example 5.5.2.

6. Find a feasible usable direction for problem 1 at the point (17, 1/2,4).

7. Use an exterior penalty function to solve Example 5.1.2.

8. Use an interior penalty function to solve Example 5.1.2.

9. Consider the design of a box of maximum volume such that the surface area is
equal to S and there is one face with an area of S /4. Use the method of multipliers
to solve this problem, employing three design variables.

10. Complete two more iterations in Example 5.9.1.

205
Chapter 5: Constrained Optimization
5.11 References

[1] Kreisselmeier, G., and Steinhauser, R., "Systematic Control Design by Optimiz-
ing a Vector Performance Index," Proceedings of IFAC Symposium on Computer
Aided Design of Control Systems, Zurich, Switzerland, pp. 113-117,1979.
[2] Sobieszczanski-Sobieski, J., "A Technique for Locating Function Roots and for
Satisfying Equality Constraints in Optimization," NASA TM-104037, NASA
LaRC, 1991.
[3] Wolfe, P .. "The Simplex Method for Quadratic Programming," Econometrica, 27
(3), pp. 382-398, 1959.
[4] Gill, P.E., Murray, W., and Wright, M.H., Practical Optimization, Academic
Press, 1981.
[5] Dahlquist, G., and Bjorck, A., Numerical Methods, Prentice Hall, 1974.
[6] Sobieszczanski-Sobieski, J., Barthelemy, J.F., and Riley, K.M., "Sensitivity of
Optimum Solutions of Problem Parameters", AIAA Journal, 20 (9), pp. 1291-
1299, 1982.
[7] Rosen, J.B., "The Gradient Projection Method for Nonlinear Programming-
Part 1: Linear Constraints", The Society for Industrial and App!. Mech. Journal,
8 (1), pp. 181- 217,1960.
[8] Abadie, J., and Carpentier, J., "Generalization of the Wolfe Reduced Gradient
Method for Nonlinear Constraints", in: Optimization (R. Fletcher, ed.), pp. 37-
49, Academic Press, 1969.
[9] Rosen, J.B., "The Gradient Projection Method for Nonlinear Programming-Part
II: Nonlinear Constraints", The Society for Industrial and App!. Mech. Journal,
9 (4), pp. 514-532, 1961.
[10] Haug, E.J., and Arora, J.S., Applied Optimal Design: Mechanical and Structural
Systems, John Wiley, New York, 1979.
[11] Zoutendijk, G., Methods of Feasible Directions, Elsevier, Amsterdam, 1960.
[12] Vanderplaats, G.N., "CONMIN-A Fortran Program for Constrained Function
Minimization", NASA TM X-62282, 1973.
[13] Fiacco, V., and McCormick, G.P., Nonlinear Programming: Sequential Uncon-
strained Minimization Techniques, John Wiley, New York, 1968.
[14] Haftka, R.T., and Starnes, J.H., Jr., "Applications of a Quadratic Extended
Interior Penalty Function for Structural Optimization", AIAA Journal, 14 (6),
pp.718-724,1976.
[15] Moe, J., "Penalty Function Methods in Optimum Structural Design-Theory and
Applications", in: Optimum Structural Design (Gallagher and Zienkiewicz, eds.),
pp. 143-177, John Wiley, 1973.

206
Section 5.11: References

[16] Shin, D.K, Giirdal, Z., and Griffin, O. H. Jr., "A Penalty Approach for Nonlinear
Optimization with Discrete Design Variables," Engineering Optimization, 16, pp.
29-42, 1990.
[17] Bertsekas, D.P., "Multiplier Methods: A Survey," Automatica, 12, pp. 133-145,
1976.
[18] Hestenes, M.R., "Multiplier and Gradient Methods," Journal of Optimization
Theory and Applications, 4 (5), pp. 303-320, 1969.
[19] Fletcher, R., "An Ideal Penalty Function for Constrained Optimization," Journal
of the Institute of Mathematics and its Applications, 15, pp.319-342, 1975.
[20J Powell, M.J.D., "A Fast Algorithm for Nonlinearly Constrained Optimization
Calculations", Proceedings of the 1977 Dundee Conference on Numerical Analy-
sis, Lecture Notes in Mathematics, Vol. 630, pp. 144-157, Springer-Verlag, Berlin,
1978.

207
Aspects of The Optimization Process in Practice 6

Occasionally, a structural analyst will write a design program that includes the
calculation of structural response as well as an implementation of a constrained opti-
mization algorithm, such as those discussed in Chapter 5. More often, however, the
analyst will have a structural analysis package, such as a finite-element program, as
well as an optimization software package available to him. The task of the analyst
is to combine the two so as to bring them to bear on the structural design problem
that he wishes to solve.
Two major difficulties are associated with the process of interfacing a structural
analysis package with an optimization program. The first is a programming difficulty.
Optimization packages typically expect subroutines that evaluate the objective func-
tion and constraints. When the structural analysis program is large, or if the analyst
does not have access to the source code of the program (a common situation), it
is very difficult to transform the analysis package into a subroutine called by the
optimization program.
The second serious problem is the high computational cost required for many
applications. For many structural optimization problems the evaluation of objective
function and constraints requires the execution of costly finite element analyses for
displacements, stresses or other structural response quantities. The optimization pro-
cess may require evaluating objective function and constraints hundreds or thousands
of times. The cost of repeating the finite element analysis so many times is usually
prohibitive.
Fortunately, there is an approach to interfacing an optimization program with
an analysis program that solves both problems. This increasingly popular approach,
called sequential approximate optimization, was suggested by Schmit and Farshi [1].
The computational cost problem is addressed by the use of approximate analyses
during portions of the optimization process. The structural analysis package is first
used to analyze an initial design, and then to generate information that allows the
construction of constraint approximations. For example, when the number of design
variables is small it is practical to analyze the structure at a number of points in
the design space, and use the response at those points to construct a polynomial
approximation to the response at other points. The optimization package is then
209
Chapter 6: Aspects of The Optimization Process in Pmctice
applied to the approximate problem represented by the polynomial approximation.
Since the polynomial approximation is typically easy to program, it is straight-forward
to interface it to the optimization package.
The simple approximations generated by repeated use of the analysis package
are often referred to as low-cost explicit approximations, in contrast to the implicit
dependence of the response on the structural design variables via a finite element
solution. The polynomial approximation obtained by analyzing the structure at a
number of design points is a global approximation. Obtaining such a global approxi-
mation can be quite expensive for a large number of design variables. For example, if
we want to fit the structural response by a quadratic polynomial, we need to analyze
the structure for at least n(n + 1)/2 design points (typically many more to ensure a
robust approximation), where n is the number of design variables. This will result
in thousands of analyses when the number of design variables is larger than, say 40.
Therefore, it is more common to use local approximations based on ch'rivatives of the
objective function and constraints with respect to the design variables. The simplest
approach is to replace the objective function and constraints with linear approxima-
tions based on these derivatives. However, these approximations are useful only in a
neighborhood of the design space. Therefore, it is necessary to impose limits, called
move limits, on the magnitudes of changes in the design that are permitted while the
approximate analysis is used.
Following an optimization based on approximate analysis and move limits, an ex-
act analysis is performed at the design point obtained by the approximate optimiza-
tion, and new derivatives are calculated so that a new approximation for objective
function and constraints can be constructed. The process is repeated until conver-
gence is achieved, typically measured by the magnitude of changes in the objective
function or the degree of satisfaction of the optimality conditions (e.g., the Kuhn-
Tucker conditions). Because each approximate optimization is only one cycle in the
overall optimization process, it is usually possible to employ lax convergence criteria
for these approximate problems, except for the last one. To distinguish them from
the iterations inside approximate optimizations, each such optimization is referred to
as a cycle rather than as an iteration.
When linear approximations are used, and the move limits are posed as linear in-
equalities, this process is called sequential linear progmmming (SLP), and was known
for many years before Schmit and Farshi proposed the use of approximations for
structural optimization. However, there is no need to limit the process to linear ap-
proximations, as long as the approximations are substantially cheaper to calculate
than the exact analyses. For example, Schmit and Farshi demonstrated the use of in-
expensive nonlinear approximations by using the reciprocal approximation, discussed
in Section 6.1.
The use of sequential approximate optimization in the design process is the key
step in interfacing a structural analysis program with an optimization program, and
so it is the major topic discussed in this chapter. However, there are other aspects of
the practical use of the optimization process in design that deserve consideration. For
shape optimization problems, it is important to be able to modify the discretization of
the structure (e.g., the finite-element model) as the design is changed. This requires
210
Section 6.1: Generic Approximations

sophisticated mesh generators, and is discussed in Section 6.5. Other topics discussed
in this chapter include optimization packages, and test problems that are often used
to check on the performance of these packages. One important topic which is not
discussed in this chapter is the calculation of the derivatives of the response of the
structure needed for constructing the approximation. This topic requires a more
detailed study and is the subject of Chapters 7 and 8.
The use of sequential approximate optimization is by no means universally ac-
cepted as the only way to deal with the optimization of complex structures. Many
analysts prefer to use their judgement so as to produce a design model of the problem
which employs a much coarser discretization than they would accept for the final
analysis of the structure. They hope that the design trends revealed by optimizing
the coarse model will hold for the more refined model. While this approach is quite
legitimate, it will not be discussed here, because it requires a great deal of experi-
ence on the part of the analyst, and is highly problem dependent. As such it is very
difficult to codify in a textbook.

6.1 Generic Approximations

The most commonly used approximations to objective functions and constraints


are based on the value of the function and its derivatives at one or several points.
Most of these approximations are applicable to any function, regardless of whether it
describes structural response or not. For this reason we refer to such approximations
as generic. Approximations that are specific to the form of analysis that is used to
generate the function are dealt with in the next section. Generic approximations can
be divided into local approximations, that are sufficiently accurate only in a limited
region of the design space, and global approximations that attempt to approximate
the function in the entire design space. Midrange approximations offer a compromise
between the two.

6.1.1 Local Approximations

The simplest local approximation is the linear approximation based on the Taylor
senes. Given a function g(x), the linear approximation gL(X) is

(6.1.1)

For many applications the linear approximation is inaccurate even for design
points x that are close to Xo. Accuracy can be increased by retaining additional
terms in the Taylor series expansion. This, however, requires the costly calculation of
higher-order derivatives. A more attractive alternative is to find intervening variables
that would make the approximated function behave more linearly. That is, define

i = 1, ... ,m, (6.1.2)


211
Chapter 6: Aspects of The Optimization Process in Practice

where Yi are m functions of the design variables called intervening variables. The
linear approximation, gI, in terms of the intervening variables is

(6.1.3)

where YOi = Yi(XO), and the derivatives of g with respect to the Vi'S can be calculated
from the derivatives with respect to the Xi'S.

Example 6.1.1

Figure 6.1.1 Beam example.

The beam shown in Fig. (6.1.1) has a rectangular cross section of width b; and
height hi, i = 1,2. The tip displacement is constrained not to exceed Wall; with
elementary beam theory this constraint can be written as

g = Wall _ (23) ~ _(~) ~ .


6 EJr 6 Eh
If the design variables are the width and height of each section, we can express g in
terms of these design variables as
46pl3 lOpl3
g = Wall - Eb 1hf - Eb2h~

This expression is a highly non-linear function of the design variables, but it can be
linearized by using the intervening variables
1 12 1 12
and
Yl = Ii = b1hf ' Y2 = 12 = b2h~ .

The constraint function can then be written as a linear function

g= W II _
a
(23) pl3E Yl _ (~) ptE3Y2 .
6 6

•••
212
Section 6.1: Generic Approximations
The cases where intervening variables can exactly linearize the constraint are
rather rare. Example (6.1.1) is typical of statically determinate structures where
such linearization is often possible. However, as shown by Mills-Curran et al. [2],
even in the case of statically indeterminate beam and frame structures, the reciprocals
of moments of inertia are good intervening variables for displacement constraints.
In many applications the intervening variables are functions of a single design
variable, that is
Yi = Yi(Xi) i = 1, ... ,n . (6.1.4)
In this case it is often convenient to write gr, Eq. (6.1.3), in terms of the original
variables
gr(x) = g(xo) + ~ (
L...J Yi(Xi) - Yi(XOi) ) (Og d
~/ d- )
Yi . (6.1.5)
i=l uX, X, Xo
Note that while gr is a linear function of y it is, in general, a nonlinear function of x.
One of the more popular intervening variables is the reciprocal of Xi
1
Yi=- . (6.1.6)
Xi
This popularity reflects the fact that many of the early structural optimization studies
were performed on structures consisting of truss or plane-stress elements. The design
variables in these studies were usually the cross-sectional areas of the truss elements
and the thicknesses of the plane-stress elements. For statically determinate structures
stress and displacements constraints are linear functions of the reciprocals of these
design variables. For statically indeterminate structures, using the reciprocals of the
design variables still proved to be a useful device in making the constraints more
linear (see, for example, Storaasli and Sobieszczanski [3], and Noor and Lowder [4]).
For the reciprocal approximation Eq. (6.1.5) becomes

gR(X) = g(xo) + ~
L...J(Xi - XOi
XOi)-. (Og)
~ (6.1.7)
i=l X, uX, Xo

One of the attractive features of the reciprocal approximation, even for statically in-
determinate structures, is that it preserves the property of scaling. That is, when the
stiffness matrix is a homogeneous function of order h in the components of x, the dis-
placements are homogeneous functions of order -h in the components ofx. For truss
and membrane elements, h = 1 so that the displacements are homogeneous functions
of the reciprocals of the design variables. If all the design variables are scaled by a
factor, the displacement vector is scaled by the reciprocal of that factor. Therefore
the reciprocal approximation is exact for scaling the design. Fuchs [5] has investi-
gated the importance of the homogeneity property, and Fuchs and Haj Ali [6] have
proposed a family of approximations that generalizes the reciprocal approximation
to any order of homogeneity.
Another approximation, called the conservative approximation [7], is a hybrid
form of the linear and reciprocal approximations which is more conservative than

213
Chapter 6: Aspects of The Optimization Process in Practice
either. It is particularly suitable for interior and extended interior penalty function
methods (see Section 5.7) which do not tolerate well constraint violations. To obtain
the conservative approximation we start by subtracting the reciprocal approximation
from the linear approximation

gL ()
X - gR ()
X = ~ (Xi -
~
XOi)2 (Og)
~. (6.1.8)
i=1 Xi VXi Xo

The sign of each term in the sum is determined by the sign of the ratio (Og/OXi)/Xi
which is also the sign of the product Xi(Og/OXi). Contributions from design vari-
ables for which this product is negative make the reciprocal approximation larger
(more positive) than the linear approximation, and vice versa. Since the constraint
is expressed as g(x) ;::: 0, a more positive approximation is less conservative. The
conservative approximation, ge, is, therefore, created by selecting for each design
variable the smaller (less positive) contribution

gc(x) = g(xo) + t i=1


Gi(Xi - XOi) (:9) ,
X, Xo
(6.1.9)

where
I if XOi(Og/OXi) :::; 0,
G;= { (6.1.10)
XO;/Xi otherwise.
Note that G; = 1 corresponds to a linear approximation, and Gi = xo;/Xj corresponds
to a reciprocal approximation in Xj.
The conservative approximation is not the only hybrid linear-reciprocal approx-
imation possible. Sometimes physical considerations may dictate the use of linear
approximation for some variables and the reciprocal for others, (see Haftka and Shore
[8]' and Prasad [9]). The conservative approximation, however, has the advantage of
being concave (Exercise 1). If all the constraints are approximated by the conserva-
tive approximation, the feasible domain of the approximate optimization problem is
convex (see Section 5.1.2). If we also approximate the objective function by a convex
function, the approximate optimization problem is convex. Convex problems are
guaranteed to have only a single optimum, and they are amenable to treatment by
dual methods (see Section 9.2.2). In fact, a convex approximation fc(x) to the objec-
tive function, f(x), is obtained by reversing the process for obtaining the conservative
concave approximation. That is (Exercise 1),

fc(x) = f(xo) + L Fi(xj -


n
;=1
XOi) (Of)
f).
X, Xo
, (6.1.11)

where
F; = {~O;/Xi if xOi(of /OXi) :::; 0, (6.1.12)
otherwise.
This process of using the conservative approximation for the constraints and the
convex approximation for the objective function has been introduced by Braibant and

214
Section 6.1: Generic Approximations
Fleury [10], and is known as convex linearization. In many papers and textbooks, the
constraints are posed as g(x) ~ 0 rather than g(x) ~ O. In this case, the conservative
approximation is convex rather than concave (that is we use the form of Eqs. (6.1.11)
and (6.1.12) also for the constraints). There are other conservative approximations
(for example, see Prasad [11] or Woo [12]), but it is important to note that the one
presented here, as well as the others, are not guaranteed to be conservative in an
absolute sense (that is, we do not know that the approximation is more conservative
than the exact constraint, gc(x) ~ g(x) ). The approximation presented here is only
more conservative than either the linear and reciprocal approximations.
Higher order approximations are also used occasionally. For example, the
quadratic approximation, gQ is obtained by including the quadratic terms in the
Taylor series expansion

gQ(x) = g(xo) + L:)Xi -


n
XOi)
( ag )
- + -1 Ln L(Xi
n ( a2g )
- XOi)(Xj - XOj) - - .
;=1 aXi Xo 2 ;=1 j=1 ax/hj Xo
(6.1.13)
The reciprocal quadratic approximation gQR is obtained by using the quadratic ap-
proximation in terms of the reciprocal design variables (Exercise 2),

gQR(XO) L.J (XOi)


= g(xo) + ~
i=1
-.
X,
(XOi)
2 - - . (Xi -
X,
XOi) (a-a'x,
g)
Xo
(6.1.14)
+ -I L L -.' (02 9 )
XO' XO'
nn ( ) ( ~ ) (X; - XOi)(Xj - XOj) -.-.
2 ;=1i=1 X, X, aX, ax] Xo

Example 6.1.2

Comparison of various approximations is demonstrated through the use of a simple

f
I

Figure 6.1.2 Three bar truss.

three bar truss shown in Figure 6.1.2. The horizontal force p can act either to the right
(as shown) or to the left. The truss is designed subject to stress and displacement
215
Chapter 6: Aspects of The Optimization Process in Practice
constraints with the design variables being the cross-sectional areas AA, A B, and Ac.
Because of the symmetry of the truss and the arbitrary direction of the horizontal
load we must have AA = Ac. We examine the approximations to the constraint on
the stress in member C, which requires that stress to be less than 110 both in tension
and compression.
The stresses in the three members can be expressed in terms of the displacement
components at the tip of the truss as

I1A = E(v + V3u)/41, I1B = Ev/l, and I1c = E(v - V3u)/41 .


From the horizontal equation of equilibrium

3EA A
or -----;u- U = P .
Similarly, from the vertical equation of equilibrium

or -Ev (A B+-
AA) = 8p,
I 4

so that
v = 8pl/ E(AB + O.25A A ) ,

and
V3- +
I1c =p ( - -
3A A AB
2) .
+ O.25A A
Assuming that member C is in tension, we may write the constraint function as

g =1-
I1c
110 =1-
P
110
(J3
- 3A A + AB + O.25A.4
2)
We now define normalized design variables

so that
g=1+------
J3 2
3Xl X2 + 0.25xl
We approximate g about the point X6 = (1,1). The first derivatives are

ag = (_ J3 + 0.5 ) = -0.2574
aXl 3xi (X2 + O.25xd 2 Xo

aX2
ag =
(X2
2
+ O.25xd 2
I
Xo
= 1.28 .
216
Section 6.1: Generic Approximations

and the second derivatives are


82 9 _ (2.;3 _ 0.25 ) _ 1 0267
8xr - 3x~ (X2 + 0.25x I)3 Xo - . ,

82 9 =
8XIX2 (X2
1
+ 0. 25x I)3
IXo
= -0.512 ,
829 - 4 I = -2.048.
8x~ - (X2 + 0. 25x I)3 Xo
Using these derivatives and 9(Xo) = -0.0227 we can construct the following approx-
imations
9L = -0.0227 - 0.2574(XI - 1) + 1.28(X2 - 1) ,

9R = -0.0227 - 0.2574 (1 - :J + 1.28 (1 - :J = 1 + .2574/XI - 1.28/X2 ,

9c = -0.0227 - 0.2574(XI - 1) + 1.28 (1 - :J '


9Q = 9L + 0.5134(XI - 1)2 - 0.512(XI - 1)(x2 - 1) - 1.024(X2 - 1)2 ,

9QR = -0.0227 - :J :J
0.2574 (2 - :J :J (1- + 1.28 (2 - (1 -

+ 0.5134 (1 - :J :J :J - :J
2 - 0.512 (1 - (1 - 1.024 (1 - 2

All of these approximations have the correct value and correct derivatives at Xo =
(1, If. The two quadratic approximations also have the correct second derivatives
at that point. The reciprocal approximations tend to one as the design variables
tend to infinity. This corresponds to the stress in member C tending to zero a.'l the
cross-sectional areas tend to infinity. This correct physical behavior is not shared by
the other approximations. Table 6.1.1 compares the predictions of the five approxi-
mations to the exact values when Xl and X2 vary between 0.75 and 1.25.
Table 6.1.1
Xl X2 9 9L 9R 9c 9Q 9QR
0.75 0.75 -0.3635 -0.2783 -0.3635 -0.3850 -0.3422 -0.3635
1.00 0.75 -0.4227 -0.3426 -0.4493 -0.4493 -0.4066 -0.4209
1.25 0.75 -0.4205 -0.4070 -0.5008 -0.5137 -0.4070 -0.4280
0.75 1.00 0.0856 0.0417 0.0631 0.0417 0.0738 0.0915
1.25 1.00 -0.0619 -0.0870 -0.0741 -0.0871 -0.0549 -0.0639
0.75 1.25 0.3786 0.3617 0.3191 0.2977 0.3617 0.3919
1.00 1.25 0.2440 0.2974 0.2334 0.2334 0.2334 0.2435
1.25 1.25 0.1819 0.2330 0.1819 0.1690 0.1691 0.1819

The Table shows that the approximations based on reciprocal variables are more
accurate than the approximations based on the actual variables, and in particular,

217
Chapter 6: Aspects of The Optimization Process in Practice

they are exact when the two variables are scaled by the same factor (that is x is
replaced by o:x where 0: is a scalar). The quadratic approximations are substan-
tially more accurate than the three first-order approximations. The conservative
approximation is not guaranteed to be more conservative than the second-order ap-
proximations, but usually, as in this example, it is. We see, however, that the price
of this extra conservativeness is that it is the least accurate approximation.

0.4
0 gL
0.3 D gQR
0.2 gQ
gc
0.1 gR
0::
0
0:: 0.0
0::
t.1l
-0.1
-0.2
-0.3
-0.4 +---+--4----...-+---+--4----...-01---+-_
0.00.1 0.20.3 0.4 0.5 0.60.7 0.8 0.9 1.0

Figure 6.1.3 Comparison of constraint approximation errors.

The constraint approximations can also be used to check for errors in the deriva-
tives used to construct them. This is done by calculating the exact constraint along
a line in design space and plotting the error in the approximation along that line. A
first order approximation must have a zero slope for the error curve at the nominal
design, while a second-order approximation must also have zero curvature there. For
example, let us compare the various approximations along the line

Xl = 1.25 - 0.5t, X2 = 0.5 + LOt, O:::;t:::;l,

where t = 0.5 represents the nominal design. Figure 6.1.3 shows the error as a
function of t. It is seen that the first-order approximations indeed have zero slope
at t = 0.5, while the second-order approximations also have zero curvature there.
For this example, the reciprocal approximation is quite conservative, so that the
conservative approximation is almost identical to it.e e e
The approximations covered so far are obtained by algebraically manipulating the
constraint functions. In an effort to improve the quality of the approximations recent
research efforts have concentrated on the extension of the concept of intermediate
design variables to the concept of intermediate response quantities. The concept was
introduced by Schmit and Miura [13J in 1976, but it was not applied until about ten

218
Section 6.1: Generic Approximations

years later (e.g., [14]). The approach seeks intermediate response quantities that are
well approximated linearly. If the response quantities appearing in the constraint
can be calculated inexpensively from the intermediate response, than we can have a
nonlinear inexpensive and accurate approximation.
One of the most successful intermediate response approximation was proposed for
stress constraints in structural design by Vanderplaats and coworkers (e.g., [15-17]).
Vanderplaats argued that an approximation for member forces will be more accurate
than the corresponding approximation for member stresses. This is expected because
member forces change more slowly than member stresses when cross-sectional areas
are changed. In particular, for a statically determinate truss, force in each of the
members is constant, while member stresses are inversely proportional to member ar-
eas. This motivates the use of the member forces as intermediate response quantities.
Consider, for example, a typical stress constraint for a truss member of the form

gi = 1- - ' ~ O. (6.1.15)
aall

A common approximation for member stresses uses the reciprocal design variables,
Xi = I/A i , where Ai is the cross-sectional area of the ith member. Using a linear
approximation for the member forces, and then dividing by the cross-sectional area
to obtain an approximation to the stress, as suggested by Vanderplaats, we obtain a
constraint of the form

gLFi = Ai _ [Fi(Ao) + 'VT Fi(Ao)(A - Ao)] ~ 0 . (6.1.16)


aall

This is linear in the cross-sectional area design variables. Note that for a statically
determinate truss, where the gradient of the member forces with respect to the cross-
sectional areas is zero, the approximation of Eq. (6.l.16) is a constant. Equation
(6.l.16) has the dimension of area, and it should be nondimensionalized by dividing
it by a reference area. A comparison of the performance of this linear force approxi-
mation with other approximations is given in Section 6.4.

6.1.2 Global and Midrange Approximations

The most common global approximation is the response surface approach. With
this approach the function is sampled at a number of points, and then an analytical
expression called the response surface (typically a polynomial) is fitted to the data.
Construction of response surface often relies heavily on the theory of experiments [18]
and is an iterative process that begins with the assumption of the analytical form
of the response surface, for example, a quadratic polynomial. The approximation
contains a number of unknown parameters (such as polynomial coefficients) that
must be adjusted to match the function to be approximated. To do so, analyses are
performed at a number of carefully selected design points, and a least square solution
is typically used to extract the parameter values from the analysis results. Then the
approximate model (the response surface) is used to predict the function at a number
of selected test points, and statistical measures are used to assess the goodness-of-fit,

219
Chapter 6: Aspects of The Optimization Process in Practice
or the accuracy of the response surface. If the fit is not satisfactory, the process is
restarted, and further experiments are made, or the postulated model is improved by
removing and/or adding terms.
Response surface techniques have not been used extensively in structural opti-
mization (see Barthelemy and Haftka [19J for applications). This may be due to the
fact that the technique is practical only for problems with a small number of design
variables (less than 20 ). The number of analyses required to construct the response
surface increases dramatically with the number of design variables.

Example 6.1.3

To demonstrate the use of response surfaces we fit a linear response surface to the
stress constraint of Example 6.1.2

The response surface is assumed to be a linear polynomial

(a)
\Ve assume that the design space is

To find 0" b, and c we need to evaluate 9 at 3 or more points. For robustness we use
more points, so we select the following 4 points:
xi :;:; (0.5,0.5), x~ = (1.5,0.5), xI = (0.5,1.5), xI = (1.5,1.5) .
Substituting each of these points into Eq. (a) we get 4 equations

0.5
1.5 0.5]
0.5 { ~ } = { -1.0453}
-0.9008
0.5 1.5 0.9239 .
[! 1.5 1.5 c 0.3182
To get a least-square solution of these 4 equations in 3 unknowns, we multiply both
sides by the transpose of the coefficient matrix and solve the resulting 3 X 3 system.
We obtain a = -1.5395, b = -0.2306, c = 1.5941, or
g,. .. = -1.5395 - 0.2306.T1 + 1.5941x2·
vVe compare this with the linear approximation about (1,1) that we found ill Example
6.1.2
gL = -0.0227 - O.2574(.Tl - 1) + 1.28(X2 - 1).
As expected, gL is more accurate near (1,1), and g,., further away. For example at
(0.75,0.75) we get 9 :;:; -0.3635, gL = -0.2783, gr8 = -0.5169. While at (0.5,0.5) we
get 9 = -1.0453, gL = -0.5340, gr. = -0.8578 .•••

220
Section 6.1: Generic Approximations
In response surface techniques the design space is sampled ahead of the opti-
mization process. However, because the optimization process requires the calculation
of constraints and their derivatives at more than one point, it makes sense to use
the information from previous calculations to construct wide ranging approximations
rather than approximations based on information at a single point. This leads to the
concept of multipoint approximations that qualify for the label midrange approxima-
tions. Haftka et al. [20] examined approximations based on two and three points.
Their experience was that the approximation worked well when it represented inter-
polation (for example, at points inside the triangle formed by three data points in
a three-point approximation), but gave only marginal improvement in accuracy for
extrapolation.
A two-point approximation that shows more promise was proposed by Fadel et
al. [21]. The approximation is a linear approximation in the variahles Yi = Xfi, where
the exponentials are selected to match the data. \Ve start by constructing a linear
approximation in Yi at the first point Xo. The approximation may he written in terms
of the original variables as

gtp = g(xo) + t [(:i.)Pi _ 1] (.TOi) (%:.)


i=l 0, P, I Xo
. (6.1.17)

Then the exponentials Pi are found from the condition that the derivatives of 9 match
those of gtp at a second point, Xl. It is easy to show that this leads to

8g ) / (~) }
log { (
8Xi 8Xi
Pi = 1 + ---'----,---:----:----=--
Xl Xo
(6.1.18)
10g(XIi/XOi)
When Pi is larger in magnitude than 1 it is set to sign{pi) so as to avoid large
exponents. Special provisions need to be made when the ratios in the numerator or
denominator in Eq. (6.1.18) are negative or if Pi is zero. In the first case Pi is taken
to be 1, while in the second case it can be shown by the use of Taylor series expansion
that
·
11m [(~ri-1] = I og (Xi)
- . (6.1.19)
Pi-+ O Pi XOi
Another midrange approximation is the scaling or local-glohal approximation [22].
It is intended to improve a global approximation, available from a response surface
approach or from a simpler model of the prohlem, by injecting some local information
into it. The simplest approach for doing that is to use a scale factor based on the
value of the function at a point Xo. That is, the scale factor Be is given as
sc(X) = g(x)/gc{x) , (6.1.20)
where gc is the global approximation. Then the scaled global approximation, g.o, is
given as
(6.1.21)

221
Chapter 6: Aspects of The Optimization Process in Practice
An improvement on this scale factor can be obtained by using the derivative of
g to construct a linear scale factor Sel given as

(6.1.22)

where the derivative of the scale factor is

(6.1.23)

The local-global approximation was applied by Chang et al. [23] for approximating
displacements, stresses and frequencies of a supersonic wing structure obtained by a
finite element model. The global approximation used was a plate model of the wing.

6.2 Fast Reanalysis Techniques

Fast reanalysis techniques take advantage of the computations performed at one


design point to reduce the computational cost of the analysis at another design point.
They are often approximate in nature, working well when the latter design point is
close to the former. In this section we assume that the exact structural response is
available at a design point xo, and that we want to calculate the effect of a small to
moderate perturbation ~x on the response. We will denote the structural properties
and response at Xo by a subscript zero, and the perturbations in properties and
response by~. For example, Uo = u(xo) denotes the displacement field for the
nominal design, and Uo + ~u = u(xo + ~x) denotes the displacement field for the
perturbed design.

6.2.1 Linear Static Response

The discrete equations of equilibrium for linear static response (obtained, for example,
from a finite element analysis) at a design point Xo are

Kouo = fo , (6.2.1)

where K o, Uo and fo are the stiffness matrix, the displacement vector and the load
vector at xo, respectively. Consider now a change ~x in the design which results in
a change ~K in the stiffness matrix, and ~f in the load vector. The equations of
equilibrium at Xo + ~x are

(Ko + ~K)(uo + ~u) = fo + ~f . (6.2.2)

Subtracting Eq. (6.2.1) from Eq. (6.2.2) we obtain

(Ko + ~K)~u = ~f - ~KUo , (6.2.3)

222
Section 6.2: Fast Reanalysis Techniques
and we can obtain a first approximation ~Ul to ~u by neglecting the ~K~u term

(6.2.4)

This approximation will be quite good when ~x is small in magnitude. Furthermore,


usually we have Ko factored or inverted in the solution for Uo. Therefore, it is
relatively inexpensive to solve Eq. (6.2.4). When ~K is a linear function of x the
approximation is in fact identical with the linear approximation of u based on the
Taylor series. We can further improve the approximation by repeating the same
process to obtain higher-order approximations to ~u. Subtracting Eq. (6.2.4) from
Eq. (6.2.3) we get
(6.2.5)
and again we can neglect the ~K(~u - ~ud on the left hand side of the equation
to get and approximation ~U2 to ~u - ~Ul by solving

(6.2.6)

The process can be continued indefinitely to obtain


00

~u= L~Uj, (6.2.7)


;=1

where the terms ~ Uj in the series are obtained through the iterative process of solving

(6.2.8)

Of course, the series is not guaranteed to converge, especially when ~x is not small.
Another approach for improving on ~Ul was suggested by Kirsch and Taye [24].
Their idea is that changes in the structure can be divided into overall scaling and
redistribution of material. That is, we write the perturbed stiffness matrix as

Ko + ~K = sKo + ~K8 , (6.2.9)

where s is a scaling factor. Overall scaling can be dealt with in a simple manner, so
that we need to analyse only the redistribution part. We choose s so as to minimize
~K •. That is, s is chosen so that sKo is as close as possible to K + LlK. Kirsch and
Taye suggested minimizing the sum of the squares of the elements of ~K.. Then it
can be shown (Exercise 7), that s is given as

S -
l:i j kOij~kij
1 + -='::---,,---- (6.2.10)
- .. k 02lJ
"L.J'l,} ••

Now we consider our nominal design to be the one with the matrix sKo instead of
Ko. For this design the displacement field is

Us = (l/s)uo. (6.2.11)
223
Chapter 6: Aspects of The 'ptimization Process in Practice
We consider only the case v here there is no change in the force, ~f = 0. Then Eq.
(6.2.4) for this scaled design is

SKO~Us1 = -~Ksus = -[~K - (s - l)Koluo/s, (6.2.12)

where we used Eq. (6.2.9). Comparing this Equation to Eq. (6.2.4) we get

The total change in u predicted by this approach, ~ u, is

lIs - 1 1 (1 - S)2
~us = Us-UO+~Us1 = (--1)UO+2~U1
s s
+-2-uo = 2~U1-
s s s
2 Uo· (6.2.13)

Example 6.2.1

Apply a first term correction, without and with scaling, to approximate the stress
constraint in member C of Example (6.1.2) when the area of member B is increased by
25 percent (Xl = 1, and X2 is increased from 1 to 1.25, in terms of the nondimensional
areas defined in Example 6.1.2).
The stiffness matrix for the three-bar truss is easily verified to be

K _ E [0.75A A
- I °
AB
° ] _l(JoEp [0.75°
+ 0.25A B -
X1
(X2
° ]
+ 0.25 x d '
so that
Ep [0.75
Ko = l(Jo ° 1.25'
0]
~Ko
Ep
= l(Jo [0° 0.250] .
Also, from Example (6.1.2) we have

_ pi { 4/3A A } _ l(Jo { 1.333}


Uo - E 8/(A B + 0.25AA) - E 6.400 .

With ~f = 0, Eq.(6.2.4) yields

E
~U1 = -Ko-1 ~Kuo = l(Jo { -1.28 °} .

From Example (6.1.2) we also had

(Jc = E(v - V3u)/41, and g = 1 - (Jc/(Jo,

so that
E
~g = --(~v
41(Jo
- V3~u). (a)
224
Section 6.2: Fast Reanalysis Techniques

Substituting the components of ~Ul we get


E 10'0
~g = --l--E (-1.28) = 0.32.
40'0
Since for the nominal design 9 = -0.0227, for the perturbed design 9 is predicted to
be
9 ~ go + ~g = -0.0227 + 0.32 = 0.2973,
which, as expected, is the same as the linear approximation (see Table 6.1.1). For
the scaled approximation we use Eq. (6.2.10), to get
1 0.25 x 1.25 1 147
s= + 0.752 + 1.252 =. .
Equation (6.2.13) becomes
10'0 { 0 } 0.147210'0 { 1.333 } 10'0 { 0.0218}
~us = 1.1472E -1.28 - 1.1472E 6.400 = Ii -1.0780
Substituting into Eq. (a) we get
~gs = -0.25( -1.078 - v'3 x 0.0218) = 0.2789,
and
g. = go + ~g. = -0.0227 + 0.2789 = 0.2562.
This approximation is substantially closer to the exact result (see Table 6.1.1) of
9 = 0.2440.e e e
It is well known (e.g., Haley [25]) that when the matrix of a system of equations
is modified by adding a matrix of low rank it is relatively inexpensive to find the
effect on the solution of the system. The computational effort is roughly equal to
finding r solutions to the original system, where r is the rank of the modification
matrix. When r is small, and the order of the system of equations is large, finding
r solutions of the original system is much cheaper than a new factorization of the
modified system.
This situation often occurs when we modify a small part of a structure. For
example, when the stiffness of a single truss member is modified, the modification
matrix is of rank one, and the solution can be found by a single solution of the original
problem. Furthermore, it can be shown [25] that once this single solution was found,
the exact solution is available for an arbitrary magnitude of the modification. Fuchs
and Steinberg [26] showed that this single solution is the same needed for obtaining
the derivative of the displacement with respect to the change in stiffness. Thus
they were able to derive an approximation to the displacement field which is exact
if a single truss member is modified. Similarly, Holnicki-Szulc [27] has developed
a method, based on virtual distortions, which permits arbitrary modifications of r
members of a truss at the cost of r displacement solutions for the original truss. These
approaches are particularly useful for optimization, because once the displacement
solutions have been obtained, the truss elements can be modified again and again
with very little additional computational cost. For finite elements with higher-rank
stiffness matrices, the same approach is still applicable, but the advantages tend to
be diminished.
225
Chapter 6: Aspects of The Optimization Process in Practice
6.2.2 Eigenvalue Problems

Vibration or buckling response is typically modeled as a symmetric eigenvalue prob-


lem. At the nominal design the vibration eigenproblem may be written as

Kouo - PoMouo = 0 , (6.2.14)

where Ko and Mo are the stiffness and mass matrices, respectively, and Po and Uo are
the eigenvalue (square of frequency) and eigenvector (vibration mode), respectively,
all evaluated at a nominal design point Xo. When Po is a nonrepeated eigenvalue,
the effect of perturbing the design can be easily estimated. Rewriting the eigenvalue
problem at Xo + ~x we have

(Ko + ~K)(uo + ~u) - (Po + ~p)(Mo + ~M)(uo + ~u) = 0 . (6.2.15)

We subtract Eq. (6.2.14) from Eq. (6.2.15) and neglect quadratic and cubic terms
in the perturbation such as ~K~u to get

(Ko - PoMo)~u + (~K - Po~M)uo - ~JlMouo ~ 0 . (6.2.16)

Premultiplying by uZ' and using Eq. (6.2.14) and the symmetry of Ko and of Mo we
get
(6.2.17)

Alternatively, we can premultiply Eq. (6.2.15) by (UO+~U)T and neglect some higher
order terms in the perturbation to get

u6(Ko + ~K)uo
Po + up ~
A
-,;;-,'-----'-:-- (6.2.18)
u6(M o + ~M)uo

Equations (6.2.17) and Eq. (6.2.18) have been obtained by neglecting quadratic and
cubic terms, and it can be shown that their errors (which are not the same) are
proportional to the square of the perturbation in the design ~x, or that they are first
order approximations.
Another first-order approximation was suggested by Pritchard and Adelman [28].
It is based on integrating the derivative of the eigenvalue P with respect to a design
variable x. Equation (7.3.5) for the eigenvalue derivative may be written as

dp
dx = a - Jlb , (6.2.19)

where
dK dM
uT -u uT-u
a= dx and b= dx . (6.2.20)
uTMu' uTMu
226
Section 6.2: Fast Reanalysis Techniques

Assuming that a and b do not change and a i- 0, we obtain the solution of the
differential equation as a function of the design variable x as

(6.2.21)

As a -+ 0 Eq. (6.2.21) tends to the standard linear approximation. When several


variables are changed simultaneously, x can be taken to be the distance along the
path from Xo to xo+Llx (see [28]). This approximation is called the DEB (Differential
Equation Based) approximation in [281.
The first order approximation of Equation (6.2.18) is the Rayleigh quotient ap-
proximation to the perturbed eigenvalue based on the nominal eigenvector Uo. If we
can calculate a linear approximation UL to the eigenvector (e.g., using first derivative
information), then we can use Rayleigh's quotient to get a superior approximation to
the perturbed eigenvalue, namely

.6. uHKo + .6.K)UL


(6.2.22)
J.to + J.t ~ uHMo + LlM)UL

This time the error in Eq. (6.2.22) is proportional to lI.6.xI1 4 , see Murthy and Haftka
[29]' so that Eq. (6.2.22) is a third-order approximation.

Example 6.2.2

k k

m m

Figure 6.2.1 Mass-spring system.

Consider the two-degrees-of-freedom system shown in Fig. (6.2.1). Estimate the


effect on the lowest frequency caused by doubling the left mass. The stiffness and
mass matrices for this system are

-1]
1 '

The lowest eigenvalue and corresponding eigenvector are

J.to = O.382k/m , U6 = (1, 1.618).


227
Chapter 6: Aspects of The Optimization Process in Practice

n'
For the perturbed system there is no change in the stiffness matrix, and

M+~M=m[~ or ~M = [~ ~] .
From Eq. (6.2.20) we get

[1 1.618](-0.382k/m) [r; ~] {1.~18} = -0.106k/m


[r; ~] {1.~18
~Jl ~ --------;:----=:;---.,.---='--...!..----~

[1 1.618] } ,

or
Jlo + ~Jl ~ 0.276k/m .
Similarly, from Eq. (6.2.21) we get

Jlo
[1 1.618] [.:t -t] {1.~18
+ ~Jl ~ ------=.:[,.....---O~]"""";{C--1-
}
T} = 0.299k / m .
[ 1 1.618] 20m
m 1.618

We now consider the DEB approximation of Eq. (6.2.21) with x being the change
in left mass

and a = O.

For the nominal design x = 0, and for the perturbed design x = m so that

Jl = 0.382(k/m)e- O.276 = 0.290(k/m) .

The exact result is


Jlo + ~Jl = 0.293k/m .
We see that the errors associated with the three first-order approximations, 5.8%,
2.0%, and 1.0%, are small compared to the 30.4% difference between the nominal
(0.382k/m) and perturbed (0.293k/m) eigenvalues .•••

6.3 Sequential Linear Programming

The constraint approximations and approximate analysis procedures described in


the previous sections are particularly useful when the computational cost of a single
evaluation of the objective function, the constraints, and their derivatives is very large

228
Section 6.3: Sequential Linear Progmmming
compared to the computational cost associated with the optimization operations, such
as the calculation of search directions. This is a typical situation when we employ
a finite element model with thousands of degrees of freedom to analyze a structural
design which is defined in terms of a handful of design variables. It then pays to reduce
the number of exact structural analyses required for the design process by applying
optimization algorithms to a model of the structure based on approximations.
The simplest and most popular approximation approach is that of sequential
linear programming (SLP). Consider an optimization problem of the form

minimize I(x),
(6.3.1)
subject to gj(x) 2: 0, j = 1, ... ,ng .

The SLP approach starts with a trial design Xo, and replaces the objective function
and constraints by linear approximations obtained from a Taylor series expansion
about Xo

minimize I(xo} + ~)Xi


n

;=1
- XOi) (a-ax,l. ) Xo
'

(6.3.2)
subject to gj(Xo) + t(Xi - XOi) (a~) 2: 0 j = 1, ... , ng ,
;=1 a. Xo

The last set of constraints are called move limits, with a,i and aui being the lower
and upper bounds, respectively, on the allowed change in Xi.
Because of the approximation involved, and the move limits, it is rare that the
final design of the linearized problem, XL, is acceptably close to the optimum design.
However, if the move limits are small enough to guarantee a good approximation
within these move limits, XL will be closer to the optimum than Xo. We can, therefore,
replace Xo by XL, and repeat the linear optimization with Eq. (6.3.1) linearized about
the new starting point. This process is repeated, so that we replace the original
optimization problem by a sequence of linear programming (LP) problems (hence
the name SLP). Each linear optimization is called an optimization cycle. The nature
of the linearization of a nonlinear problem and the application of move limits are
demonstrated in the following example.

Example 6.3.1

Consider the problem

minimize I(x) = -2X1 - X2 ,


subject to gl = 25 - x~ - x~ 2: 0 ,
g2 = 7 - xi + x~ 2: 0 ,
Xl, X2 2: 0 .
229
Chapter 6: Aspects of The Optimization Process in Practice
Linearize the constraint functions about the starting point ofx~ = (1.0,1.0), and use
move limits of 1.0xo;.
Evaluating the constraint functions and derivatives at the initial point we have

g2(XO) = 7 - 1 + 1 = 7,
(Vgd xo = {=~} ,
(Vg 2 )xo = { - ; }

Therefore, the linear approximations take the form

gIL(X) = 23 + [-2 - 2] { ~~=i } = 27 - 2Xl - 2X2 ~ 0,

g2L(x) = 7 + [-2 2] { ~~ =i } = 7 - 2Xl + 2X2 ~0.

10.00

8.00

6.00

4.00

2.00

0.00
0.00 2.00 4.00 6.00 8.00 10.00
Xl

Figure 6.3.1 Constraint linearization and move limits.

These linear approximations are shown in Figure (6.3.1) together with the original
constraints represented by the dashed lines. Also shown in the figure are the move
limits which form a rectangular boundary around the initial design point.

230
Section 6.3: Sequential Linear Progmmming
The solution of this new linear programming problem is xf = (2.0 2.0) with
an objective function of f =-6 which corresponds to a 100% improvement in the
objective function. If there were no move limits, the solution of the problem would
have been at xf = (8.5 5.0) and the resulting value of the objective function would
be f = -22 (see Figure 6.3.1).
Although without move limits we achieve a much larger gain in the objective
function, the exact constraints are violated substantially, as shown in Figure (6.3.1).
A procedure for evaluating the acceptability of constraint violations is discussed later
in this section. __ _

SLP is attractive because reliable LP packages are readily available to most com-
puter users through system library packages, while reliable nonlinear programming
packages are not so readily available. However, the SLP strategy has several problems
associated with it. First, it greatly increases the computational cost associated with
optimization operations, because the optimization process is repeated several times
(typically five to forty times). Thus, this strategy is reasonable only when the cost
of these optimization computations is small compared to the cost of analysis plus
the cost of sensitivity derivatives. The efficiency of the LP package used for the SLP
approach can, therefore, become an important consideration.
Second, without a proper choice of move limits, the process may never converge.
In general, move limits should be gradually shrunk as the design approaches the
optimum. Part of the reason for the need to shrink the move limits is that the
accuracy of the approximation is required to be higher when we get close to the
optimum. When we are far from the optimum design, the gains that are made during
each cycle are large, and we can tolerate significant errors and still make progress
towards the optimum. When we get close to the optimum, the gains are small and
can be swamped by approximation errors. However, reduction of the move limits
early in the process may unnecessarily slow down the convergence too, especially if
the initial design is far from the actual optimum. The need to reduce move limits is
indicated when the final design of a cycle proves, upon exact analysis, to be inferior
to the initial design of that cycle (which is the final design of the previous cycle), or
provides no gain in the function f. The move limits are typically shrunk by ten to
fifty percent of their previous values until the improvement in the objective function
for a given set of move limits becomes smaller than a given tolerance. Popular choices
for starting values of the move limits are in the range of ten to thirty percent of the
design variables. However, this choice is reasonable only if a design variable is not
exceedingly small because it may be on its way to changing its sign. In such a case,
it may be reasonable for the move limits to be ten to thirty percent of a typical value
(as opposed to the instantaneous value) of that design variable.
A third difficulty associated with SLP arises occasionally when the starting design
is infeasible. The combined effects of approximation and move limits can then result
in a situation where the linearized optimization problem does not have a feasible
solution. That is, if the initial point of a problem is infeasible with respect to the
normalized constraints and the move limits are small, the region formed by the move
limits may remain entirely inside the infeasible linearized design space leading to an
infeasible problem. In this case it is advisable to relax the constraints during the first

231
Chapter 6; Aspects of The e7ptimization Process in Practice
few cycles. This can be done, for example, by replacing the optimization problem
Eq. (6.3.2) by

minimize f(xo) + ~)Xi


n
;=1
- XOi)
(8x,f ) xo + k/3 ,
-8.

subject to gj (xo) + f)Xi - XOi) (89~) + /3 2 0, j = 1, ... , n g , (6.3.3)


i=l 8x, Xo
ali ~ Xi - XOi ~ aui ,
and /3 20,
where /3 is an additional design variable, representing the allowed margin of original
constraint violation, and k is a number chosen to make the contribution of fJ to
the objective function large enough, so that the optimization cycle will emphasize
reducing /3 over reducing f.
Finally, if the solution of the original problem is not at a vertex of the constraint
set it is possible that the iterations can cycle between two points. For example, if the
actual optimum is at the boundary of a nonlinear constraint, solution of the linearized
problem may take the design back to the initial point of the previous linear problem.
An appropriate move limits reduction strategy can resolve this difficulty easily.
The following example demonstrates some of the considerations in the choice of
move limits.

Example 6.3.2

2p

Figure 6.3.2 Four bar statically determinate truss.

We consider the minimum weight design of a four bar statically determinate truss
shown in Figure (6.3.2). In the interest of simplicity we assume members 1 through
3 to have the same area A1 and member 4 an area A 2 . Under the specified loading

232
Section 6.3: Sequential Linear Programming

the member forces and the vertical displacement at joint 2 can be easily verified to
be
II = 5p, h = -p, fa = 4p, 14 = -2V3p ,
b
2
= 6pl (~
E Al +
v'A23) '

where a negative sign denotes compression. We assume the allowable stresses in


tension and compression to be 7.73 X 1O- 4 E and 4.833 X 1O- 4 E, respectively, and
limit the vertical displacement to be no larger than 3 X 1O- 3 1. The problem of
the minimum weight design subject to stress and displacement constraints can be
formulated in terms of nondimensional variables

Xl = 103 (A~E)' X2 = 10 3 (A:E) ,


as
minimize I(XI,X2)
3
+ -X2
= -
v'3 ,
Xl
subject to 18xI + 6V3x2 ::; 3 ,
0.05 ~ Xl ~ 0.1546 ,
0.05 ~ X2 ::; 0.1395 ,
where lower bound limits on Xl and X2 have been assumed to be 0.05.
We start the first cycle with an initial guess of xij = (0.1,0.1) which satisfies the
constraints and gives 1 = 47.32. The LP problem is started with ten percent move
limits, au; = ali = 0.01, i = 1,2. Only the objective function is nonlinear, and its
derivatives at Xo are
01 = -300, 01
~ = -173.2,
OXI U X2

so that the first LP is


minimize h = 47.32 - 300(XI - 0.1) - 173.2(x2 - 0.1) ,
subject to 18xI + 6V3x2 ~ 3,
0.09 ~ Xl ::; 0.11,
0.09 ::; X2 ::; 0.11 .

This problem is solved to yield Xl = 0.10316, X2 = 0.11, and h = 44.6410. How-


ever, 1(0.10316,0.11) = 44.8274, so that the linear approximation exaggerated the
improvement in I. We next linearize 1 about (0.10316, o.l1f, and keeping same-size
move limits, we get for the second cycle the following LP:
minimize h = 44.8274 - 281.9(XI - 0.10316) - 143.1(X2 - 0.11),
subject to 18xI + 6V3x2 ~ 3,
0.09316 ::; Xl ::; 0.11316,
0.1 ~ X2 ::; 0.12 .
233
Chapter 6: Aspects of The Optimization Process in Practice

The solution for this problem is Xl = 0.10893, X2 = 0.1, h = 44.63126, f = 44.86069.


That is, this move resulted in apparent gain (in terms of h), but actual loss (in terms
of 1). This is an indication that we need to reduce the move limits.
We reduce the move limits to 0.005 and perform two additional cycles starting
from the best design so far, X5 = (0.10316,0.11). The first cycle yields Xl = 0.10604,
X2 = 0.105, h = 44.72937, f = 44.78560. With the second cycle we get back
Xl = 0.10316, X2 = 0.11, and this oscillation again indicates the need for reducing
move limits for further improvements. However, with the last set of move limits we
reduced f from 44.8274 to 44.78560 which is by less than 0.1 percent. Thus, it may
be reasonable to quit. Indeed, for each one of the LP's the nonlinear displacement
constraint was active, so that we can find the exact solution by setting
3 - 18xl
or X2 = 6v3
and substituting into f to get
f= ~+ 6
Xl 3 - 18.rl
It is easy to check that the minimum of f is at Xl = X2 = 0.105662, f = 44.7846.
The design space for this problem is shown in Figure 6.3.3 •••
It is possible that the optimum design obtained for a linearized problem at any
cycle of the iterative SLP process may violate the constraints of the original problem.
\Ve have seen in Example 6.3.1 that if the move limits in that example were not
imposed or they were large enough, the solution of the linear problem would have
caused a significant violation of the original constraint set. Such constraint violations
are generally associated with objective function improvements. It is also possible
that, from the solution of one linear problem to the next, the objective function may
deteriorate and the constraint violations be reduced. These events can be prevented
by altering the imposed move limits. However, neither of the two events is necessarily
objectionable for the overall convergence of the SLP. Following is a discussion of how
to judge whether a new design obtained by the LP is an improvement when a better
objective function is accompanied by constraint violation, or a better satisfaction of
the constraint set is accompanied by an increase in the objective function.
Suppose the optimum of the LP during the ith cycle, XiL' leads to a set of active
or violated constraints gj(Xi'L)' j E J, where J is the set of active constraints. \Ve
can view the solution of the linearized problem as the exact solution of the following
modified nonlinear problem
minimize: f(x),
subject to: gj(x) ~ pgj(xirJ, (6.3.4)
for p = 1. The actual problem we want to solve is for p = O. Using Eq. (5.4.7) we
can estimate that the optimum value of the objective function for the unmodified
problem is

.c = f(xirJ - L Ajgj(x7rJ , (6.3.5)


j=l

234
Section 6.9: Sequential Linear Programming

I
(18x 1+6f"3x 2 ::;;: 3)

xl =0.05 Xl ::;;:0.1546
0.25

0.20

0.15
t------I"........"""""+--'t---+-----x2 = 0.1395

0.10

- -_ _ f*=4O

f* = 44.784
0.05 +---~~It'""r" ........~___......~+_--- x2 = 0.05
~-.;:!~-f* = 60

0.05 0.10 0.15 0.20

Figure 6.9.9 Design space for four-bar truss problem.

where C is the Lagrangian function. This suggests the following procedure: If the
objective function and the most critical constraints both improve, always accept the
new design. If the objective function improves and the constraints deteriorate or vice
versa, compare the values of the Lagrangians. If the Lagrangian at the end of a cycle
is smaller than its value at the beginning of the cycle, then accept the new design. If,
on the other hand, the Lagrangian increases, modify the move limits. We recommend
using only critical and violated constraints in the Lagrangians.

Example 6.3.3

Consider example 6.3.2 with variables Yi = Ilx; (proportional to the cross-sectional


areas). The problem takes the following form

minimize: fey) = 3Yl + v3Y2 ,


subject to: g=3---->0
18 6V3
Yl Y2 - ,

235
Chapter 6: Aspects of The Optimization Process in Practice

8.0 S YI S 20,
8.0 S Y2 S 20 ,
where lower bound for the variables are increased to 8.0 for convenience. An initial
guess of YI = 12, and Y2 = 8 results in f = 49.856.
Linearizing the problem with 30% move limits leads us to the problem

minimize: 3YI + V3Y2 ,


subject to: 0.125YI + 0.1624Y2 ~ 2.598,
804 S YI S 15.6,
8.0 S Y2 S lOA .
Solution of this LP yields YI = 804, Y2 = 9.534, and f = 41.713. Reanalysis reveals
g = -0.2329. Also from the solution of the LP problem we obtain Al = 10.667
(corresponding to g) and A2 = 1.667 (corresponding to move limit). Therefore the
Lagrangian is
L = 41.713 - 10.667( -0.2329) = 44.197 ,
which, compared to the initial objective function of f = 49.856, is a smaller improve-
ment than the solution of the LP, 41. 713, but still acceptable.
Linearizing the constraint function about the last design point and formulating
the LP problem with 30% move limits we find the problem

minimize: 3YI + V3Y2 ,


subject to : 0.2551YI + 0.1l43Y2 ~ 304658,
8.0 S YI S 10.92,
8.0 S Y2 S 12.3938 ,

which h3.'> a solution of YI = 10.000, Y2 = 8.0, and f = 43.858, with constraint


function multiplier of Al = 11. 76 and a lower bound multiplier of A2 = 0.38. Although
the objective function increased roughly by 5%, evaluation of the actual constraint
shows a smaller constraint violation, g = -0.09896 compared to the initial design of
this LP problem. Therefore, we must calculate the Lagrangian in order to accept or
reject this design. At the end of this step the Lagrangian is

L = 43.858 - 11. 76( -0.09896) = 45.022 ,


which is larger than the value of the Lagrangian calculated at the end of the previous
LP problem. We, therefore, reject the design and reconstruct the LP problem with
smaller move limits.e e e

6.4 Sequential Nonlinear Approximate Optimization

\Ve can generalize SLP by using nonlinear approximations for some of the con-
straints and objective function. For the application of SLP we need to linearize even

236
Section 6.4: Sequential Nonlinear Approximate Optimization
simple nonlinear functions. With the more general procedure we approximate only
expensive-to-calculate functions using either linear or nonlinear (such as quadratic)
approximations. Inexpensive constraints need not be approximated at all. \Ve start
by identifying those constraints (and possibly the objective function) which require
large computational resources for evaluation. These constraints are singled out for
approximation, while the cheaper constraints are evaluated exactly. Given a trial
solution Xo to the structural design problem, we construct approximations to the
expensive constraints about Xo. As in the case of SLP, we need to augment the ap-
proximate problem with move limits to guard against large changes in design variables
that can result in poor approximations.
The solution of the approximate problem with the move limits, obtained by any
optimization procedure is denoted as Xl. We perform a new exact structural analy-
sis at Xl, use it to construct new approximations to the expensive constraints, and
perform a new optimization of the approximate problem. That is, the original opti-
mization problem Eq. (6.3.1) is replaced by

minimize
subject to
(6.4.1)
and
for i = 0,1,2, ... ,

where fa and gaj denote the approximate objective function and constraints, respec-
tively, X~i) is the solution of the ith minimization, and aj is a suitably chosen move
limit.
Because most of the cost of the optimization is associated with the exact analysis
and sensitivity calculations, it is often not important what optimization procedure
is used for obtaining the optimum of approximate problems. In general, it is more
important to emphasize reliability and robustness in the choice of the optimization
procedure rather than computational efficiency.
The following example demonstrates the use of sequential nonlinear approximate
optimization with the standard approximations discussed in section 6.1 as well as one
which was tailored more to the problem at hand.

Example 6.4.1

The ten-bar truss shown in Figure (6.4.1) is a standard example used by many
authors. The minimum weight design obtained by changing the cross-sectional ar-
eas of the truss members is sought subject to stress constraints and minimum gage
constraints of 0.lin 2 . The maximum allowable stress in each member is the same
in tension and compression. This allowable is set to 25 ksi for all members except
member 9. For member 9 the stress allowable is 75 ksi. The density of the truss
material is 0.llb/in 3 •

237
t-0
Chapter 6; Aspects of The Optimization Process in Practice

-I- ®----1,
J I

~. ~----91
I:
x

1 = 360 ", P = 100 Kips

Figure 6.4.1 lO-bar truss.


Table 6.4.1 Ten-bar truss designs
Member Initial area Optimum area Stress
(in 2 ) (in 2 ) (ksi)
1 5.0 7.90 25.0
2 5.0 0.10 25.0
3 5.0 8.10 -25.0
4 5.0 3.90 -25.0
5 5.0 0.10 -0.07
6 5.0 0.10 25.0
7 5.0 5.80 25.0
8 5.0 5.51 -25.0
9 5.0 3.68 37.5
10 5.0 0.14 -25.0

The five generic local approximations described in section 6.1 were used here,
together with the linear force approximation proposed by Vanderplaats and coworkers
[e.g., 15J. Table 6.4.1 shows the initial and optimum designs and the stresses in the
optimum truss members.
Table 6.4.2 compares the convergence history of twelve cycles of approximate op-
timization using the six approximations. To compare the performance of the various
approximations in Table 6.4.2 a useful measure of performance is the number of cycles
required to get to within one percent of the optimum weight (that is to 1514 lb). The
linear, reciprocal-quadratic, and linear force approximations required six cycles, the
quadratic approximation seven, the reciprocal approximation ten, and the conserva-
tive approximation never made it. The difference between the linear and reciprocal
approximations turns out to be an idiosyncrasy of this problem. For many truss

238
Section 6.5: Special Problems Associated with Shape Optimization
Table 6.4.2 Convergence of optimum weight (lb) using different approximations
Cycle Linear Reciprocal Conservative Quadratic Recip-quadratic Linear force
1 1845 1774 2361 2002 1931 1891
2 1637 1673 1960 1741 1684 1688
3 1601 1593 1722 1650 1595 1589
4 1558 1566 1641 1586 1548 1549
5 1531 1548 1587 1547 1522 1526
6 1514 1537 1566 1525 1509 1511
7 1507 1528 1555 1514 1506 1504
8 1502 1522 1546 1507 1502 1501
9 1500 1518 1540 1503 1500 1500
10 1500 1511 1538 1501 1500 1499
11 1500 1511 1535 1500 1499 1499
12 1499 1508 1532 1499 1499 1499

problems the reciprocal approximation does better than the linear one. As a group,
the second order approximations are slightly better than the first order ones, but the
difference does not appear to be significant enough to justify the cost of computing
second derivatives (see Section 7.2.2 for discussion of the cost of calculating second
derivatives ).
The dismal performance of the conservative approximation is explained by the
fact that it is typically less accurate than either the linear or reciprocal approxima-
tion. It is useful in situations where we need the conservativeness (such as when it is
employed with interior penalty function algorithms), or the convexity (such as with
dual algorithms, see Chapter 9). However, for sequential approximate optimization
it is of little use. Finally, the linear force approximation due to Vanderplaats is com-
parable in performance to the second-order approximations even though it employs
only first derivatives. This is due to the fact that it approximates a "more linear"
quantity than the stress. In using this approximation we approximate an interme-
diate quantity-the member force, and compute the stress exactly from the force.
Similar physical insight leading to identification of quantities that are approximately
linear can afford comparable gains in other problems. , . ,

6.5 Special Problems Associated with Shape Optimization

The term shape optimization is employed here in a very broad sense. In terms
of a finite element model we consider as shape optimization any problem where we
need to change the position of the nodes of the finite-clement model or the element
connectivity (e.g remove elements). Shape optimization problems are contrasted
with sizing optimization problems where we change only element stiffness properties,
such as bar cross-sectional areas or plate thicknesses. The term shape optimization
is often used in a narrow sense referring only to the optimal design of the shape
of the boundary of two- and three-dimensional structural components. The broad
usage includes also geometrical optimization of skeletal structures, and topological

239
Chapter 6: Aspects of The Optimization Process in Practice

optimization which decides the connectivity of the structure (for example, ,vhich
nodes are connected by clements).
Shape optimization problems are typically more difficult to tackle than sizing opti-
mization problems. Consider first the optimization of the boundary shape of two- and
three-dimensional bodies. The calculation of sensitivity derivatives for these shape
optimization problems is associated with accuracy problems discussed in Chapters 7
and 8. Another serious problems is mesh deformation. As the shape of the structure
changes we need to change the finite-element mesh. Simple remeshing rules that
translate node positions as the boundary changes, usually lead to highly deformed
finite elements and concomitant loss of accuracy. This problem can be addressed by
manually remeshing during the optimization process (which is time consuming), or
employing sophisticated mesh generators. \Vork in shape optimization has indeed
spurred the development and usage of such mesh generators (e.g., [30,31]).
Another problem associated with boundary shape optimization is that of the
existence or creation of internal boundaries or holes. In many problems the optimal
design will have internal cavities. It is impossible to generate these cavities with a
standard optimization approach without prior knowledge of their existence. That
is, an optimization procedure can easily find for llS the optimum shape of a cavity
once we assume there is one, but it cannot tell liS that there should be one, two,
or three cavities. One approach for dealing wit h this problem is to aSS11me that the
material is not homogeneous, but instead has an underlying microstructure. This
underlying microstructure can be of fibers and matrix composite material. However,
typically the assumed microstructure is more general than that of the fiber and matrix
components of a laminated plate, and includes micro cavities in the material. This
type of microstructure was devised so as to probe the theoretical limits of strength
and stiffness that can be attained by a structure (see, e.g., Kohn and Strang [32],
or Rozvany et al. [33]). Bends0e and Kikuchi [34] showed that it can be used to
determine the need for introducing cavities into the structure. Figure 6.5.1 shows
the type of structure obtained by Bends0e and Kikuchi by permitting microcavities.
The structure under consideration is a bar in tension where the cross sections at the
two ends are given (solid areas in figure), and the cross section on the left is larger
than the that on the right. The objective is to maximize the stiffness of the bar for a
given volume. The result shown in the figure, while not practical in itself, permits us
to identify regions where cavities exist. Standard optimization techniques can then
be used to find the optimal shape of these cavities.
An example of the application of this technique was reported by Rasmussen
[35] for the design of a floor beam design in a civil transport. Figure 6.5.2 shows
the topology that was assumed by the designers and the topology identified by the
homogenization approach which led to a suhstantially lighter design.
The problem of finding the cavities in two- and there-dimensional bodies belongs
to the realm of topological optimization. Topological optimization is a difficult prob-
lem which has received more attention in applications to skeletal structures such as
trusses and frames. There the optimum topology is typically defined by decisions as
to which joints are connected to each other by members. The basic approach followed
by most researchers is to create a ground structure where every joint is connected to

240
Section 6.5: Special Problems Associated with Shape Optimization

Figure 6.5.1 Optimal shapes for a fillet problem using microstructure.

000000

Figure 6.5.2 Shape design of floor beam for a civil transport aircraft: initial and final
geometries.

every other joint. If the design problem is minimum weight with constraints on the
plastic collapse load, then as shown in Chapter 3, the optimization problem is linear,
and the simplex method may be used to find the optimum design. The algorithm
also automatically removes all unnecessary members. This approach was first taken
by Dorn and co-workers [361.

When the structure is designed subject to stress and displacement constraints


rather than plastic collapse, it may be impossible to start with a ground structure
and rely on a standard optimization algorithm to remove unnecessary members. One
problem is that the members that need to be removed may experience large strains
as their areas are reduced, so that the optimization algorithm will tend to reinforce
them rather than eliminate them. Because this problem is related to the compatibility
conditions, it is possible to relax these during part of the optimization process for
the purpose of eliminating members (e.g., Sheu and Schmit [371, or Reinschmidt and
Russel [38]). Another problem is that the stiffness matrix may become singular due

241
Chapter 6: Aspects of The Optimization Process in Practice
to the removal of members. This problem may be overcome by using simultaneous-
analysis-and-design techniques which do not require the inversion or factorization of
the stiffness matrix (see Section 10.6). The reader is referred to two survey papers by
Topping [39], and Kirsch [40J for additional information on topological optimization.
Geometrical optimization of skeletal structures refers to the search for the opti-
mum locations of the joints of the structures. The problem can be solved by standard
techniques, but there are often numerical advantages to treating the geometry vari-
ables differently from the sizing variables and employing a two-level optimization
approach. This topic is discussed in Chapter 10 in Section 10.5.

6.6 Optimization Packages

During the first few years of the development of structural optimization most an-
alysts developed special purpose finite-element programs with built-in optimization
procedures for their own use. When these programs were used by other analysts they
found them to be insufficiently documented and difficult to modify. In recent years it
has become more common to employ general purpose constrained optimization pack-
ages and interface them with general purpose structural analysis codes. Additionally,
the growing popularity of structural optimization as a tool for industrial applications
is generating demand for the introduction of optimization capabilities into general-
purpose analysis packages. The purpose of this section is to provide the reader with
brief description of some of the more popular packages.
First consider integrated packages which combine structural analysis and opti-
mization procedures. One of the more popular programs of this class was the TSO
program (originally called WASP [41,42]) developed for the preliminary design of
aircraft wing and tail structures subject to aeroelastic constraints. The program
models the wing or tail structure as an orthotropic plate and employs simplified
plate analysis rather than a finite element model. Design variables are coefficients
of polynomials that describe the thickness distribution and ply orientations over the
surface. The optimization procedure is based on an interior penalty function formu-
lation (see Chapter 5). The program has been used extensively for design studies and
for some actual aircraft design problems (see [43]).
Many integrated structural optimization packages are based on special purpose
finite element programs. One of the better known is the ACCESS program developed
by Schmit and co-workers [44,45J. Other programs of this type include FASTOP
[46]' OPSTAT [47], OPTCOMP [48], OPTIMUM [49], ASOP [50], STARS [51] and
DESAP [52].
Because of the lack of generality associated with special purpose finite-element
programs, there has been interest in structural optimization packages built around
a general purpose finite element program. Two early examples of this type are
PARS[53] and PROSSS [54]. These programs are based on the SPAR finite-element
package and its commercial derivative EAL. However, because the optimization soft-
ware was not supported by the developer of the finite-element package, the use of

242
Section 6.6: Optimization Packages
PARS and PROSSS has been limited. The EAL program, however, lends itself to in-
terfacing with other programs, and has been used with optimization software; Walsh
[55] reports on the use of EAL together with the CONMIN [56] program.

Other finite-element programs have also been recently used to form structural op-
timization packages. The OPTSYS package [57] is based on the ASKA and ABAQUS
finite-element programs, the ASTROS system [58] evolved from the public domain
version of NASTRAN, and the NISAOPT package (including the programs SHAPE
[59] and STROPT [60]) is based on NISA II.

The demand for structural optimization is pushing finite-element software devel-


opers to include optimization capabilities in their programs. The N ASTRAN program
[61] and the I-DEAS program [621 now have sensitivity and optimization capabilities,
and ANSYS has a built in rudimentary optimizer. A recently developed program,
GENESIS [63], has gone one step further in that it is a general finite element program
developed together with sensitivity, approximation and optimization capabilities. In
the not too distant future we can expect that most commercial structural analysis
packages will offer built-in optimization capabilities.

Until that day, and probably even after, there will be a continuing demand for
general purpose optimization software that can be coupled to structural analysis pro-
grams. Most finite-element packages lend themselves to the calculation of sensitivities
via finite-differences (see Section 7.1), so that the analyst can construct constraint
approximations based on these derivatives (see Section 6.1) and use the optimization
package on this approximation in a sequential-approximate-optimization mode. The
most commonly available general-purpose optimization packages are linear program-
ming (LP) solvers. These are usually available at most computer centers as part of
IMSL or similar subroutine libraries. While in some cases there are advantages to
using more general optimization algorithms, LP packages seem to work well in the
majority of applications.

At the other extreme of generality we find the ADS [64]' DOT [65] and DOC [66]
packages from VMA Engineering which allow the user a wide menu of optimization
algorithms and strategies. These programs evolved from the very popular CONMIN
[56] package which was used extensively for structural optimization. DOT (Design
Optimization Tool) is a collection of fortran subroutines for optimization, and DOC
(Design Optimization Control) is a control program that simplifies the use of opti-
mization (calling DOT subroutines). Another general-purpose optimization package,
commonly used in structural optimization, is NEWSUMT [67] developed by Miura
and Schmit which is based on a penalty function procedure (see Chapter 5), and
an updated version of the program NEWSUMT-A which incorporates constraint ap-
proximations [68]. Other packages of this type include OPT based on the reduced
gradient algorithm (see Chapter 5), and IDESIGN [69] based on sequential quadratic
programming (see Chapter 5). There are also several packages available from math-
ematical programming specialists. However, these programs do not enjoy as much
popularity in structural optimization applications as the aforementioned programs
which were developed by engineers.
243
Chapter 6: Aspects of The Optimization Process in Pmctice
6.7 Test Problems

Standard test problems are useful for the purpose of checking optimization algo-
rithms and software. The three test problems given in this section have been widely
used for this purpose.
6.7.1 Ten-bar Truss

The ten bar truss shown in Figure 6.4.1 is a classical example used to show the dif-
ference between a fully stressed design (FSD) and an optimum design. The material
properties and the minimum area are given in Table 6.7.1. When the truss is de-
signed subject to stress constraints only, the optimum and FSD designs are identical.
However, when the stress allowable for member 9 is increased above 37,500 psi the
optimum design and the FSD design are different. The three designs are given in
Table 6.7.2. The truss has also been optimized with displacement constraints (Table
6.7.3) and the final design is given in Table 6.7.4. For additional information, see
Ref. [70J.
Table 6.7.1 Data for ten bar truss
Material: aluminum Specific mass: 0.1 Ibm/in3
Young's modulus: 10 7 psi Allowable stress: ±25 000 psi
Minimum area: 0.1 in 2

Table 6.7.2 Final designs for ten bar truss with stress constraints only
Increased allowable, member 9
Member FSD and optimum FSD optimum design
areas(in 2 ) areas(in 2 ) areas(in 2 )
1 7.94 4.11 7.90
2 0.10 3.89 0.10
3 8.06 11.89 8.10
4 3.94 0.11 3.90
5 0.10 0.10 0.10
6 0.10 3.89 0.10
7 5.74 11.16 5.80
8 5.57 0.15 5.51
9 5.57 0.10 3.68
10 0.10 5.51 0.14
Mass (Ibm) 1593.2 1725.2 1497.6

Table 6.7.3 Displacement allowables for ten bar truss


Displacement limits
Case Node Direction lower upper
A 1 Y -2.0 -2.0
3 Y -1.0 -2.0
B 1-4 Y -2.0 +2.0

244
Section 6. 7: Test Problems

Table 6.7.4 Optimum designs for ten bar truss with displacement constraints
Cross-sectional areas (in2 )
Member Case A Case B Member Case A Case B
1 22.66 30.52 6 0.10 0.55
2 1.40 0.10 7 12.69 7.46
3 21.58 23.20 8 14.54 21.04
4 8.43 15.22 9 11.93 21.53
5 0.10 0.10 10 1.98 0.10
Mass(lbm) 4048.96 5060.85

6.7.2 Twenty-jive-bar Truss

The twenty-five-bar truss is shown in Figure 6.7.1. The loading, material properties
and allowables are shown in Tables 6.7.5,6.7.6,6.7.7, and 6.7.8, and the final design
is shown in Table 6.7.9. For additional details see Ref. [70].
z

a • 63.; em (25 in)

Figure 6.7.1 Twenty-jive-bar truss.

Table 6.7.5 Data for twenty-five-bar truss


Material: aluminum
Young's modulus: 107 psi
Specific mass: 0.1lbm/in3
Minimum area: 0.01 in2

245
Chapter 6: Aspects of The Optimization Proces in Pmctice
Table 6.7.6 Allowable stresses for twenty-five-bar truss (psi)
Members Tension Compression Members Tension Compression
1 40000 -35092 12,13 40000 -35092
2-5 40000 -11590 14-17 40000 -6759
6-9 40000 -17305 18-21 40000 -6959
10,11 40000 -35092 22-25 40000 -11082

Table 6.7.7 Nodal load components (lbf) for twenty-five-bar truss


Load case Node x y z
1 1 1000 10000 -5000
2 0 10000 -5000
3 500 0 0
6 500 0 0
2 5 o 20000 -5000
6 o -20000 -5000

Table 6.7.8 Displacement allowables for twenty-fIve-bar truss


Displacement limits (in)
Node x y z
1 ±0.35 ±0.35 ±0.35
2 ±0.35 ±0.35 ±0.35

Table 6.7.9 Optimum design for twenty-five-bar truss


Design variable Members
1 1 0.010
2 2-5 1.987
3 6-9 2.991
4 10,11 0.010
5 12,13 0.012
6 14-17 0.683
7 18-21 1.679
8 22-25 2.664
Mass(lbm) 545.22

6.7.3 Seventy-two-bar Truss

The seventy-two-bar truss is shown in Figure 6.7.2. The loadings, material properties
and allowables are shown in Tables 6.7.10, 6.7.11, and 6.7.12, and the optimum design
is shown in Table 6.7.13. For additional details see Ref. [70J.

246
Section 6.7: Test Problems

Table 6.7.10 Data for seventy-two-bar truss


Material: aluminum
Young's modulus: 107 psi
Specific mass: 0.llbm/in3
Allowable stress: ±25 000 psi
Minimum area: 0.01 in2

Note: Fo. the ub of da,i,v. n01 Otll Piement' ,,,. drawn in this figur •.

Figure 6.7.2 Seventy-two-bar truss.

Table 6.7.11 Nodal load components (lbf) for seventy-two-bar truss


Load case Node x y z
1 1 5000 5000 -5000
2 1 0 0 -5000
2 0 0 -5000
3 0 0 -5000
4 0 0 -5000
2 5 0 20000 -5000
6 0 -20000 -5000

247
Chapter 6: Aspects of The Optimization Process in Practice

Table 6.7.12 Displacement allowables for seventy-two-bar truss


Displacement limits (in)
Node x y z
1 ±0.25 ±0.25
2 ±0.25 ±0.25
3 ±0.25 ±0.25
4 ±0.25 ±0.25

Table 6.7.13 Optimum design for seventy-two-bar truss


Design variable Members Areas (in 2 )
1 1-4 0.1571
2 5-12 0.5356
3 13-16 0.4099
4 17,18 0.5690
5 19-22 0.5067
6 23-30 0.5200
7 31-34 0.1
8 35,36 0.1
9 37-40 1.280
10 41-48 0.5148
11 49-52 0.1
12 53,54 0.1
13 55-58 1.897
14 59-66 0.5158
15 67-70 0.1
16 71,72 0.1
Mass(lbm) 379.66

6.8 Exercises

1. Show that the conservative approximation, Eq. (6.1.9) is concave, and the ap-
proximation of Eq. (6.1.11) is convex as long as the design variables do not change
their sign.
2. Derive Eq. (6.1.14).
3. Add to Table 6.1.1 a column representing an approximation to the constraint based
on a linear approximation of the force in member C (This linear-force approximation
is due to Vanderplaats and coworkers [15-17]).

248
Section 6.9: References

A c

Figure 6.8.1 Asymmetric three-bar truss.

4. The three-bar truss in Figure 6.8.1 has members with equal cross-sectional areas.
Calculate the five approximations discussed in Section 6.1 as well as the Linear-force
approximation discussed in the previous problem for the stress in member A. Compare
the accuracy and conservativeness of the approximations for changes of ±25% in the
area of member C.
5. Obtain a good approximation to the stress in member A in the previous problem
in terms of the two angles of the truss.
6. The beam in Figure 6.1.1 has a mass density p, and cross-sectional area propor-
tional to the square root of the moment of inertia A = a..,fj. Use the global-local
approximation to obtain the lowest vibration frequency as 12/11 is varied from 1 to
2. Use a two-element model as the exact solution, and a 1 element model as a global
approximation. Note that this requires you to derive the stiffness matrix of a beam
with a variable cross section.
7. Prove Eq. (6.2.10).
8. Repeat Example 6.2.1 doubling the left spring constant instead of the mass.
8. Use sequential linear programming to design the three-bar truss of Figure 6.1.2
subject to a yield stress constraint of ao and a minimum gage constraint on all
members of O.lp/ao.
9. Repeat the previous problem with the reciprocal approximation.

6.9 References

[1] Schmit, L.A. Jr., and Farshi, B., "Some Approximation Concepts for Structural
Synthesis," AIAA Journal, 12, 5, 692-699, 1974.
[2] Mills-Curran, W.C., Lust, R.V., and Schmit, L.A. Jr., "Approximation Methods
for Space Frame Synthesis," AIAA Journal, 21 (11),1571-1580,1983.
[3] Storaasli, 0.0., and Sobieszczanski, J., "On the Accuracy of the Taylor Approx-
imation for Structure Resizing," AIAA Journal, 12 (2), 231-233, 1974.

249
Chapter 6: Aspects of The Optimization Process in Pmctice
[4J Noor, A.K., and Lowder, H.E., "Structural Reanalysis via a Mixed Method,"
Computers and Structures, 5, 9-12,1975.
[5J Fuchs, M.B., "Linearized Homogeneous Constraints in Structural Design," Int.
J. Mech. Sci., 22, pp. 33-40, 1980.
[6J Fuchs, M.B., and Haj Ali, R.M., "A Family of Homogeneous Analysis Models
for the Design of Scalable Structures," Structural Optimization, 2, pp. 143-152,
1990.
[7J Starnes, J.H. Jr., and Haftka, R.T., "Preliminary Design of Composite Wings for
Buckling, Stress and Displacement Constraints," Journal of Aircraft, 16,564-570,
1979.
[8J Haftka, R.T., and Shore, C.P., "Approximate Methods for Combined Thermal-
Structural Analysis," NASA TP-1428, 1979.
[9J Prasad, B., "Explicit Constraint Approximation Forms in Structural Optimiza-
tion-Part l:Analyses and Projections," Computer Methods in Applied Mechan-
ics and Engineering, 40 (1), 1-26, 1983.
[10J Braibant, V., and Fleury, C., "An Approximation Concept Approach to Shape
Optimal Design," Computer Methods in Applied Mechanics and Engineering, 53,
pp. 119-148,1985.
[11] Prasad, B., "Novel Concepts for Constraint Treatments and Approximations in
Efficient Structural Synthesis," AIAA J., 22, 7, pp. 957-966, 1984.
[12J Woo, T.H., "Space Frame Optimization Subject to Frequency Constraints,"
AIAA J. 25, 10, pp. 1396-1404,1987.
[13J Schmit, L.A., Jr., and Miura, H., "Approximation Concepts for Efficient Struc-
tural Synthesis," NASA CR-2552, 1976.
[14J Lust, R.V., and Schmit, L.A., Jr., "Alternative Approximation Concepts for Space
Frame Synthesis," AIAA J., 24, 10, pp. 1676-1684,1986.
[15J Salajeghah, E., and Vanderplaats G.N., "An Efficient Approximation Method for
Structural Synthesis with Reference to Space Structures," Space Struct. J., 2, pp.
165-175, 1986/7.
[16J Kodiyalam, S., and Vanderplaats G.N., "Shape Optimization of 3D Continuum
Structures Via Force Approximation Technique," AIAA J., 27 (9), pp. 1256-1263,
1989.
[17J Hansen, S. R., and Vanderplaats G.N., "Approximation Method for Configuration
Optimization of Trusses," AIAA J., 28 (1), pp. 161-168, 1990.
[18J Box, G.E.P., and Draper, N.R., Empirical Model-Building and Response Surface,
Wiley, New York, 1987.
[19J Barthelemy, J.-F., and Haftka, R.T., "Recent Advances in Approximation Con-
cepts for Optimum Structural Design," NASA TM 104032, 1991.

250
Section 6.9: References

[20] Haftka, RT., Nachlas, J.A., Watson, L.T., Rizzo, T., and Desai, R, "Two-Point
Constraint Approximation in Structural Optimization," Computer Methods in
Applied Mechanics and Engineering, 60, pp. 289-301, 1989.
[21] Fadel, G.M., Riley, M.F., and Barthelemy, J.-F.M., "Two Point Exponential Ap-
proximation Method for Structural Optimization," Structural Optimization, 2,
pp. 117-124,1990.
[22] Haftka, RT., "Combining Local and Global Approximations," AIAA Journal,
Vol. 29 (9), pp. 1523-1525, 1991.
[23] Chang, K.-J., Haftka, RT., Giles, G.L., and Kao, P.-J., "Sensitivity Based Scaling
for Correlating Structural Response from Different Analytical Models," AIAA
Paper 91-0925, Proceedings of AIAA/ ASME/ ASCE/ AHS/ ASC 32nd Structures,
Structural Dynamics and Materials Conference, Baltimore, MD, April 8-10, 1991.
[24] Kirsch, U., and Taye, S., "High Quality Approximations of Forces for Optimum
Structural Design," Computers and Structures, 30,3, pp. 519-527, 1988.
[25] Haley, S.B., "Solution of Modified Matrix Equations," SIAM J. Numer. Anal., 24
(4), pp. 946-951, 1987.
[26] Fuchs, M.B., and Steinberg, Y., "An Efficient Approximate Analysis Method
Based on an Exact Univariate Model for the Element Loads", Structural Opti-
mization,3 (1), 1991.
[27] Holnicki-Szulc, J., Virtual Distortion Method, Springer Verlag, Berlin, pp. 30-40,
1991.
[28] Pritchard, J.I., and Adelman, H.M., "Differential Equation Based Method for
Accurate Approximation in Optimization," AIAA/ ASME/ ASCE/ AHS/ ASC 31st
Structures, Structural Dynamics and Materials Conference, Long Beach, CA,
April 2-4, Part I, pp. 414-424, 1990.
[29] Murthy, D.V., and Haftka, RT., "Approximations to Eigenvalues of Modified
General Matrices," Computers and Structures, 29, pp. 903-917, 1988.
[30] Shephard, M.S., and Yerry, M.A., "Automatic Finite Element Modeling for Use
with Three-Dimensional Shape Optimization," in The Optimum Shape (Bennett,
J.A., and Botkin M.E., eds.), Plenum Press, N.Y. 1986, pp. 113-135.
[31] Yang, RJ., and Botkin, M.E., "A Modular Approach for Three-Dimensional
Shape Optimization of Structures," AlA A J., 25 (3), pp. 492-497, 1987.
[32] Kohn, RV., and Strang, G., "Optimal Design and Relaxation of Variational
Problems," Comm. Pure Appl. Math., 39, pp. 113-137 (Part I), pp. 139-182
(Part II), and pp. 353-377 (Part III), 1986.
[33] Rozvany, G.I.N., Ong, T.G., Szeto, W.T., Olhoff, N., and Bends~e, M.P., "Least-
Weight Design of Perforated Plates," Int. J. Solids Struct., 23, pp. 521-536 (Part
I), and pp. 537-550 (Part II), 1987.
251
Chapter 6: Aspects of The Optimization Process in Practice

[34] Bends0e, M.P., and Kikuchi, N., "Generating Optimal Topologies in Structural
Design using a Homogeneization Method," Compo Meth. Appl. Mech. Engng.,
71, pp.197-224, 1988.
[35] Rasmussen, J., "Shape Optimization and CAD," SARA, 1,33-45, 1991.
[36] Dorn, W.S., Gomory, R.E., and Greenberg, H.J., "Automatic Design of Optimal
Structures," J. Mecanique, 3, pp. 25-52, 1964.
[37] Sheu, C.Y., and Schmit, L.A., "Minimum Weight Design of Elastic Redundant
Trusses under Multiple Static Loading Conditions," AIAA, J., 10 (2), pp. 155-
162, 1972.
[38] Reinschmidt, K.F., and Russel, A.D., "Applications of Linear Programming in
Structural Layout and Optimization," Comput. Struct., 4, pp. 855-869, 1974.
[39] Topping, B.H.V., "Shape Optimization of Skeletal Structures-a Review," ASCE
J. Struct. Enging., 109 (8), pp. 1933-1951,1983.
[40] Kirsch, U., "Optimal Topologies of Structures," Appl. Mech. Rev., 42 (8), pp.
223-239, 1989.
[41] McCullers, L.A., and Lynch, R.W., "Composite Wing Design for Aeroelastic Tai-
loring Requirements," Air Force Conference on Fibrous Composites in Flight
Vehicle Design, Dayton, Ohio, September, 1972.
[42] McCullers, L.A., and Lynch, R.W., "Dynamic Characteristics of Advanced Fila-
mentary Composites Structures," AFFDL-TR-73-111, Vol. II, 1974.
[43] Haftka, R.T., "Structural Optimization with Aeroelastic Constraints-A Survey
of US Applications," Int. J. Vehicle Design, 7, pp. 381-392, 1986.
[44] Schmit, L.A., and lvIiura, H., "A New Structural Analysis / Synthesis Capability
- Access I, AIAA J., 14 (5), pp. 661-671,1976.
[45] Fleury, C., and Schmit, L.A., "ACCESS 3-Approximation Concepts Code for Ef-
ficient Structural Synthesis--User's Guide," NASA CR-159260, September 1980.
[46] Wilkinson, K., et al., "An Automated Procedure for Flutter and Strength Anal-
ysis and Optimization of Aerospace Vehicles, Vol. I-Theory, Vol. II-Program
User's Manual," AFFDL-TR-75-137, 1975.
[47] Venkayya, V.B., and Tischler, V.A., "OPSTAT-A Computer Program for Opti-
mal Design of Structures Subjected to Static Loads," AFFDL-TR-79-67,1979.
[48] Khot, N.S., "Computer Program (OPTCOMP) for Optimization of Composite
Structures for Minimum Weight Design," AFFDL-TR-76-149, 1977.
[49] Gellatly, R.A., Dupree, D.M., and Berke, L., "OPTIMUM II: A ~vIAGIC Com-
patible Large Scale Automated Minimum Weight Design Program," AFFDL-TR-
74-97, Vols. I and II, 1974.
252
Section 6.9: References

[50] Isakson, G., and Pardo, H., "ASOP-3: A Program for the Minimum Weight Design
of Structures Subjected to Strength and Deflection Constraints," AFFDL-TR-76-
157, 1976.
[51] Bartholomew, P., and Wellen, H.K., "Computer Aided Optimization of Aircraft
Structures," J. Aircraft, 27 (12), pp. 1079-1086,1990.
[52] Kiusalaas, J., and Reddy, G.B., "DESAP 2-A Structural Design Program with
Stress and Buckling Constraints," NASA CR-2797 to 2799, 1977.
[53] Haftka, R.T., and Prasad, B., "Programs for Analysis and Resizing of Complex
Structures," Comput. Struct., 10, pp. 323-330, 1979.
[54] Sobieszczanski-Sobieski, J., and Rogers, J.L., Jr., "A Programming System for
Research and Applications in Structural Optimization," Int. Symposium on Op-
timum Structural Design, Tucson, Arizona, pp. 11-9-11-21, 1981.
[55] Walsh, J.L., "Application of Mathematical Optimization Procedures to a Struc-
tural Model of a Large Finite-Element Wing," NASA TM-87597, 1986.
[56] Vanderplaats, G.N., "CONMIN- A Fortran Program for Constrained Function
Minimization: User's manual," NASA TM X-62282, 1973.
[57] Brama, T., "Applications of Structural Optimization Software in the Design Pro-
cess," in Computer Aided Optimum Design of Structures: Applications, (Eds, C.
A. Brebbia and S. Hernandez), Computational Mechanics Publications, Springer-
Verlag, 1989, pp. 13-21.
[58] Neill, D.J., Johnson, E.H., and Canfield, R., "ASTROS-A Multidisciplinary Au-
tomated Structural Design Tool," J. Aircraft, 27,12, pp. 1021-1027,1990.
[59] Atrek, E., "SHAPE: A Program for Shape Optimization of Continuum Struc-
tures," in Computer Aided Optimum Design of Structures: Applications, (Eds, C.
A. Brebbia and S. Hernandez), Computational Mechanics PubliC"ations, Springer-
Verlag, 1989, pp. 135-144.
[60] Hariran, M., Paeng, J.K., and Belsare, S., "STROPT-the Structural Optimiza-
tion System," Proceedings of the 7th International Conference on Vehicle Struc-
tural Mechanics, Detroit, MI, April 11-13, 1988, SAE, pp. 27-38.
[61] Vanderplaats, G.N., Miura, H., Nagendra, G., and Wallerstein, D., "Optimization
of Large Scale Structures using MSCjNASTRAN," in Computer Aided Optimum
Design of Structures: Applications, (Eds, C. A. Brebbia and S. H('rnand('z), Com-
putational Mechanics Publications, Springer-Verlag, 1989, pp. 51-68.
[62] Ward, P. and Cobb, W.G.C., "Application of I-DEAS Optimization for the Static
and Dynamic Optimization of Engineering Structures," in Computer Aid('d Opti-
mum Design of Structures: Applications, (Eds, C. A. Brebbia and S. Hernandez),
Computational Mechanics Publications, Springer-Verlag, 1989, pp. 33-50.
63] GENESIS User's Manual (version 1.00), VMA Engineering, Goleta, California,
September, 1991.
253
Chapter 6: Aspects of The Optimization Process in Practice
[64J Vanderplaats, G.N., "ADS: A FORTRAN Program for Automated Design Syn-
thesis", VMA Engineering, Inc. Goleta, California, May 1985.
[65] DOT User's Manual (version 2.0B), VMA Engineering, Inc. Goleta, California,
Sept. 1990.
[66] DOC User's manual (version 1.00), VMA Engineering, Inc. Goleta, California,
March 1991.
[67] Miura, H., and Schmit, L.A., Jr., "NEWSUMT-A Fortran Program for Inequal-
ity Constrained Function Minimization-User's Guide," NASA CR-159070, June,
1979.
[68] Grandhi, R.V., Thareja, R., and Haftka, R.T., "NEWSUMT-A: A General Pur-
pose Program for Constrained Optimization Using Constraint Approximations,"
ASME Journal of Mechanisms, Transmissions and Automation in Design, 107,
pp. 94-99, 1985.
[69] Arora, J.S. and Tseng, C.H., "User Manual for IDESIGN: Version 3.5, Optimal
Design Laboratory, College of Engineering, The University of Iowa, Iowa City,
1987.
[70] Fleury, C., and Schmit, L.A. Jr., "Dual Methods and Approximation Concepts
in Structural Synthesis," NASA CR-3226, December, 1980.

254
Sensitivity of Discrete Systems 7

The first step in the analysis of a complex structure is spatial discretization of the
continuum equations into a finite element, finite difference or a similar model. The
analysis problem then requires the solution of algebraic equations (static response),
algebraic eigenvalue problems (buckling or vibration) or ordinary differential equa-
tions (transient response). The sensitivity calculation is then equivalent to the math-
ematical problem of obtaining the derivatives of the solutions of those equations with
respect to their coefficients. This is the main subject of the present chapter.
In some cases it is advantageous to differentiate the continuum equations govern-
ing the structure with respect to design variables before the process of discretization.
One advantage is that the resulting sensitivity equations are equally applicable to
various analysis techniques, whether finite element, Ritz solution, collocation, etc.
This approach is discussed in the next chapter.
As noted in chapter 6, the calculation of the sensitivity of structural response to
changes in design variables is often the major computational cost of the optimization
process. Therefore, it is important to have efficient algorithms for evaluating these
sensitivity derivatives.
The sensitivity of structural response to problem parameters also has other ap-
plications. For example, it is usually impossible to know all the parameters of a
structural model, such as material properties, loads and dimensions exactly. The
sensitivity of the response to small variations in these parameters is essential for
calculating the statistical variation in the response of the structure.
The simplest technique for calculating derivatives of response with respect to a
design variable is the finite-difference approximation. This technique is often com-
putationally expensive, but is easy to implement and very popular. The efficiency of
the analytical methods discussed in the present chapter is measured by comparison
to the finite-difference alternative. Unfortunately, finite-difference approximations
often have accuracy problems. We begin this chapter with a discussion of these
approximations to sensitivity derivatives.

255
Chapter 7: Sensitivity of Discrete Systems
7.1 Finite Difference Approximations

The simplest finite difference approximation is the first-order forward-difference


approximation. Given a function u(x) of a design variable x, the forward-difference
approximation 6.u/6.x to the derivative du/dx is given as

6.u u(x + 6.x) - u(x)


(7.1.1)
6.x 6.x
Another commonly used finite-difference approximation is the second-order central-
difference approximation

6.1l u(x + 6.x) - u(x - 6.x)


(7.1.2)
6.x 26.x
It is also possible to employ higher-order finite-difference approximations, but they
are rarely used in structural optimization applications because of the associated high
computational cost. If we need to find the derivatives of the structural response
with respect to n design variables the forward-difference approximation requires n
additional analyses, the central-difference approximation 2n additional analyses, and
higher order approximations are even more expensive.
The key to the selection of the approximation and the step size 6.x is an estimate
of the required accuracy. This topic is discussed in [1] and [2], and is summarized in
the following section.

7.1.1 Accuracy and Step Size Selection

\Vhenever finite-difference formulae are used to approximate derivatives, there are


two sources of error: truncation and condition errors. The truncation error eT(6.x)
is a result of the neglected terms in the Taylor series expansion of the perturbed
function. For cxample, the Taylor series expansion of u(x + 6.x) can be writ.t.en as

du (6.x)2J2'u
u(:r + 6.x) = u(x) + 6.x-(x)
dx
+ -- - 12(x + (6.x),
2 C:f
OS;(S;1. (7.1.3)

From Eq. (7.1.3) it follows that the truncation error for the forward-difference ap-
proximation is
(7.1.4)

Similarly, by including one more term in the Taylor series expansion we find that the
truncation error for the central difference approximation is

-1 S;(:::; 1. (7.1.5)

Thc condition error is the difference between the numerical evaluation of the function
and its exact value. One contribution to the condition error is round-off error in

256
Section 7.1: Finite Difference Approximations

calculating du / dx from the original and perturbed values of u. This contribution


is comparatively small for most computers unless ~x is extremely small. However
if u(x) is computed by a lengthy or ill-conditioned numerical process, the round-off
contribution to the condition error can be substantial. Additional condition errors
may occur if u(x) is calculated by an iterative process which is terminated early.
If we have a bound Eu on the absolute error in the computed function u, we can
estimate the condition error. For example, for the forward-difference approximation
the condition error ec(6.x) is (very!) conservatively estimated from Eq. (7.1.1) &<;
2
ec(6.x) = ~x Eu· (7.1.6)

Equations (7.1.4) and (7.1.6) present us with the so called "step-size dilemma." If we
select the step size to be small, so as to reduce the truncation error, we may have an
excessive condition error. In some cases there may not be any step size which yields
an acceptable error!
Example 7.1.1

Suppose the function u(x) is defined as the solution of the following two equations
101u+xv=10,
xu + 100v = 10,
and let us consider the derivative du/ dx evaluated at x = 100.

0.0

-0.1

du/dx
o
-0.2 Central difference o
o Forward difference

o
-0.3
0.00001 0.0001 0.001 0.01 0.1 1
Step Size
Figure 7.1.1 Effect of step size on derivative.

257
Chapter 7: Sensitivity '7 ...Jiscrete Systems

The solution for u is


-lOx + 1000
u=
10100 - x 2 '

and the exact value of du/dx at x = 100 is -0.10. The forward-difference and central-
difference derivatives are plotted in Figure 7.1.1 for a range of step sizes. Note that
for the very small step sizes the error oscillates because the condition error is not a
continuous function. For the higher step sizes the total error is dominated by the
truncation error which is a smooth function of the step size. We can change the
problem slightly to make it more ill-conditioned, and increase the condition error as
follows
10001u + xv = 1000,
xu + 10000v = 1000 .
The values of the forward- and central-difference approximations at x = 10000 are
shown in Figure 7.1.2. Now the range of acceptable step sizes is narrowed and we have
to use the central-difference approximation if we want to have a reasonable range .•••

0.2

0.0

du/dx

-0.2
Central difference
o Forward difference o

o
-0.4
0.001 0.01 0.1 1
Step Size

Figure 7.1.2 Effect of step size on derivative.

A bound e on the total error- the sum of the truncation and condition errors-
for the forward-difference approximation is obtained from Eqs. (7.1.4) and (7.1.6)
as
(7.1.7)

258
Section 7.1: Finite Difference Approximations

where Sb is a bound on the second derivative in the interval [x, x + Ax). When IOu and
Sb are available it is possible to calculate an optimum step-size that minimizes e as

AXopt
~
=2V~. (7.1.8)

Procedures for estimating Sb and IOu are given in [1) and [2).
7.1.2 Iterative Methods

Condition errors can become important when iterative methods are used for per-
forming some of the calculations. Consider a simple example of a single displacement
component u which is obtained by solving a nonlinear algebraic equation which de-
pends on one design variable x
f(x,u)=O. (7.1.9)
The solution of Eq. (7.1.9) is obtained by an iterative process which starts with
some initial guess of u and terminates when the iterate u is estimated to be within
some tolerance 10 of the exact u (Note that 10 is a bound on the condition error in
u). To calculate the derivative du/dx, assume that we use the forward-difference
approximation. That is, we perturb x by Ax and solve Eq. (7.1.9) for U6
f(x + Ax, U6) = O. (7.1.10)
The iterative solution of Eq.(7.1.10) yields an approximation U6, and then du/dx is
approximated as
(7.1.11)
To start the iterative process for obtaining U6, we can use either of two initial guesses.
The first is the same initial guess that was used to solve for u. If the convergence
of the iterative process is monotonic there is a good chance that when we use Eq.
(7.1.11) the errors in u and U6 will almost cancel out, and we will get a very small
condition error. The other logical initial guess for U6 is u. This initial guess is good if
Ax is small, and so we may get fast convergence. Unfortunately, this time we cannot
expect the condition errors to cancel. As we iterate on U6, the original error (the
difference between u and u) will be reduced at the same time that the change due to
Ax is taking effect. (Consider, for example, what happens if Ax is set to zero, or an
extremely small number).
Reference [3) suggests a strategy which allows us to start the iteration for U6 from
u without worrying about excessive condition errors. The approach is to pretend that
u is the exact rather than approximate solution by changing the problem that we want
to solve. Indeed, u is the exact solution of
f(x,u) - f(x,u) = 0, (7.1.12)
which is only slightly different from our original problem (because f(x, u) is almost
zero). We now find the derivative du/dx from Eq.(7.1.12), by obtaining U6 as the
solution of
f(x + Ax, U6) - f(x, u) = o. (7.1.13)
Because u is the exact solution of this equation for Ax = 0 the iterative process will
only reflect the effect of Ax .
259
Chapter 7: Sensitivity of Discrete Systems
Example 7.1.2

Consider the nonlinear equation

f(u,x)=u 2 -x=0,

and the iterative solution process

Urn = 0.5(U m _l + xjurn-d,


which is an application of Newton's method to the square-root problem and therefore
has quadratic convergence properties.
Table 7.1.1 Iteration history starting with u =x
x = 1000 x +.6.x = 1000.1 x +.6.x = 1100
Iter. U f u~ f .6.u/ .6. x u~ f .6.uj .6. x
0 1000.00 999,000 1000.10 999,000 0.99850 1100.00 1,208,000 1.00000
1 500.500 250,000 500.550 250,000 0.49800 550.500 302,000 0.50000
2 251.249 62,100 251.274 62,100 0.24900 276.249 75,200 0.25000
3 127.615 15,300 127.627 15,300 0.12450 140.115 18,500 0.12500
4 67.7253 3,590 67.7315 3,590 0.06225 73.9380 4,370 0.06258
5 41.2454 701.2 41.2486 701.3 0.03174 44.4256 873.6 0.03180
6 32.7453 72.25 32.7471 72.27 0.01862 34.5930 96.68 0.01848
7 31.6420 1.216 31.6436 1.217 0.01587 33.1957 1.954 0.01553
8 31.6228 -0.005 31.6244 0.000 0.01587 33.1663 0.0007 0.01543

Exact values u(x = 1000) = 31.6228; du/dx = 0.01581


Table 7.1.1 shows the convergence of u for x = 1000, x = 1000.1 and x = 1100,
and the estimate of the derivative duj dx at x = 1000. The first guess for u is taken to
be x in all three cases. Note that far from the solution the convergence is slow with
the error being halved at each iteration. As the error gets smaller the convergence
rate increases. It is seen that the convergence of the derivative is slightly slower than
that of u. Also, we do not see that the small .6.x leads to any large condition errors
as compared to the large .6.x. This is due to the monotonic convergence and the
resulting cancellation of condition errors.
Now we switch the first guess ofthe perturbed solution to an iterate of the nominal
one. Starting the perturbed solution from a good approximation to the nominal
solution we obtain fast convergence; usually we need only one or two iterations.
Therefore, the value of the finite-difference derivative remains virtually constant after
the first two iterations. Table 7.1.2 shows the second iterate U2 obtained when the
perturbed solution is started from each ofthe last four iterates of the nominal solution
given in Table 7.1.1.
Inspection of Table 7.1.2 shows that, because the perturbed solution is more ac-
curate than the nominal one, the derivative obtained by finite differences is erroneous,

260
Section 7.1: Finite Difference Approximations
Table 7.1.2 Effect of starting Ut., from Uo

x + ~x = 1000.1 x + ~x = 1100
~u/ ~x U2 ~u/ ~x
41.2454 31.6436 -96.0181 33.1755 -0.08070
32.7453 31.6244 -11.2093 33.1662 0.00421
31.6420 31.6243 - 0.1772 33.1663 0.01524
31.6228 31.6243 0.01572 33.1663 0.01543

tuo are iterates from Table 7.1.1.

except at very high accuracies (low c:). The effect of the finite difference increment
~x is also evident. The errors for the small ~x are larger than for the larger ~x,
except when uo has fully converged (so that there is no condition error).
We now use the approach of 7.1.13, replacing the original equation by
u2 - X - f = 0,

where 1 is the residual of the last iterate of the nominal solution. That is, for the
perturbed solution we try to calculate the root of x + f instead of x. The results
of the modified calculation are shown in Table 7.1.3. We can now get a reasonable
approximation to the derivative in two iterations .•••
Table 7.1.3 Modified derivative calculation
x + ~x = 1100 x + ~x = 1000.1
uo ~u/~x ~u/~x
41.2454 42.4404 0.01195 41.2466 0.01205
32.7453 34.2382 0.01493 32.7468 0.01511
31.6420 33.1846 0.01543 31.6436 0.01572
31.6228 33.1663 0.01543 31.6243 0.01572

Cost and accuracy considerations often dictate that we avoid the use of finite-
difference derivatives. For static displacement and stress constraints analytical deriva-
tives are fairly easy to get, as discussed in the next section.

7.1.3 Effect of Derivative Magnitude on Accuracy

It is well known that small displacements and stresses are not calculated as accurately
as large stresses and displacements. The same applies to derivatives. When both the
function u and the variable x are positive, the relative magnitUde of the derivative
can be estimated from the logarithmic derivative
diU d(logu) du/u
dx = d(logx) = dx/x' (7.1.14)

The logarithmic derivative gives the percentage change in u due to a percent change in
x. Therefore, when the logarithmic derivative is larger than unity the relative change

261
Chapter 7: Sensitivity of Discrete Systems

in u is larger than the relative change in x and the derivative can be considered to
be large. When the logarithmic derivative is much smaller than unity, the relative
change in u is much smaller than the relative change in x. In this case the derivative
is considered to be small, and in general, it would be difficult to evaluate it accmately
using finite-difference differentiation (or any other procedure subject to condition or
truncation errors). Fortunately, when the logarithmic derivative is small it is usually
not important to evaluate it accurately, because its influence on the optimization
process is small.
The logarithmic derivative can be misleading when a variable is about to change
sign so that it is very small in magnitude. In that case we recommend using typical
values of u and x instead of local values. That is, we define a modified logarithmic
derivative dim U / dx as
dlmu dU/Ut
(7.1.15)
dx dx/xt'
where Xt and Ut are representative values of the variable and the function, respectively.

Example 7.1.3

The increased error associated with small derivatives is demonstrated in the following
simple design problem. We consider the design of a submerged beam of rectangular
cross section so as to minimize the perimeter of the cross section (so as to reduce
corrosion damage). The beam is subject to a bending moment M and we require the
maximum bending stress to be less than the allowable stress ao. The design variables
are the width b and height h of the rectangular cross-section. The problem can be
formulated as
minimize 2(b + h),
6M
such that bh 2 :::; ao·
We nondimensionalize the problem by defining a characteristic length I and using it
to define new design variables Xl and X2 as

Xl = b/l, X2 = h/I.

In terms of the new variables the problem can be reformulated as

minimize U = Xl + X2,
1
such that --2=1,
xlx2

where the inequality has been replaced by an equality because it is clear that the
stress constraint will be active (otherwise the solution is b = h = 0). The equality
can be used to eliminate Xl, so that the objective function can be written as

U = 1/ X~ + .1:2 •
262
Section 7.2: Sensitivity Derivatives of Static Displacement and Stress Constraints
We now consider the calculation of the derivative by finite differences at two points;
at an initial design where X2 = 1, and near the optimum, at X2 = 1.29. In both cases
we use forward differences with ~X2 = 0.01. At X2 = 1 we get

~U = 1/1.01 2 + 1.01 - 2 = -0.970,


~X2 0.01
which is 3 percent off the exact value of the derivative du/ dX2 = -1.0. However, at
X2 = 1.29 we get

~u 1/1.30 2 + 1.30 - (1/1.29 2 + 1.29)


~
~X2
= 0
.01
= 0.0791 ,
which is 16 percent off the exact value of 0.0683. The logarithmic derivative can
warn us that we should expect the large relative error in the second case. Indeed, for
X2 = 1, we have u = 2.0, and the logarithmic derivative is estimated from the finite
difference derivative to be
diU ~IU ~u X2
- ~ - = - - = -0.97 x 1/2 = -0.485.
dX2 ~X2 ~X2 U

At X2 = 1.29 we have u = 1.891 and


diU ~IU ~u X2
-d ~ ~ = ~- = 0.0791 x 1.29/1.891 = 0.054,
X2 ~X2 ~X2 U

so that the logarithmic derivative is indeed quite small. •••

7.2 Sensitivity Derivatives of Static Displacement and Stress Constraints

7.2.1 Analytical First Derivatives

The equations of equilibrium in terms of the nodal displacement vector u are gener-
ated from a finite element model in the form

Ku=f, (7.2.1 )
where K is the stiffness matrix and f is a load vector. A typical constraint, involving
a limit on a displacement or a stress component, may be written as

g(u,x)~O, (7.2.2)
where, for the sake of simplified notation, it is assumed that g depends on only a
single design variable x. Using the chain rule of differentiation, we obtain

(7.2.3)

263
Chapter 7: Sensitivity of Discrete Systems

where z is a vector with components

ag
Z;=-. (7.2.4)
au;
Note that we use the notation dg/dx to denote the total derivative of 9 with respect
to x. This total derivative includes the explicit part ag/ax plus the implicit part
through the dependence on u. The explicit part of the derivative is usually zero or
easy to obtain, so we discuss only the computation of the implicit part. Differentiating
Eq. (7.2.1) with respect to x we obtain

K du = df _ dK u . (7.2.5)
dx dx dx
Premultiplying Eq. (7.2.5) by zTK-l obtain

zT du = zTK-1(df _ dK u }. (7.2.6)
dx dx dx

Numerically, the calculation of ZT du/dx may be performed in two ways. The


first, called the direct method, consists of solving Eq. (7.2.5) for du/dx and then
taking the scalar product with z. The second approach, called the adjoint method,
defines an adjoint vector A which is the solution of the system

KA=Z, (7.2.7)

and then we write Eq. (7.2.3)as

(7.2.8)

where we have used the symmetry of K.


The solution of Eq. (7.2.7) for A is similar to a solution for displacement under a
load vector z. The adjoint method is also known as the dummy-load method because
z is often described as a dummy load. When 9 in Eq. (7.2.2) is an upper limit on a
single displacement component, the dummy load also has a single nonzero component
corresponding to the constrained displacement component. Similarly, when 9 is an
upper limit on the stress in a truss member, the dummy load is composed of a pair
of equal and opposite forces acting on the two ends of the member.
For this case of static response the derivation of the adjoint technique is very
simple. However the technique will be used in many other cases where we will want to
calculate the derivative of a constraint without having to calculate first the derivative
of the response u. We repeat the derivation of the adjoint method in a procedure that
is applicable to the general case. This procedure consists of adding the derivative of
the equations of equilibrium multiplied by a Lagrange multiplier to the derivative of
the constraint. The Lagrange multiplier, which is equal to the adjoint vector, is then
264
Section 7.2: Sensitivity Derivatives of Static Displacement and Stress Constmints

selected to satisfy equations that lead to elimination of the derivative of the response.
For the present case we rewrite Eq. (7.2.3) as

dg _ og + ZT du + oXT(df _ dK u _ Kdu) (7.2.9)


dx - ox dx dx dx dx '
where the additional term is the adjoint vector times the derivative of the equations
of equilibrium. Rearranging the terms in Eq. (7.2.9) we have

dg = og + (zT _ oXTK)du + oXT(df _ dK u ). (7.2.10)


dx ox dx dx dx
If we want to eliminate du/ dx from this expression we need to select oX so as to
eliminate its coefficient, which gives us Eq. (7.2.7) for oX. The remaining terms are
the same as Eq. (7.2.8) for the derivative of the constraint.
Example 7.2.1

In this example, we calculate the sensitivity derivative of a constraint on the tip


displacement of a stepped cantilever beam with respect to the moment of inertia II
and the length II.

Figure 7.2.1 Beam example for derivatives of static response.

The constraint on the tip displacement is posed as


9 = c- Wtip ~ 0.
The problem is simple and has an analytical solution based on elementary beam
theory, namely
P (Z3 Z2Z [ [2) pl~
Wtip = 3EI1 1 + 3 1 2 + 3 1 2 + 3EI2 '

so that
:~ = 3:I?(I~ + 31rl2 + 3111~),
;~ = - 3;11(31r + 61112 + 31~) = - ;It (II + 12)2 .
265
Chapter 7: Sensitivity of Discrete Systems

The finite element solution is based on a standard cubic beam element, with one
element used for each section. We denote the displacement and rotation at the ith
node by Wi and 8i , respectively. The element stiffness matrix is
12 61 -12
Ke _ EI [ 61 412 -61
- [3 -12 -61 12
61 212 -61
so that the global stiffness matrix, corresponding to degrees of freedom W2, 82 , w3,
83 , is

K = E [
12(h/lt + 12/m -6(h/li - 12/1~)
4(Idh + h/12)
-12h/~
-6/2/1?
6/2/1~
212/122
1
1212/12 -612 /1 2 •
sym 4/2/12
The load vector f = [0, O,p, OJT, and the solution for the displacement vector is

12 -6/1
oK
oh
= (E)
IT
U
[ -611
0
4/i
o
o o

~(;,)a} ,
where the solution for W2 and 82 was used. Similarly,

oK (Eh)
-36 12h 0
= [ 1211 -41i 0
&11U It 0 0 0
o 0 0
-6(1 + 12/1d }
= (E)
11
{ 2(11 +
0
12 )
o
In the direct method

266
Section 7.2: Sensitivity Derivatives of Static Displacement and Stress Constmints
or

~
ah
{~:} _ K-1 {;:tfI1}
W3 - 0
__ ...L.
-
{~i:~/; q~~/3
Ell 1112 +lt12 +1 1/3
3 }
'
03 0 1112 + It/2
so that ag/aI1 = -aw3/ah, which agrees with the beam-theory result.
Similarly

In the adjoint method, ZT = -aWtip/ au = [0,0, -1,0]' and we can solve for the
adjoint vector

so that from Eq. (7.2.8)

ag = _,xTaK
ah ah u
= ~(Jl
Eh 3h
1?12 m2
+ 2h + 211 +
ltl~)
11
= ...L.(12[
Elf 1 2 +
[Z2
1 2 +
[3/3)
1 ,

and

•••
The difference between the computational effort associated with the direct
method and with the adjoint method depends on the relative number of constraints
and design variables. The direct method requires the solution of Eq. (7.2.5) once for
each design variable, while the adjoint method requires the solution of Eq. (7.2.7)
once for each constraint. Thus the direct method is the more efficient when the
number of design variables is smaller than the number of displacement and stress
constraints that need to be differentiated. The adjoint method is more efficient when
the number of design variables is larger than the number of these constraints.
In practical design situations we usually have to consider several load cases. The
effort associated with the direct method is approximately proportional to the number
of load cases. The number of critical constraints at the optimum design, on the other
267
Chapter 7: Sensitivity of Discrete Systems

hand, is usually less than the number of design variables. Therefore, in a multiple-
load-case situation the adjoint method becomes more attractive.
Both the direct and adjoint methods require the solution of a system of equations
as the major part of the computational effort. However, the factored form of the
matrix K of the equations is usually available from the solution of Eq. (7.2.1) for
the displacements. The solution for du/dx or A is therefore much cheaper than the
original solution of Eq. (7.2.1). This provides the major computational advantage of
these two analytical methods over the finite-difference calculation of the derivatives.
For example, the forward difference approximation to du/ dx

du u(x + ~x) - u(x)


dx ~ ~x (7.2.11)

requires the evaluation of u(x + ~x) by re-assembling the stiffness matrix and load
vector at the perturbed design and solving

K(x + ~x)u(x + ~x) = f(x + ~x). (7.2.12)


The required factorization of K( x + ~x) is typically much more expensive than a
solution for another right hand side with the already factored K(x) in Eqs. (7.2.5)
and (7.2.7). The advantage of the analytical methods over the finite-difference ap-
proximation becomes very pronounced for a large number of design variables.

7.2.2 Second Derivatives

In some applications (e.g., calculation of sensitivity of optimum solutions, see Section


5.4) we also need second derivatives of constraint functions with respect to the design
variables. In the following we obtain expressions for evaluating d2 g/dxdy where x
and yare design variables. For the sake of simplicity we assume that the constraint
ax
function 9 is not an explicit function of the design variables, so that og / and og / oy
are zero. More general expressions are to be found in [41.
As in the case of first derivatives we have a direct method and an adjoint method
for obtaining second derivatives. The direct method starts by differentiating Eq.
(7.2.3) with respect to y

~g _ zT ~u + du TR du (7.2.13)
dxdy - dxdy ( dx ) dy ,

where R is the matrix of second derivatives of 9 with respect to u, that is


02g
Tij=-O
Uj Uj
a . (7.2.14)

We obtain the second derivative of the displacement field by differentiating Eq. (7.2.5)

K d2 u = ~f _ ~K u _ dK du _ dK du .
(7.2.15)
dxdy dxdy dxdy dx dy dy dx
268
Section 7.2: Sensitivity Derivatives of Static Displacement and Stress Constraints

Solving Eq. (7.2.5) for du/dx, a similar equation for du/dy, and Eq. (7.2.15) for
d2 u/dxdy we finally substitute into Eq. (7.2.13).
The adjoint method starts by differentiating Eq. (7.2.8) with respect to y

~g _(dA f (8f _dKu)+AT( ~f _ ~Ku_dKdU) (7.2.16)


dxdy - dy 8x dx dxdy dxdy dx dy .

To evaluate the first term we differentiate Eq. (7.2.7) with respect to y

K dA = R du _ dK A . (7.2.17)
dy dy dy
Using Eqs. (7.2.5) and (7.2.17), Eq. (7.2.13) becomes

~g = (du)TRdu _AT(dKdu + dKdu _ ~f + d2 K u ). (7.2.18)


dxdy dy dx dy dx dx dy dxdy dxdy
In this case the adjoint method is always more efficient than the direct method.
Assume that we have n design variables and m constraint functions. The direct
method requires as its major computational effort the solution of Eq. (7.2.5) n times,
and the solution of Eq. (7.2.15) n(n + 1)/2 times. The adjoint method, on the other
hand, requires the solution of Eq. (7.2.5) n times for the first derivatives, and the
solution of Eq. (7.2.7) m times for the adjoint vectors.

7.2.3 The Semi-Analytical Method

Both the direct and adjoint methods require the derivatives of the stiffness matrix
and load vectors with respect to design variables. These derivatives are often difficult
to calculate analytically, especially for shape design variables which change element
geometry. For this reason a semi-analytical approach, where the derivatives of the
stiffness matrix and load vector are approximated by finite differences, is popular.
Typically, these derivatives are calculated by the first-order forward difference ap-
proximation, so that dK/dx is approximated as

dK K(x + ~x) - K(x)


(7.2.19)
dx ~ ~x .

However, while the semi-analytical method is as efficient as the analytical direct


or adjoint methods, it is based on finite-difference approximations, and may have
accuracy problems. Such accuracy problems can be particularly serious for derivatives
of beam and plate structures response with respect to geometrical parameters.
The accuracy problem was observed first in Ref. [5] for the car model shown in
Fig. (7.2.2) made of beam elements. The semi-analytical method was used success-
fully for all section size and most geometrical design variables. However, for some of
the derivatives with respect to the overall length dimensions of the car, there were
serious accuracy problems.

269
Chapter 7: Sensitivity of Discrete Systems

Figure 7.2.2 Stick model of a Car.

lOOOOO.-------------------------~

10000

1000
~
Z
~ 100
r
!

I
w
10

0.1

0.01 "r------.------"T------T------l
1.[-10 1.£-8 1.£-8 1.£-2
ft!LATIV[ SI[' SIZ!

Figure 7.2.3 Errors in the derivative of the strain energy with respect to a length
variable of the stick model for overall-finite-differences (OFD) and semi-analytical
(SA) methods.

Figure (7.2.3) shows the dependence of the relative error of the derivative of the
strain energy of the model with respect to one length variable in the semi-analytical
(SA) method and the overall finite difference (OFD) approach. For large step sizes,
the OFD method has smaller error (mostly truncation error) than the SA method.
The step-size range for which the approximate derivative has an error less than 1%

270
Section 7.2: Sensitivity Derivatives of Static Displacement and Stress Constraints
is much larger for the OFD than for the SA approximation. For small step sizes the
OFD method has a larger error (mostly condition error) than the SA method. Figure
(7.2.3) shows that, for a relative step size of 10- 7 , the SA method approximates well
the derivative. For some variables, however, there was no step size giving accurate
derivatives! To solve the accuracy problem the central difference approximation to
the derivative of the stiffness matrix had to be used, which increased substantially
the computational cost.
IOOOOOr-------------------------~

~
'"...

O.I~--~--_r--~----r_--~--~
1.(-10 1.£-1 1.(-1 1.1-1 I.E-I 1.(-5 1.1-'
RELA11vE STEP SIZE

Figure 7.2.4 Forward- and central-difference SA approximation of the derivative of


the strain energy with respect to a second length variable of car stick model.

Figure (7.2.4) compares the forward- and central-difference approximations of


the derivative with respect to a second length variable. We can clarify the cause of
the high truncation errors associated with the semi-analytical method by considering
Eq. (7.2.5) carefully. The right hand side of the equation, sometimes referred to
as the pseudo load, is the 'load' that has to be applied to the structure to produce
a displacement field du/dx. For beam and plate structures the derivative of the
displacement field with respect to geometrical variables is usually not a legitimate
displacement field (for example, it may grossly violate the Kirchhoff assumption).
The finite element approximation to this illegitimate field is a valid, though highly
unusual, displacement field, which requires large self-cancelling components in the
pseudo load. As the finite-element mesh is refined, the pseudo load required to
generate du/ dx acquires ever larger self-cancelling components. Thus the errors in
the pseudo load due to the finite difference derivative of the stiffness matrix can be
greatly magnified.

271
Chapter 7: Sensitivity of Discrete Systems

I~~----------------------------------~

uoo

1300

; 1000
I'j
•w
L
:! lOa
!
...
Ii 100

'00

aoo

0
a s 10 IS 2fI
MINIER Of !LlIII IITS

Figure 7.2.5 Errors in the semi-analytical (SA) and overall-finite-difference (OFD)


approximations to the derivative of tip displacement with respect to cantilever beam
length (one percent step size).

This phenomenon is demonstrated in Fig. (7.2.5) which shows that the error in
the derivative of the tip displacement of a cantilever beam with respect to the length
of the beam greatly increases as the finite-element mesh is refined.
When a beam or a plate structure is modeled by more general elements, such as
three dimensional elements, mesh refinement is no problem. However, as the beam
becomes more slender or the plate thinner, the displacement-derivative field becomes
more and more incompatible with the geometry, and the same accuracy problems
ensue. Reference [6] reports very large errors for beams modeled by truss, plane-
stress and solid elements for slenderness ratios larger than ten.

Example 7.2.2

We repeat the calculation of derivatives in Example 7.2.1 to compare the errors


associated with the finite-difference and semi-analytical methods. Using forward
differences we find
8g ~ wtip(h + !lId - Wtip(Il)
8h ~ - !lIl '
the truncation error, eT, given by Eq. (7.1.4) is approximately

272
Section 7.2: Sensitivity Derivatives of Static Displacement and Stress Constraints

and the relative truncation error is

Therefore, it is enough to take D.ld h = 10- 3 to get a negligible truncation error.


Similarly, the truncation error for the derivative with respect to II is approximately

eT D.it
~ - 11 + 12 '

and it is enough to take a perturbation in 11 to be 0.001l1' The error analysis for


the semi-analytical method is more complicated. The derivative with respect to the
moment of inertia is approximated as

8g ~ ,\TK(II + D.h) - K(Id u ,


811 D.ll
and the truncation error vanishes

because K is a linear function of II' The situation is not as good for the truncation
error 8g / 8it which is approximately

D.ll T 8 2K pD.ll ( 2 2)
eT = -,\ 81 2 U= - E1 31 1 +7l 112 +4'2 ,
2 1 h 1

so that the relative error is

31i + 7lt12 + 41~ D.11


(l1 + [2)2 11'

Comparing the semi-analytical error to the one obtained by the finite difference ap-
proach, we note that it is seven times larger when II = 12 . As shown in Ref. [7], this
larger error for the semi-analytical method increases a..'i the mesh is refined .•••

7.2.4 Nonlinear Analysis

For nonlinear analysis, the equations of equilibrium may be written as

f( u, x) = f.lp( x ) , (7.2.20)

273
Chapter 7: Sensitivity of Discrete Systems

where f is the internal force generated by the deformation of the structure, and J.1p
is the external applied load. The load scaling factor J.1 is used in nonlinear analysis
procedures for tracking the evolution of the solution as the load is increased. This is
useful because the equations of equilibrium may have several solutions for the same
applied loads. By increasing J.1 gradually we make sure that we obtain the solution
that corresponds to the structure being loaded from zero.
Differentiating Eq. (7.2.20) with respect to the design variable x we obtain

Jdu _ J.1d p _ af
(7.2.21)
dx - dx ax'
where J is the Jacobian of f at u,

(7.2.22)

often called the tangential stiffness matrix.


The direct method for obtaining dgjd.r is to solve Eq. (7.2.21) for dujd.r and
substitute into Eq. (7.2.3). The matrix J is often available from the solution of the
equations of equilibrium when these are solved by using Newton's method. Newton's
method is based on a linear approximation of the equations of equilibrium about a
trial solution u
f(u, x) + J(u, .r)(u - u) ;:::: IIp(X). (7.2.23)
Equation (7.2.23), solved for u, typically provides a better approximation to u than
U. This new approximation replaces u in Eq. (7.2.23) for the next iteration, either
with an updated value of J (Newton's method) or with the old value ( modified
Newton's method). The iteration continues until convergence to a desired accuracy
is achieved. If the last iterate U, for which J was calculated, is close enough to u,
then that J can be used for calculating the derivative of u.
The adjoint approach is very similar to that used in the linear case. The adjoint
vector A is the solution of the equation

(7.2.24)

where again z is the vector of derivatives of the constraint with respect to the dis-
placement components, Zi = agjaui. It is easy to check that we obtain

dg = ag + AT (pdp _ of). (7.2.25)


dx ox dx ax

7.2.5 Sensitivity of Limit Loads

At a critical point with the load value denoted as J.1*, the tangential stiffness matrix J
becomes singular, and we can have either a bifurcation point or a limit load. \Ve can
distinguish between the two by differentiating Eq. (7.2.20) with respect to a loading

274
Section 7.3: Sensitivity Derivatives of Static Displacement and Stress Constmints
parameter that increases monotonically throughout the loading history. The load
parameter J.l is not a good choice, because at a limit point it reaches a maximum and
is not monotonic. Instead we often use a displacement component, known to increase
monotonically, or the arc length in the (u, /J) space. We denote such a monotonic load
parameter by a, and denote a derivative with respect to a by a prime. Differentiating
Eq. (7.2.20) with respect to a we get
Ju'=J.l'p. (7.2.26)
At a critical point, J is singular, and we denote the left eigenvector associated with
the zero eigenvalue of J by v, that is
v T J* = 0, (7.2.27)
where the asterisk denotes quantities evaluated at the critical point. Premultiplying
Eq. (7.2.26) by v T , we get
/J'v T P = o. (7.2.28)
At a limit point this equation is satisfied because the load reaches a maximum, and
then J.l' = o. In that case, Eq. (7.2.26) indicates that the buckling mode, which is
the right eigenvector of the tangential stiffness matrix J, is equal to the derivative of
u with respect to the loading parameter. At a bifurcation point J.l' =I- 0, and instead
v T p = O. (7.2.29)
For a symmetric tangential stiffness matrix v is also the buckling mode, and Eq.
(7.2.29) indicates that the buckling mode is orthogonal to the load vector.
To calculate sensitivity of limit loads we need to consider a more general response
path parameter v which can be a load parameter, a design variable, or a combination
of both-a parameter that controls both structural design and loading simultaneously.
We denote differentiation with respect to v by a dot and differentiate Eq. (7.2.20)
with respect to v to get
J. af . . dp.
u + ax x = /Jp + /J dx x . (7.2.30)
We now want a parameter v that controls the design variable x and the load parameter
J.l so that we remain at a limit load, J.l = J.l*. We select v = x, and then Eq. (7.2.30)
becomes
J* . ( af)* _ d/J* * dp (7.2.31 )
u + ax - dx p + J.l dx '
where we used the fact that for our choice of parameter x = 1. Premultiplying Eq.
(7.2.31) by the left eigenvector, v T , and rearranging we get
af)* _ J.l* d P ]
vT [(
d/J*
dx
= ax
--~~~--~~
vTp
dx
(7.2.32)

The quantity in brackets in the numerator of Eq. (7.2.32) is the derivative of the
residual of the equations of equilibrium at the limit point. Thus we can use the
semi-analytical method to evaluate the limit load sensitivity as follows: We perturb
the design variable, calculate the change in the residual (for fixed displacements) and
take the dot product with the buckling mode to get the numerator. The denominator
is the dot product of the buckling mode with the load vector.
275
Chapter 7: Sensitivity of Discrete Systems

7.3 Sensitivity Calculations for Eigenvalue Problems

Eigenvalue problems arc commonly encountered in structural stability and vihration


analysis. When forces are conservative, and no damping is considered, these prohlems
lead to real eigenvalues which represent buckling loads or vibration frequencies. In
the more general case the eigenvalues are complex. Our discussion starts with the
simpler case of real eigenvalues.

7.3.1 Sensitivity Derivatives of Vibration and Buckling Constraints

Undamped vibration and linear buckling analysis lead to eigenvalue problems of the
type
KU-JiMu=O, (7.3.1)
where K is the stiffness matrix, M is the mass matrix (vibration) or the geometric
stiffness matrix (buckling) and u is the mode shape. For vibration problems JI is the
square of the frequency of free vibration, and for buckling problems it is the buckling
load factor. Both K and M are symmetric, and K is positive semidefinite. The mode
shape is often normalized with a symmetric positive definite matrix W such that

uTWu = 1, (7.3.2)
where, for vibration problems, W is usually the mass matrix M. Equations (7.3.1)
and (7.3.2) hold for all eigenpairs (Jik, uk). Differentiating these equations with re-
spect to a design variable x we obtain

du dJ1 dK dM
(K - JiM)- - -Mu = - ( - - J1-)u, (7.3.3)
dx dx dx dx

and
(7.3.4)

where we have used of the symmetry of W. Equations (7.3.3) and (7.3.4) are valid
only for the case of distinct eigenvalues (repeated eigenvalues are, in general, not
differentiable, and only directional derivatives may be obtained, see Haug et al. [8]).
In most applications we are interested only in the derivatives of the eigenvalues.
These derivatives may be obtained by premultiplying Eq. (7.3.3) by u T to obtain

T dK dM
u ( - -Jl-)u
dJ1 dx dx (7.3.5)
dx
In some applications the derivatives of the eigenvectors are also required. For ex-
ample, in automobile design we often require that critical vibration modes have low
amplitudes at the front seats. For this design problem we need derivatives of the

276
Section 7.3: Sensitivity Calculations for Eigenvalue Problems
mode shape. To obtain eigenvector derivatives we can use the direct approach and
combine Eqs. (7.3.3) and (7.3.4) as

[
K - pM
_ TW
u
-MU] { ;:-1:!:. } = { -(: -
0
dx
!uTaw u
2 (IX
P :)u} . (7.3.6)

The system (7.3.6) may be solved for the derivatives of the eigenvalue and the eigen-
vector. However, care must be taken in the solution process because the principal
minor K -I-lM is singular. Cardani and Mantegazza [9] and Murthy and Haftka [10]
discuss several solution strategies which address this problem.
One of the more popular solution techniques is due to Nelson[l1]. Nelson's
method temporarily replaces the normalization condition, Eq. (7.3.2), by the re-
quirement that the largest component of the eigenvector be equal to one. Denoting
this re-normalized vector ii, and assuming that its largest component is the mth one,
we replace Eq. (7.3.2) by
(7.3.7)
and Eq.(7.3.4) by
d;; = 0 . (7.3.8)
Equation (7.3.3) is valid with u replaced by ii, but Eq. (7.3.8) is used to reduce its
order by deleting the mth row and the mth column. When the eigenvalue P is distinct,
the reduced system is not singular, and may be solved by standard techniques.
To retrieve the derivative of the eigenvector with the original normalization of
Eq. (7.3.2) we note that u = umii, so that
du dUm _ dii
dx = dx u + Urn dx ' (7.3.9)
and dUm/dx may be obtained by substituting Eq. (7.3.9) into Eq. (7.3.4) to obtain

dUm
dx
= _u2m uTWdii
dx
_ umuTaw U
2 dx
. ()
7.3.10

We can also use an adjoint or modal technique for calculating the derivatives of
the eigenvector by expanding that derivative as a linear combination of eigenvectors.
That is, denoting the ith eigenpair of Eq. (7.3.1) by (Pi, u i ) we assume

dUk
- =
L:
1
.
CkjU), (7.3.11)
dx .
)=1

and the coefficients Ckj can be shown to be (see, for example, Rogers [12])
'T dK dM k
uJ (~-Pk~)U
Ckj = 'T' k #j. (7.3.12)
(I-lk - Pj)u) MuJ

277
Chapter 7: Sensitivity of Discrete Systems

Using the normalization condition of Eq. (7.3.7) we find

Ckk =- L CkjU::n . (7.3.13)


j#

On the other 111md, if we use the normalization condition of Eq. (7.3.2) with W = M,
we get
1 k T dM k
Ckk = --(u ) - u . (7.3.14)
2 dx
If all the eigenvectors are included in the sum, Eq. (7.3.11) is exact. For most
problems it is not practical to calelllate all the eigenvectors, so that only a few of the
eigenvectors associated with the lowest eigenvalues are included. Wang [13J developed
a modified modal method that accelerates the convergence. Instead of Eq. (7.3.11)
we use
1 k I
eu _
- d - u 8 + ~dkjU ,
k ""' j (7.3.15)
X j=1

where
(7.3.16)

is a static correction term, and

k i- j. (7.3.17)

The coefficient dkk is still given by Eq. (7.3.14) for the normalization condition of
uTMu = 1. For the normalization condition of (7.3.7)

dkk -- k -
-11 8m ""'d
~kjVmj . (7.3.18)
j#

Sutter et al. [14J present a study of the convergence of the derivative with increasing
number of modes using both the modal method and the modified modal method and
demonstrate the improved convergence of the modified modal method.

Example 7.3.1

The spring-mass-dashpot system shown in Fig. (7.3.1) is analysed here for the case
that the dash pot is inactivated, that is c = O. Initially the two masses and the
three springs have values of 1, and we want to calculate the derivatives of the lowest
vibration frequency and the lowest vibration mode with respect to k for two possible
normalization conditions: one of the form Eq. (7.3.2) with W = M, and one of the
form Eq. (7.3.7) with the second component of the mode set to 1.

278
Section 7.3: Sensitivity Calculations for Eigenvalue Problems

-~Of 1 2 't' ''t'

Figure 7.3.1 Spring-mass-dashpot example for eigenvalue derivatives.


Denoting the motions of the two masses as UI and U2, we find the elastic energy,
E, and the kinetic energy, T, to be

E= 0.5 [kui + (U2 - UI)2 + u~] , T = 0.5( iti + it~) .


This gives us the stiffness and mass matrices as

-1]
2 ' M=[~ ~].
For k = 1, the eigenvalue problem, Eq. (7.3.1) becomes

2 --1w 2 ]{UI}
2 - w2
[
-1 U2 = O. (a)
Setting the determinant of the system to zero we get the two frequencies, WI = 1,
and W2 = v'3. Substituting back the lowest frequency into Eq. (a) we get for the first
vibration mode
UI -
U2 = 0,

-UI
+ U2 = O.
As expected, the system is singular at a natural frequency, so that we need the nor-
malization condition to determine the eigenvector. For the normalization condition
(7.3.2) the additional equation is

uTMu = ui + u~ = 1 .
For the normalization condition Eq. (7.3.7), the condition is

where we use the bar to denote the vibration mode with the second normalization
condition. The solutions with the normalization conditions are

279
Chapter 7: Sensitivity of Discrete Systems

Next we calculate the derivative of the lowest frequency from Eq. (7.3.5) using primes
to denote derivatives with respect to k. For our example

K' = [10 0
0] ' M'=O.

We use the mode normalized by the mass matrix in Eq. (7.3.5), so that the denomi-
nator is equal to 1, and then

\Ve can also get the derivative of the frequency and the mode together by using Eq.
(7.3.6). We note that

K - pM = [~1 11], Mu = Wu = u = ~ {~} ,

-(K' - p.M')u = -K'u = ~ { -r} }, ~UTW'U = O.


Equation (7.3.6) is then

1 -1
[ -1 1
-J2/2 -J2/2
\Ve solve this equation to get

11~ = -/2/8, u; = /2/8, ,/ = 1/2.


In order to solve for fl' from Eq. (7.3.3), with the additional condition fl~ = 0, we
need to evaluate the expressions:

J1
'M-u = 0.5-u = { 0.5
0.5 } ' -(K' - ItM')ii = -K'ii = { ~1 } .

Then Eq. (7.3.3), with ii replacing u, and the additional condition yield

u~ -fl~ = -0.5,
-fl~ +fl; = 0.5,
fl; = O.

The solution is
u~ = -0.5, u; = O.
We can show that u can indeed be retrieved from ii' by using Eqs. (7.3.9) and
(7.3.10). Equation (7.3.10) becomes

11; = -u~uTii' = -0.5( /2/2)[ 1 1 J { -~.5 } = /2/8,


280
Section 7.3: Sensitivity Calculations for Eigenvalue Problems

which agrees with our previous result. Equation (7.3.9) becomes

which also agrees with our previous result.e e •

When the eigenvalue f.l is repeated with a multiplicity of m, there are m linearly
independent eigenvectors associated with it. Furthermore, any linear combination
of these eigenvectors is also an eigenvector, so that the choice of eigenvectors is not
unique. In this case the eigenvectors that are obtained from a structural analysis
program will be determined by the idiosyncrasies of the computational procedure
used for the solution of the eigenproblem. Assuming that u 1 , ..• , urn is a set of
linearly independent eigenvectors associated with f.l, we may write any eigenvector
associated with f.l as
rn
u= Lq;u i =Uq, (7.3.19)
;=1

where q is a vector of coefficients and U a matrix with columns equal to u i , i =


1, ... , m. As the design variable x is changed, the eigenvalues usually separate, and
the eigenvectors become unique again. We obtain these eigenvectors by substituting
Eq. (7.3.19) into Eq. (7.3.3) and premultiplying by UT to obtain

(7.3.20)

where
(7.3.21)

and
(7.3.22)
Equation (7.3.20) is an m X m eigenvalue problem for df.l/dx. The m solutions
correspond to the derivatives of the m eigenvalues derived from f.l as x is changed, and
the eigenvectors q give us, through Eq. (7.3.19), the eigenvectors associated with the
perturbed eigenvalues. A generalization of Nelson's method to obtain derivatives of
the eigenvectors was suggested by Ojalvo [15] and amended by Mills-Curran [16] and
Dailey [17]. Their procedure seems to contradict the earlier assertion that repeated
eigenvalues are not differentiable. However, while we can find derivatives with respect
to any individual variable, these are only good as directional derivatives, in that
derivatives with respect to x and y cannot be combined in a linear fashion. That is

Of.l Of.l
ax
df.l= -dx+-dy
oy (7.3.23)

will not hold in general. This is demonstrated in the following example.

281
Chapter 7: Sensitivity of Discrete Systems
Example 7.3.2

Let us consider a simple, two variable system

K= [ 2 + Y
x 2'
x] W=M=1.

The two eigenvalues are

(a)
The two eigenvalues are identical for x = =y 0, and we will first demonstrate that
the eigenvectors are discontinuous at the origin. In fact for x = 0 the two eigenvectors
are

and for y = 0
1
u {I}
2
=l'u= {-I}l'

Obviously, we can get either set of eigenvectors as close to the origin as we wish by
approaching it either along the x axis or along the y axis.
Next we calculate the derivatives of the two eigenvalues with respect to x and y
at the origin. At (0,0) any vector is an eigenvector, and we select the two coordinate

n.
unit vectors as a basis, that is

u = [~
We first calculate derivatives with respect to x, and using Eqs. (7.3.21) and (7.3.22)
we get

B= [~ ~]
The solution of the eigenvalue problem, Eq. (7.3.20) is

and the corresponding eigenvectors are

282
Section 7.3: Sensitivity Calculations for Eigenvalue Problems

and because U is the unit matrix, from Eq. (7.3.19) u i = qi. It is easy to check that
these are indeed the eigenvectors along the y axis (x, 0). Similarly, for derivatives
with respect to y we have

A= [~ ~], B= [~ ~],
and the two eigenvalues of Eq. (7.3.20) are

The corresponding eigenvectors are

To see that the above derivatives cannot be used to calculate the change in I-l due to
a simultaneous change in x and y, consider an infinitesimal change dy = 2dx = 2dt.
From the solution for the two eigenvalues, Eq. (a), we have

dl-l = dt ± V2dt .
On the other hand, Eq. (7.3.23) yields four values depending on which of two values
we use for the x and y derivatives. These are 3dt, dt, dt, and -dt .•••
The implications of the failure of calculating a derivative in an arbitrary direction
from derivatives in the coordinate directions are quite serious. Most optimization al-
gorithms rely on these calculations to choose move directions or to estimate objective
function and constraints. Therefore, these algorithms could experience serious dif-
ficulties for problems with repeated eigenvalues. On the bright side, computational
experience shows that even minute differences between eigenvalues are often sufficient
to prevent such difficulties. Furthermore, the coalescence of eigenvalues often has an
adverse effect on structural performance. In buckling problems it is associated with
imperfection sensitivity, and for structural control problems coalescence of vibration
frequencies can lead to control difficulties. Therefore, constraints are often used to
separate the eigenvalues in design problems.

7.3.2 Sensitivity Derivatives/or Non-Hermitian Eigenvalue Problems

When structural damping is important or when damping is supplied by aerodynamic


forces or active control systems, the damped motion ft is governed by

Mil + eli + Kft = 0, (7.3.24)


283
Chapter 7: Sensitivity of Discrete Systems
where C is the damping matrix, assumed to be symmetric, and a dot denotes differ-
entiation with respect to time. Setting

(7.3.25)
we get
[f1?M + pC + KJu = o. (7.3.26)
Note that we have not defined the eigenvalue p in the way we did for the undamped
vibration problem. There p was the square of the frequency, while here, when C = 0,
we get p = iw where w is the vibration frequency. The derivative of the eigenvalue
p with respect to a design variable x is obtained by differentiating Eq. (7.3.26) with
respect to x and premultiplying by u T

dp
dx = (7.3.27)

This equation can be used for estimating the effect of adding a small amount of
damping to an undamped system. For the undamped system C = 0, the eigenvalue
is p = iw, and the eigenvector is the vibration mode that we will denote here as 41 to
distinguish it from the damped mode u. Then Eq. (7.3.27) becomes

dp
(7.3.28)
dx

Example 7.3.3

Use linear extrapolation to estimate the effect ofthe dashpot in Figure (7.3.1) on the
first vibration mode, and then compare with the exact effect for c = 0.2, and c = 1.0.
For this example we take x = c and then (using K, and M from Example 7.3.1)

dM dK
-=-=0,
dx dx
dC _
dx-
[1 0]
00'

Using the first vibration mode from Example (7.3.1) which is normalized so that the
denominator of Eq. (7.3.28) is 1, (41 1 f = (v'2/2)[1 ,1]' we get
dp _ dp, TdC
dc = dx = -0.541 dx 41 = -0.25.
From Example (7.3.1), the frequency of the first natural mode is W1 = 1 (which
corresponds to p = i in the notation of this section). Then using linear extrapolation
to calculate an approximate eigenvalue Pa we get

P,a = P I
c=O
dp
+ -dC c =
.
-0.25c + z .

284
Section 7.3: Sensitivity Calculations for Eigenvalue Problems

For the two given values of c = 0.2, and c = 1.0, the approximate eigenvalues are
-0.05 + i, and -0.25 + i, respectively. We compare this approximation to the exact
result obtained by solving Eq. (7.3.26); this yields
2
[ /1 + C/1 +
-1
2 /1 -1]
2+2
{u1 } = O.
U2
(a)
The eigenvalue /1 is obtained by setting the determinant of this equation to zero. For
the two values of c we get
C = 0.2 : /1 = -0.05025 + 1.0013i .
c = 1.0: /1 = -0.29178 + 1.0326i .
We see that the prediction that C changes only the damping and not the frequency
is quite good, and that linear extrapolation worked quite well for predicting the
damping .•••
The order of the damped eigenproblem is commonly reduced by approximating
the damped mode as a linear combination of a small number of natural vibration
modes u i , i = 1 ... ,m. This may be written as
u=Uq, (7.3.29)
where U is a matrix with u i as columns, and q is a vector of modal amplitudes.
Substituting Eq. (7.3.29) into Eq. (7.3.26) and premultiplying by U T we get
[/12M R + /1C R + KRlq = 0, (7.3.30)
where
(7.3.31)
After we solve for the reduced eigenvector q from Eq. (7.3.30), we can calculate
the derivative of the eigenvalue using two approaches. The first approach, called the
fixed-mode approach, employs Eq. (7.3.27) with It calculated from Eq. (7.3.30) and
u given by Eq. (7.3.29). The second approach, called the updated-mode approach,
uses Eq. (7.3.27) for the reduced problem, that is
2 TdMR TdCR TdK R
d/1 /1 q --q+/1q - q + q - - q
dx dx dx
dx = (7.3.32)

The derivative of KR is given as

dK R = UTdKU + dUT KU + UTK dU (7.3.33)


dx dx dx dx
with similar expressions for the derivatives of MR and CR. The names of the two
approaches are associated with the fact that the corresponding derivatives will agree
with a finite-difference derivative calculations with the modes being fixed or updated,
respectively. Also, it can be shown that if we omit the terms with dU / dx from the
updated-mode expression we will recover the fixed-mode result. The calculation
of derivatives of vibration modes is expensive, and for this reason the fixed-mode
approach is more appealing. However, as the following example demonstrates, the
updated-mode approach can, occasionally, be substantially more accurate.

285
Chapter 7: Sensitivity of Discrete Systems

Example 7.3.4

For the spring-mass-dashpot example shown in Fig. (7.3.1) construct a reduced model
based only on the first vibration mode. Calculate the fixed-mode and updated-mode
derivatives of the eigenvalue associated with the lowest frequency with respect to the
constant k of the leftmost spring. Compare with the exact derivatives for c = 0.2
and c = 1.0.
Full-model analysis:
The eigenvalue problem for this example is given by Eq. (a) of Example (7.3.3),
and the exact eigenvalue is solved in that example for the two required values of c.
For the eigenvector we use a normalization condition that the second component, U2,
is equal to 1, and employ the second equation of the eigenproblem to obtain

u_- {p2 1+ I} .

To calculate the derivative of p with respect to the stiffness k of the leftmost spring
we use Eq. (7.3.27) with matrices calculated in Examples 7.3.1 and 7.3.3

C=[~~], -1]
2 '

M/=O, C/=O, K' = [~


where a prime is used to denote a derivative with respect to k. Then from Eq. (7.3.27)
we get
I uTK/u _(p2 + 2)2
P = - u T (C+2pM)u = C(p2 +2)2 + 2p[(p2+ 2 )2 + 1] .

For the two values of c we get (see Example 7.3.3 for values of p)

For c= 0.2 : p= -0.05025 + 1.0013i, p' = 0.02525 + 0.2522i


For c = 1.0 : p = -0.29178 + 1.0326i, p' = 0.1544 + 0.3460i
Reduced-basis analysis:
The vibration frequencies and first vibration mode were calculated in Example
(7.3.1). Since the normalization condition for the full-model eigenvector was that
the second component be equal to 1, we take the vibration mode with the same
normalization. This mode was denoted with an overbar in Example (7.3.1), but we
drop this overbar since it is the only mode used here

286
Section 7.3: Sensitivity Calculations for Eigenvalue Problems

Since we use only one mode for the reduced basis, U = u, and using Eq. (7.3.31)
with k = 1 we get
MR = 2, C R = C, KR = 2.
Equation (7.3.30) for the reduced system becomes

(2Jt 2 + CJt + 2)q = 0,

so that
JtR = -0.25c + i,h - 0.0625c2 ,
where the subscript R is used to denote the fact that this is the eigenvalue obtained
from the reduced system. The eigenvector, which has only one component, we select
as q = 1. For the two values of c we get

c=0.2: JtR = -0.05 + 0.9987i ,

c=l.O: JtR = -0.25 + 0.9682i .


It appears that the reduced model gives excellent results for the low-damping case,
and moderate errors for the high damping case.
Fixed mode derivative:
For the fixed-mode derivative ,,,.e still use Eq. (7.3.27), but with Jt replaced by
JtR and u replaced by its approximation in term of the vibration modes. Since the
eigenvect.or q = 1, this approximation is equal to the first vibration mode, so

-1
tt'= ~~~~~~-=
uT(C + 2JtRM)u
------
c + 4JtR '

For the two values of c we get

c=0.2: Jt~f = 0.2503i ,

c=l.O: Jt'Rf = 0.2582i ,


where the subscript f was used to denote derivatives calculated with the fixed-mode
approach. We note that the derivative of the imaginary part (frequency) is good only
in the low-damping case, and that the fixed-mode derivative misses out altogether
the effect on the real part (damping). Large errors of this type can happen when
the derivative is small. Recall that the size of a derivative is best estimated by the
logarithmic derivative. However, here the logarithmic derivative of the real part, say
for the low damping case is

dJtT / itT
dk/k = 0.02525/( -0.05025) = -0.5025,

so that it is quite substantial.

287
Chapter 7: Sensitivity of Discrete Systems

Updated-mode derivative:
In this case we need the derivative of the vibration mode with respect to k. This
was calculated in Example (7.3.1) as (remember that we use fi. from that example)

u ,_
- {-0.5}
0 .

Then from Eq. (7.3.33)

K~ = uT[K'u+ 2Ku'l = [1 11[ [6 ~] {~} +2 [!1 -;1] { -~.5}] = O.

n{-~.5
Similarly
M~ = 2u™u' = 2[ 1 1] [6 } = -1,

C~=2uTCu'=2[1 1][~ ~]{-~.5}=_c.


Finally, from Eq. (7.3.32)

For the two values of c we get

c=0.2: /l~u = 0.025 + 0.2513i ,


c=l.O: /l~u = 0.125 + 0.2843i ,
which is a much better approximation to the exact derivative than /l~f .• ••

In many applications the damping matrix is not symmetric, and then it is con-
venient to transform the equations of motion Eq. (7.3.24) to a first order system

Bw+Aw=O, (7.3.34)

where
(7.3.35)

Setting
w=wel-'t, (7.3.36)
we get a first-order eigenvalue problem

AW+/lBw = O. (7.3.37)
For calculating the derivatives of the eigenvalues it is convenient to use the left eigen-
vector v which is the solution of the associated eigenproblem
(7.3.38)

288
Section 7.3: Sensitivity Calculations for Eigenvalue Problems
The two eigenproblems defined in Eqs. (7.3.38) and (7.3.37) are easily shown to have
the same eigenvalues (e.g., [18]). Differentiating (7.3.37) with respect to a design
variable x
dw dA dB dJl
(A + JlB) dx + ( dx + Jl dx )w + dx Bw = 0 , (7.3.39)
and premultiplying by yT we get

y
T dA
(-+Jl- w
dB)
dJl dx dx
dx =
(7.3.40)
To obtain derivatives of the eigenvector we need a normalization condition. A
quadratic condition such as Eq. (7.3.2) is inappropriate because the eigenvector is
complex and wTWw can be zero. Even if we eliminate this possibility by replacing
the transpose with the hermitian transpose, the condition
wHWw = 1 (7.3.41)
does not define the eigenvector uniquely because we can still multiply the eigenvector
by any complex number of modulus one without changing the product in Eq. (7.3.41).
Therefore, it is more reasonable to normalize the eigenvector by requiring that
yTBw = 1, = Vm = 1, Wm (7.3.42)
where m is chosen so that both Wm and Vm are not small compared to other compo-
nents of wand y. The derivative of the normalization condition gives us
dW m dV m = 0
dx = 0, (7.3.43)
dx '
and together with Eq. (7.3.39) we can solve for the derivative of the eigenvector. This
is the direct method for calculating the eigenvector derivatives. As in the symmetric
case, the adjoint method for calculating the same derivatives is based on expressing
the derivative of the eigenvector in terms of all the eigenvectors of the problem.
Denoting the ith eigenvalue as Jli and the corresponding eigenvectors as Wi and Vi
we assume
(7.3.44)

and the coefficients Ckj are


"TdA dB k
v 3 (-+Jl-)w
dx dx
Ckj = --:---'::::::...---:----,.,;"T~---:"- , (7.3.45)
(Jlk - Jlj )y3 Bw3
and
Ckk = - L Ckjw!,.. (7.3.46)
j#
The upper limit in the sum, l, is the order of the matrices A and B. As in the
symmetric case, it is possible to truncate the series without taking all the eigenvectors
for the purpose of reducing the cost of the derivative calculation. This introduces an
error which, in general, is problem dependent. Additional information on the various
options for derivative calculation can be found in [10].

289
Chapter 7: Sensitivity 01 Discrete Systems
7.3.3 Sensitivity Derivatives lor Nonlinear Eigenvalue Problems

In flutter and nonlinear vibration problems, we encounter eigenvalue problems


where the dependence on the eigenvalue is not linear. For example, Bindolino and
Mantegazza [19] consider an aeroelastic response problem which produces a transcen-
dental eigenvalue problem of the form
A(p, x)u = 0 (7.3.47)
Differentiating Eq. (7.3.47) we get
+ dpoA = _ oA u
A du (7.3.48)
dx dx op ox
Using the normalizing condition Urn = 1 we can solve Eq. (7.3.48) for du/dx and
dp/dx. Instead, it is also possible to use the adjoint method, employing the left
eigenvector y satisfying
vrn = 1 (7.3.49)
to obtain
dp yTdAU
dx
dx = yTdAU
(7.3.50)
([ji
A common treatment of flutter problems is to have two real parameters representing
the frequency and speed as an eigenpair instead of one complex eigenvalue. For
example Murthy [20] replaces Eq. (7.3.47) by
A(M,w)u = 0, (7.3.51)
where the Mach number, M, and the frequency, w, are real parameters. Using this
approach, differentiate Eq. (7.3.51), premultiply by yT, and use Eq. (7.3.49) to get
dM dw
1M dx + Iw dx = - Ix, (7.3.52)
where
T aA TaA TaA
1M = y aM u , Iw = yow u, y Ix = ax u. (7.3.53)
Multiplying Eq. (7.3.52) by lw (the complex conjugate of Iw) we get
- dM 2 dw -
IMlw dx + I Iw I dx = -Iwlx (7.3.54)
The second term in Eq. (7.3.54) as well as dM/dx are real, so by taking the imaginary
part of Eq. (7.3.54) we get

dM ImUwlx) 1m [(yT¥XU) (yT¥Wii )]


-= = (7.3.55)
dx Im(JMlw) 1m [(yTg~U) (yT~~ii)] .
Next, multiplying Eq. (7.3.52) by 1M and following a similar procedure we find
dw 1m [ (yT ~~ u) (yT g! ii) ] (7.3.56)
=
dx 1m [(yTZ~U) (yT~~ii)]

290
Section 7.4: Sensitivity of Constraints on Transient Response
7.4 Sensitivity of Constraints on Transient Response

Compared to constraints on steady-state response, constraints on transient response


depend on one additional parameter--time. That is, a typical constraint may be
written as
g( u, x, t) ~ 0, (7.4.1)
where for simplicity we assume that the constraint must be satisfied from t = 0 to
some final time t f. For actual computation the constraint must be discretized at a
series of nt time points as
i = 1, ... , nt. (7.4.2)
The distribution of time points has to be dense enough to preclude the possibility
of significant constraint violation between time points. This type of constraint dis-
crctization can greatly increase the number of constraints, and thereby the cost of the
optimization. Therefore it is desirable to find ways to remove the time dependence
without substantially increasing the number of constraints.

7.4.1 Equivalent Constraints

One way of removing the time dependence of the constraint is to replace it with an
equivalent integrated constraint which averages the Reverity of the constraint over the
time interval. An example is the equivalent exterior constraint

g(u,x) = [t~ It! < -g(u,x,t) >2 dtf / 2 , (7.4.3)

where < a > denotes max(a, 0). The equivalent constraint g is violated if the original
constraint is violated for any finite period oftimc. If, however, g(u, x, t) is not violated
anywhere, g( u, x) is zero. The equivalent exterior constraint is identically zero in
the feasible domain, and so no indication is provided when the conRtraint is almost
critical. An equivalent constraint which is nonzero when the constraint is satisfied is
based on the Kresselmeier-Steinhauser function, [21, 22], and Eq. (7.4.2)

-1
g(u, x) = -in [ Ln, e-pgidt 1 , (7.4.4)
p ;=1

where p is a parameter which determines the relation between g and the most critical
value of g, gmin. Indeed, we can write Eq. (7.4.4) as

11 = 9mill - 1 [
-in Ln, e-P(gi-9minldt1 (7.4.5)
p ;=1

And from Eq. (7.4.5) we get


_ In(nt)
gmin ~ 9 ~ gmin - -- , (7.4.6)
p

291
Chapter 7: Sensitivity of Discrete Systems
so that 9 is an envelope constraint in that it is always more critical than g. The
parameter p determines how much more critical 9 is. However, if p is made too
large for the purpose of reducing the difference between 9 and gmin, the problem can
become ill conditioned.
The savings obtained by replacing the discretized constraint, Eq. (7.4.2), by an
equivalent one may seem illusory because the integral in Eq. (7.4.3) or the sum in
Eq. (7.4.4) usually require the evaluation of g(u, x, t) at many time points. The
savings are realized in the optimization effort and in the computation of constraint
derivatives discussed later.

nominal design
Constraint - - - - - perturbed design
function

time

Figure 7.4.1 Critical points.

The disadvantage of equivalent constraints is that they may tend to blur design
trends. Consider, for example a change in design which moves the constraint 9 from
the solid to the dashed line in Fig. (7.4.1). An equivalent constraint 9 may become
more positive, indicating a beneficial effect, while the situation has become more
critical because we have moved closer to the constraint boundary (g = 0), at least at
some time point tml' To avoid this blurring effect we use the critical point constraint
replacing the original constraint by

g(U,X,tmi) 2:: 0, i = 1,2 ... , (7.4.7)

where tmi are time points where the constraint has a local minimum. Figure (7.4.1)
shows a typical situation where the constraint function has two local minima: an
interior one at t m1 , and a boundary minimum at tm2' The local minima are critical
points in the sense that they represent time points likely to be involved first in
constraint violations.
One attractive feature of the critical point constraint is that, for the purpose of
obtaining first derivatives, the location of the critical point may be assumed to be

292
Section 7.4: Sensitivity of Constraints on Transient Response

fixed in time. This is shown by differentiating Eq. (7.4.7) with respect to the design
variable x
dg( tmi) 8g 8g du 8g dt mi
-"-d'-x---'- = 8x + -8u- -dx + at -d-x- . (7.4.8)

The last term in Eq. (7.4.8) is always zero. At an interior minimum such as tml in
Fig. (7.4.1) 8gj8t is zero. We get a boundary minimum when 8gj8t is positive at
the left boundary or negative at the right boundary. This boundary minimum cannot
move away from the boundary unless the slope, 8gj8t becomes zero. This means that
as long as 8g j at is nonzero at a boundary minimum, the minimum cannot move, so
thatdt m ; / dx is zero.

7.4.2 Derivatives of Constraints

For the purpose of calculating derivatives of constraints we assume that the constraint
is of the form
g(u, x) =
t, p(u, x, t)dt
Jo ~ o. (7.4.9)

This form represents most equivalent constraints, as well as the critical-point con-
straint, which can be obtained by defining

p(u, x, t) = g(u, x, t)6(t - tmi). (7.4.10)

The derivative of the constraint with respect to a design variable x is

(7.4.11)

To evaluate the integral we need to differentiate the equations of motion with respect
to x. These equations are written in a general first-order form

AiI = f(u,x,t), u(O) = Uo, (7.4.12)

where u is a vector of generalized degrees of freedom, and f is a vector which includes


contributions of external and internal loads.
We now discuss several methods for calculating the constraint derivative starting
with the simplest-the direct method. As in the steady-state case, the direct method
proceeds by differentiating Eq. (7.4.12) to obtain an equation for dujdx

A diI = J du _ dA iI + 8f
dx dx dx 8x' ~:(O) = 0, (7.4.13)

where J is the Jacobian of f


(7.4.14)

The direct method consists of solving for dujdx from Eq. (7.4.13), and then substi-
tuting into Eq. (7.4.11). The disadvantage ofthis method is that each design variable

293
Chapter 7: Sensitivity of Discrete Systems
requires the solution of a system of differential equations, Eq.(7.4.13). When we have
many design variables and few constraint functions we can, as in the static case,
use a vector of adjoint variables which depends only on the constraint functions and
not on the design variables. To obtain the adjoint method, we pursue the standard
procedure of multiplying the derivatives of the response equations, Eq. (7.4.13), by
an adjoint vector and adding them to the derivatives of the constraint

dg = ('(oP + opdu)dt+ (, AT(A dit _J du _ of + dAit)dt (7.4.15)


dx io ax audx io dx dx ax dx
We want to group together all the terms involving duj dx and define the adjoint
variable so that the coefficient of duj dx will vanish. To do that, we need to integrate
the term involving ditjdx. Integrating by parts and rearranging we obtain

dg = {t, {a p _ AT (Of _ dA
dx io ox ox dx
it) + [opAU _ AT (.A + J) _ O. f A] dU}
dx
dt

ATAdult,
+ dxlo .
(7.4.16)
Equation (7.4.16) indicates that the adjoint variable should satisfy

A T .:\ + (JT + AT)A = (;~f, (7.4.17)

Then from Eq. (7.4.16) we get

dg = (, [op _
dx io ax
>.7 (af _ dA
ax dx
it)] dt , (7.4.18)

where we used the fact that dujdx is zero at t = O. Equation (7.4.17) is a system of
ordinary differential equations for A which are integrated backwards (from t J to 0).
This system has to be solved once for each constraint rather than once for each design
variable. As in the static case, the direct method is preferable when the number of
design variable is smaller than the number of constraints, and the adjoint method
is preferable otherwise. Equation (7.4.17) takes a simpler form for the critical-point
constraint

(7.4.19)

By integrating Eq. (7.4.19) from tmi - f to tmi + f for an infinitesimal f, we can easily
show that Eq. (7.4.19) is equivalent to

(7.4.20)

A third method available for derivative calculation is the Green's function ap-
proach [23]. This method is useful when the number of degrees of freedom in Eq.

294
Section 7.4: Sensitivity of Constraints on Transient Response
(7.4.12) is smaller than either the number of design variables or the number of con-
straints. This can happen when the order of Eq. (7.4.12) has been reduced by
employing modal analysis. The Green's function method will be discussed for the
case of A = I in Eq. (7.4.12) so that Eq. (7.4.13) becomes

du
dx (0) = o. (7.4.21)

The solution of Eq. (7.4.21) may be written [23] in terms of Green's function K(t, T)
as
(7.4.22)

where K(t, T) satisfies

K(t, T) - J(t)K(t, T) = 8(t - T)I,


(7.4.23)
K(D, T) = 0,

and where 8(t - T) is the Dirac delta function. It is easy to check, by direct substi-
tution, that du/dx defined by Eq. (7.4.22), indeed satisfies Eq. (7.4.21).
If the elements of J are bounded then it can be shown that Eq. (7.4.23) is
equivalent to
K(t,T) =0, t < T,
K(T, T) = I, (7.4.24)
K(t, T) - J(t)K(t, T) = 0, t > T.
Therefore, the integration of Eq. (7.4.22) needs to be carried out only up to T = t. To
see how du/dx is evaluated with the aid of Eq. (7.4.24), assume that we divide the
interval 0 ~ t ~ t f into n subintervals with end points at TO = 0 < t[ < ... < tn = t f.
The end points T; are dense enough to evaluate Eq. (7.4.22) by numerical integration
and to interpolate du/ dx to other time points of interest with sufficient accuracy. \Ve
now define the initial value problem

K(t, Tk) - J(t)K(t, Tk) = 0,


(7.4.25)
K(Tk,Tk) = I, k=O,l, ... ,n-l.

Each of the equations in (7.4.25) is integrated from Tk to Tk+l to yield K(Tk+l' Tk)'
The value of K for any other pair of points is given by (sec [23] for proof)

j > k. (7.4.26)

The solution for K is equivalent to solving nm systems of the type of Eq. (7.4.13)
or (7.4.20) where nm is the order of the vector u. Therefore, the Green's function
method should be considered for cases where the number of design variables and
constraints both exceed n m • This is likely to happen when the order of the system
has been reduced by using some type of modal or reduced-basis approximation.

295
Chapter 7: Sensitivity of Discrete Systems
Example 7.4.1

We consider a single degree-of-freedom system governed by the differential equation

u(O) = 0,

and a constraint on the response u in the form

g(u)=c-u(t)~O,

The response has been calculated and found to be monotonically increasing, so that
the critical-point constraint takes the form

\Ve want to use the direct, adjoint, and Green's function methods to calculate the
derivative of !J with respect to a and b.
The problem may be integrated directly to yield

b2 t
U=--.
bt +a
In our notation
of
A =a, J= au =2(u-b).
Direct Method. The direct method requires us to write Eq. (7.4.13) for x = a and
x= b. For x = a we obtain
dit du
a - - 2(u - b)- - it
da - da' ~~ (0) = O.
In general the values for u and it would be available only numerically, so that the
equation for duj da will also be integrated numerically. Here, however, we have the
closed-form solution for u, so that we can substitute it into the derivative equation

dit 2ab du ab 2 du
a-=---- , da (0) = 0,
da bt + a da (bt + a)2

and solve analytically to obtain

du b2 t
da (bt + a)2 .

Then

296
Section 7-4: Sensitivity of Constraints on Transient Response
We now repeat the process for x = b. Equation (7.4.13) becomes

dit du
a- = 2(u - b)- - 2(u - b)
db db '
Solving for dujdb we obtain
du b2 t 2 + 2abt
db (bt + a)2 ,
and then
dg du b2 t} + 2abt I
db =- db (t/) =- (btl+a)2 .

Adjoint Method. The adjoint method requires the solution of Eq. (7.4.20) which
becomes
a)..+2(u-b) .. =O,
or
. 2ab
a)..---)..=O
bt + a '
which can be integrated to yield

).. = ~( bt + a )2.
a btl +a
Then dg j do, is obtained from Eq. (7.4.18) which becomes

dg = t, )..itdt = (' ~( bt + a? ab2 dt = b2tl


da Jo Jo a btl +a (bt + 0,)2 (bt I + 0,)2 .

Similarly, dg j db is

dg t, 2 (, bt + a 2 ab b2 tj + 2abt I
db = Jo 2)..(u-b)dt=-~Jo (btl+) bt+a =- (btl+a)2 .
dt

Green's Function Method. We recast the problem as

it=(u-b?ja,

so that the Jacobian J is


J=2(u-b)ja.
Equation (7.4.24) becomes

k(t, T) - [2(u - b)ja]k(t, T) = 0, k(T,T) = 1,

or
. 2b
k(t, T) + --k(t, T) = O.
bt + a
297
Chapter 7: Sensitivity of Discrete Systems
The solution for k is
k = (bT+ a)2 t ~ T,
bt +a '
so that from Eq. (7.4.22)

du= tJ8fkdT=_ t'(bT+a) (u-b)2 dT =


da Jo 8a Jo bt + a a2 (bt + a)2 .

Similarly

du = t J 8F kdT = _ t J 2 (bT + a)2 (u - b) dT = _ b2t 2 + 2abt .


db Jo 8b Jo bt + a a (bt + a)2
•••
7.4.3 Linear Structural Dynamics

For the case of linear structural dynamics it may be advantageous to retain the second-
order equations of motion rather than reduce them to a set of first-order equations.
It is also common to use modal reduction for this case. In this section we discuss the
application of the direct and adjoint methods to this special case. The equations of
motion are written as
Mii + Cu+Ku = f(t). (7.4.27)
Most often the problem is reduced in size by expressing u in terms of m basis functions
u i, i = 1, ... m, where m is usually much less than the number of degrees of freedom
of the original system, Eq.(7.4.27)

u=Uq, (7.4.28)

where U is a matrix with u i as columns. Then a reduced set of equations can be


written as
(7.4.29)
where

(7.4.30)

When the basis functions are the first m natural vibration modes of the structure
scaled to unit modal masses, U satisfies the equation

KU-MU0 2 =0, (7.4.31)

where 0 is a diagonal matrix with the ith natural frequency Wi in the ith row. In that
case KR = 0 2 and MR = I are diagonal matrices. For special forms of damping, the
damping matrix C R is also diagonal so that the system Eq. (7.4.29) is uncoupled.
After q is calculated from Eq. (7.4.29) we can use Eq. (7.4.28) to calculate u. This
modal reduction method is known as the mode-displacement method.

298
Section 7.4: Sensitivity of Constraints on Transient Response
When the load f has spatial discontinuities the convergence of the modal approx-
imation, Eq. (7.4.29) can be very slow [24, 25]. The convergence can be dramatically
accelerated by using the mode acceleration method, originally proposed by Williams
[26]. The mode acceleration method can be derived by rewriting Eq. (7.4.27) as
u = K-1f - K-1Cu - K-1Mii. (7.4.32)

The first term in Eq. (7.4.32) is called the quasi-static solution because it represents
the response of the structure if the loads are applied very slowly. The second term
and third terms are approximated in terms of the modal solution. It can be shown
(e.g., Greene [27]) that K- 1 can be approximated as

K- 1 = U9-2 U T (7.4.33) .
Using this approximation for the second and third terms of Eq. (7.4.32) we get

(7.4.34)
This approximation is exact when U contains the full set of vibration modes. Note
that q and q in Eq. (7.4.34) are obtained from the mode-displacement solution, Eq.
(7.4.29). Therefore, there is no difference in velocities and accelerations between the
mode-displacement and the mode acceleration methods.
In considering the calculation of sensitivities we treat first the mode-displacement
method. The direct method of calculating the response sensitivity is obtained by
differentiating Eq. (7.4.29) to obtain
dq dq dq
MR- +C R - +K R - = r, (7.4.35)
dx dx dx
where
dfR dM R .. dM R . dK R
r= - - --q- --q - --q. (7.4.36)
dx dx dx dx
The derivative of KR with respect to x is given by Eq. (7.3.33), and similar expres-
sions are used for the derivatives of M R , C R , and fRo The calculation is simplified
considerably by using a fixed set of basis functions U or neglecting the effect of the
change in the modes. In some cases (e.g., [28]) the error associated with neglecting
the effect of changing modes is small. When this error is unacceptable we have to
face the costly calculation of the derivatives of the modes needed for calculating the
derivatives of the reduced matrices, such as Eq. (7.3.33). Fortunately it was found
by Greene [27] that the cost of calculating the derivatives of the modes can be sub-
stantially reduced by using the modified modal method Eq. (7.3.15) keeping only the
first term in this equation. This approximation to the derivatives of the modes may
not always be accurate, but it appears to be sufficient for calculating the sensitivity
of the dynamic response.
For the adjoint method we consider a constraint in the form of Eq. (7.4.9)

g(q,x) = Jo
t, p(q,x,t)dt~O, (7.4.37)

299
Chapter 7: Sensitivity of Discrete Systems
so that
(7.4.38)

To avoid the calculation of dqj dx we multiply the response derivative equation, Eq.
(7.4.35), by an adjoint vector, A, and add to the derivative of the constraint
dg ltl op op dq ltl T dq dq dq
-= (-+--)dt+ A (-MR--CR--KR-+r)dt. (7.4.39)
dx 0 ox oq dx 0 dx dx dx
We want to get rid of the response derivative terms by selecting A appropriately.
We use integration by parts to get rid of time derivatives in the response derivative
terms. We obtain
dg = (I {op _ AT r _ .\TM R + ~TCR _ ATKR] dq } dt
+ [OP
dx Jo ox oq dx ( )
7.4.40
-ATMR dqltl +~TMR dqltf _ ATCR dqltl.
dx 0 dx 0 dx 0
If the initial conditions do not depend on the design variable x, Eq. (7.4.40) suggests
the following definition for A
... op T .
MRA-CRA+KRA=(oq) ' A(t,)=A(t,)=O, (7.4.41)

and then Eq. (7.4.40) becomes

dg = (I(OP _ ATr)dt. (7.4.42)


dx Jo ox
For the mode-acceleration method we consider only the direct method. We start
by differentiating Eq. (7.4.27) and rearranging it as
du = K- 1 [df _ dK u _ C du _ dC u _ M du _ dM u ] . (7.4.43)
dx dx dx dx dx dx dx
Next we use Eq. (7.4.34) to approximate the second term, and the modal expansion
Eq. (7.4.28) to approximate the other terms to get

du ~ K- 1 [df _ dK [K-1f _ UO-2CRq _ UO- 2 q]-


dx dx dx
(7.4.44)
CU dq _ dC Uq _ MU dq _ dM Uq] .
dx dx dx dx
Finally we use the modal approximation to K- 1 , Eq. (7.4.33) to obtain
du ~ K- 1 [df _ dK K - 1f] +
dx dx dx
UO- 2 U T [dKUO-2CRq _ dC Uq _ Cu dq ] + (7.4.45)
dx dx dx
K-1 [dK UO - 2 _ dM U ] q _ UO- 2dq .
dx dx dx
300
Section 7.5: Exercises
Note that the calculation involves the solution of Eqs. (7.4.29) and (7.4.35) for q and
dq/dx, followed by Eq. (7.4.45) for retrieving the du/dx. Additional details can be
found in [27].

7.5 Exercises

rx'u
y,v
o ~p

Figure 7.5.1 Three-bar truss.

1. Write a program using the finite-element method to calculate the displacements


and stresses in the three-bar truss shown in Fig. (7.5.1). Also calculate the derivative
of the stress in member A with respect to AA by the forward- and central-difference
techniques. Consider the case AA = AB = kAc. (a)Take k = lO-m where m is the
number of decimal digits you use in the computation minus two. Find the optimum
step size. (b )Find the smallest value of k that allows an error of less than 10 percent.
2. Calculate the derivatives of the stress in member A of the three bar truss of Fig.
(7.5.1) at a design point where all three cross-sectional areas have the same value
A. First calculate the derivative with respect to the cross-sectional area of A using
the direct and adjoint method. Next calculate the derivative with respect to the
cross-sectional areas of members Band C using one method only.
3. Calculate all the second derivatives of the stress in member A of problem 2 with
respect to the three cross-sectional areas.
4. Obtain a method for calculating third derivatives of constraints on displacement
and stresses (static case).
5. Obtain a finite-element approximation to the first vibration frequency of the truss
of problem 1 in terms of A, I, Young's modulus E and the mass density p. Assume
that there is no bending. Then calculate the derivative of the frequency with respect
to the cross-sectional area of the three members.
6. Calculate the derivative of the lowest (in absolute magnitude) eigenvalue of prob-
lem 5 with respect to the strength c of a horizontal dashpot at joint D: (i) when
c = 0; (ii) when c is selected (by linear extrapolation on the basis of part (i)) to make
the damping ratio (negative of real part over the absolute value of the eigenvalue) be
0.05.

301
Chapter 7: Sensitivity 01 Discrete Systems

1 21 0

P---~~~~=~~===21=0==);~---P
Figure 7.5.2 Two-span beam.
7. The beam shown in Fig. (7.5.2) needs to be stiffened to increase its buckling load.
Calculate the derivative of the buckling load with respect to the moment of inertia of
the left and right segments, and decide what is the most economical way of stiffening
the beam. Assume that the cost is proportional to the mass, and the cross-sectional
area is proportional to the square root of the moment of inertia.
8. Obtain an expression for the second derivatives of the buckling load with respect
to structural parameters.
9. Repeat Example 7.3.4 for the derivative with respect to c instead of k.
10. Consider the equation of motion for a mass-spring-damper system

mill + cw + kw = I(t)
where I(t) = 10H(t) is a step function, and w(O) = w(O) = O. Calculate the derivative
of the maximum displacement with respect to c for the case kIm = 4., elm = 0.05,
101m = 2. using the direct method.
11. Obtain the derivatives of the maximum displacement in Problem 10 with respect
to c, m, 10 and k using the adjoint method.
12. Solve problem 10 using Green's function method.
13. Solve problem 10 using the mode-displacement method and mode-acceleration
methods with a single mode.

7.6 References

[1] Gill, P.E., Murray, W., Saunders, M.A., and Wright, M.H., "Computing Forward-
Difference Intervals for Numerical Optimization", SIAM J. Sci. and Stat. Comp.,
Vol. 4, No.2, pp. 310-321, June 1983.
[2] lott, J., Haftka, R.T., and Adelman, H.M., "Selecting Step Sizes in Sensitivity
Analysis by Finite Differences," NASA TM- 86382, 1985.
[3] Haftka, R.T., "Sensitivity Calculations for Iteratively Solved Problems," Inter-
national Journal for Numerical Methods in Engineering, Vol. 21, pp.1535-1546,
1985.

302
Section 7.6: References
[4J Haftka, R.T., "Second-Order Sensitivity Derivatives in structural Analysis",
AIAA Journal, Vol. 20, pp.1765-1766, 1982.
[5J Barthelemy, B., Chon, C.T., and Haftka, R.T., " Sensitivity Approximation of
Static Structural Response", paper presented at the First World Congress on
Computational Mechanics, Austin Texas, Sept. 1986.
[6J Barthelemy, B., and Haftka, R.T., "Accuracy Analysis of the Semi-analytical
Method for Shape Sensitivit.y Calculations," Mechanics of Structures and Ma-
chines, 18, 3, pp. 407432, 1990.
[7J Barthelemy, B., Chon, C.T., and Haftka, R.T., "Accuracy Problems Associated
with Semi-Analytical Derivatives of Static Response," Finite Elements in Analysis
and Design, 4, pp. 249-265, 1988.
[8J Haug, E.J., Choi, K.K., and Komkov, V., Design Sensitivity Analysis of Structural
Systems, Academic Press, 1986.
[9J Cardani, C. and Mantegazza, P., "Calculation of Eigenvalue and Eigenvector
Derivatives for Algebraic Fluttcr and Divergence Eigenproblems," AIAA Journal,
Vol. 17, pp.408-412, 1979.
[10J Murthy, D.V., and Haftka, R.T., "Derivatives of Eigenvalues and Eigenvectors
of General Complex Matrix", International Journal for Numerical 1vlethods in
Engineering, 26, pp. 293-311,1988.
[l1J Nelson, R.B., "Simplified Calculation of Eigcnvector Derivatives," AIAA Journal,
Vol. 14, pp. 1201-1205,1976.
[12J Rogers, L.C., "Derivatives of Eigenvalues and Eigenvectors", AIAA Journal, Vol.
8, No.5, pp. 943-944, 1970.
[13J Wang, B.P., Improved Approximate Methods for Computing Eigenvector Deriva-
tives in Structural Dynamics," AIAA Journal, 29 (6), pp. 1018-1020,1991.
[14J Sutter, T.R., Camarda, C.J., Walsh, J.L., and Adelman, H.M., "Comparison of
Several Methods for the Calculation of Vibration Mode Shape Derivatives" , AIAA
Journal, 26 (12), pp. 1506-1511,1988.
[15J Ojalvo, I.U., "Efficient Computation of Mode-Shape Derivatives for Large Dy-
namic Systems" AIAA Journal, 25, 10, pp. 1386-1390,1987.
[16J Mills-Curran, W.C., "Calculation of Eigenvector Derivatives for Structures with
Repeated Eigenvalues", AIAA Journal, 26 (7), pp. 867-871, 1988.
[17J Dailey, R.L., "Eigenvector Derivatives with Repeated Eigenvalues" , AlA A Jour-
nal, 27 (4), pp. 486-491, 1989.
[18J vVilkinson, J.H., The Algebraic Eigenvalue Problem, Clarendon Press, Oxford,
1965.
[19J Bindolino, G., and Mantegazza, P., "Aeroelastic Derivatives as a Sensitivity Anal-
ysis of Nonlinear Equations," AIAA Journal, 25 (8), pp. 1145-1146,1987.

303
Chapter 7: Sensitivity of Discrete Systems

[20] Murthy, D.V., "Solution and Sensitivity of a Complex Transcendental Eigen-


problem with Pairs of Real Eigenvalues," Proceedings of the 12th Biennial ASME
Conference on Mechanical Vibration and Noise (DE-Vol. 18-4), Montreal Canada,
September 1720, 1989, pp. 229-234 (in press Int. J. Num. Meth. Eng. 1991).
[21] Kreissclmeier, G., and Steinhauser, R., "Systematic Control Design by Optimizing
a Vector Performance Index", Proceedings of IFAC Symposium on Computer
Aided Design of Control Systems, Zurich, Switzerland, 1979, pp.113-117.
[22] Barthelemy, J-F. M., and Riley, M. F., "Improved Multilevel Optimization Ap-
proach for the Design of Complex Engineering Systems" , AIAA Journal, 26 (3),
pp. 353-360, 1988.
[23] Kramer, M.A., Calo, .T.M., and Rabitz, H., "An Improved Computational Method
for Sensitivity Analysis: Green's Function Method with AIM," Appl. Math. Mod-
eling, Vol. 5, pp.432-441, 1981.
[24] Sandridge, C.A. and Haftka, R.T., "Accuracy of Derivatives of Control Perfor-
mance Using a Reduced Structural Model," Paper presented at the AIAA Dy-
namics Specialists Meeting, Monterey California, April, 1987.
[25] Tadikonda, S.S.K. and Baruh, H., "Gibbs Phenomenon in Structural Mechanics,"
AIAA Journal, 29 (9), pp. 14881497, 1991.
[26] \Villiarns, D., "Dynamic Loads in Aeroplanes Under Given Impulsive Loads with
Particular Reference to Landing and Gust Loads on a Large Flying I3oat," Great
I3ritain Royal Aircraft Establishment Reports SME 3309 and 3316, 1945.
[27] Greene, \v.H., Computational Aspects of Sensitivity Calculations in Linear Tran-
sient Structural Analysis, Ph.D dissertation, Virginia Polytechnic Institute and
State University, August 1989.
[28] Greene, \v.H., and Haftka, R.T., "Computational Aspects of Sensitivity Calcu-
lations in Transient Structural Analysis" , Computers and Structures, 32, pp.
433-443, 1989.

304
Introduction to Variational Sensitivity Analysis 8

The methods for discrete sensitivity analysis discussed in the previous chapter
are very general in that they may be applied to a variety of nonstructural sensitivity
analyses involving systems of linear equations, eigenvalue problems, etc. However, for
structural applications they have two disadvantages. First, not all methods of struc-
tural analysis lead to the type of discretized equations that are discussed in Chapter
7. For example, shell-of-revolution codes such as FASOR [11 directly integrate the
equations of equilibrium without first converting them to systems of algebraic equa-
tions. Second, operating on the discretized equations often requires access to the
source code of the structural analysis program which implements these equations.
Unfortunately, many of the popular structural analysis programs do not provide such
access to most users. It is desirable, therefore, to have sensitivity analysis methods
that are more generally applicable and can be implemented without extensive access
to and knowledge of the insides of structural analysis programs. Variational methods
of sensitivity analysis achieve this goal by differentiating the equations governing the
structure before they are discretized. The resulting sensitivity equations can then be
solved with the aid of a structural analysis program. It is not even essential that the
same program be used for the analysis and the sensitivity calculations.
As an example of this approach consider the Euler-Bernoulli plane beam governed
by the differential equation

(8.1)

where w denotes the transverse displacement, EI is the flexural rigidity and q is the
load. Equation (8.1) is supplemented by appropriate boundary conditions. Imagine
that we have to design a class of structures that are modeled well by this beam equa-
tion with complex loading and boundary conditions corresponding to intermediate
supports. We have an old computer program, written to solve this problem, for which
we do not have any programming documentation. We now want to use this program
to calculate the sensitivity of the response to changes in the stiffness properties of the
beam. Finite difference sensitivity calculations are, of course, the first choice in this
type of situation. However, difficulties in finding good step-sizes for accurate deriva-
tives (see Section 7.1) force us to consider the calculation of analytical derivatives.
305
Chapter 8: Introduction to Variational Sensitivity Analysis
We start by differentiating Eq. (8.1) with respect to a parameter p (since x is used in
this chapter to denote a coordinate variable, we use p for the generic design variable)
which affects the moment of inertia of the beam over part of the span

(8.2)

where a comma subscript followed by p denotes differentiation with respect to p.


Comparing Eqs. (8.1) and (8.2) we note that both have the same left-hand side in
the unknown functions, wand w,p, respectively. If we treat the right-hand side of Eq.
(8.2) as a load, the similarity is complete. As in Chapter 7, we call this right-hand
side a pseudo load. If that pseudo load is applied to the beam, the response to the
pseudo load will be the derivative of the original response with respect to p. We now
have a prescription for using our beam analysis program for calculating sensitivity.
We need to write a postprocessor that will take the solution wand the derivative of
the moment of inertia l,p, construct the pseudo load, and output it in a form required
by our program for the load definition.
There are many approaches to sensitivity calculations using variational methods.
Reference [2] provides an excellent exposition of the methods, as well as a sound
mathematical basis. The present chapter has the more modest aim of introducing
some of the basic methods with a few examples. The discussion is based on the
principle of virtual work which provides a good foundation for both discrete and
continuum based sensitivity analysis. Most of the material in this chapter is limited
to the calculation of sensitivity with respect to stiffness (sizing) parameters, with the
last section introducing sensitivity with respect to shape.
The results obtained in this chapter depend often on the differentiability of the
structural response with respect to the sizing or shape parameter. Throughout the
chapter it is assumed that the structural response is differentiable with respect to
the parameter in question, and that the sensitivity field has the same differentiability
properties with respect to space coordinates as the original response.
Finally, the material in this chapter is rather abstract, and many readers may
want to skip the derivations and focus only on the implementation of the final results
of the derivations. It is suggested that the introductory part of Section 8.1 be read
to understand the notation, and then the implementation notes at the end of each
section be read to obtain information on how to implement sensitivity calculations
using structural analysis programs without the need for access to the source code of
the program.

8.1 Linear Static Analysis

The equations governing static structural response include the strain-displacement


relation, the constitutive equations, and the equations of equilibrium. These equa-
tions take different forms depending on whether we consider the full three-dimensional
problem or special cases such as plane-stress, plate bending analysis or beam analysis.
For the sake of obtaining results that are generally applicable, we adopt Budiansky's

306
Section 8.1: Linear Static Analysis
operator notation for these equations. The notation is compact and allows for ease in
algebraic manipulations. However, it is abstract, and it is not always easy to grasp.
The reader who has trouble with the notation may want to translate the abstract
equations for a specific case such as plane-stress or beam analysis. For linear analysis,
the strain displacement relation is written as

(8.1.1 )

where e is the generalized strain tensor, and u the displacement vector, and Ll is a
linear differential operator. For example, for Euler-Bernoulli beam analysis the gen-
eralized strain tensor has one component, the curvature Ii, and Eq. (8.1.1) translates
into
Ii = w,,,,,. (8.1.2)
The strain is obtained from the generalized strain Ii as f = -yli where y is the distance
from the neutral axis of the beam. However, in using the principle of virtual work
it is convenient to use the generalized strain and stress tensors rather than actual
strains and stresses.
For plane-stress analysis e has actual strain components f", fy, and f"y, while Ll
is given as

(8.1.3)

However, the constitutive equations are written in terms of generalized stresses which
are the stress resultants.
The linear constitutive equations are the appropriate version of Hooke's law, and
may be written as
(8.1.4)
where (T is the generalized stress tensor, D is the material stiffness matrix, and e i
is the initial strain (e.g. due to an applied temperature field). For example, for the
plane-stress problem, (T includes the stress resultant components N", Ny, and N"y,
while for a beam bending the stress resultant is the section bending moment M, and
the constitutive equation is
M = EI(Ii _ Iii) , (8.1.5)
where E is Young's modulus and I is the section moment of inertia.
The equations of equilibrium are written via the principle of virtual work as

(T • be = f • bu , (8.1.6)

where f is the applied load field, and a bullet denotes a scalar product followed by
integration over the structural domain. For example, for the plane stress case

307
Chapter 8: Introduction to Variational Sensitivity Analysis

and

f • ou = J
f . oudA = J (fx ou + fyov)dA + J (Tx ou + Tyov)drT , (8.1.7)

where fx and fy are body forces per unit area, and T x , Ty are tractions on the loaded
boundary T r
The virtual displacement field OU must be differentiable and satisfy the kinematic
boundary conditions, but is otherwise arbitrary. The virtual strain field oe is obtained
from the virtual displacement field via Eq. (8.1.1) as

(8.1.8)
This operator notation for the equations is quite general, in that it is equally ap-
plicable to continuum problems as well as to discrete formulations. It is also very
convenient for sensitivity calculations. In this section we consider only sensitivities
with respect to a stiffness parameter appearing in the material stiffness matrix D.
For one or two dimensional problems the parameter can include sizing variables, such
as rod cross-sectional areas or plate thicknesses, since these variables are incorporated
in D (as is the beam section moment of inertia in Eq. (8.1.5)).

8.1.1 The Direct Method

The direct method for sensitivity calculation is obtained by differentiating the equa-
tions defining the response of the structure with respect to p. We then obtain a set
of equations for the response sensitivity u,p, e,p, and (T,p' The governing equations
for the sensitivity fields are shown to be the same as the equations for the response
itself, albeit with a different loading terms, that are called pseudo loads. The impli-
cation is that if we replace the loading in the original problem by the pseudo loads
our structural analysis package will compute the response sensitivities instead of the
response. We start by differentiating the strain-displacement relation

(8.1.9)

Similarly, differentiating the constitutive equations we obtain

(T,p = De,p + D,p(e - e i ), (8.1.10)

and the differentiated equations of equilibrium are

(T,p.oe = 0, (8.1.11)

where oe, given by Eq. (8.1.8), is not a function of p because ou is an arbitrary field.
Note that all the sensitivity fields have units of the original fields divided by units of
p. For example, if p represents the cross-sectional area of a truss member then (T,p
has units of stress divided by area.
We now compare the differentiated equations, Eq. (8.1.9), (8.1.10), and (8.1.11)
to the original governing equations, Eq. (8.1.1), (8.1.4), and (8.1.6). We see that the

308
Section 8.1: Linear Static Analysis

sensitivity field u,P' e,p, u,p can be viewed as the solution of the original structure
under a different set of loads called the pseudo loads. These loads do not include any
mechanical components, but just an initial strain field eP • This initial strain field is
obtained by rearranging Eq. (8.1.lO)as
U,p = D(e,p - eP ), (8.1.12)
For example, for truss members the relation between the generalized stress (member
force N) and the strain is
N=EA(€-€i). (8.1.13)
Differentiating this equation with respect to A we get
N,A = EA[€,A + (€ - €i)jA], (8.1.14)
so that to implement the direct method we need to apply an initial strain of magnitude
-(€ - (i)jA instead of the actual loads.
As another example consider the isotropic plane-stress case, where the constitu-
tive equations are

Nx }
{ Ny
N xy
Eh
= --
1- v 2
[1
v
0
v
1
0 l~vl {:: }. (8.1.15)
2 txy

By differentiating Eq. (8.1.15) with respect to the thickness h we can show that to
find the sensitivity with respect to change in thickness we need to apply a pseudo
initial strain of eP = -ejh. To obtain sensitivity with respect to Poisson's ratio v we
note that

oo
2(1 + v)
1,Dv= Eh [2V
l+v2
1 + v2
2v o 1.
, (1 - v 2 )2 0 o _ (1_;)2

(8.1.16)
so that we need to apply a pseudo initial strain of

...P - - -
'"" -
1
2
{-V€ -
-v/-/ y

x
}
(8.1.17)
1- v (1 - vh"y

When we analyze the structure using a finite-element model, the pseudo initial
strain is converted to a pseudo nodal force fP such that
De P • De = fP • DU . (8.1.18)
With other solution techniques the pseudo load is obtained from the initial strain
in a different manner. For example, in a three dimensional continuum formulation
the pseudo initial strain, e P , can be replaced by pseudo body forces with components
Ii = (DeP)ij,j and surface tractions with components Ti = (DeP)ijnj. Where nj are
the components of the vector normal to the boundary S, and a comma followed by
an index j denotes derivative with respect to the coordinate Xj.

309
Chapter 8: Introduction to Var'iational Sensitivity Analysis

1-
y,v
x,u

I
Figure 8.1.1 Three bar truss.

Example 8.1.1

Calculate the derivative of the stress in members A, Band C of the truss in Fig.
(8.1.1) with respect to the area of member B. At the nominal configurat.ion all three
members have the same area A.
\Ve assume that the areas of members A and C remain the same, AA, and denote
the area of member B as A B . Due to symmetry, the vertical force contributes only
to the vertical displacement, and the horizontal force only to the horizontal displace-
ment. Furthermore, member B does not influence the horizontal displacements. It
is easy then to check that the two displacements at the point of load application are
given as
v = Pv1/[(AB + 0.25A A )E] . (a)
The forces in members A, I3 and C are then calculated to be

, ~ ~ 0.25PVA A ~
NA = 0.5{730PH + 4 A = 0.97730P,
• B + 0.25 A
T _ PVAB _
AB - A B+O. 25A A - 1.6P, (b)

Ne = -0.57735PH + AO.25Pv;4~ = -O.17735P.


B + O. 5 A

\Ve can calculate the derivatives of these forces with respect to AB analytically for the
purpose of comparing it later with the derivatives we obtain using the direct method.

dNe dNA -0. 25PV A A = -O.32P/A.


=
dAB dAB (AB + 0.25A A F

dN B O. 25PV A A .. = O.32P/A.
dAB (AB + 0.25A A F
310
Section 8.1: Linear Static Analysis
For our problem, we need to apply to member B a pseudo initial strain
fP = -fB/AB = -NB /EA1 = -1.6P/EA2 ,
while for other members the pseudo initial strain is zero. Note that as with all
sensitivity fields the units of the pseudo initial strain are units of strain divided
by units of p (area here). The displacement field generated by this initial strain is
obtained by applying to member B a pair of opposite forces, with the force at the
bottom joint (having units of force over area) being (see Eq. (8.1.18))

f P8v = 11 EA B fP8f.dy = EABfP(8v/l)1 = -1.6P/A8v.


We can get the corresponding displacements by setting the horizontal force PH to
zero and replacing the vertical force Pv in Eq. (a) with fP. These displacements are
the derivatives of the original displacements. Thus
du dv (-1.6P/A)l -1.28Pl
dAB = 0, dAB = (AB + O.25AA)E = EA2
The derivative of NA and Nc can be similarly obtained from Eq. (b)
dNc = dNA = 0.25( -1.6P/A)AA = -0.32P .
dAB dAB AB + 0.25A A A
However, the internal load in member B due to the pseudo initial strain, which
corresponds to the derivative of N B cannot be obtained in a similar way from Eq.
(b) because member B has now initial strain. Instead we use Eq. (8.1.14), which
requires the derivative of fB

and then
dNB = EA (df B _ P) = -1.28P 1.6P = 0 32 P
dAB B dAB f A + A . A·
We note that both derivatives agree with the expressions we obtained by explicit
differentiation.
To calculate the derivatives of the stresses from the derivatives of the loads we
note that
NA NB Nc
O"A = - , O"B = AB' O"C = - ,
AA AA
and therefore
dO"A 1 dNA -0.32P
dAB
=- -- =
AAdA B A2
dO"c 1 dNc -0.32P
dAB
=- --
AAdAB A2
and

•••
311
Chapter 8: Introduction to Variational Sensitivity Analysis
8.1.2 The Adjoint Method

Often we do not need the derivatives of the entire displacement or stress fields, but
only a few quantities such as the derivative of the vertical displacement at a point, or
the Von Mises stress at another point. In such cases it may be more economical to use
the adjoint method to calculate these derivatives. We therefore consider the adjoint
method for calculating derivatives of displacement and stress functionals. Consider
first a displacement functional defined by an integral over the structural domain V

H = J h(u,p)dV. (8.1.19)

This could also be used to represent the value of a displacement component at a point
by employing the Dirac delta function as part of h. The derivative of H with respect
to a design parameter p is

H,p = J h,pdV + h,u e u,p , (8.1.20)

where h,u is a load-like vector field (recall that a bullet denotes scalar product followed
by integration over the structure). For example, in a plane-stress case if h = u 2 + v 2 ,
then
(8.1.21)

The calculation of h,p and h,u is typically easy, and the main difficulty is to obtain the
derivative of the displacement field, u,p' We can use the direct method to calculate
u,p, but instead, as shown below, we can define an adjoint problem with h,u as
the load, and use it to eliminate u,p' Since we want the derivative of H with the
requirement that Eqs. (8.1.1), (8.1.4) and (8.1.6) are satisfied, we multiply these
equations by some appropriate Lagrange multipliers (called adjoint variables) and
add them to H. The Lagrange multipliers for Eqs. (8.1.1) and (8.1.4) are an adjoint
stress field and an adjoint strain field, respectively. Equation (8.1.6) represents the
equations of equilibrium written as the work done on a virtual displacement field bu,
and the corresponding virtual strain field be = Ll(bu). Multiplying the eqnations of
equilibrium by a Lagrange multiplier is equivalent to calculating the work done when
this Lagrange multiplier is treated as a virtual displacement field. So we replace the
bu by the adjoint displacement field. Denoting the adjoint fields by a superscript a
we get
H*=H+uae(e-Ll(u))+eae(u-D(e-ei))+feua-ueLl(ua). (8.1.22)

Because Eqs. (8.1.1), (8.1.4) and (8.1.6) have to be satisfied for all values of p, we
have H* = H, and H~ = H,p' We will now differentiate Eq. (8.1.21), and then define
the adjoint fields so as to get rid of the terms involving the (expensive) derivatives of
the response. The derivative of Eq. (8.1.21) with respect to p is

H,p = Jh,pdV + h,u e u,p + (u a - De a ) e e,p - u a e L1(u,p)


(8.1.23)
_e a e D,p(e - e i ) + (e a - L1(u a )) eu.p '

312
Section 8.1: Linear Static Analysis
We can get rid of the terms involving u,p and e,p by requiring the adjoint fields to
satisfy the linear strain displacement relationship and Hooke's law

(8.1.24)

(8.1.25)
The terms involving u,p can be removed by requiring the adjoint field to satisfy the
equilibrium equations with a body force equal to h,u, so that from the principle of
virtual work
u a • De = h,u • DU. (8.1.26)
Indeed, if we choose DU = u,p in Eq. (8.1.26) we get the desired elimination of the
u,p terms. Altogether we get

H,p = J h,pdV - D,p(e - e i ) • ea . (8.1.27)

Using Eqs. (8.1.12), and (8.1.25) we can write this as

(8.1.28)

When we use the finite element method for the analysis we can transform the second
term further. To this end we set De = e a , Du = u a in Eq. (8.1.18) to obtain

(8.1.29)

so that Eq. (8.1.28) becomes

H,p = J h,pdV + fP • u a . (8.1.30)

The treatment of a generalized stress functional is similar. \Ve limit the treatment
to the case where there is no initial strain in the structure (that is, mechanical loads
are allowed, but no temperature loading, dislocations, etc.) and consider the stress

J
functional
G = g(u,p)dV (8.1.31 )

J
and its derivative
G,p = g,pdV + g,IT. u,p, (8.1.32)

where g,CT is a strain-like tensor. Again, to get rid of the expensive derivative of
the response, u,p, we add the adjoint terms as Lagrange multipliers on Eqs. (8.1.1),
(8.104) and (8.1.6)

G* =G +ua • (e - Ll(u)) + ea. (u - De) + f. u a - u. L1(u a ). (8.1.33)

313
Chapter 8: Introduction to Variational Sensitivity Analysis
We differentiate Eq. (8.1.33) with respect to p to obtain

G,p = G~p = / g,pdV + g,O". u,p + (u a - De a ) . e,p - u a • L1(u,p)


(8.1.34)
+e a • u,p - e a • D,pe - e,p. Ll(U a ).

We use Eq. (8.1.10) and rearrange terms to get

G,p = / g,pdV + (u a + Dg,O" - De a ). e,p - u a • LI(U,p)


(8.1.35)
+(g,O" - e a) • D,pe + (e a - L1(u a )) • u,p'

From Eq. (8.1.35) we can see that we can eliminate the terms including derivatives
of the response by using an adjoint strain-displacement relation in the form of Eq.
(8.1.24), and setting Hooke's law for the adjoint field as
(8.1.36)
and equilibrium as
(8.1.37)
That is, in this case the adjoint loading is an initial strain g,O" with no mechanical
load. Then
(8.1.38)

While Eq. (8.1..38) gives us G,p without the need to calculate first the design sensi-
tivity field, its second term involves calculations of stiffness matrix derivatives at the
element level, and may require some knowledge of the details of the finite-element
analysis. To overcome this problem we note that by using Eq. (8.1.36) we can
transform the second term of Eq. (8.1.38) into
(8.1.39)
so that using Eq. (8.1.12), which with e i = 0 reduces to e P = -D-1D,p, we can also
write G,p as
(8.1.40)

In obtaining Eq. (8.1.40) we used the fact that if UI and U2 are two stress tensors,
then UI.n-1u2 = D-1uI.u2. As with the displacement functional we can also write
G,p in terms of the pseudo load. We use Eq. (8.1.18) with (8u,8e) set to (ua,e a ) and
Eq. (8.1.12) to obtain
(8.1.41)
and then Eq. (8.1.38) becomes

G,p = / g,pdV + f P • u a + D,pe. g,O", (8.1.42)

314
Section 8.1: Linear Static Analysis
The last term in Eq. (8.1.42) still involves computations with displacements and
strains which may not be easy to implement in a general structural analysis code.
However, when G is simply the average stress (not generalized stress!) in an element,
the first and last terms often cancel. Consider, for example, the average stress in the
ith element of a truss. In a truss element the generalized stress is the member force

J
N, so
G= .!.Ii N dl·
A'
and (8.1.43)

When we need the derivative of G with respect to a design variable which does not
affect the ith element both the first and third terms in Eq. (8.1.42) are zero. For the
derivative of G with respect to the area of the ith element we have from Eq. (8.1.42)
(using D,p = E and f = N/AE)

G,p = ~J (-N/A 2 )dl; + f P • u a + J (N/A)(l/Ali)dl; = f P • u a • (8.1.44)

Note that, as in the discrete case, both the direct and adjoint methods use the pseudo
load fP. In the direct method fP is applied to the structure to stand for the pseudo
initial strain e P of Eq. (8.1.12), and the response to that load is u,P' In the adjoint
method fP is used to form a scalar product with the adjoint displacement field u a , as
in Eq. (8.1.42).

Example 8.1.2

We solve Example (8.1.1) again using the adjoint method to obtain the derivatives
of the stresses in members A and B with respect to the cross-sectional areas of both
members.
Consider first the stress in member B written in terms of the generalized stress
(member force) N B
G=aB = -.!..JN
IB
BdlB .
AB
The adjoint load is an initial strain g,O" which is denoted here as g,N because the
member force N is the only component of (T.

1 1
g,N = lBAB = lA .
Note that adjoint initial strain is measured in units of l/(volume) in contrast to the
dimensionless physical strains. As a result, all the units of the adjoint field will be
the original ones divided by volume. As in Example (8.1.1), the effect of this initial
strain is obtained by applying a pair of opposite forces to member B, with the force
at the bottom being EABg,N = E/l. Using Eq. (a) of Example (8.1.1) we get

va = (E/I)l/[(AB + O.25AA)E] = O.8/A.

315
Chapter 8: Introduction to Variational Sensitivity Analysis

Following Eq. (8.1.44) we multiply this by the pseudo load fP of -1.6P I A obtained
in Example (8.1.1) to get
d~B P
dAB = G,AB = -1.28 A2 '
which agrees with the result obtained in Example (8.1.1).
Next we calculate the derivative of ~B with respect to AA. As in Example (8.1.1)
we need to calculate the pseudo load due to a change in AA. This change affects
both members A and C, leading to pseudo initial strains of -fAIAA and -fcIAA,
respectively. This in turn leads to pseudo loads of - N AI AA in the direction of member
A and -NcIA A in the direction of member C. The components of the pseudo load
are
PH = - -
p (NA - -Nc). sm60 0 = --, P PvP = - (NA
- + -Nc) cos60 0 = -O.4 P
AA AA A' A,1 AA A '
where the values of NA and Nc are substituted from Eq. (b) of Example (8.1.1).
Multiplying the adjoint displacement by the pseudo load we obtain
d~B _ G _ ,ap p _ P
-- - 4. - 1. V - -0.32-
dA A ,. A A2'
which can be easily checked directly.
N ext we calculate the derivatives of ~A by considering the functional

G= ~JNAdIA'
1.4. AA
\Ve need to impose an adjoint initial strain of
1 1
g,N = lAAA = 2/A .
This is implemented by applying a pair of opposite forces at the two nodes of member
A of magnitude EAAg,N = E 121 collinear with member A. The horizontal and vertical
component of the adjoint force at the bottom node are
a _ 0.433E a _ 0.25E
PH - --Z-, Pv - -1-'
Using Eq.(a) of Example 8.1.1 we get
(0.25E
va = -;----'--_ It)Z
_'---'---.,.-_ 0.2 4(0.433E Il)i 0.57735
(As + 0.25A ,1 )E A' ua = 3EA A = A .
To get the derivative of the stress with respect to AB we multiply the adjoint dis-
placements by the pseudo load associated with AB to obtain
d~A = -1.6PO.2 = -0.32~.
dAB A A A2
Similarly, to obtain the derivative with respect to AA we multiply the adjoint dis-
placements by the pseudo loads associated with AA
- -P 0.57735 - ----
d~A
- O.4P 0.2
= - 06 - P
.5(35 2 ,
dA A A A A A A
This last result can be checked directly by using the expression for N A in Eq. (b) of
Example (8.1.1) •••

316
Section S.l: Linear Static Analysis
S.1.3 Implementation Notes

In general, the direct method is easier to implement than the adjoint method, par-
ticularly if the implementation is outside the structural analysis program. The direct
method will require a postprocessor that calculates the value of the pseudo initial
strain from the values of the actual strains based on Eq. (S.1.12). The derivative of
the material stiffness matrix D,p which needs to be evaluated in this postprocessor
requires knowledge of the form of Hooke's law used in the analysis program, but not
of any finite element implementation. Then the values of the pseudo strain can be
used as initial strain input to the same structural analysis package. The output of
the package will be the sensitivity field, instead of the response. If the structural
analysis package docs not have the capability of accepting initial strain input it is
often possible to use a combination of a temperature field and anisotropic coefficients
of thermal expansion to get the required initial strains.

The implementation of the adjoint method with the displacement functional H


of Eq. (S.1.19) is very simple. We need to perform the structural analysis with the
actual loading replaced by the adjoint load h,u. Note that the adjoint load h,u is
similar to the adjoint load used in the discrete case; that was the derivative of the
constraint with respect to the displacement vector (sec Section 7.2.1). Its units are
the units of It divided by the units of il, and in general these will not be the units
of force per unit volume. As a result, the units of the adjoint field will not be the
normal units associated with displacement, strain and stress fields. vVe also need to
add a postprocessor that will perform the calculations indicated in Eqs.(S.1.2S) or
Eqs. (S.1.30). The former involves the pseudo initial strain of Eq. (8.1.12) and the
latter the pseudo load associated with this strain, Eq. (S.1.1S). Equation (S.1.30)
is typically easier to implement than Eq. (8.1.2S) because it requires only a scalar
product of the pseudo-load with the adjoint displacement field. It is complctely
analogous to Eq. (7.2.8), except that the pseudo-load is not obtained in terms of
derivatives of stiffness matrices, and so does not require intimate knowledge of the
finite-element package.

The implementation of the adjoint method for a stress functional, Eq. (S.1.31)
is more complicated. First we need to implement the calculation of an initial strain
field g,(7, which is usually fairly simple. We then need to implement Eq. (S.1.42),
which requires calculations at the element level for a finite element program. The
discussion following Eq. (S.1.42) shows that when the stress functional is just the
stress itself this difficulty can be in many cases bypassed.

317
Chapter 8: Introduction to Variational Sensitivity Analysis
8.2 Nonlinear Static Analysis and Limit Loads

8.2.1 Static Analysis

In this section we generalize the results of the previous section to the case of geo-
metric nonlinearity. We consider only the case where the nonlinearity is adequately
represented by replacing Eq. (8.1.1) by

(8.2.1)

where L2 is a second order homogeneous operator. For example, for the nonlinear
deformation of a beam under lateral and axial loads, the generalized strain has one
component of axial strain Ex and one component of curvature K, and Eq.(8.2.1) is
written as
(8.2.2)

The variation of the strain is specified in terms of displacement variation as

De = L 1(DU) + Lll(u, DU), (8.2.3)

where Lll is a symmetric bilinear operator, i.e. Lll(u, v) = Lll(v, u), defined by

L 2(u + v) = L2(U) + L 2(v) + 2L ll (u, v). (8.2.4)


In particular Eq. (8.2.4) yields

(8.2.5)
In solving nonlinear analysis problems it is customary to increase the load gradually
from zero to its final value. To accommodate this practice we assume that the load
f and the initial strain e i depend on a load amplitude parameter It, that is

(8.2.6)

The structural response can then be obtained by solving Eqs. (8.2.1), (8.1.4) and
(8.1.6) as a function of the load parameter ft.
Unfortunately, in the nonlinear regime the response is not always a single-valued
function of the load parameter ft. Figure (8.2.1) shows a typical load displacement
curve for two values of the stiffness parameter p. At load levels near the maximum
(limit load), there are two solutions for each value of J1. Structural analysis packages
that solve for nonlinear response often use more general parameters for tracing the
response curve. A typical parameter is the arc length in the (u, Il) space. We call
any parameter that is used to trace an equilibrium path (that is, a path of solutions
to Eqs. (8.2.1), (8.1.4) and (8.6)) a path parameter.
318
')ection 8.2: Nonlinear Static Analysis and Limit Loads
"-
"- .......

~ ______________________________________--..u

Figure 8.2.1 Load displacement diagram.

In considering changes in the structure we want to have the freedom to change


both the load parameter and the stiffness simultaneously. Such simultaneous changes
will be needed in the calculation of derivatives of limit loads. Figure (8.2.1) shows
one example of the curve traced by such a more general path parameter. The dashed
curve in the Figure connects all the limit points for configurations with different
stiffnesses. We denote derivatives with respect to general path parameters by a dot.
Differentiating Eqs. (8.2.1), (8.1.4) and (8.1.6) with respect to such a parameter we
get
e = Ll(U) + Lll(u, u), (8.2.7)
iT = D(e - e i ) + D(e - jJ,e i/ ) , (8.2.8)
iT. {je + ( 7 . Lu(u, (ju) = f-tf' • {ju, (8.2.9)
where a prime denotes differentiation with respect to f.1. The second term in Eq.
(8.2.9) is due to the dependence of (jg on u, Eq. (8.2.3).
Most solution algorithms used for nonlinear analysis are based. on gradually in-
crementing the load parameter f.1. Quite often the solution requires the calculation
of the sensitivity of the response with respect to f.1. Specializing Eqs. (8.2.7)-(8.2.9)
to this case, and denoting load sensitivities by primes, we obtain

g' = L1(u') + Lll(u, u'), (8.2.10)

(7' = D(g' _ gil), (8.2.11)


(7' • {je + ( 7 . Lll(u', (ju) = f' • {ju, (8.2.12)
where D is assumed to be independent of the load. We will refer to the calculations
required for solving Eqs. (8.2.10)-(8.2.12) as the load sensitivity module. These equa-
tions for the derivatives with respect to the load parameter are quite similar to the
equations governing the sensitivity to a stiffness parameter obtained by specializing
Eqs. (8.2.7)-(8.2.9) to the case of a stiffness parameter p:

(8.2.13)
319
Chapter 8: Introduction to Variational Sensitivity Analysis

(7',p = D,p(e - ei ) + De,p = D(e,p - e P ), (8.2.14)

(7',p • be + (7' • Lll (u,p, bu) = O. (8.2.15)


Comparing the two sets of equations we note that the load term in Eq. (8.2.12) is
missing from Eq. (8.2.15), and that the constitutive relation Eq. (8.2.14) includes
a different initial strain which is equal to e P defined by Eq. (8.1.12). Consequently,
in terms of implementing the calculation of design sensitivity in a structural analysis
package, we use the load sensitivity module with the actual load and initial strain
replaced by the pseudo initial strain e P with zero mechanical load.

In terms of a finite element analysis, the load sensitivity equations are governed by
the tangent stiffness matrix. So the only difference between the linear and nonlinear
sensitivity calculation is that the pseudo initial strain is applied to the "tangent"
structure instead of the original structure. Finally, let us note that both the load
sensitivity equations and the design sensitivity equations are linear, even though the
analysis problem is nonlinear. This is a general property of sensitivity analysis of
nonlinear problems.

It can be shown that the effect of nonlinearity on the adjoint method is similar
to its effect on the direct method. That is, in the case of a displacement functional
H of Eq. (8.1.19) the adjoint structure satisfies

(8.2.16)

(8.2.17)

(8.2.18)
The adjoint structure is therefore the tangent structure with h,u as the applied load
[see also [3J). To implement the calculation of the adjoint field in a structural analysis
package, we need only to replace the actual load by h,u in the load sensitivity module.
It can be shown (Exercise 5) that Eq. (8.1.28) is still applicable. Similarly, in the
case of a stress functional G of Eq. (8.1.31) we apply an initial strain g,O" to the
tangent structure, and we can still obtain Eq. (8.1.42).

Example 8.2.1

The beam shown in Figure (8.2.2) has a cross sectional area Ao, a moment of inertia
I = 0.00IAoL2, and is subject to a constant applied temperature, T (measured
from the stress-free temperature), and a variable transverse load, I1P. The applied
temperature T is selected so that the resulting axial load is close to the buckling load
limit, that is, EAo€i = EAoo:T = 7.5EIo/ L2, where 0: is the coefficient of thermal
expansion, and the applied load is P = 1.2 x 1O- 4 EA o. We want to calculate the
derivative of the displacement under the load, Wm, with respect to the cross sectional
area A (assuming that P and I remain constant).

320
Section 8.2: Nonlinear Static Analysis and Limit Loads

2L

Figure 8.2.2 Beam subject to initial strain and normal load.

For a beam under combined axial and bending actions the generalized strain
tensor has two components fx and "', and the generalized stress tensor includes the
axial load N and the bending moment M. The nonlinear strain-displacement relation
for the beam is given by Eq. (8.2.2), and Hooke's law is

M=EI""
where fi = aT. The virtual work equation is

where
8", = 8w,xx.
First we solve the analysis problem in closed form based on a simple finite-element
model. Because of the symmetry we need analyze only the left half of the beam,
using half of the force, and symmetry conditions of u = 0, and w,>: = 0 in the middle.
We approximate the left half of the beam by a single beam finite element with linear
variation of u and cubic variation of w. Using the boundary conditions, and the
finite-element shape functions we have

u = 0, where x = x/L, W = w m / L.

The expressions for the strains and generalized stresses are

fx = 18w2 (x - X2 )2, '" = (6w/L)(1- 2x),


8f x = 36w8w(x - X2)2, 8", = (68w/L)(1- 2x),

N = 18EAw2 (x - :£2)2 - EAf i , M = (6Elw/ L)(l - 2x).


Integrating the virtual work equation over the element (with a load O.P at the end)
we obtain
(a)
321
Chapter 8: Introduction to Variational Sensitivity Analysis
Dividing Eq. (a) by EA and using the relations between the values of I, (i and P for
A = Ao we get
1.02857w3 + 0.003w = 6 x 1O-5p.
For p = 1 we get W = 0.01800.
Before applying the direct and adjoint methods to calculate W,A, we consider the
tangent state of derivatives with respect to p. Equations (8.2.10) - (8.2.12) become

(' = u',x + W ,x w',x = 36ww'(i; -


if:
i;2)2 , r/ = w~xx = 6w'(1 - 2i;)/ L,

N' = EA(~, M' = ElK',


1L (N'bE", + M'{YK + Nw:x6w,x)dx = 0.5P6wm .
This last equation can be integrated to yield

(b)
This equation can be verified by differentiating Eq. (a) with respect to p.
Direct method: Equations (8.2.13)- (8.2.15) become (remember that I is constant)

(x,p = u,xp + w,xw,xp, K,p = w,xxp,


N,p = EA[(x,p + ((x - (i)/AJ, M,p = EIK,p,

1L (N,pbE z + M,p8K + N w,zp8w,x )dx = 0.

We note that the sensitivity equations are identical to the tangent state equations
except that instead of the load P we have the pseudo initial strain (p = -((x - (i)/ A.
Using Eq. (8.1.18), we find that the initial strain gives rise to a pseudo load defined
by

L
P P 6wm = - 1 E(f x - fi)&",dx = -E 1L[18w2(i; - i;2)2 - fi]36iv6w(i; - i;2)2dx

= -1.02857Ew 3 6w m + 1.2EfiW6w m .
(c)
The design sensitivity equation is obtained from the load sensitivity equation, Eq.
(b), by replacing the actual load (0.5P) with the pseudo load, pp and replacing w'
with W,A, so that the equation for W,A is

3.08571EAw 2w,A + 12(El/ L2)W,A - 1.2EA(iW,A = -1.02857 Efv 3 + 1.2EfiW. (d)


This result can be verified by differentiating Eq. (a) with respect to A. Solving for
W,A we obtain
-1.02857w3 + 0.096w 0.6888
W - ---
,A - (3.08571w2 + 0.024)A - A .
322
Section 8.2: Nonlinear Static Analysis and Limit Loads
Adjoint method: To use the adjoint method we define w as

w=H= Jor2L (w/L)8(x-L)dx,


where 8(x - L) is the Dirac delta function. For this case h,u is a vertical unit con-
centrated load of magnitude 1/ L in the middle of the beam. Since the adjoint field is
obtained from the load sensitivity module by replacing the actual load with h,u, the
equation for the adjoint state w R can be obtained by replacing w' by wa and 0.5P by
I/L in Eq. (b)

Then we obtain W,A from Eq.(8.1.30) as

W,A = H,A = P Pw(I/2) = LpPw a

which is identical to the result obtained from the direct method .•••

8.2.2 Limit Loads

Next we consider the calculation of a limit load; here the load sensitivity equations,
Eqs. (8.2.10)-(8.2.12), become singular. To circumvent the problem associated with
this singularity it is customary to define the response path in term of a parameter
other than the load (e.g., a displacement component or an arc length parameter).
We specialize Eqs. (8.2.7)-(8.2.9) to that case, where the parameter controls the
response and ft, but not the stiffness (that is D = 0). At the limit point, jt = 0,
and we denote the derivative of the response with respect to the path parameter by
a subscript 1. That is Eqs. (8.2.7)-(8.2.9) become

(8.2.19)

O'j = Dej, (8.2.20)


O'j • 8e + 0'*. L ll (uj, 8u) = 0, (8.2.21)
where an asterisk denotes the response at the limit point. Note that Eqs. (8.2.19)-
(8.2.21) are similar to the homogenous part of equations for the load sensitivities,
Eqs. (8.2.10)-(8.2.12). The fact that the homogenous equations have a nontrivial
solution indicates that the load sensitivity equations are singular (as expected at
a limit point). The singularity can occur not only at a limit point, but also at a
bifurcation point; here the solution of Eqs. (8.2.1)-(8.2.12) is not unique. At a
bifurcation point we have Eqs. (8.2.19)-(8.2.21) even though jt =I- O. Whether a limit
load or a bifurcation buckling, we call Uj the buckling mode.
To calculate the derivative of the limit load ft* with respect to a stiffness param-
eter p we need to specialize Eqs. (8.2.7)-(8.2.9) so that we can change the stiffness
and the load simultaneously, but have the load remain at its limit value as we change
323
Chapter 8: Introduction to Variational Sensitivity Analysis
the stiffness. This path is denoted by v and shown as a dashed line in Fig. (8.2.1).
Along that path we have
p= v, !l = !l*(p), u=u*(p). (8.2.22)
To denote the simultaneous change of p and the load we use both the p subscript for
derivative and the asterisk for critical load, so that Eqs. (8.2.7)-(8.2.9) become
<p = L1(U~p) + Ll1(U*, u~p), (8.2.23)

oo~ = D,p(e* - ei ) + D«p _ !l~pei'), (8.2.24)


oo~. fle + 00* • Ll1(U~p' flu) = !l~pf' • flu. (8.2.25)
We can now get an expression for the derivative of the limit load by substituting
flu = U1 into Eq. (8.2.25)

* _ oo~. e1 + 00* • Lll(U~p' U1)


(8.2.26)
!l,p - fl. U1 .

This equation requires the derivatives of the pre buckling response. We can elimi-
nate these derivatives without using an adjoint field by noting the similarity of the
numerator to what we get by substituting flu = u~p into Eq. (8.2.21)
(8.2.27)
To make Eqs. (8.2.27) more similar to the numerator of Eq. (8.2.26) we use Eqs.
(8.2.21) and (8.2.24) to rewrite Eq. (8.2.27) as
* e1 -
OO,p. D ,p (*
e - e i) • e1 + !l,p* D e i' • e1 + 00 * • L 11 (UI, u,p* ) = 0 . (8.2.28)
Finally, combining Eqs. (8.2.26) and (8.2.28) we get a form of the derivative of the
limit load with respect to a stiffness parameter
* _ D,p(e* - e i ) • e1
(8.2.29)
!l,p - fl. U1 + Dei' • e1

that does not require derivatives of prebuckling response. This expression can be
simplified further for the case of finite-element calculation. Using Eqs. (8.1.12) and
(8.1.18) we get
* -fP U1 *.
!l,p = (fl + fi') • U1 '
(8.2.30)

where f P* is the pseudo load of Eq. (8.1.18) evaluated at the limit point, and fi' is
the equivalent nodal load due to the initial strain e il .
The above calculation appears to be applicable also to bifurcation buckling. How-
ever, for bifurcation buckling jJ, in Eq. (8.2.9) is not zero. The consistency condition
for this equation is for the right-hand side to be orthogonal to the nonzero solution of
the homogeneous problem U1. That is, for the bifurcation problem (fl + fi'). U1 = 0,
and we cannot use Eqs. (8.2.29) and (8.2.30). The sensitivity of bifurcation buckling
loads is discussed in the next section.

324
Section B.2: Nonlinear Static Analysis and Limit Loads

Example 8.2.2

The two-bar truss shown in Figure 8.2.3 is subject to a constant load P and variable
negative applied temperature -pT. As the truss is cooled the displacement h under
the load will increase until a limit point is reached and the truss collapses. We want to
calculate the derivative of the limit load factor pM with respect to the cross-sectional
area A for A = Ao. The other parameters of the problem are PI EAo = 0.001,
aT = 0.01, and B = 10°, where E is Young's modulus, and a is the coefficient of
thermal expansion.

Figure B. 2. 3 Two-bar truss under combined mechanical and thermal loading

Because of symmetry we need analyze only one half of the truss, applying to
it one half of the mechanical load. We select a coordinate x that runs along the
truss member. The strain-displacement relation, Hooke's law, and the virtual work
equation are given as

N = EA(E + paT), lL N &dx = 0.5P8h,


where
bE = bu,,,, + u,,,,bu,x + v,xbv,,,,
Since we are dealing with a truss, we can assume linear variation of u and vasa
function of x and get

u,x = -hsinB, v,x = -hcosB, where h = hi L,


so that
E = -hsinB + 0.5h2 , & = -bhsinB + h8h ,
and
N = EA( -hsinB + 0.5h 2 + /l.aT) .
325
Chapter 8: Introduction to Variational Sensitivity Analysis
Substituting into the virtual work equation we obtain

EA( -JisinO + 0.5Ji2 + fJaT) ( -sinO + Ji)oJi = 0.5P8Ji.


Dividing by 0.5EA, rearranging the equation, and using the data for the problem
gives us

Ji3 - 0.5209Ji2 + 0.02(3.015 + fJ)h = O.OOlAo/A + 0.003473fJ. (a)

Equation (a) can be used to trace the response of the truss as the temperature is
increased. For a given load parameter fJ this requires the solution of a cubic equation.
However, it is possible instead to gradually increase Ji and calculate the resultant fJ.
Tracing the curve we find that the limit load factor is fJ' = 0.56274 corresponding to
a displacement Ji = 0.09424. Since this problem has only one degree of freedom, the
buckling mode has only one component, Ji, and we can take it to have a unit value.
To calculate the sensitivity of fJ* using Eq. (8.2.30) w~ also need ( - (i at the
limit point. Using the expression for the strain in terms of h we get

f* - fi* = -Ji'sinO + 0.5(h*? + fJ* aT = -0.006297.


The pseudo initial strain for a truss element is -(f - fi)/A (see Eq. (8.1.14)), so that
the magnitude of the pseudo load is

f P' = -E{E* - (i') = 0.006297E.


The pseudo load consists of two forces collinear with the truss element and acting at
its two ends. We also need to calculate the pseudo load associated with fi' = -aT.
The magnitude of this force is

t' = -EAaT = -O.OlEA.

This force is also collinear with the element. We now use Eq. (8.2.30), noting that
for our case f' = 0

, -0.006297 Ecos{90 0 + 0) 0.6297


fJ,A = -0.01EAcos(900 + 0) = A;;- .
We check this result by finite differences. We increase the area by one percent, and
substitute A = 1.01Ao into Eq. (a). Solving again we get fJ* = 0.56899 so that the
finite-difference approximation to the derivative is

, 0.56899 - 0.56274 0.625


fJ,A ~ O. OlA o =~,
which is in reasonable agreement with the analytical derivative .•••

326
Section 8.3: Vibration and Buckling

8.2.3 Implementation Notes

A structural analysis package for nonlinear analysis will typically have facilities for
generating the derivatives of the applied loads with respect to the loading parameter
IJ, and for solving the tangent equations of equilibrium at any value of that load.
For the sensitivity of static response using the direct method only the second is
needed. The procedure is identical to that used in the linear case (see Section 8.1.3).
The actual load is replaced by the initial strains associated with the stiffness change
(Eq. (8.1.12), and the tangent equations of equilibrium are solved by the structural
analysis package. The output of the package will then be the sensitivity to the stiffness
variable.
The adjoint method is similar to that used in the linear case. The same adjoint
load is used, but it is applied to the tangent system. Equations. (8.1.28) and (8.1.42)
are still applicable. However, for nonlinear analysis there is even less of a reason to
use the adjoint method than in the linear case. In nonlinear analysis the cost of the
analysis is much larger than the cost of sensitivity calculations (which are always
linear). Therefore, even when the number ofresponse functionals to be differentiated
is much smaller than the number of design variables, the direct method is still a
reasonable choice.
For sensitivity of limit loads, Eq. (8.2.30) is easy to implement. It requires
calculation of the pseudo load associated with the stiffness change, and the compu-
tation of two scalar products: of the pseudo load and the actual load (including both
mechanical and initial strain components) with the buckling mode.

8.3 Vibration and Buckling

We first consider small free harmonic vibrations with frequency w superimposed on


the nonlinear equilibrium state (u(IJ), c(IJ), O"(p,)) associated with load parameter
IJ. \Ve denote the vibration amplitude fields by Ul, Cl and 0"1. These vibration
amplitude fields can be viewed as small perturbations off the nonlinear equilibrium
state. Therefore, the equations satisfied by these fields are obtained by adding a
small perturbation to the nonlinear field equations, Eqs. (8.2.1), (8.1.4) and (8.1.6)
and replacing the body force f with a D'Alembert inertia force. Assuming that there
is no initial strain we obtain
Cl = L 1(Ul) + L l1 (u, ud, (8.3.1)
0"1 = DCl , (8.3.2)
0"1. bc + 0". L l1 (Ul, bu) = w2 Mu1 • ou, (8.3.3)
where M denotes the mass tensor and Dc is given by Eq. (8.2.3). Note that these
equations are identical to the load sensitivity equations, Eqs. (8.2.10)-(8.2.12) except
that f' is replaced by the inertia load. Setting bu = Ul in Eq. (8.3.3) we obtain the
Rayleigh quotient for the vibration frequency

(8.3.4)

327
Chapter 8: Introduction to \ iational Sensitivity Analysis

Under static loading tl1t.,cructure buckles at a load IL' corresponding to a pre-


buckling state U* = U('L*), e' = e(J.L'), {T' = (T(J.L*). The buckling load corresponds
to a zero vibration frequency. Therefore the bucking mode UI, el, {TI satisfies Egs.
(8.3.1), (8.3.2), and (8.3.3) with w = 0 and U = u*, {T = {T'.

8.3.1 The Direct Method

To calculate the derivative of the frequency with respect to a stiffnpss parameter p


we start by differentiating Eqs. (8.3.1), (8.3.2), and (8.3.3) with respect to p, then
set bu equal to the mode shape UI, and use Eg. (8.2.5) to obtain

(8.3.5)

{TI,p = D,pel + Del,p, (8.3.6)


+ {TI • L ll (u,p, ud + {T,p. L2(ud + (T. Lll(UI,p, UI)
{TI,p. el
(8.3.7)
= (w2),pMul • UI + w2M,pUI • UI + w2MuI,p. UI .
The derivatives of the vibration mode UI,p, {TI,p can be eliminated from Eq. (8.3.7)
by first setting bu = UI,p in Eq. (8.3.3), and using Eq. (8.2.3); this gives

(8.3.8)

Then subtracting Eq. (8.3.8) from Eq. (8.3.7) and using Eqs. (8.3.5) and (8.3.6) we
can get (Exercise 7)

(w 2 ) p = D,pel • el + 2{TI • Lll(u,p, UI) + {T,p. L 2 (ud - w2M,pUI • UI .


(8.3.9)
, MUI. UI

The first and last terms in the numerator of Eq. (8.3.9) correspond to the derivatives
of the stiffness matrix and mass matrix, respectively, in Eq. (7.3.5). \\'hen we
calculate derivatives of natural frequencies the other terms in the numerator vanish.
However, for the vibration frequencies of a loaded structure we need the other term
which contain derivatives of the static field u, (T with respect to p. These derivatives
need to be calculated by solving Eqs. (8.2.13) - (8.2.15).
The derivative of the buckling load is ohtained from the condition that w 2 = 0
at buckling. As p changes, J.L' must change with it so that w 2 remains zero, that is
d(w 2 ) = O. Thus
(8.3.10)
where a prime denotes derivative with respect to p. The first term in Eq. (8.3.10) is
the change in w 2 at a fixed load level, and the second is the change in w2 due to a
change in load level. These two changes add up to zero, so that the frequency remains
zero at the buckling load. Equation (8.3.10) gives

(8.3.11)

328
Section 8.3: Vibration and Buckling

To calculate the derivative of the frequency with respect to the load parameter J.l we
start by differentiating Eqs. (8.3.1) - (8.3.3) with respect to J.l and then set bu = Ul
(8.3.12)

(8.3.13)
O"~ eel +O"leLn(u', ul)+O"'eL2(ut)+O"eLn(u~, ut) = (w2)'MuleUI +w2Mu~ eUl.
(8.3.14)
Next, we eliminate the derivatives of the vibration field with respect to J.l by setting
bu = u~ in Eq. (8.3.3) and using Eq. (8.2.3)

(8.3.15)
and then subtracting Eq. (8.3.15) from Eq.(8.3.14) and using Eqs. (8.3.2), (8.3.12),
and (8.3.13) to get

(w 2 )' = 20"1 e Ln(u', Ul) + 0"' e L2(Ut} .


(8.3.16)
MUle Ul

Finally, substituting Eqs. (8.3.9) and (8.3.16) evaluated at the buckling load into Eq.
(8.3.11) gives

* D,pel eel + 20"1 e Ln(u~p, Ul) + O"~ e L2(Ut)


(8.3.17)
!-l,p = - 20"1 e Ln(u'·, ut) + 0"'* e L 2(Ul) ,

where the asterisk denotes prebuckling quantities evaluated at the buckling load.
Note that the field Ul, 0"1 now denotes the zero-frequency or buckling mode.

Example 8.3.1

The beam in Example (8.2.1) has a mass density p. Calculate the derivative of the
lowest frequency of lateral vibration with respect to the cross-sectional area A with
the applied load parameter !-l = 1 (assuming again that I and P do not change).
We use the same single finite-element approximation for half the beam that we
used in Example (8.2.1). Assuming a symmetric mode shape, we find the vibration
mode

To calculate the vibration frequency we use the Rayleigh quotient Eq. (8.3.4). The
first term in the numerator is

0"1 eel = 1L (Nlf.xl + J\,11Kt}dx.

Using Eqs (8.3.1) and (8.3.2) and expressions from Example (8.2.1) we have

f. x l = W,xWl,x = 36 W-(-x - x-2)2 ,


329
Chapter 8: Introduction to Variational Sensitivity Analysis
KI = WI,.,., = 6{1 - 2x)/ L,
So
0'1 • el = Io L[1296EAw(x - X2)4 + 36EI{1 - 2x)2 / L2]dx
= 2.05714EAwL + 12EI / L.
The other terms in the Rayleigh quotient are

0'. L 2{UI) = Io L Nwtxdx = 1.02857EAw2L - 1.2EAL€i,


MUI • UI = Io Apw~dx = 0.3714pAL
L 3,

so that
w2 = 3.08571EAw2 + 12EI/L2 -1.2EA€i = 0 01077~
0.3714pAL2 . p£2 .
Note that for the unloaded beam, w= 0, €i = 0 we get

{EI
w = 5.68 yp;:iJ ,
which is about 1.5% above the exact answer. We can differentiate w2 with respect to
A, for an analytical derivative that we can use later for comparison

( 2) _ 3.08571E( w 2 + 2Aww,A) - 1.2E€i w2


W ,A - 0.3714pA£2 - 11 .

For the direct method to calculate the same derivative we use Eq. (8.3.9). The
individual terms in this equation are calculated as follows:

D,pe1 • e1 = Io L E€;1 dx = 2.05714Ew 2L ,

20'1. Lu(u,p, ud = 210 L N1w,xAW1,,,,dx = 4.1l428EAww,AL,

O',p. L2(ud = Io L
N,AWi,,,,dx,
where

So

and

330
Section 8.3: Vibmtion and Buckling

Altogether

( W2) = 3.08571Ew 2 + 6.17142EAww,A - 1.2Efi _ w2


,A 0.3714pAP A '
which agrees with the analytical result. Using the values for wand W,A from Example
(8.2.1) we get
(w
2)
,A
E
= 0.1788 pAP'
•••
8.3.2 The Adjoint Method

The direct sensitivity approach requires the calculation of sensitivities of the static
field (prebuckling state), Eqs.(8.2.13) - (8.2.15). This calculation can become expen-
sive when we need sensitivities with respect to a large number of structural parame-
ters. In that case an adjoint technique that eliminates the need for static sensitivities
is appropriate. As usual, we multiply the equations that govern static equilibrium by
Lagrange multipliers (that we call the adjoint fields) and add them to to w2 ; thus

(w 2)* =mow2 + (Ta. [e - LI(U) - ~L2(U)] + ea. [(T - D(e - e i )] + f. u a (8.3.18)


-(T. [LI(ua ) + Lll(U, u a )],

where mo is the value of MUI • UI for the nominal value of p (that is mo does not
change with p). The constant mo is included to simplify the final expressions for the
adjoint field. We differentiate Eq. (8.3.18) and use Eq. (8.3.9) to get

mo(w2),p = (w2)~ = D,pel • el + 2(Tl • Ln(u,p, ut} + (T,p. L 2 (ud - w2M,pUl • Ul


+ (Ta. [e,p - L1(u,p) - Lll(U, u,p)] + ea. [(T,p - D,p(e - e i ) - De,p]

- (T,p. [LI(Ua ) + L ll (u, ua)] - (T. L l1 (u,p, u a ).


(8.3.19)
Grouping together terms that involve displacement derivatives, strain derivatives and
stress derivatives we get

mo(w2),p = D,pel • el - w2M,pUl • Ul - ea. D,p(e - e i )


- (Ta. [L1(u,p) + Lll(U, u,p)]- (T. Ln(u,p, u a ) + 2(Tl • Lu(u,p, Ul)
+ e,p. [(Ta - Dea] + (T,p. lea - L1(uO) - Lll(u, ua) + L 2(ud] .
(8.3.20)
From Eq. (8.3.20) it is clear that in order to eliminate of derivatives of the static
equilibrium state the adjoint state should satisfy the following equations
(8.3.21 )
(8.3.22)
331
Chapter 8: Introduction t"_,, iational Sensitivity Analysis

(1'a. [Ll(6u) + L l1 (u, 6u)] + (1'. L 1l (ua , 6u) - 2(1'1. L 1l (Ul, 6u) = O. (8.3.23)
Then the derivative of the frequency is given as

(w 2 ) = D,pc Cl - w2M,pU1 • UI - ea. D,p(c - e i ) .


(8.3.24)
i •

,p MUI. Ul

The adjoint equations, which have a homogeneous part identical to that of Eqs.
(8.3.1) - (8.3.3) for w = 0, may be considered to be the field equations of an adjoint
structure for which the term L 2 (Ul) in Eq. (8.3.21) is an initial strain term, and the
last term in Eq. (8.3.23) corresponds to body-force loading. In a buckling problem
(w = 0) the homogeneous part is singular, the adjoint fields are not unique, and
any multiple of the buckling mode Ul can be added. Any convenient orthogonality
relation can be used to make the adjoint fields unique.
The derivative of the buckling eigenvalue is similarly given as

* D pel • Cl - D pc' • c a
fl = -' , (8.3.25)
,p 2(1'1 • Lll (U'*, Ul) + (1"* • L 2 ( UI)
Equation (8.3.25) is based on the buckling mode and the prebuckling state calculated
at fl = fl'. The usual practice, however, is to estimate the buckling load by solving a
linearized eigenvalue problem based on a load {t < fl*. It is shown in [4J that the error
introduced in the derivative Ii,; due to this approximation is of the order of ({t* - p)2.

Example 8.3.2

We repeat Example (8.3.1) using the adjoint approach. We need to recalculate the
two terms that depend on the derivative of the static solution. From that example
these are

Using the adjoint method these two terms are replaced by the term

in Eq. (8.3.24).
The adjoint state, defined by Eqs. (8.3.21)-(8.3.23), has an initial strain and a
body force. The initial strain is L 2 ( UI) = wi,x' The corresponding equivalent nodal
force, if, is
if L6w = 1£ wi,xEA6Exdx.

Using expressions from Examples (8.2.1) and (8.3.1) for bE x and WI we get

if EAiiJ
= 1296--
L
1£ 0
(x - X2)4dx = 2.05714EAw.

332
Section 8.4: Static Shape Sensitivity

The body force is

f~LbiiJ =20"1. L 1l (uI,bu) = 21L NxIWI.xbw,xdx = 2EA lL wi.xw,xbw,xdx

=2592EAiiJbiiJ lL (x - x 2 )4dx = 4. 11428EALiiJbiiJ .

Altogether, the total nodal force is

r = If + f~ = 6. 17142EAiiJ .

This force has to be applied to the tangent structure. This means that if we use it to
replace the right-hand side of the tangent state equation, Eq. (b) of Example (8.2.1)
then we must use iiJa to replace Wi on the left side. That is

For later use we compare Eq. (b) to Eq. (d) of Example (8.2.1) and note that

-a 6.17142AiiJ,A
(c)
W = -1.02857iiJ2 + 1.2fi .

Once we have iiJa from Eq. (a) we can calculate A as

From Eq. (8.3.21)

fax = w a, x
W,x - w 21,x = 36(x - x 2)2(iiJuiiJ - 1) ,

so that
A =36(1 - iiJaiiJ) l \ x - X2 )2 [18(x - x 2 f - fi] dx
(d)
=(1 - iiJaiiJ)EL(l.02857iiJ2 - 1.2fi ).
We can now calculate w a from Eq. (b) and A from Eq. (d) to get the derivative of
w2 without calculating W,A' To check that Eq. (d) gives the same result as Eq.(a) we
use Eq. (c) to obtain

(1 _ iiJaiiJ) = 6.17142AiiJ,A + 1.02857iiJ 2 -1.2fi


1.02857w 2 - 1.210'

Substituting this expression into Eq. (d) we find Eq. (a) .•••

333
Chapter 8: Introduction f Variational Sensitivity Analysis

8.4 Static Shape Sensi rity

The calculation of sensitivity with respect to shape variation is more complicated


than that for stiffness variation. The present section is limited to shape sensitivity of
static response in the linear elastic range, and is based on Refs. [5-11J. Furthermore,
the discussion is limited to formulations which do not have curvature (such as arch
and shell formulations). The reader is also referred to Ref. [2J for proofs of many of
the results presented here.
Two general approaches have been used for variational shape sensitivity. The
first and more popular is the material derivative approach, and the second one is the
domain parametrization approach, also known as the control volume approach. While
both methods are very general, the domain parametrization approach is simpler, and
is particularly powerful for finite element analysis with isoparametric elements. We
start this section with a discussion of these two methods, and then see how they can
be applied with the direct and adjoint methods.

SA.l The Material Derivative

Consider a shape variation field ¢ such that the a material particle located at x is
moved to x¢
x¢ = x + ¢(x,p) , (8.4.1 )
where p is a shape design variable. The coordinate x is typically referred to as the
material or Lagrangian coordinate in that it is associated with a material particle.
The variation changes the domain V and the boundary S of the structure as
shown in Figure 8.4.1.

,"
,~ ~
. ......
~

I
\
. .... ____ rep)
"\ •
,I Q(p) ,
J

" ,,
, ,.
~
~
~
,-'
I
I x I

,
I
I
.. ..... ,•
~ , , - ,,'

Figure 8.4.1 Shape variation of structural domain.

334
Section 8.4: Static Shape Sensitivity
Consider a function f(x,p) defined on the changing structural domain V. We
denote the partial derivative af / ap of f with respect to p by f,p. This derivative
measures the change in f at a fixed position in the structure, and is often referred
to as the local derivative. The derivative that measures the change in f at a fixed
material point needs to take into account also the change in x as p changes. This
derivative is called the material derivative or the total derivative of j, and is denoted
here by jp
jp = f,p + V' fT xc/>,p = f,p + V' P v , (8.4.2)
where V' j denotes the gradient of j in space, and

v = xc/>,p = ¢,p (8.4.3)


is often referred to as the shape velocity field. This terminology is based on viewing
p as a time-like variable, so that xc/>,p is a velocity field. The components of v are
denoted by Vk where k runs from 1 to the dimension of the problem, and VI = V""
V2 = vy, V3 = V z ·

Consider now a vector function such as the displacement field u. For each com-
ponent Ui of U we can use Eq. (8.4.1) to obtain the material derivative as

Uip = Ui,p + (V'uif v. (8.4.4)


This equation is abbreviated as

Up = u,p + (V'u)v, (8.4.5)


where V'u is a matrix called the deformation gradient with components given by
au;
(V'u);j = u;J == -a
Xj
. (8.4.6)

Note that a comma followed by an index subscript j denotes differentiation with


respect to Xj. From this definition we get that

(V'U)v = U,jVj , (8.4.7)


with repeated indices summed over the dimension of the problem, so that for the
two-dimensional case
(8.4.8)
Similarly for a tensor such as the strain tensor e the material derivative is given by

e p = e,p + (Ve)v = e,p + e,;vi. (8.4.9)

Typically the material derivative is more physically interesting than the local
derivative. For example, if we change the shape of a hole boundary to relieve stress
concentration at that boundary, we would want the derivative of the stress at the
boundary rather than at a point with fixed coordinates. Mathematically, the material
335
Chapter 8: Introduction to Variational Sensitivity Analysis

derivative is more complicated to handle than the local derivative. For example,
the local derivative commutes with differentiation with respect to coordinates while
the material derivative does not. Consider, for example, the strain field associated
with a displacement field u, and denote it as e(u). The strain is obtained from the
displacements by differentiation, and since we can change the order of differentiation
for local derivatives
e,p(u) = e(u,p), (8.4.10)
while we cannot write a similar equation for the material derivative e p •
In order to differentiate the virtual work equation with respect to p we need to
calculate derivatives of integrals over the volume and over the surface of the structure.
Let Iv denote an integral over the domain of the structure

Iv = l f(x,p)dV . (8.4.11)

The derivative of Iv with respect to p is

(8.4.12)

where lip is the relative change in volume. It can be shown (e.g., [2]) that

(8.4.13)

Recall that repeated indices are summed over the dimensionality of the problem, so
that in the three-dimensional case

Vk,k = VI,1 + V2,2 + V3,3 = vx,x + Vy,y + vz,z = div v . (8.4.14)

The derivative of the surface integral

Is = is f(x,p)dS (8.4.15)

is handled in a similar manner; thus

(8.4.16)

The derivative of the surface element is given as


-
(dS)p = SpdS = -HnT vdS, (8.4.17)

where n is the vector normal to the boundary S, and H is the curvature of S in two
dimensions and twice the mean curvature in three dimensions.

336
Section 8.4: Static Shape Sensitivity
8.4.2 Domain Parametrization

The discussion of the domain parametrization is based on the work of Haber and
coworkers, and in particular [111. With this approach the material coordinate vector
x is given in terms of some reference domain as
x = x(r,p), (8.4.18)
where r is a coordinate vector in the reference domain n with boundary r, and p is
a shape parameter (see Figure 8.4.2). When isoparametric elements are used, it is
convenient to use the parent element as the reference domain for the actual element.
Specifically, for isoparametric elements the coordinate vector x in the element is
written as
# nodes
x = L h;(r)d;(p), (8.4.19)
;=1

where hi are shape functions for the element, r is a vector of intrinsic coordinates,
and d i are vectors of nodal coordinates. Variations in geometry are represented by
variations of the nodal coordinates, with the shape functions held fixed.

r
---+ n
r

material configuration referenc,e configuration material


Po and actual finite With Pflrent configuration
elements finite elements P

Figure 8.4.2 Domain parametrization approach

The transformation between the reference domain and material domain is char-
acterized by the Jacobian of the transformation JE, known as the Eulerian Jacobian,
and its inverse J-E
8r; 8x;
J'J.. -- -8 -r"
Xj
- J,J'
-E
andJ. . = -
'J 8rj
= xij. . (8.4.20)

Note that a comma followed by an index subscript (such as i or j) denotes differentia-


tion with respect to a material coordinate, while a dot followed by an index subscript
denotes differentiation with respect to a reference coordinate. The differential vol-
ume and area in the material configuration are expressed in terms of the reference
configuration using the determinant of the Eulerian Jacobian
(8.4.21 )
337
Chapter 8: Introduction to Variational Sensitivity Analysis
where K-E is a Jacobian of the transformation between the surface coordinates in
the reference and material configurations. Its determinant is given as

(8.4.22)

where ni are the components of the unit outward normal to the surface area r of
the reference domain, and repeated indices are summed. The derivative of J- E with
respect to p is obtained from its definition

(8.4.23)

while the derivative of JE requires using the formula for the derivative of an inverse

(8.4.24)

to get
(8.4.25)

With the domain parametrization approach the displacements, strains and


stresses are considered to be functions of the reference coordinates r. Therefore,
when we evaluate their derivatives with respect to p we get derivatives for a con-
stant position r which are essentially the material derivatives of these quantities. A
function f(x,p) is first rewritten in terms of the reference coordinates as ](r,p) and
then
(8.4.26)

Derivatives with respect to material coordinates have to be transformed to derivatives


with respect to reference coordinates using the chain rule of differentiation. Thus,
the linear strain displacement relationship becomes

(8.4.27)

This produces an explicit dependence of the strain on the shape parameter, and to
reflect that we rewrite Eq. (8.1.1) as

(8.4.28)

Derivatives of integrals are handled in a similar way to the material derivative ap-
proach. A volume integral is written in terms of the reference coordinates

Iv = Iv f(x,p)dV = In f(r,p)det(rE)dn, (8.4.29)

where] is the new form of the function when it is written in terms of the reference
coordinates. Then
(8.4.30)

338
Section 8.4: Static Shape Sensitivity

where
Vp= (det(J-E))/det(J- E). (8.4.31)

Similarly, for a surface integral Is, Eq. (8.4.15), we get

(8.4.32)

where
(8.4.33)

8.4.3 The Direct Method

To apply the direct method to shape sensitivity calculation we need to differentiate the
strain displacement relation, Eq. (8.1.1), Hooke's law, Eq. (8.1.4) and the equilibrium
equations, Eq. (8.1.6) with respect to p. We start with the strain displacement
relation and with the material derivative approach. Using Eqs. (8.4.9) and (8.4.10)
the differentiated strain-displacement relation is

ep = e,p + (V'e)V = L1(u,p) + (V'e)V. (8.4.34)

Using Eq. (8.4.5) we can write this as

ep=L1(up)-e, (8.4.35)

where
e= Ll [(V'u)v] - (V'e)V (8.4.36)

is an initial-strain associated with the sensitivity field. Even though Eq. (8.4.36)
appears to include strain gradients, these gradients cancel out and e includes only
first derivatives of the displacement and shape velocity fields. For example, for the
three-dimensional case we obtain

(8.4.37)

We can get another form of e by using the domain parametrization approach.


Differentiating Eq. (8.4.27) we get

(fij)p = ~(UPi.kJt; + Upj.kJ{i) + ~(Ui.kJt;,p + Uj.kJ{i,p) = [L1(up)]ij - fij, (8.4.38)


where
(8.4.39)

We assume that the elastic coefficients do not change with shape change, and
that there is no initial strain. Then the derivative of Hooke's law is

(8.4.40)

339
Chapter 8: Introduction to-Variational Sensitivity Analysis
The derivative of the equations of equilibrium is
(0". 6e)p = (f. 6u)p. (8.4.41)
The term on the left side of Eq. (8.4.41) is a volume integral which according to Eqs.
(8.4.12) and (8.4.30) is the volume integral of the derivative of the integrand plus a
term which accounts for the change in the volume element. This translates to
(0" • 6e)p = O"p. 6e + 0". 6ep + 0". (~6e). (8.4.42)
The derivative of the virtual strain 6ep is obtained in a similar manner to Eq. (8.4.35)
as
6ep = Ll(6up) - 6e, (8.4.43)
where with the material derivative approach
6e = LdV(6u)v]- (V6e)v, (8.4.44)
while for the domain parametrization approach
6Eij = -~(6ui.kJfi,p + 6uj.kJ{i,p)' (8.4.45)
The derivative of the virtual work of the applied loads is more complicated because
this work is composed of volume and surface integrals
f. 6u = fb • 6u + T. 6u, (8.4.46)
where fb denotes the body load vector, and T denotes the vector of applied tractions.
The first term on the right hand side of Eq. (8.4.46) is a volume integral, while the
second term is a surface integral. Differentiating the body force integral is straight
forward. However, the traction term can be a problem if there are corners on the
boundary or if the loaded boundary is changing. We will assume that there are no
corners or changes in loaded boundary. Then we can differentiate Eq. (8.4.46) to get
(f.6u)p = f bp • 6u+fb • 6up+ f b• (Vp6u) + Tp. 6u + T .6up + T. (SpC'iu). (8.4.47)
The virtual displacement 6u is arbitrary except that it needs to satisfy the kinematic
boundary conditions, which are assumed to be independent of p. \Ve make sure that
6u satisfies these boundary conditions as the shape changes by requiring that
6up = O. (8.4.48)
Using Eq. (8.4.48), Eq. (8.4.43) becomes
6e p = -6e . (8.4.49)
Finally, using Eqs. (8.4.41), (8.4.42), (8.4.47) and (8.4.49) we get
O"p.6e=fbp.6u+fb.(~6u)+Tp.6u+T.(Sp6u)+0".6e-O"Vp.6e. (8.4.50)
The right hand side of Eq. (8.4.50) represents the body forces that need to be
applied to the structure (along with the initial strain e) in order for the solution to
be the sensitivity field. The pseudo load fP that needs to be applied to the original
structure to produce the sensitivity field includes the terms on the right hand side of
Eq. (8.4.50) as well as a pseudo force due to the initial strain e
f P .6u = fbp.6u+fb~.6u+Tp.6u+TSp.6u+0".6e-0"~.6e+De.6e. (8.4.51)
When the curve separating the loaded and unloaded boundaries is changing, and
when the boundary has corners, there are additional terms (see [6]). By using Eq.
(8.4.51), we may write Eq. (8.4.50) as
0"p • 6e = fP • 6u - De • 6e (8.4.52)

340
Section 8.4: Static Shape Sensitivity
Example 8.4.1

(a) L
re~eren.ce
b)
material
omam
COnfigu,"ti°l L...._ _....I

Figure 8.4.3 Bar under self-weight loading.

The bar shown in Figure (8.4.3) is loaded under its own weight. Calculate the
sensitivity of the solution to changes in the length of the bar (approximated by a
single finite element) using the direct method.
The loading in this case is a body force of constant magnitude f = pAg. The
exact solution for the displacement u and the member force N is given in terms of
the density p, the area A and the acceleration due to gravity g as
N = pAgeL - x), u = (pg/ E)(Lx - x 2 /2).
Using a single linear finite element we concentrate half of the body force at each node,
so that each node is loaded by pAgL/2. The finite-element solution is

U2 = pgL2 /2E, € = pgL/2E, N = pAgL/2,


so that the maximum displacement is correct, but the maximum member force is off
by a factor of 2. The derivatives of these two quantities with respect to L are

U2L = pgL/E, NL = pAg/2. (a)


To calculate the sensitivity field with the material derivative approach we need
to assume a shape variation field ¢. We assume that as the length of the bar changes,
all points in the bar are moved proportionately. Denoting the new length of the bar
by p we find
x'" = x(p/ L), or ¢ = x(p/L - 1),
and the shape velocity field
v = ¢,p = x/L.
We now have

Vp = Vk,k = v,x = I/L,


341
Chapter 8: Introduction to Variational Sensitivity Analysis
For the domain parametrization method we use a parent element of length one,
so that

where in our case Xl = 0, X2 = p. The Jacobian of the transformation is a single


number
J
-E
=
ax
ar = -Xl + X2 = p.

Then JE = lip, so that from Eq. (8.4.39)


_ E -1auU2
E = -ulIJ
. ,p ar p2
= - - - =-
p2 '
which is the same as E obtained from the material derivative approach. The relative
volume change, ~, is

which also agrees with the material derivative result. The first term in the pseudo
load expression Eq. (8.4.52) is zero, because the body load is constant. The second
term introduces a body force of f I L = pAgl L which accounts for the effect of change
in volume element on the resultant of the original body force. This is equivalent to
an end load of pAg/2. The two terms associated with the tractions vanish because
we have no applied surface tractions. The next term two terms are evaluated using
the fact that for the finite-element model the member force N and the strain E are
constant in the element

The last term is

Altogether
fP • 8u = pAg8u2
which indicates that fP is equal to a force of pAg. Under this force we get
U2p = pgLI E
which agrees with the results in Eq. (a) abo\'('. To calculate the derivative of the
member force, Np we first calculate Ep from Eq. (8.4.19).
Ep = LI(U p ) - E = up,x - ElL = u2plL - ElL

so that
Ep = pgl2E, Np = EAEp = pAg/2
which agrees with the result in Eq. (a) above .•••

342
Section 8.4: Static Shape Sensitivity
8.4.4 The Adjoint Method

Consider now the sensitivity of a displacement functional H given by Eq. (8.1.19).


Using Eq. (8.4.12) to obtain the derivative with respect to the first argument, we
differentiate H to obtain

(8.4.53)

To eliminate of the displacement derivative term, we multiply the derivatives of the


governing equations, Eqs (8.4.35), (8.4.40), and (8.4.53) by adjoint fields as Lagrange
multipliers and add them to Hp to obtain

Hp= j(hp+hVp)dV+h,ueup+uae [€p-L1(UP)+e]


+€a e (up - D€p) - up e L1(u a ) + fP e u a - De e L1(ua)
(8.4.54)
= j(hp + hVp)dV + h,u e up - u a e Ll(Up) + €p e (u a - D€a)

+up e [€a - L1(ua )] + [u a -IDL1(ua )] e e + fP.

From Eq. (8.4.54) we see that we can eliminate of the response sensitivity terms by
defining the adjoint as we did in the stiffness variable case, Eqs. (8.1.24)-(8.1.26).
Then we get
Hp = j(hp + hVp)dV + fP e u a • (8.4.55)

Equation (8.4.55) requires the evaluation of fP from the volume integrals in Eq.
(8.4.51). It is possible to transform fPeu a to surface integrals (e.g., [6), [7]). However,
there has been unfavorable computational experience with the surface version of the
adjoint method (e.g., [12]). Unfortunately, it is not always possible to tell which
method gives more accurate results as demonstrated in the following Example.

Example 8.4.2

The cantilever beam shown in Figure (8.4.4) is modeled with rectangular plane
stress elements. The beam is composed of two materials with different Young's mod-
ulus, and the position of the interface between the two is the design parameter p. The
sensitivity of the tip displacement with respect to the position of the interface was cal-
culated using six methods: (i)Overall finite-differences (OFD); (ii) the semi-analytical
method (SA); (iii)the discrete direct method (DD); (iv) the direct variational method
(DV); (v) the adjoint variational domain method (AVD); and (vi) the adjoint vari-
ational surface method (AVS). The first three methods are discussed in Chapter 7,
the next two in this chapter, and the last method in [6].

343
Chapter 8: Introduction to Variational Sensitivity Analysis

P =11 b

T W=2"

~ _______E_1_=__1_04__P_Si_____________Y_b_=L-1.________~

I. '"1
L =20"
·1
nx = number of elements along x

nys = number of elements along y under interface

nyh = number of elements along y above intetrface

Figure 8.4.4 Geometry, loading and mesh definition of cantilever beam modeled by
plane stress elements (from [13].)

0.3 ~------------- 0.030

9
: DO - Direct discrete
0.024
OV - Direc. variational
AVO - Adjoint _0,10110001 domain
AVS - Adjolnl _o,loli_I ,u,fac •
... 0.2 f- i
z SA - Semi- analyllcol
...>
a
."
0.0111
~ ~
<I
~ I OD.DV.AVD >
...a:
\
e; ~~.. 6;: Of"O 0.012 0
25 0.1 l(»::a;;::-it:·:·::t:~~::-::~:::'=::::::-:-:::~

~- AVS _ 0.0011

o~---~-------~----~o
o 100 200 300

NUMBER OF ELEMENTS IN X DIRECTION

Figure 8.4.5 Convergence of the tip displacement and its derivative for ny = 8.

344
Section 8.6: References

The convergence of the the displacement and its derivative as the number of
elements along the axis of the beam is increased, is shown in Figure (8.4.5). It is
clear that the derivatives converge more slowly than the displacement, which is to be
expected. It is also seen, that though several methods including the direct methods
and the adjoint variational domain method agree very well with the overall finite
difference method, they are not more accurate than the adjoint variational surface
method. They just converge to the correct value from a different direction .•••

8.5 Exercises

1. The three-bar truss of Example 8.1.1 is loaded by heating member A by t::.T


degrees instead of by mechanical loads. Use the direct method to calculate the
derivatives of the stresses in the three members with respect to t.he cross-sectional
area of member A in terms of A, I, E, t::.T and the coefficient of thermal expansion
Lt. 2. Derive the expression for t.he adjoint. loading in t.he case of linear structural
analysis for a functional g(Ti ), where Ti are the components of the traction on the
boundary.
3. Derive the expression for G,p of section 8.1.2 for the case of nonzero init.ial strain.
4. Using t.he results obtained in the previous prohlem calculat.e the derivatives of
the stress in member A of the three-bar truss of Prohlem 1 with respect to the two
cross-sectional areas using the adjoint method.
5. Show that Eqs. (8.1.27) and (8.1.40) arc applicahle also for the nonlinear case.
6. Calculate the derivative of the axial strain ax in Example 8.2.1 with respect to A
using the direct and adjoint methods.
7. Derive Eq. (8.3.9)
8. Repeat Example 8.4.1 using the adjoint method.

8.6 References

[1] Cohen, G.A., "FASOR-A program for Stress, Buckling, and Vibration of Shells
of Revolution", Advances in Engineering Software, 3 (4), pp.155-162, 1£)81.
[2] Haug, E.J., Choi, K.K., and Komkov, V., Design Sensitivity Analysis of Structural
Systems, Academic Press, 1986.
[3] Mr6z, Z., Kamat, M.P., and Plaut, R.H., "Sensitivity Analysis ami Optimal De-
sign of Nonlinear Beams and Plates," J. Stmct. Mech., 13 (3/4), pp. 245-266,
1985.
[4] Cohen, G.A., "Effect of Nonlinear Prebuckling State on the Post buckling Behav-
ior and Imperfection Sensitivity of Elastic Structures", AIAA Journal, 6 (8), pp.
1616-1619, 1968.

345
Chapter 8: Introduction to Variational Sensitivity Analysis
[5] Mr6z, Z., "Sensitivity Analysis and Optimal Design with Account for Varying
Shape and Support Conditions" , In Computer Aided Optimal Design: Structural
and Mechanical Systems (C.A. Mota Soares, Editor), Springer-Verlag, 1987, pp.
407-438.
[6] Choi, K.K., "Shape Design Sensitivity Analysis and Optimal Design of Struc-
tural Systems" , In Computer Aided Optimal Design: Structural and Mechanical
Systems (C.A. Mota Soares, Editor), Springer-Verlag, 1~87, pp. 439-492.
[7] Yang, R-J., "A Three Dimensional Shape Optimization System-SHOP3D",
Computers and Structures, 31(8), pp. 881-890, 1989.
[8] Dems, K., and Haftka, RT., "Two Approaches to Sensitivity Analysis for Shape
Variation of Structures," Mechanics of Structures and Machines, Vol. 16, No.4,
pp. 501-522, 1988/89.
[9] Dems, K., and Mr6z, Z., "Variational Approach by Means of Adjoint Systems to
Structural Optimization and Sensitivity Analysis, Part I: Variation of Material
Parameters within Fixed Domain," Int. J. Solids Struct., 19 (8), pp. 677-692,
1983, "Part II: Structure Shape Variation," 20, pp. 527-552, 1984.
[10] Phelan, D.G., and Haber, RB., "Sensitivity Analysis of Linear Elastic Systems
Using Domain Parametrization and a Mixed Mutual Energy Principle," Com-
puter Methods in Applied Mechanics and Engineering, Vol. 77, pp. 31-59, 1989.
[11] Arora, J.S., and Cardoso, J.B., "A Variational Principle for Shape Design Sen-
sitivity Analysis," AIAA Paper 91-1213-CP, Proceedings AIAA/ ASME/ ASCE-
/ AHS / ASC 32nd Structures, Structural Dynamics and Material Conference, Bal-
timore, MD, April 8-10, 1991, Part 1, pp. 664-674.
[12] Choi, K.K. and Seong, H.G., "A Domain Method for Shape Design Sensitivity of
Built-Up Structures", Computer Methods in Applied Mechanics and Engineering,
Vol. 57, pp. 1-15, 1986.
[13] Haftka, R.T., and Barthelemy, B., "On the Accuracy of Shape Sensitivity Deriva-
tives", In: Eschenauer, H.A., and Thierauf, G. (eds), Discretization Methods and
Structural Optimization-Procedures and Applications, pp. 136-144, Springer-
Verlag, Berlin, 1989.

346
Dual and Optimality Criteria Methods 9

In most of the analytically solved examples in Chapter 2, the key to the solution
is the use of an algebraic or a differential equation which forms the optimality con-
dition. For an unconstrained algebraic problem the simple optimality condition is
the requirement that the first derivatives of the objective function vanish. When the
objective function is a functional the optimality conditions are the Euler-Lagrange
equations (e.g., Eq. (2.2.13)). On the other hand, the numerical solution methods
discussed in chapters 4 and 5 (known as direct search methods) do not use the opti-
mality conditions to arrive at the optimum design. The reader may have wondered
why we do not have numerical methods that mimic the solution process for the prob-
lems described in Chapter 2. In fact, such numerical methods do exist, and they
are known as optimality criteria methods. One reason that the treatment of these
methods is delayed until this chapter is their limited acceptance in the optimization
community. While the direct search methods discussed in Chapters 4 and 5 are widely
used in many fields of engineering, science and management science, optimality crite-
ria method have been used mostly for structural optimization, and even in this field
there are many practitioners that dispute their usefulness.

The usefulness of optimality criteria methods, however, becomes apparent when


we realize their close relationship with duality and dual solution methods (see Section
3.7). This relationship, established by Fleury, helps us understand that these methods
can be very efficient when the number of constraints is small compared to the number
of design variables. This chapter attempts to demonstrate the power of optimality
criteria methods and dual methods for the case where the number of constraints is
small. In particular, when there is only one constraint (plus possibly lower and upper
limits on the variables) there is little doubt that dual methods and optimality criteria
methods are the best solution approaches. The chapter begins with the discussion of
intuitive optimality criteria methods. These have motivated the development of the
more rigorous methods in use today. Then we discuss dual methods, and finally we
show that optimality criteria methods are closely related to dual methods.
347
Chapter 9: Dual and Optimality Criteria Methods
9.1 Intuitive Optimality Criteria Methods

Optimality criteria methods consist of two complementary ingredients. The first is


the stipulation of the optimality criteria, which can be rigorous mathematical state-
ments such as the Kuhn-Tucker conditions, or an intuitive one such as the stipulation
that the strain energy density in the structure is uniform. The second ingredient is
the algorithm used to resize the structure for the purpose of satisfying the optimal-
ity criterion. Again, a rigorous mathematical method may be used to achieve the
satisfaction of the optimality criterion, or one may devise an ad-hoc method which
sometimes works and sometimes does not. The division into intuitive and rigorous
methods is usually made on the basis of the chosen optimality criterion rather than
of the resizing algorithm. This convention will be employed in the following sections.

9.1.1 Fully Stressed Design

The Fully Stressed Design (FSD) technique is probably the most successful optimality
criteria method, and has motivated much of the initial interest in these methods. The
FSD technique is applicable to structures that are subject only to stress and minimum
gage constraints. The FSD optimality criterion can be stated as follows:

For the optimum design each member of the structure that is not at its minimum
gage is fully stressed under at least one of the design load conditions.

This optimality criterion implies that we should remove material from members
that are not fully stressed unless prevented by minimum gage constraints. This
appears reasonable, but it is based on an implicit assumption that the primary effect
of adding or removing material from a structural member is to change the stresses in
that member. If this assumption is violated, that is if adding material to one part of
the structure can have large effects on the stresses in other parts of the structure, we
may want to have members that are not fully stressed because they help to relieve
stresses in other members.

For statically determinate structures the assumption that adding material to a


member influences primarily the stresses in that member is correct. In fact, without
inertia or thermal loads there is no effect at all on stresses in other members. There-
fore, we can expect that the FSD criterion holds at the minimum weight design for
such structures, and this has been proven [1,2]. However, for statically indeterminate
structures the minimum weight design may not be fully stressed [3-6]. In most cases
of a structure made of a single material, there is a fully stressed design near the
optimum design, and so the method has been extensively used for metal structures,
especially in the aerospace industry (see for example, [7,8]). As illustrated by the
following example, the FSD method may not do as well when several materials are
employed.

348
Section 9.1: Intuitive Optimality Criteria Methods

Example 9.1.1

1
1
I

Figure 9.1.1 Two-bar structure.

A load p is transmitted by a rigid platform A-A to two axial members as shown


in Figure 9.1.1. The platform is kept perfectly horizontal so that the vertical displace-
ments of point C and of point D are identical. This may be accomplished by moving
the force p horizontally as the cross sections of members 1 and 2 are changed. The
two members are made from different steel alloys having the same Young's modulus
E but different densities PI and P2, and different yield stresses 0'01 and 0'02, respec-
tively. The alloy with higher yield stress is also more brittle, and for this reason we
cannot use it in both members. We want to select the two cross-sectional areas Al
and A2 so as to obtain the minimum-mass design without exceeding the yield stress
in either member. Additionally, the cross-sectional areas are required to be larger
than a minimum gage value of Ao.
The mass, which is the objective function to be minimized, is

The stresses in the two members (based on the assumption that the platform remains
horizontal) are easily shown to be

P
0'1 = 0'2 = .
Al +A2
Now assume that member one is made of a high-strength low-density alloy such that
= 20'02 and PI = O.9P2' In this case the critical constraint is
0'01

349
Chapter 9: Dual and Optimality Criteria Methods

so that At + A2 = P/(T02' The minimum mass design obviously will make maximum
use of the superior alloy by reducing the area of the second member to its minimum
gage value A2 = A o, so that At = P/(T02 - Ao, provided that P/(T02 is larger tlli1n
2Ao. This optimum design is not fully stressed as the stress in member 1 is only half
of the allowable and member 1 is not at minimum gage. The fully stressed desigll
(obtained by the stress-ratio technique which is described below) is: At = Ao and
A2 = P/(T02 - A l . In this design, member 2 is fully stressed and member 1 is at.
minimum gage. This is, of course, an absurd design because we make minimal \lS(~
of the superior alloy and maximum use of the inferior one. For an illustration of the
effect on mass assume that
P/(T02 = 20Ao·
For this case the optimal design has At = 19A o, A2 = Ao and m = 18.1p2Aol. The
fully stressed design, on the other hand, has At = Ao, A2 = 19Ao and m = 19.9p2J1oi .
• ••
Beside the use of two materials, another essential feature of Example 9.1.1 is it
structure which is highly redundant, so that changing the area of one member has
a large effect on the stress in the other member. This example is simple enough so
that the optimum and fully-stressed designs can be found by inspection.
Minimum Minimum
size

100 100 100 100


(a) Optimum design (b) Fully stresses design

Figure 9.1.2 Ten-bar truss.

A more complex classical example (developed by Berke and Khot [9]) often used tf)
demonstrate the weakness of the FSD is the ten-bar truss shown in Figure 9.1.2. Th('
truss is made of aluminum (Young's modulus E = 107 psi and density p = O.llb/in~l
with all members having a minimum gage of 0.1 in2 . The yield stress is ±25000 psi
for all members except member 9. Berke and Khot have shown that for (109::; 3750()
psi the optimum and FSD designs are identical, but for (T09 ~ 37500 psi the optimum
design weighs 1497.6 lb and member 9 is neither fully stressed nor at minimum gage.
The FSD design weighs 1725.2 lb, 15% heavier than the optimum, with member D at.
minimum gage. The two designs are shown in Figure 9.1.2.

350
Section 9.1: Intuitive Optimality Criteria Methods
The FSD technique is usually complemented by a resizing algorithm based on
the assumption that the load distribution in the structure is independent of member
sizes. That is, the stress in each member is calculated, and then the member is
resized to bring the stresses to their allowable values assuming that the loads carried
by members remained constant ( this is logical since the FSD criterion is based on a
similar assumption). For example, for truss structures, where the design variables are
often cross-sectional areas, the force in any member is a A where a is the axial stress
and A the cross-sectional area. Assuming that a A is constant leads to the stress ratio
resizing technique
a
Anew = Ao1d-, (9.1.1)
ao
which gives the resized area Anew in terms of the current area A old , the current stress
a, and the allowable stress ao. For a statically determinate truss, the assumption
that member forces are constant is exact, and Eq. (9.1.1) will bring the stress in
each member to its allowable value. If the structure is not statically determinate
Eq. (9.1.1) has to be applied repeatedly until convergence to any desired tolerance is
achieved. Also, if Anew obtained by Eq. (9.1.1) is smaller than the minimum gage,
the minimum gage is selected rather than the value given by Eq. (9.1.1). This so
called stress-ratio technique is illustrated by the following example.

Example 9.1.2

For the structure of Example 9.1.1 we use the stress ratio formula and follow the
iteration history. We assume that the initial design has Al = A2 = A o, and that the
applied load is p = 20Aoa02. The iteration history is given in Table 9.1.1.
Table 9.1.1
Iteration AI/Ao AdAo aI/aol a2/ a o2
1 1.00 1.00 5.00 10.00
2 5.00 10.00 0.67 1.33
3 3.33 13.33 0.60 1.2
4 2.00 16.00 0.56 1.11
5 1.11 17.78 0.56 1.059
6 1.00 18.82 0.504 1.009
7 1.00 18.99 0.500 1.0005

Convergence is fast, and if material 2 were lighter this would be the optimum
design .•••
As can be seen from Example 9.1.2 the convergence of the stress ratio technique
can be quite rapid, and this is a major attraction of the method. A more attrac-
tive feature is that it does not require derivatives of stresses with respect to design
variables. When we have a structure with hundreds or thousands of members which
need to be individually sized, the cost of obtaining derivatives of all critical stresses
with respect to all design variables could be prohibitive. Practically all mathemati-
cal programming algorithms require such derivatives, while the stress ratio technique

351
Chapter 9: Dual and Optimality Criteria Methods

does not. The FSD method is, therefore, very efficient for designing truss structures
subject only to stress constraints.
For other types of structures the stress ratio technique can be generalized by
pursuing the assumption that member forces are independent of member sizes. For
example, in thin wall construction, where only membrane stresses are important,
we would assume that uijt are constant, where t is the local thickness and Uij are
the membrane stress components. In such situations the stress constraint is often
expressed in terms of an equivalent stress U e as

(9.1.2)

For example, in a plane-stress problem the Von-Mises stress constraint for an isotropic
material is
2 2 2 32 < 2 (9.1.3)
Ue=Ux+Uy-UxUy+ Txy_U O '

In such a case the stress ratio technique becomes

(9.1.4)

In the presence of bending stresses the resizing equation is more complicated. This
is the subject of Exercise 3.
'When the assumption that member forces remain constant is unwarranted the
stress ratio technique may converge slowly, and the FSD design may not be optimal.
This may happen when the structure is highly redundant (see Adelman et al. [10],
for example), or when loads depend on sizes (e.g., thermal loads or inertia loads).
The method can be generalized to deal with size-dependent loads (see, for example,
Adelman and Narayanaswami [11] for treatment of thermal loads), but not much can
be done to resolve the problems associated with redundancy. The combination of FSD
with the stress ratio technique is particularly inappropriate for designing structures
made of composite materials. Because composite materials are not isotropic the FSD
design may be far from optimum, and because of the redundancy inherent in the use
of composite materials, convergence can be very slow.
The success of FSD prompted extensions to optimization under displacement
constraints which became the basis of modern optimality criteria methods. Venkayya
[12] proposed a rigorous optimality criterion based on the strain energy density in the
structure. The criterion states that at the optimum design the strain energy of each
element bears a constant ratio to the strain energy capacity of the element. This was
the beginning of the more general optimality criteria methods discussed later.
The strain energy density criterion is rigorous under some conditions, but it has
also been applied to problems where it is not the exact optimality criterion. For
example, Siegel [13] used it for design subject to flutter constraints. Siegel proposed
that the strain energy density associated with the flutter mode should be constant
over the structure. In both [12] and [13] the optimality criterion was accompanied
by a simple resizing rule similar to the stress ratio technique.

352
Section 9.2: Dual Methods

9.1.2 Other Intuitive Methods

The simultaneous failure mode approach was an early design technique similar to
FSD in that it assumed that the lightest design is obtained when two or more modes
of failure occur simultaneously. It is also assumed that the failure modes that are
active at the optimum (lightest) design are known in advance.

N
/ / / / LI-
t}

:r
b} "I

1
--l r-- t2

Figure 9.1.3 Metal blade-stiffened panel with four design variables.

Consider, for example (from Stroud [14]) how this procedure is used to design
a metal blade-stiffened panel having the cross section shown in Figure 9.1.3. There
are four design variables b1 , b2 , t 1 , t 2 • Rules of thumb based on considerable experi-
ence are first used to establish proportions, such as plate width-to-thickness ratios.
The establishment of these proportions eliminates two of the design variables. The
remaining two variables are then calculated by setting the overall buckling load and
local buckling load equal to the applied load. This approach results in two equa-
tions for the two unknown design variables. The success of the method hinges on
the experience and insight of the engineer who sets the proportions and identifies
the resulting failure modes. For metal structures having conventional configurations,
insight has been gained through many tests. Limiting the proportions accomplishes
two goals: it reduces the number of design variables, and it prevents failure modes
that are difficult to analyze. This simplified design approach is, therefore, compatible
with simplified analysis capability.

9.2 Dual Methods

As noted in the introduction to this chapter, dual methods have been used to
examine the theoretical basis of some of the popular optimality criteria methods.
Historically optimality criteria methods preceded dual methods in their application
to optimum structural design. However, because of their theoretical significance we
will reverse the historical order and discuss dual methods first.

353
Chapter 9: Dual and Optimality Criteria Methods

9.2.1 General Formulation

The Lagrange multipliers are often called the dual variables of the constrained opti-
mization problem. For linear problems the primal and dual formulations have been
presented in Chapter 3, and the role of dual variables as Lagrange multipliers is not
difficult to establish (See Exercise 1). If the primal problem is written as
minimize cT x
subject to Ax - b ~ 0,
x ~ o.
Then the dual formulation in terms of the Lagrange multipliers is
maximize ATb
subject to AT A - c ~ 0,
A ~ o.

There are several ways of generalizing the linear dual formulation to nonlinear
problems. In applications to structural optimization, the most successful has been
one due to Falk [15) as specialized to separable problems by Fleury [16).
The original optimization problem is called the primal problem and is of the form
minimize f(x)
(9.2.1 )
subject to gj(x) ~ 0, j = 1, ... , n g .
The necessary conditions for a local minimum of problem (9.2.1) at a point x* is that
there exist a vector A* (with components Ai, ... , A~ 9 ) such that

gj(x*) ~ 0, j = 1, ... , n g , (9.2.2)


Ajgj(X*) = 0, j = 1, ... , n g , (9.2.3)
Aj ~ 0, j = 1, ... , n g , (9.2.4 )

of _ ~ A~agj = 0 i = 1, ... , n, (9.2.5)


aXi ~) 8.1:; ,
)=1

Equations (9.2.3)-(9.2.5) are the Kuhn-Tucker conditions (see Chapter 5). They
naturally motivate the definition of a function .£ called the Lagrangian function

L
ng

.£(x, A) = f(x) - Aj9j(X). (9.2.6)


j=l

Equations (9.2.5) can then be viewed as stationarity conditions with respect to x for
the Lagrangian function. Falk's dual formulation is
maximize .£m(A)
such that Aj ~ 0, j = 1, ... , 71 g , (9.2.7)

354
Section 9.2: Dual Methods
where
Cm(,x) = minC(x,,x),
xEC
(9.2.8)

and where C is some closed convex set introduced to insure the well conditioning of
the problem. For example, if we know that the solution is bounded, we may select C
to be
c = {x: -r::; Xi::; r , i=l, ... ,n}, (9.2.9)
where r is a suitably large number. Under some restrictive conditions the solution of
(9.2.7) is identical to the solution of the original problem (9.2.1), and the optimum
value of Cm is identical to the optimum value of f. One set of conditions is for the
optimization problem to be convex (that is, f(x) bounded and convex, and gj(x)
concave), f and gj to be twice continuously differentiable, and the matrix of second
derivatives of C(x,,x) with respect to x to be nonsingular at x*.
Under these conditions the convexity requirement also guarantees that we have
only one minimum. For the linear case the Falk dual leads to the dual formulation
discussed in Section 3.7 (Exercise 1).
In general, it does not make sense to solve (9.2.7), which is a nested optimization
problem, instead of (9.2.1) which is a single optimization problem. However, both the
maximization of (9.2.7) and the minimization of (9.2.8) are virtually unconstrained.
Under some circumstances these optimizations become very simple to execute. This
is the case when the objective function and the constraints are separable functions.

9.2.2 Application to Separable Problems

The optimization problem is called separable when both the objective function and
constraints are separable, that is
n

(9.2.10)

gj(x) = L gji(Xi) , j = 1, ... ,ng. (9.2.11)


i=l

The primal formulation does not benefit much from the separability. However, the
dual formulation does because C(x,,x) is also a separable function and can, therefore,
be minimized by a series of one-dimensional minimizations, and Cm (,x) is therefore
easy to calculate.

Example 9.2.1

Find the minimum of f(x) = x~ + x~ + x~ subject to the two constraints

gl(X) = Xl +X2 -10 ~ 0,


g2(X) = X2 + 2X3 - 8~ o.
355
Chapter 9: Dual and Optimality Criteria Methods

Solution via dual method:

(a)
where
Lj(Xd = xi - )'1 X j ,

L.AX2) = x~ - (Aj + A2)X2,


(b)
L 3 (X3) = X~ - 2A2X3,
La = 10Aj + 8A2 .
Each one of the functions L 1 , L 2 , L3 can be minimized separately to get the minimum
of L(X, A) with respect to x. The minimum of L j is found by setting its first derivative
to zero
2Xl - Al = 0,
so that Xj = A1/2. Similarly we obtain X2 = (Aj + A2)/2 and X3 = A2. Substituting
back into L(X, A) we get

Lm(A) = -0.5Ai - 1.25A§ - 0.5A1A2 + 10Aj + 8A2.

We now need to find the maximum of Lm(A) subject to the constraints Aj :::: 0,
A2 :::: O. Differentiating Lm(A) we obtain

or
1 1
Aj=9 3 , A2=1 3 , L m(A)=52.
vVe also have to check for a maximum along the boundary A1 = 0 or A2 = o. If A1 = 0

and this function attains its maximum for A2 = 3.2, Lm(A) = 12.8. For A2 = 0 we
get

with the maximum attained at Al = 10, Lm = 50. \Ve conclude that the maximum
is inside the domain. From the expressions for XI, X2, X3 above we obtain
2 1 1
Xj = 4- X2 = 5- X3 = 13 , f(x) = 52.
3' 3'
The equality of the maximum of Lm(A) and the minimum of f(x) is a useful check
that we obtained the correct solution .•••

356
Section 9.2: Dual Methods

9.2.3 Discrete Design Variables

Because the dual method requires only one-dimensional minimization in the design
variable space, it has been used for cases where some of the design variables are
discrete (see, Schmit and Fleury [17]). To demonstrate this approach we suppose all
the design variables are discrete. That is, the optimization problem may be written
as
L
n

minimize f(x) = fi(Xi)


i=l
n

=L
(9.2.12)
such that gj(x) gji(Xi) ~ 0, j = 1, ... ,ng,
i=l
and XiEX i , i=l, ... ,n.
The set Xi = {d i1 , di2 , •.. , } is a set of discrete values that the ith design variable can
take. The Lagrangian function is

L Li(Xi, >.),
n

£(X, >.) = (9.2.13)


i=l

where
n.
Li(Xi, >.) = !;(Xi) - L Ajgji(Xi), i=l, ... ,n. (9.2.14)
j=l

For a given>. we obtain £m(>') by minimizing each Li with respect to Xi by running


through all the discrete values in the set Xi
n

(9.2.15)

Note that for a given Xi, Li is a linear function of >.. The minimum over Xi of Li is
a piecewise linear function, with the pieces joined along lines where Li has the same
value for two different values of Xi. If the set Xi is ordered monotonically, and the
discrete values of Xi are close, we can expect that these lines will be at intersections
where
(9.2.16)
Equation (9.2.16) defines boundaries in >.-space, which divide this space into regions
where x is fixed to a particular choice of the discrete values. The use of these bound-
aries in the solution of the dual problem is demonstrated in the following example
from Ref. [181.

357
Chapter 9: Dual and Optimality Criteria Methods
Example 9.2.2

Figure 9.2.1 Two-bar truss

For the two bar truss shown in Figure 9.2.1, it is required to find the minimum
weight structure by selecting each of the cross-sectional areas Ai, i = 1,2, from the
discrete set of areas
A={1,1.5,2},
while at the same time satisfying the displacement constraints

u::; 0.75(FL/E), v ::; 0.25(F L/ E) .

The truss is statically determinate, and the displacements are found to be

FL ( 1 1)
u = 2E Al + A2 '

It is convenient to use Yi = l/A i as design variables. Denoting the weight by TV, and
the weight density by p, we can formulate the optimization problem as

W 1 1
minimize -=-+-
pL YI Y2
such that 1.5 - YI - Y2 2 0 ,
0.5 - YI + Y2 2 0 ,

and YI,Y2E {~,~,1}.


The Lagrangian function is

and
L1(YI,)..) = -1.5Al - 0.5A2 + l/YI + (AI + A2)YI ,
L2(Y2,)..) = 1/Y2 + (AI - A2)Y2 .
358
Section 9.2: Dual Methods

The boundaries for changes in values of Y1, from Eq. (9.2.16) are
1 1 1 2
1/2 + 2(A1 + A2) = 2/3 + 3(A1 + A2),
1 2 1
2/3 + 3(A1 + A2) = 1 + A1 + A2·
This yields

Similarly, we obtain the boundaries for changes in Y2 as


and A1 - A2 = 1.5.

(112,1)
2 --? x
~~,f

Figure 9.2.2 Regions in (A1' A2) space for two-bar truss problem

These lines divide the positive quadrant of the (A1' A2) plane into 6 regions with
the values of YI and yz in each region shown in Figure (9.2.2). \Ve start the search for
the maximum of Lm at the origin, ,x = (0,0). At this point L(x,,x) = l/YI + 1/Y2,
so that L(x,,x) is minimized by selecting the discrete values YI = Yz = 1, as also
indicated by the figure. For the region where these values are fixed

Obviously, to maximize Lm we increase A1 (we cannot reduce A2) until we get to the
boundary of the region at (1.5,0) with Lm = 2.75. \Ve can now move into the region
where the values of (YI, Y2) are (2/3,1), or into the region where these values are
(2/3,2/3). In the former region

Lm = 2.5 + A1/6 - 5A2/6 ,


and in the latter region

359
Chapter 9: Dual and Optima, :y Criteria Methods

In either region we cannot increase Lm. Because the maximum of Lm is reached at a


point (1.5,0) that belongs to three regions, we have three possibilities for the values of
the y's. However, only one selection yT = (2/3,2/3) does not violate the constraints.
For this selection the value of the objective function is 3, which is different from the
maximum of Lrn, which was 2.75. This difference, which is called the duality gap, is
a reminder that the discrete problem is not convex, so that we are not sure that we
obtained a minimum of the objective function. For the present example we did find
one of the minima, the other being (1/2,1) .•••
To demonstrate that the procedure can occasionally provide a solution which is
not the minimum we repeat Example (9.2.1) with the condition that the variables
must be integers.

Example 9.2.3

The problem formulation is now


• ••
mInImIze f()
x = ?
Xi + .1: 22 + X32
subject to gl (x) = Xl + ·1:2 - 10 :2: 0,

g2(X) = X2 + 2X3 - 8 :2: o.


and Xi are integers, i = 1,2,3.
The Lagrangian is given by Eqs. (a) and (b) of Problem (9.2.1), and the continuous
solution was obtained as
2 1
.1:1 = 4-
3' , = 1-
X3
3' f(x) = 52.
1
A1 = 9- Lm(A) = 52.
3'
\Ve will look for the integer solution in the neighborhood of the continuous solution.
Therefore, we need the boundaries given by Eq. (9.2.16) for integer values of :ri near
the continuous optimum. For :r1 we consider transition between 3 and 4, and uet\,,Teen
4 and 5. For these transitions Eq. (9.2.16), applied to L 1 , yields

and 16 - 4A1 = 25 - 5A1 ,

or
Al = 7, Al = 9.
For X2 we consider transitions between 4 and 5 and between 5 and 6. Equa-
tion(9.2.16), applied to L2 yields

Similarly, for X3, Eq. (9.2.16) applied to L3 for transitions between 0 and 1 and
between 1 and 2, gives
and A2 = l.5.

360
Section 9.2: Dual Methods

L = 65 - A.) - 2A.z
(5,6,2)
2

L =62 - A.)
(5,6, 1)

o
8 9 10 11
Figure 9.2.3 A-plane for Example 9.2.3.

These boundaries, and the values of .e(x, A) in some of the regions near the
continuous optimum are shown in Figure 9.2.3. We start the search for the optimum
at the continuous optimum values of Al = 9~, A2 = I!.
For this region the values
of the x;'s that maximize .em are (5,5,1), and .em = 51 + A2. This indicates that
A2 should be increa.'3ed. For A2 = 1.5 we reach the boundary of the region and
.em = 52.5. That value is attained for the entire boundary of the region marked
in heavy line in Figure 9.2.3. There are six adjacent regions to that boundary, and
using the expressions for .em given in the figure we can check that .em = 52.5 is the
maximum. \Ve now have six possible choices for the values of the Xi'S, as indicated
by the six regions that touch on the segment where .em is maximal. The two leftmost
regions violate the first constraint, and the three bottom regions violate the second
constraint. Of the two regions that correspond to feasible designs (5,5,2) has lower
objective function f = 54. The optimum, however, is at (4,6,1) with f = 53 .•••
\Vhile this example demonstrates that the method is not guaranteed to converge
to the optimum, it has been found useful in many applications. In particular, the
method has been applied extensively by Grierson and coworkers to the design of steel
frameworks using standard sections [1921J. The reader is directed to Ref. [18J for
additional information on the implementation of automatic searches in A-space for
the maximum, and for the case of mixed discrete and continuous variables.

9.2.4 Application with First Order Approximations

Many of the first order approximations discussed in chapter 6 are separable. The
linear and the conservative approximations are also concave, and the reciprocal ap-
proximation is concave in some ca.'3es. Therefore, if the objective function is convex
and separable the dual approach is attractive for the optimization of the approxi-
mate problem. Assume, for example, that the reciprocal approximation is employed

361
Chapter 9: Dual and Optimality Criteria Methods
for the constraints, and the objective function is approximated linearly. That is, the
approximate optimization problem is

+L
n

minimize f(x) = fo /;Xi


;=1
n (9.2.17)
subject to gj(x) = COj - L Cij/Xi ~ 0, j = 1, ... ,ng ,
i=1

where the constants Cij in (9.2.17) are calculated from the values of f and gj and
their derivatives at a point Xo. That is

of
fo = f(xo) - L XOi~(XO)
n

i=1 x,
, fi
of
= "!l(xo),
UXi
(9.2.18)

and from Eq. (6.1.7)

(9.2.19)

This approximate problem is convex if all the Ci/S are positive. Alternatively, the
problem is convex in terms of the reciprocals of the design variables if all the fi'S are
positive. In either case we have a unique optimum. The Lagrangian function is now

C(x, >.) ~ 10+ t I;x; - ~ ~j (COj - t C;j/ x) . (9.2.20)

The first step in the dual method is to find Cm(A) by minimizing C(x, A) over x.
Differentiating C(x, A) with respect to x, we obtain
n.
fi - L >"jCij/X~ = 0, (9.2.21)

r
j=l

so that

x; ~ U; ~ ~jC;j (9.2.22)

Substituting Eq. (9.2.22) into Eq. (9.2.20) we obtain

(9.2.23)

362
Section 9.2: Dual Methods
where x;(oX) is given by Eq. (9.2.22).
The maximization of .em(oX) may be performed numerically, and then we need
the derivatives of .em(oX). Using Eq. (9.2.21) we find
a.em
a>.. = -COj +
1
Ln

;=1
C;j/Xi(oX), (9.2.24)

and
a2.em ~(/ 2) aXi
a>.1·a>. k = - L...J
;=1
Cij X; a>.k ' (9.2.25)

or, using Eq.(9.2.22)


(9.2.26)

With second derivatives so readily available it is reasonable to use Newton's method


for the maximization.
In general, when some of the Cij'S are negative, Eq. (9.2.22) may yield imaginary
values. It is, therefore, safer to employ the conservative-convex approximation. This
is, indeed, the more common practice when the dual method is used [22].
Example 9.2.4

The three-bar truss in Figure 9.2.4 is subjected to a combination of a horizontal and a


vertical load. For this example we assume that the vertical load is twice the horizontal
load, PH = p, Pv = 2p. The horizontal load could be acting either to the right or to
the left, and for this reason we are seeking a symmetric design. The truss is to be
designed subject to a constraint that the displacement at the point of load application
does not exceed a value d in either the horizontal or vertical directions. The design
variables are the cross-sectional areas AA, AB and Ac of the truss members. Because
of symmetry we assume that AA = Ac. The objective h selected for this example is
somewhat artificial, and is given as

,x.u
Y,v

Figure 9.2.4 Three-bar truss.

363
Chapter 9: Dual and Optimality Criteria Methods

h = AA + 2A B ,
which is based on the assumption that the cost of member B is high. The constraints
are
gl = 1 - u/d ?: 0,
g2=I-v/d?:0,
where 11 and v are the horizontal and vertical displacements, respectively. Assuming
that all three members have the same Young's modulus E, we may check that

4pl
u=--
3EA A '
2pl
v = -=-:----------,-
E(AB + 0.25A A) .

vVe define a reference area Ao = 4pl/3Ed and normalized design variables Xl


AA/A o, X2 = AR/Ao, and we may now write the optimization problem as

minimize f(x) = Xl + 2.T2


subject to gl(X) = 1- I/Xl ?: 0,
g2(X) = 1 - 1.5/(X2 + 0.25xl) ?: O.

vVe now use the reciprocal approximation for g2(X) about an initial design point
X6=(I,I)
g2(XO) = -0.2,
Og2 (xo) = 0.375
OXI (.T2 + 0.25xl)2 IXo = 0.24,
Og2 (xo) = 1.5 .
+ 0.25xd 2 IXo
= 0.96.
OX2 (X2

so the reciprocal approximation g2R is

g2R(X) = -0.2 + 0.24(Xl - I)/Xl + 0.96(X2 - 1)/x2


= 1 - 0.24/ Xl - 0.96/ X2·

The approximate problem is

minimize f = Xl + 2X2
subject to gl(X) = 1 - I/x] ?: 0,
g2R(X) = 1 - 0.24/x] - 0.96/.7:2 ?: O.

In the notation of Eq. (9.2.17)

fl = 1, 12 = 2, Cll = 1, C21 = 0, C12 = 0.24, C22 = 0.96, COl = C02 = 1.


364
Section 9.3: Optimality Criteria Methods for a Single Constraint
Cm(oX) is maximized here using Newton's method, starting with an initial guess of
Xr; = (1,1). Then from Eq.(9.2.22)

Xl = (1.24)1/2 = 1.113, X2 = (0.48)1/2 = 0.693.


From Eq.(9.2.24)
-aC
a)' 1 = -1 + -
m 1
- = -0.1015,
1.113
aC m _ 0.24 0.96 _ 6
a),2 - -1 + 1.113 + 0.693 - O. 0,
and from Eq.(9.2.26)

a2 cm
8),r = -2"
1( 1)3 1.113 = -0.3626,

a2C m = _~ ( 0.24 ) = -0.0870


a)' 18),2 2 1.1133 '

8 2 Cm = _~ ( 0.24 2 0.96 2 ) = -0.7132.


8)'~ 2 1.1133 +2X 0.693 3
Using Newton's method for maximizing Cm, we have

oX _ { 1 } _ [-0.3626 -0.0870] -I { -0.1015 } _ { 0.503 }


1 - 1 -0.0870 -0.7132 0.60 - 1.903 '

so that
Xl = (0.503 + 0.24 x 1.903)1/2 = 0.980,
= (0.48 X 1.903)1/2 = 0.956.
X2

One additional iteration of Newton's method yields ),1 = 0.356, ),2 = 2.05, Xl = 1.02,
X2 = 1.17. We can check on the convergence by noting that the two Lagrange
multipliers are positive, so that we expect both constraints to be critical. Setting
gl(X) = 0 and g2R(X) = 0, we obtain Xl = l,x2 = 1.263,1 = 3.526 as the optimum
design for the approximate problem. Newton's method appears to converge quite
rapidly. The optimum of the original problem can be found by setting gl (x) = 0 and
g2(X) = 0 to obtain Xl = 1, X2 = 1.25, f = 3.5 .•••
Because dual methods operate in the space of Lagrange mUltipliers they are par-
ticularly powerful when the number of constraints is small compared to the number
of design variables. The same is true for optimality criteria methods which are dis-
cussed next. These methods are indeed exceptional when we have only a single critical
constraint.

9.3 Optimality Criteria Methods for a Single Constraint

Optimality criteria methods originated with the work of Prager and his co-workers
(e.g., [23]) for distributed parameter systems, and in the work of Venkayya, Khot,

365
Chapter 9: Dual and Opt, tlity Criteria Methods

and Reddy ([24] for discf( /stems. They formulated optimality criteria such as the
uniform energy distributi. riterion discussed earlier. Later the discrete optimality
criteria were generalized 't1J'"i3erke, Venkayya, Khot, and others ( e.g., [25]-[27]) to
deal with general displacement constraints. The discussion here is limited to discrete
optimality criteria, and it is based to large extent on Refs [28] and [29]. The reader
interested in distributed optimality criteria is referred to a textbook by Rozvany [30]
who has contributed extensively to this field.
Optimality criteria methods are typically based on a rigorous optimality criterion
derived from the Kuhn-Tucker conditions, and a resizing rule which is heuristic. Usu-
ally the resizing rule can be shown to be based on an assumption that the internal
loads in the structure are insensitive to the resizing process. This is the same assump-
tion that underlies the FSD approach and the accompanying stress-ratio resizing rule.
This assumption turns out to be equivalent in many cases to the assumption that the
reciprocal approximation is a good approximation for displacement constraints. This
connection between optimality criteria methods and the reciprocal approximation is
useful for a better understanding of the relationship between optimality criteria meth-
ods and mathematical programming methods, and is discussed in the next section.

9.3.1 The Reciprocal Approximation for a Displacement Constraint

We start by showing that for some structural design problems the assumption of
constant internal loads is equivalent to the use of the reciprocal approximation for
the displacements. The equations of equilibrium of the structure are written as
Ku=f, (9.3.1)
where K is the stiffness matrix of the structure, u is the displacement vector, and f
is the load vector.
Because the reciprocal approximation is used extensively in the following, we
introduce a vector y of reciprocal variables, y; = llx;, i = 1, ... , n. The displacement
constraint is written in terms of the reciprocal design variables as
g( u, y) = z- ZT U ;::: 0, (9.3.2)
where z is a displacement allowable, and zT u is a linear combination of the displace-
ment components. The reciprocal approximation is particularly appropriate for a
special class of structures defined by a stiffness matrix which is a linear homogeneous
function of the design variables Xi (e.g., truss structures with cross-sectional areas
being design variables)
n n
(9.3.3)
i=1 i=1

We also assume that the load is independent of the design variables. Under the above
conditions we will show that
n 8
g(u) = z+ :LYi~' (9.3.4)
i=1 8y;

366
Section 9.3: Optimality Criteria Methods for a Single Constraint
That is, what appears to be a first order approximation of 9 is actually exact. Equa-
tion (9.3.4) does not imply that the constraint is a linear function of the design
variables because 8gj8Yi depends on the design variables. To prove Eq. (9.3.4) we
use Eq. (7.2.8) for the derivative of a constraint, replacing x by Yi. As the load vector
is independent of the design variables, Eq. (7.2.8) yields

8g = -),7 (8K) U, (9.3.5)


8Yi 8Yi
where A is obtained by solving Eq. (7.2.7) (Note, however, that in Eq. (7.2.7) z is a
vector with components equal to 8gj8uj, while here, z is the negative of this vector,
see Eq. (9.3.2), so we have to replace z by -z in the solution for A). Also from Eq.
(9.3.3)
8K Ki
(9.3.6)

Using Eqs. (7.2.7), (9.3.4), (9.3.5) and (9.3.6) and the symmetry of K we get

t ;g.Yi = tAT~iUYi = AT (tKdYi) = ATKu = -zTu.


i=1 y, i=1 y, i=1
U (9.3.7)

From Eqs. (9.3.7) and (9.3.2) we can see that Eq. (9.3.4) is indeed correct.
Equation (9.3.4) motivates the use of the reciprocal approximation for displace-
ment constraints. For statically determinate structures, under the assumptions used
to prove Eq. (9.3.4), we have even a stronger result that the derivatives 8gj8Yi are
constant, so that the reciprocal approximation is exact. We prove this assertion by
showing that if internal loads in the structure are independent of the design variables
then 8g j 8Yi are constant. The internal loads in a statically determinate structure are,
of course, independent of design variables which control stiffness but not geometry
or loads.
We consider K;jYi in Eq.(9.3.3) to be the contribution of the part ofthe structure
controlled by the ith design variable to the total stiffness matrix K. The forces acting
on that part of the structure are fi
K
fi=-'u. (9.3.8)
Yi

If the ith part of the structure is constrained against rigid body motion, the same
forces will be obtained from a reduced stiffness matrix K~ and a reduced displacement
vector u~
K~u,i ,
fi = - (9.3.9 )
Yi
where K~ is obtained from Ki by enforcing rigid body motion constraints, and u~ is
obtained from u by removing the components of rigid body motion from the part
of u which pertains to the ith part of the structure. Under these conditions K~ is
invertible, so that
u:
= YiUj, (9.3.10)
367
Chapter 9: Dual and Optimality Criteria Methods

where
Ui = (K~t1fi . (9.3.11)
Using Eqs. (9.3.6), (9.3.8) and (9.3.9), we now write Eq. (9.3.5) as

(9.3.12)

The vector ATK:!Yi is the internal force vector due to the (dummy) load z (see Eq.
(7.2.7», and is constant if we assume that internal forces are independent of design
variables. Also fi in Eq. (9.3.9) is constant, and so is Ui from Eq. (9.3.11). Therefore,
finally, from Eq. (9.3.12) 8g18y; is constant.
We will now consider the use of optimality criteria methods for a single displace-
ment constraint, based on the reciprocal approximation.

9.3.2 A Single Displacement Constraint

Because of the special properties of the reciprocal approximation for displacement


constraints, we pose the optimization problem in terms of reciprocal variables as
minimize f(y)
(9.3.13)
subject to g(y) 2:: o.
For this problem, the Kuhn-Tucker condition is

of _ >. 8g _ 0
oy; OYi - ,
i = 1, ... ,n. (9.3.14)

In many cases the objective function is linear or almost linear in terms of the original
design variables Xi, and since Yi = 1lxi, Eq. (9.3.14) is rewritten as
28f 8g
Xi-O
Xi
+ >.~
VYi
= 0, (9.3.15)

so that
Og/OYi) 1/2
Xi = ( ->. of 18x; , i = 1, ... ,no (9.3.16)

The Lagrange multiplier>. is obtained from the requirement that the constraint re-
mains active (with a single inequality displacement constraint we can usually assume
that it is active). Setting the reciprocal approximation of the constraint to zero we
have
gR = g(yo) + Ln

;=1
8g
"i):(Yi - YOi) = Co +
y,
n

i=l
L
og 1
"i):;: = 0,
y, ,
(9.3.17)

where
(9.3.18)

368
Section 9.3: Optimality Criteria Methods for a Single Constraint
Substituting from Eq.(9.3.16) into Eq. (9.3.17) we obtain

A ~ [~ t. (- ::;:rr
Equations (9.3.19) and (9.3.16) can now be used as an iterative resizing algorithm
(9.3.19)

for the structure.


The process starts with the calculation of the displacement constraint and its
derivatives, then A is calculated from Eq. (9.3.19) and the new sizes from Eq. (9.3.16).
The iteration is repeated, and if fJf /fJxi and fJg/fJYi are not too volatile the process
converges.
In most practical design situations we also have lower and upper limits on design
variables besides the displacement constraint, and the resizing algorithm must be
modified slightly. First, Eq. (9.3.16) is supplemented by the lower and upper bounds,
so that if it violates these limits the offending design variable is set at its limit. Second,
the set of design variables which are at their lower or upper limits is called the passive
set and denoted lp, while the set including the rest of the variables is called the active
set and denoted la. Equation (9.3.17) is now written as
*
Co + "fJg 1
L...J ~- = 0, (9.3.20)
iEI. UYi Xi
where
*
Co=CO+ L -fJg-1. (9.3.21 )
iElp
fJy·•X·

Equation (9.3.19) for A is similarly modified to


2
1 fJf fJg
~
1/2
[ ]
A- - --- (9.3.22)
- Co ( fJXj fJyJ
The resizing process described in this section does not have step-size control. That
is, Eq. (9.3.16) could possibly result in a very large change in the design variables
from the initial values used to calculate the derivatives of f and g. The process
can be modified to have control on the amount of change in the design variables, as
discussed in the following sections. Including such control, Khot [28] showed that the
optimality criterion method discussed here can be made completely equivalent to the
gradient projection method applied together with the reciprocal approximation.
Example 9.3.1

We repeat example 9.2.2 with only a single displacement constraint on the verti-
cal displacement. Using the normalized design variables, we pose the mathematical
formulation of the problem as
minimize f(x) = Xl + 2X2
subject to g(x) = 1 - 1.5/(X2 + 0.25xd ~ O.
369
Chapter 9: Dual and Optimality Criteria Methods

We also add minimum gage requirements that Xl ;::: 0.5 and X2 ;::: 0.5.
The derivatives required for the resizing process are

of of og 2 og 0.375xi
OXI = 1, OX2 = 2, - = - XI - = -
0YI OXl (X2 + 0.25xt}2 '
og 20g 1.5x~
- = - X2- = .
0Y2 OX2 (X2 + 0.25xt}2'
og og
Co = g(y) + -;;-Yl
UYl
+ -;;-Y2
UY2
1.5 0.375xl 1. 5x 2
= 1 - (X2 + 0.25xd + (X2 + 0.25xl)2 + (X2 + 0.25xl)2 =1
.

We start with an initial design Xo = (1, If, and the iterative process is summarized
in Table 9.3.1.
Table 9.3.1
Eq.(9.3.16)
Xl X2 OgjOYl OgjOY2 c*0 .\. Xl X2
1.0 1.0 -0.24 -0.96 1.0 3.518 0.92 1.30
0.92 1.30 -0.136 -1.083 1.0 3.387 0.68 l.35
0.68 1.35 -0.0751 -1.183 1.0 3.284 0.496 l.39
0.50 1.39 -0.0408 -1.263 0.918 2.997 0.350 l.376
0.50 1.376 -0.0416 -l.261 0.917 2.999 0.353 1.375

The design converged fast to Xl = 0.5 (lower bound) and X2 = 1.375 even though
the derivative of the constraint with respect to Yl is far from constant. The large
variation in the derivative with respect to Yl is due to the fact that the three bar
truss is highly redundant. This statement that one extra member constitutes high
redundancy may seem curious, but what we have here is a structure with 50% more
members than needed .•••
As can be seen from the example, the optimality criteria approach to the single
constraint problem works beautifully. Indeed, it is difficult to find It more suitable
method for dealing with this class of problems.

9.3.3 Generalization for Other Constraints

The optimality criteria approach discussed in the previous section is very similar to
the dual method. In particular, Eq. (9.2.22) is a special case of Eq. (9.3.16). While
the derivations in the previous section were motivated by the match between displace-
ment constraints and the reciprocal approximation, they arc clearly suitable for any
constraint that is reasonably approximated by the reciprocal approximation. In the
present section we generalize the approach of the previous section, and demonstrate
its application to more general constraints.

370
Section 9.3: Optimality Criteria Methods for a Single Constmint

The optimality criterion for a single constraint

i = 1, ... ,n, (9.3.23)

may be written as
A_ of / og i = 1, ... ,no (9.3.24)
- ox; ax;'
The right-hand-side of Eq. (9.3.24) is a measure of the cost effectiveness of the ith
design variable in affecting the constraint. The denominator measures the effect of x;
on the constraint, and the numerator measures the cost associated with it. Equation
(9.3.24) tells us that at the optimum all design variables are equally cost effective in
changing the constraint. Away from the optimum some design variables may be more
effective than others. A reasonable resizing technique is to increase the utilization of
the more effective variables and decrease that of the less effective ones. For example,
in the simple case where Xi, of/ax; and og/ox; are all positive, a possible resizing
rule is
x~ew = x~d(Ae.)l/'1 ,
I t t (9.3.25)
where
ei = (og/oxi)/(of /OXi) , (9.3.26)
is the effectiveness of the ith variable and 'T] is a step size parameter. A large value of
'T] results in small changes to the design variables, which is appropriate for problems
where derivatives change fast. A small value of 'T] can accelerate convergence when
derivatives are almost constant, but can cause divergence otherwise. To estimate the
Lagrange multiplier we can require the constraint to be critical at the resized design.
Using the reciprocal approximation, Eq. (9.3.17), and substituting into it Xi from
Eq. (9.3.25), we get
1 n
A= [ - Lx;~ei'
og _l.
,
l'1 (9.3.27)
Co i=l uX,

with Co obtained from Eq. (9.3.18). A resizing rule of this type is used in the FASTOP
program [31] for the design of wing structures subject to a flutter constraint.

Example 9.3.2

A container with an open top needs to have a minimum volume of 125m3 . The cost
of the sides of the container is $10/m2 , while the ends and the bottom cost $15/m2 •
Find the optimum dimensions of the container.
We denote the width, length and height of the container as Xl, X2, and X3, re-
spectively. The design problem can be formulated then as

minimize f = 20X2X3 + 30XIX3 + 15x IX2


such that 9 = XIX2X3 - 125 ~ O.

371
Chapter 9: Dual and Optimality Criteria Methods

The e;'s for the three variables are given as

e1 = X2X3/(30X3 + 15x2) ,
e2 = XIX3/(20X3 + 15xI) ,
e3 = xlx2/(20X2 + 30xd .

We start with an initial design of a cube Xl = X2 = X3 = 5m, f = $1625, and obtain


Og/OX1 = Og/OX2 = ag/aX3 = 25, Co = 375 and e1 = 1/9, e2 = 1/7, e3 = 1/10.
Selecting "1 = 2 we obtain from Eq. (9.3.27) .A = 8.62, and using Eq. (9.3.25) we get

Xl =5(8.62/9)1/2 = 4.893,
X2 =5(8.62/7)1/2 = 5.549,
X3 =5(8.62/10)1/2 = 4.642.

For the new values of the design variables we obtain f = 1604, g = 1.04, e1 = 0.l158,
e2 = 0.1366, e3 = 0.1053, and .A = 8.413. The next iteration is then

Xl =4.893(8.413 X 0.l158)1/2 = 4.829,


X2 =5.549(8.413 X 0.1366)1/2 = 5.949,
X3 =4.642(8.413 X 0.1053)1/2 = 4.370.

Finally, for these values of the design variables the effectivenesses are e1 = 0.l180,
e2 = 0.1320 and e3 = 0.1089, g = 0.54, and f = 1584. We see that the maximum dif-
ference between the e;'s which started at 43 percent is now 21 percent. By continuing
the iterative process we find that the optimum design is Xl = 4.8075, X2 = 7.2l12,
X3 = 3.6056 and f = 1560. At the optimum e1 = €2 = €3 = 0.120. Even though the
design variables change much from the values we obtained after two iterations, the
objective function changed by less than two percent, which was expected in view of
the close initial values of the ei's . •••

9.3.4 Scaling-based Resizing

As noted in the previous section, Eq. (9.3.24) indicates that at the optimum all design
variables (which are not at their lower or upper bounds) are equally cost effective,
and that their cost effectiveness is equal to 1/.A. It is possible, therefore, to estimate
.A as an average of the reciprocal of the cost effectivenesses. Venkayya [291 proposed
to estimate .A as
.A = 'C"'n
2:~=1 ai , (9.3.28)
L.."i=l aiei

where the ai represent some suitable weights (such as af /aXi). Equation (9.3.28) can
then be used in conjunction with a resizing rule, such as Eq. (9.3.25).
Unfortunately, the combination of Eq. (9.3.28) with a resizing rule does not
contain any mechanism for keeping the constraint active, and so the iterative process
will tend to drift either into the feasible or infeasible domains. Therefore, an estimate
372
Section 9.3: Optimality Criteria Methods for a Single Constraint

of oX from Eq. (9.3.28) must be accompanied by an additional step to insure that the
design remains at the constraint boundary. One simple mechanism, used extensively
with optimality criteria formulations is that of design variable scaling. One reason
for the popularity of scaling is that for the simple case represented by Eq. (9.3.3)
it is very easy to accomplish. It is easy to check from Eqs. (9.3.1) and (9.3.3) that
scaling the design variable vector by a scalar a to ax scales the displacement vector
to (1/0' )u. Venkayya [291 proposed the following procedure for the more general case.
Consider a constraint g of the form
g(x) = z- z(x) ~ 0, (9.3.29)
where z(x) represents some response quantity such as a displacement component,
and z is an upper bound for z. If at the current design g ~ 0, we would like to find
a so that
z(ax) = z. (9.3.30)
Approximating z(ax) linearly about x we get

z(ax) ~ z(x) + ~oz'


~ ax', (a - l)Xi = z, (9.3.31)
;=1 •
or
z-z g
a = 1+ n az = 1- n a . (9.3.32)
~i=1 ax; Xi ~i=1 ';;;Xi
If we use the reciprocal approximation in Eq. (9.3.31) we get instead
",n E1L .
L."i=1 ax; x,
0'= n a . (9.3.33)
g + ~i=1 ';;;Xi
For the simple case represented by Eq. (9.3.3) and if the response quantity z is
a stress or displacement component, the reciprocal scaling is exact. Furthermore,
z(ax) = (l/a)z(x), so that Eq. (9.3.33) can be replaced by
0'= z/z = 1- g/z. (9.3.34)
Venkayya suggests that Eq. (9.3.32) be used when
1 n oz
-LaXi~O, (9.3.35)
z ;=1 Xi

otherwise Eq. (9.3.33) is to be used. It can be readily checked that the scaling
equations for a in terms of g are valid also for lower bound constraints of the form
z - z ~ O.
In combining the resizing step, Eq. (9.3.25), with the scaling step we must con-
sider whether we calculate new derivatives for each of these two operations. If we
do, then the number of derivative calculation will increase to two per iteration. In
most cases this is unnecessary. Unless the scaling step results in large changes in the
design variables we can calculate the Lagrange multiplier using derivatives obtained
before scaling.

373
Chapter 9: Dual and Optim. :ty Criteria Methods
Example 9.3.3

Consider again the container problem of Example (9.3.2). We will solve it again using
Eq. (9.3.28) for estimating ,x, and also employ scaling.
vVe start with the same initial design as in Example (9.3.2) of Xl = X2 = X3 = 5m.
For this design 9 = 0, so that we do not need any scaling. We have e1 = 1/9, e2 = 1/7,
e3 = 1/10, so that Eq. (9.3.28) with all the weights set to one gives us

3
,x = = 8475
1/9 + 1/7 + 1/10 . ,
Then, using Eq. (9.3.25) with." = 2, we have
Xl =5(8.475/9)1/2 = 4.852,
X2 =5(8.475/7)1/2 = 5.502,
X3 =5(8.475/10)1/2 = 4.603.
For the new values of the design variables 9 = -2.12, f = 1577, 8g/8xl = 25.325,
8g/8x2 = 22.334, 8g/8x3 = 26.695, el = 0.1148, e2 = 0.1355, ea = 0.1044. In our
case z = X1X2.Ta and it is easy to check that Eq. (9.3.35) is satisfied, so that we use
Eq. (9.3.32) for scaling.

a = 1- -2.12 = 1.00576.
25.325 X 4.852 + 22.334 X 5.502 + 26.695 X 4.603
Scaling the design variables we get Xl = 4.880, X2 = 5.533, and X3 = 4.630. For these
scaled variables 9 = 0.015, indicating that the scaling worked. For this scaled design
f = 1595 which is a truer measure of improvement than the f = 1577 of the unsealed
design, because the constraint is not violated. \Ve next obtain
3
,x = 0.1148 + 0.1355 + 0.1044 = 8.457,
and resize to obtain
X'1= 4.880(8.457 X 0.1148)1/2 = 4.808,
X2 = 5.533(8.457 X 0.1355)1/2 = 5.923,
= 4.628(8.457 X 0.1044)1/2 = 4.351.
Xa

For this design 9 = -1.08, f = 1570, 8g/8xI = 25.772, 8g/8x2 = 20.921, 8g/8x3 =
28.481, el = 0.1175, e2 = 0.1315, e3 = 0.1084. The scaling factor a is
-1.08
a = 1- = 1.0029.
25.772 X 4.808 + 20.921 X 5.923 + 28.481 X 4.351
Scaling the design variables we get Xl = 4.822, .7:2 = 5.941, and X3 = 4.364. For these
values 9 = 0.018 and f = 1579. Note that convergence is faster than in Example
(9.3.2).
374
Section 9.4: Seveml Constmints

9.4 Several Constraints

9.4.1 Reciprocal-Approximation Based Approach

We start again by posing the optimization problem in terms of the reciprocal variables
minimize f(y)
(9.4.1)
subject to 9j(y)?= 0, j = 1, ... ,ng,
so that the Kuhn-Tucker conditions are

i = 1, ... ,n. (9.4.2)

As in the case of a single constraint we assume that f is approximately linear in x,


so we replace the derivative with respect to y by a derivative with respect to x to get
ng

X%fk - L CkjAj = 0, (9.4.3)


j=l

where
k = 1, ... ,n. (9.4.4)

This equation can be used to obtain Xk as

k = 1, . .. ,n. (9.4.5)

However, several other possibilities for using Eq. (9.4.3) have been proposed and
used. One resizing rule, called the exponential rule, is based on rewriting Eq. (9.4.3)
as
In. )
k = Xk ( --:r; L
1/1/
X EW
AjCkj , k = 1, ... ,n, (9.4.6)
XkJk j=1

where the old value of Xk is used on the right-hand side to produce a new estimate for
Xk. A linearized form of Eq. (9.4.6) can be obtained by using the binomial expansion
as
k = 1, ... ,n, (9.4.7)
where
k = 1, ... ,n. (9.4.8)

375
Chapter 9: Dual and Optimality Criteria Methods

It is clear from the form of the last two equations that 17 is a damping or step-size
parameter. A high value of 17 reduces the correction to the present design, prevents
oscillations, but can slow down progress towards the final design. A value of 17 = 2
corrC'sponds to Eq.(9.4.5).

The main difficulty in the case of multiple constraints is the calculation of the
Lagrange multipliers. It is possible to use the dual method and calculate the Lagrange
multipliers using Newton's method. A second approach is to calculate them from the
condition that the critical constraints remain critical, similar to Eq. (9.3.17). Assume,
for example, that the ng constraints are all critical. Then the Lagrange multipliers
arc fonnd from the condition that

I = 1, ... , n g , (9.4.9)

or
(9.4.10)

Using Eq.(9.4.8) for !::;,J.:k we have

I: I: x
"fl g n
CklClej, -
3
--A· -
J f
I:
1l

C"I
X
- - 1791 ( X ) , I = 1, ... , 71 g • (9.4.11)
j=l k=l . k· k k=l k

Equation (9.4.11) is a system of linear equations for A. Often the solution will yield
negative values for some of the Lagrange multipliers which may indicate that the
corresponding constraints should not be considered active. Several iterations with
revised sets of active constraints may be needed before a set of positive Lagrange
multipliers is found. Equation (9.4.11) may also be used to find starting values for a
solution with the dual approach.
Stress constraints can be dealt \vith using the above approach. However, in
many optimality criteria procedures they are handled instead by using the stress
ratio technique. Member sizes obtained by the stress ratio technique are then used
as minimum gages for the next optimality criteria iteration. The two approaches are
compared in the following example.

Example 9.4.1

Find the minimum-mass design of the truss in Figure 9.4.1 subject to a limit
of d = O.OOll on the vertical displacement and a limit of au on the stresses. The
design variables are the cross-sectional areas of the members, AA, An and A c , and
because of symmetry it is required that AA = Ac. All members arc made from the
same material having Young's modulus E, density p and yield stress ao = O.002E.
After finding the optimum design we also want to estimate the effect of increasing
t he displacement allowable to 1.25d.

376
Section 9.4: Several Constraints

I
x,u

I
p

y,v 8p

Figure 9.4.1 Three-bar truss.

The truss was analyzed in example 6.1.2, and the vertical displacement and
stresses in the members were found to be

8pl
v =
+ 0.25A A )
=c:-:---=--:::----::-:-:--:-~
E(AB ,

aA = P (_v'3_3_ + 2 ) ,
3A A AB + 0.25A A
8p
=
aB
AB + O.25A A ,
ae = p ( __v'3_3_ + 2 )
3A A All + O.25A A

The design problem may be written as

minimize m = pl(AB + 4A A )
v
subJ'ect to 91 = 1 - - -
.
>0
0.0011 - ,
aA aB
g2 = 1- -;::: 0, 93=1--;:::0,
ao aD
ae ae
g4 = 1- -;::: 0, g5 = 1 + -;::: O.
ao ao

where the second constraint on ae is needed because rYe could be negative. Defining
nondimensional design variables

377
Chapter 9: Dual and Optimality Criteria Methods
we may rewrite the problem as
minimize f(x) = 4Xl + X2
16
such that gl(X) =1- (X2 + O.25xd ;::: 0,

g2(X)
v'3 -
= 1 - -3Xl 2
> 0,
(X2 + O.25xl) -
8
g3(X) = 1 - (X2 + O.25xI) ;::: 0,

g4(X) = 1 + -
v'3 - 2
> 0,
3Xl (X2 + O.25xl) -

g5(X)
v'3 +
= 1 - -3Xl 2
> O.
(X2 + O.25xl) -
Obviously, gl is always more critical than g3, and g2 is always more critical than either
g4 or g5, so that we need consider only gl and g2. We solve the problem first by using
the stress ratio technique coupled with the optimality criterion for the displacement
constraint.
Using the stress-ratio technique we resize the areas as

(AA)new = (::) (AA)oId,

(AB)new = (::) (A B)oId,


or in terms of the nondimensional variables
(xdnew = [1 - g2(X)]XI ,
(X2)new = [1 - g3(X)JX2 .
These values are now employed as minimum gage values for the optimality criteria
method applied to gl only, using Eqs. (9.3.19) and (9.3.16). For the calculations we
need the following derivatives
Ogl 2og1 4xi
- = - x 1- =
OYI OXI (X2 + O.25xt}2 '
Ogl 2og1 16x~
- = - x2 - = ,
OY2 OX2 (X2 + O.25xd 2
of _ 4 of = 1
OXI - 'OX2 '
og og
Co = g(y) - - Y l - -Y2
OYI OY2
16x~ 4XI 16x~ 1
=1-
(X2 + O.25xI) + (X2 + O.25xl)2 + (X2 + O.25xd = .

378
Section 9.4: Several Constraints
We start at x~ = (1,10) and obtain

92 = 0.2275, 93 ~9l
= 0.2195, UYI = -0.03807, 091 = -15.23.
OY2

Applying the stress ratio technique we get (Xl)new = 0.7725, (X2)new = 7.805. Because
of the large difference in the derivatives of 91 with respect to Yl and Y2 we expect the
optimality criteria approach to try and reduce Xl further, so that the value obtained
from the stress ratio technique will end up as a minimum gage constraint. Therefore,
we consider Xl to be a passive design variable (i.e Xl E Ip). Then from Eqs.(9.3.20)
and (9.3.21) we have

c~ = 1 - 0.03807 = 0.9619, A= (VI5.23)2


0.9619
= 16.46.

Finally from Eq. (9.3.16) we obtain Xl = 0.356, X2 = 15.83, confirming the assump-
tion that Xl is controlled by the stress constraints. The iteration is continned in Table
9.4.1.

Table 9.4.1
Iteration Xl X2 (xdnew (X2)new c*0 A Xl X2
1 1. 10. 0.7725 7.805 0.9619 16.46 0.356 15.83
2 0.7725 15.83 0.6738 7.904 0.9880 16.00 0.193 15.81
3 0.6738 15.81 0.6617 7.916 0.9894 16.00 0.169 15.83

We next solve the same problem using the optimality criteria technique for both
constraints. We use Eq. (9.4.11) for calculating the Lagrange multipliers, and Eq.
(9.4.8) with TJ = 2 for updating the design variables. The iteration history is given in
Table 9.4.2.

Table 9.4.2
Iteration Xl x2 91 92 Al A2 6. x l 6. x 2
1 1. 10. -0.5610 0.2275 11.70 O. -0.4443 3.906
2 0.5557 13.906 -0.1392 -0.1814 15.00 2.648 0.0897 1.694
3 0.6434 15.600 -0.0152 -0.0243 15.63 2.826 0.0160 0.231

Note that Tables 9.4.1 and 9.4.2 indicate convergence to the same design, with
A in Table 9.4.1 and Al in Table 9.4.2 converging to 16.00. this value is the 'price'
of 91. At the optimum design 91 = 0 or v = d. If we increase that allowable
displacement to 1.25d, then 91 = 0.2, and the expected decrease in the objective
function is approximately 0.2 x 16 = 3.2 .•••

379
Chapter 9: Dual and Optimality Criteria Methods

9.4.2 Scaling-based Approach

The Kuhn-Tucker conditions, Eqs. (9.2.5), can be written as


ng

L)..jejj = 1, i = 1, ... ,n, (9.4.12)


j=l

where
e .. _ ogj/ of i = 1, ... , n , j = 1, ... , ng , (9.4.13)
'J - OXj OXj

is the effectiveness parameter of the ith design variable with respect to the jth con-
straint. Equation (9.4.12) indicates that at the optimum the effectivenesses of all
design variables, weighted by the Lagrange multipliers, are the same. This form of
weighting makes sense, since the Lagrange multipliers measure the importance of the
constraints in terms of their effect on the optimum value of the objective function.
Venkayya [29] suggests the generalization of Eq. (9.3.25) as

i = 1, ... ,n, (9.4.14)

for resizing the design variables. For the Lagrange multiplier evaluation he proposes
using estimates based on a single constraint, that is Eq. (9.3.28), which gives

)... = l:~l aj j = 1, ... , n g • (9.4.15)


J ",n ,
L.....i=l ajeij

However, Lagrange multipliers are calculated only for the most critical constraints,
and are set to zero for the other constraints. Finally, scaling is used, based on the most
critical design constraint. This approach is demonstrated by repeating the previous
example.

Example 9.4.2

The minimization problem that we consider is


minimize f(x) = 4Xl + X2
such that 9l(X) = 1- (X2 +160.25xl) ;?: 0,

g2(X) = 1 - - -
v'3 2
> O.
3Xl (X2 + 0.25xd -
We solve this problem assuming that a constraint is critical if after scaling its value
is less than 0.15. Starting with Xl = 1, X2 = 10, we get gl = -0.5610,92 = 0.2275,
so that we need to scale based on the first constraint. For this constraint we have
4 16
2 = 0.03807, -;----:-::-=__= = 0.1523 ,
( X2 + O. 25Xl ) (X2 + 0.25xl)2
380
Section 9.4: Several Constraints
agd aXl agd aX2
ell = af/aXl = 0.009518, e21 = af/ aX 2 = 0.1523.
For this case z = 1 - g so that the scaling test, Eq. (9.3.35) yields
1 ~ az -1 ~ ag -1
~ f;;r aXi Xi = 1 _ g f;;r
aXi Xi = 1.561 (0.03807 x 1 + 0.1523 x 10) ~ O.

Therefore we use the reciprocal scaling, Eq. (9.3.33)


Q = 0.03807 x 1 + 0.1523 x 10 = 1.561.
-0.561 + 0.03807 x 1 + 0.1523 x 10
The scaled variables are Xl = 1.561, X2 = 15.61. If we check the constraints we find
that gl = 0., g2 = 0.5051 so that the scaling is exact. This is because the structure
satisfies Eq. (9.3.3) so that we can use Eq. (9.3.34) which simplifies here to (f = 1- g.
We now estimate A from Eq. (9.3.28) using al = a2 = 1 to get
A= 2 = 12.36.
0.009518 + 0.1523
Next we resize the design variables using Eq. (9.3.24) with "l = 2 to get
Xl = 1.561(12.36 X '0.009518)1/2 = 0.5354, X2 = 15.61(12.36 X 0.1523)1/2 = 21.42.
The large change in the design variables indicates that the value of "l = 2 that we
used is to low, so we increase it to 4 and repeat the resizing
Xl = 1.561(12.36 X 0.009518)1/4 = 0.9142, X2 = 15.61(12.36 X 0.1523)1/4 = 18.28.
For these new values of the design variables we have gl = 0.1357, g2 = 0.2604.
vVe expect that after scaling g2 will be under 0.15, so that both constraints will be
considered critical. Therefore we calculate derivatives for both constraints.
~gl = 0.01167, ~gl = 0.04669, ell = 0.00292, e21 = 0.04669,
UXI UX2
and
ag2 = J3 + 0.5 = 0 6923 ag2 _ 2 = 0 00584
aXI 3xI (X2 + 0.25xI)2 . , aX2 (X2 + 0.25xt}2 . ,
el2 = 0.1731, e22 = 0.00584.
We first resize to obtain
Q = 1 - gl = 0.8643, Xl = 0.7901, X2 = 15.80.
We then calculate the Lagrange multipliers
Al = 2/(0.00292 + 0.04669) = 40.32, A2 = 2/(0.1731 + 0.00584) = 11.18,
and resize using Eq. (9.4.14) with "l = 4 (based on the experience of the previous
iteration)
Xl =0.7901 (0.00292 x 40.32 + 0.1731 x 11.18)1/4 = 0.9457,
X2 =15.80(0.04669 x 40.32 + 0.00584 x 11.18)1/4 = 18.67.
The first few iterations are summarized in Table 9.4.3. The solution oscillates more
than in Example 9.4.1, and seems to drift away once it gets close to the optimum
of Xl = 0.6598, X2 = 15.83. The Lagrange multipliers are not converging to their
correct values because they are based on a single-constraint approximation .•••

381
Chapter 9: Dual and Optimality Criteria Methods
Table 9.4.3
Scaled Resized
Xl X2 g1 g2 >'1 >'2 Xl x2
1.5610 15.61 0 0.5051 12.36 0 0.9142 18.28
0.7901 15.80 0 0.1443 40.32 11.18 0.9457 18.67
0.8004 15.80 0 0.1537 42.04 0 0.4688 18.51
0.6277 24.78 0.3584 0 0 3.017 0.7448 9.000
1.2974 15.68 0 0.4300 9.927 0 0.7598 18.36
0.6593 15.93 0.006 0 40.49 7.807 0.7910 18.77
0.6672 15.83 0 0.0096 42.34 8.453 0.8003 18.66
0.6789 15.83 0 0.0246 41.85 8.646 0.8143 18.66

9.4.3 Other Formulations

There are several other formulations of optimality criteria methods. These are of-
ten tailored to treat specific constraints. An example is the treatment of stability
constraints by Khot in [32]. The stability eigenvalue problem is typically written as
(9.4.16)
where K is the stiffness matrix, KG is the geometric stiffness matrix, ILk is the buckling
eigenvalue, and Uk is the corresponding eigenvector or buckling mode. We assume
that the modes are normalized so that
uIKGUk = 1, (9.4.17)
and then the eigenvalue ILk is given by
ILk = uIKuk. (9.4.18)
The constraints on eigenvalues considered in [32] are of the form
j = 1, ... ,ng. (9.4.19)
The derivative of gj with respect to a design variable Xi is obtained from Eq. (7.3.5)
as
(9.4.20)

The second term of the right-hand side of Eq. (9.4.20) is zero if the pre buckling
internal loads, and therefore KG, do not depend on the design variables. Even when
the second term is not zero, there are many situations where it can be neglected.
Khot defines
2 aILj 2 T aK
bij = Xi -a = Xi U j -a Uj. (9.4.21 )
Xi Xi

If the stiffness matrix is a linear combination of the design variables


n aK
K = 2: ax.Xi. (9.4.22)
i=1 •

382
Section 9.5: Exercises
then from Eqs. (9.4.18) and (9.4.21)

1/. -
r"J - L: ..21.
n
b··

, (9.4.23)
j=1 I

and from Eq. (9.4.21)


og·
_ J _OW
_ bj •
J_....!...
(9.4.24)
OXj - OXi - x~ .
Equations (9.4.23) and (9.4.24) together indicate that bij could not be approximately
constant (if it were we should have a minus sign in one of these equations). However,
we can still proceed in the same manner as for displacement constraints, with the
optimality conditions written as

of _ ~ >.. agj _ of _ ~ >... bij _ 0 (9.4.25)


!1 L...J J!1 -!1 L...J J 2- ,
UXj . 1 UXj UXi . 1 X,'
J= J=

so that

x, ~ (;. ~ Ajb,j) 'I' , (9.4.26)

where fi = of loxi' We can use the more general form corresponding to Eq. (9.4.6)

(9.4.27)

The calculation of the Lagrange multipliers then follows one of the methods suggested
in this section. In [32] the method leading to Eq. (9.4.11) was employed. The method
converged well for the truss examples in [32] even though the coefficients bjj can be
expected to change substantially with changes in the design.
To conclude this chapter we should note that it emphasized the relationship be-
tween optimality criteria methods, dual methods and approximation concepts. There
are other treatments of optimality criteria both for specific and for general constraints.
The reader is directed to Refs. [33-34] for survey of other works on optimality criteria
methods.

9.5 Exercises

1. Show that for the linear case the Falk dual leads to the dual formulation discussed
in Chapter 3.
2. The truss of Figure 9.2.4 is to be designed subject to stress and Euler b1lckling
constraints for two load conditions: a horizontal load of magnitude p; and a vertical
load of magnitude 2p. The yield stress is ao = fiE where E is Young's modulus and

383
Chapter 9: Dual and Optimality Criteria Methods
0: a proportionality constant. Assume that the moment of inertia of each member is
I = fJA2 where fJ is a constant and A the cross-sectional area. Write a program to
obtain a fully-stressed design of the truss, assuming that member A and member C
are identical, for various 0:, {J, p, E, and t. What is the design for 0: = 10-3 , {J = 1.0
and (Jol2/ p = 105 . .

3. Obtain the FSD resizing rule for a panel of thickness t subject to in-plane loads
n x , ny, nxy and bending moments m x , my, m xy (all per unit length) using the Tresca
(maximum shear stress) yield criterion.
4. Using the dual method find the minimum of f = Xl + X2X3 + x~ subject to the
constraint 10 - I/XI - 2X2X3 -1/x4 ~ 0 and Xi ~ 0, i = 1, ... ,4.
5. Write a computer program to solve Example 9.2.4. Perform enough iterations to
obtain the optimum design to three significant digits.
6. Repeat Example 9.2.3 when Xl and X2 can take only even integer values, and X3
can vary continuously.
7. \\Trite a program to repeat Example 9.3.1 when the design is not symmetric, so
that we have three design variables. Member C is not subject to minimum gage
constraints, but members A and Bare.
8. Find how small we can make 'T] in Example 9.3.2 without causing divergence of
the solution.
9. Solve Example 9.4.1 with the additional constraint that the horizontal displace-
ment does not exceed d = 0.0005l.
10. Complete Tables 9.4.1 and 9.4.2 for Example 9.4.1.
11. Use an optimality criteria method to design the truss of Figure 9.2.4 so that the
fundamental frequency is about 1 Hertz, and the second frequency above 3 Hertz.
Assume that all members have the same material properties.

9.6 References

[11 Mitchell, A.G.M., "The Limits of Economy of Material in Framed Structures,"


Phil. Mag., 6, pp. 589-597, 1904.
[2] Cilly, F.H., "The Exact Design of Statically Determinate Frameworks, and Ex-
position of its Possibility, but Futility," Trans. ASCE, 43, pp. 353-407, 1900.
[3] Schmit, L.A., "Structural Design by Systematic Synthesis," Proceedings 2nd
ASCE Conference on Electronic Computation, New York, pp. 105-132, 1960.
[4] Reinschmidt, K., Cornell, C.A., and Brotchie, J.F., "Iterative Design and Struc-
tural Optimization," J. Strct. Div. ASCE, 92, ST6, pp. 281-318, 1966.

384
Section 9.6: References
[5] Razani, R, "Behavior of Fully Stressed Design of Structures and its Relationship
to Minimum Weight Design," AIAA J., 3 (12), pp. 2262-2268,1965.
[6] Dayaratnam, P. and Patnaik, S., "Feasibility of Full Stress Design," AIAA J., 7
(4), pp. 773-774,1969.
[7] Lansing, W., Dwyer, W., Emerton, R and Ranalli, E., "Application of Fully-
Stressed Design Procedures to Wing and Empennage Structures," J. Aircraft, 8
(9), pp. 683-688, 1971.
[8J Giles, G.L., Blackburn, C.L. and Dixon, S.C., "Automated Procedures for Sizing
Aerospace Vehicle Structures (SAVES)," AIAA Paper 72-332, presented at the
AIAA/ ASME/SAE 13th Structures, Structural Dynamics and Materials Confer-
ence,1972.
[9J Berke, L. and Khot, N.S., "Use of Optimality Criteria for Large Scale Systems,"
AGARD Lecture Series No. 170 on Structural Optimization, AGARD-LS-70,
1974.
[10J Adelman, H.M., Haftka, RT. and Tsach, U., "Application of Fully Stressed De-
sign Procedures to Redundant and Non-isotropic Structures," NASA TM-81842,
July 1980.
[l1J Adelman, H.M. and Narayanaswami, R, "Resizing procedure for structures under
combined mechanical and thermal loading," AIAA J., 14 (10), pp. 1484-1486,
1976.
[12J Venkayya, V.B., "Design of Optimum Structures," Comput. Struct., 1, pp. 265-
309, 1971.
[13J Siegel, S., "A Flutter Optimization Program for Aircraft Structural Design,"
Proc. AIAA 4th Aircraft Design, Flight Test and Operations Meeting, Los An-
geles, California, 1972.
[14J Stroud, W.J., "Optimization of Composite Structures," NASA TM-84544, August
1982.
[15J Falk, J.E., "Lagrange Multipliers and Nonlinear Programming," J. Math. Anal.
Appl., 19, pp. 141-159, 1967.
[16] Fleury, C., "Structural Weight Optimization by Dual Methods of Convex Pro-
gramming," Int. J. Num. Meth. Engng., 14 (12), pp. 1761-1783,1979.
[17] Schmit, L.A., and Fleury, C., "Discrete-Continuous Variable Structural Synthesis
using Dual Methods," AIAA J., 18 (12), pp. 1515-1524,1980.
[18J Schmit, L.A., and Fleury, C., "Discrete-Continuous Variable Structural Synthesis
using Dual Methods," Paper 79-0721, Proceedings of the AIAA/ ASME/ AHS 20th
Structures, Structural Dynamics and Materials Conference, St. Louis, MO, April
4-6,1979.
[19J Grierson, D.E., and Lee, W.H., "Optimal Synthesis of Steel Frameworks Using
Standard Sections," J. Struct. Mech., 12(3), pp. 335-370, 1984.
385
Chapter 9: Dual and Optimality Criteria Methods

[20] Grierson, D.E., and Lee, W.H., "Optimal Synthesis of Frameworks under Elastic
and Plastic Performance Constraints Using Discrete Sections," J. Struct. Mech.,
14( 4), pp. 401-420, 1986.
[21] Grierson, D.E., and Cameron, G.E., "Microcomputer-Based Optimization of Steel
Structures in Professional Practice," Microcomputers in Civil Engineering, 4 (4),
pp. 289-296, 1989.
[22] Fleury C., and Braibant, V., "Structural Optimization: A New Dual Method
Using Mixed Variables," Int. J. Num. Meth. Eng., 23, pp. 409-428, 1986.
[23] Prager, W., "Optimality Criteria in Structural Design," Proc. Nat. Acad. Sci.
USA, 61 (3), pp. 794-796,1968.
[24] Venkayya, V.B, Khot, N.S., and Reddy, V.S., "Energy Distribution in an Opti-
mum Structural Design," AFFDL-TR-68-156, 1968.
[25] Berke, L., "An Efficient Approach to the Minimum Weight Design of Deflection
Limited Structures," AFFDL-TM-70-4-FDTR, 1970.
[26] Venkayya, V.B., Khot, N.S., and Berke, L., "Application of Optimality Criteria
Approaches to Automated design of Large Practical Structures," Second Sympo-
sium on Structural Optimization, AGARD-CP-123, pp. 3-1 to 3-19, 1973.
[27] Gellatly, R.A, and Berke, L., "Optimality Criteria Based Algorithm," Optimum
Structural Design, R.H. Gallagher and O.C., Zienkiewicz, eds., pp. 33-49, John
Wiley, 1972.
[28] Khot, N.S., "Algorithrm; Based on Optimality Criteria to Design Minimum
Weight Structures," Eng. Optim., 5, pp. 73-90, 1981.
[29] Venkayya, V.B., "Optimality Criteria: A Basis for Multidisciplinary Optimi7:a-
tion," Computational Mechanics, Vol. 5, pp. 1-21,1989.
[30] R07:vany, G.I.N., Structural Design via Optimality Criteria: The Prager Approach
to Structural Optimization, Kluwer Academic Publishers, Dordrf'cht, Holland,
1989.
[31] \Vilkinson, K. et al. "An Automated Procedure for Flutter and Strength Analysis
and Optimization of Aerospace Vehicles," AFFDL-TR-75-137, December 1975.
[32] Khot, N.S., "Optimal Design of a Structure for System Stability for a Specified
Eigenvalue Distribution," in New Directions in Optimum Structural Design (E.
Atrek, R.H., Gallagher, K.M., Ragsdell and O.C. Zienkiewicz, editors), pp. 75-87,
John Wiley, 1984.
[33] Venkayya, V.B., "Structural Optimization Using Optimality Criteria: A Review
and Some Recommendations," Int. J. Num. Meth. Engng., 13, pp. 203-228, 1978.
[34] Berke, L., and Khot, N.S., "Structural Optimi7:ation Using Optimality Criteria,"
Computer Aided Structural Design: Structural and Mechanical Systems (C.A.
Mota Soares, Editor), Springer Verlag, 1987.

386
Decomposition and Multilevel Optimization 10

10.1 The Relation between Decomposition and Multilevel Formulation

The resources required for the solution of an optimization problem typically increase
with the dimensionality of the problem at a rate which is more than linear. That is, if
we double the number of design variables in a problem, the cost of solution will typi-
cally more than double. Large problems may also require excessive computer memory
allocations. For these reasons we often seek ways of breaking a large optimization
problem into a series of smaller problems.
One of the more popular methods for achieving such a break-up is decomposition.
The process of decomposition consists of identifying relationships between design
variables and constraints that permit us to separate them into groups that are only
weakly interconnected. Once we have accomplished the process of decomposition we
need to identify an optimization method that would take advantage of the grouping
and replace the overall design with a series of optimizations of the individual groups,
coordinated so as to optimize the entire system.
The coordination process is often achieved by an optimization algorithm, and then
the overall optimization becomes a two-level optimization process. The coordination
level is usually referred to as the top level, and the small optimization problems are
called the subordinate level. Of course, it may be possible to break each one of the
groups in the subordinate level to further subgroups, so that we obtain a three-level
optimization, and so on. The multilevel structure generated through the process of
decomposition is usually characterized by a large number of daughter subproblems
in successive levels. \\Then the decomposition process is depictcd schematically (see
Figure lO.1.1a), the diagram has a wide-tree (or multiple branching) structure.
2.65 FiglO.l.lE Multilevel-problem structures Multilevel optimization is not only
generated through decomposition. Some problems have natural multilevel structure
with only one or few daughter sublevels, that is they have a narrow-tree structure (see

387
Chapter 10: Decomposition and Multilevel Optimization

(a) wide-tree structure (b) narrow-tree structure


Figure 10.1.1 Multilevel-problem structures
cases it is possible to formulate the structural analysis as an optimization process
by minimizing the total potential energy of the structure. In this case the design
problem can be viewed as a two-level optimization problem, analysis being a single
daughter sublevel. Another example, is optimization with different types of design
variables, such as sizing and shape variables, where it may be advantageous to deal
with them at different levels. Finally, in multidisciplinary optimization we may have
cases where it is advantageous to have sublevels corresponding to individual disci-
plinary optimizations coordinated at an upper level.
Because multilevel optimization techniques also have some drawbacks (discussed
below), we may seek to transform some multilevel problems (especially narrow-tree
problems) to a single-level structure. For example, for design problems where the
analysis is performed as a second-level optimization, it may be advantageous to use a
single level formulation. This single-level formulation is called simultaneous analysis
and design, and is discussed along with other narrow-tree multilevel problems in
Section 10.5.

10.2 Decomposition

The process of decomposition begins by the identification of groups of design vari-


ables, so that variables in each group interact closely, but interact weakly with the
rest of the design variables (the strength of interaction between variables will be de-
fined shortly). Assuming that there are s such groups, the design variable vector x
is written as
X T =(X1, ... ,xs f. (10.2.1)
The groups of design variables do not interact at all when the objective function is
separable in terms of the groups, that is

f(x) = L f(Xi}, ( 10.2.2)


;=1

388
Section 10.2: Decomposition

and each constraint depends only on variables from a single group. That is, if we
denote the vector of constraints associated with Xi as gi, the constraints may be
written as
gi(Xi) 2:: 0, i = 1, ... , s. (10.2.3)

f x X X X f X X X X
X X
X X X
X X X
X X X

Figure 10.2.1 Block-diagonal and block-angular structures

This simple problem structure is diagrammed in Figure 10.2.1a. The rows in the
diagram represent the objective function and constraints, and the columns represent
the design variables. An 'x' in a block indicates that the objective function or the
constraint corresponding to the row of the block depends on the vector of design
variables associated with the column of that block. For a block-diagonal problem the
solution naturally breaks down to a series of problems

minimize J; (Xi)
(10.2.4)
such that gi(Xi) 2:: 0,
which can be solved independently for i = 1, ... ,s (that is, the problem is separable,
see Section 9.2.2). This is an ideal situation because we replace the solution of the
large problem with a series of smaller problems without the need for any coordination
between subproblems. This is also the simplest example of problem decomposition.
It is extremely rare to encounter problems that have a simple block-diagonal
structure, but in many cases we have optimization problems where the coupling be-
tween groups of variables is very weak. The coupling between groups of variables
means that some of the blank off-diagonal squares in Fig. (10.2.1) fill up. A weak
coupling means that the derivatives in these off-diagonal squares are small compared
with the derivatives in the diagonal squares. In cases of weak coupling it may be
possible to proceed as if the problem form were block diagonal. However, instead of
optimizing each group of variables only once, ,ve have to repeat the process several
times to account for the weak coupling between groups. For example, consider the
design of truss structures subject to stress and local buckling constraints. 'vVe can
design the cross-sectional parameter of each member of the truss separately to sat-
isfy the stress and local buckling constraints, assuming that member forces remain

389
ChapteT 10: Decomposition and Multilevel Opt'irnizatioTl

constant. Of course, in a statically indeterminate truss, member forces will change,


so that we will need to iterate the process. This approach is a generalization of
the stress-ratio sizing technique to fully stressed design discussed in Chapter 9; it
can be applied to individual members as well as to substructures (see Giles [1] and
Sobieszczanski and Loenclorf[2]). Furthermore, as for the stress-ratio technique, it is
possible for the process to converge to a nOll-optimal (though usually near-optimal)
design.
A more common situation is where the subproblems are iuterconnected through
a small number of design variables. \Ve denote the coupling design variable vector,
involved in the interaction behveen groups, as y. Then the minimization problem is
written as
minimize fo(Y) + I:J;(Xi'Y)
i=1 (10.2.5)
such that go(Y) 2 0,
and gi(X"y) 20, i=1, ... ,8,
where go is a vector of global cOllstraiuts. The connC'ctivity matrix is ctiagrammed in
Figure 10.2.1b, and is said to have a block-angular form. The subsystem variables,
Xi, are often called local variables, while the coupling variahles, yare called global
variables. Beside the block-diagonal and block-angular problem structures there are
other cases that are suited to decomposition. The reader is referred to Barthelemy
[3] for a more complete discussion of problem structures which favor decomposition.
One case where the block-angular problem structure is obtained naturally is in
the limit design of structures subject to several load cases (see Section 3.1). Consider
a truss with T members made from a single material and suhject to s load cases, given
in terms of the nodal load vectors pi ,i = 1, ... ,8. The equations of equilibrium under
these loads may be written as

i=l, ... ,s, (10.2.6)

where ni denotes the member force vector for the ith load case, and E is a matrix
of direction cosines. For the limit design problem of the truss we need to enforce the
yield constraints under each load casc as

j=l, ... ,7', i=1, ... ,8, (10.2.7)

where (JT, and (Jc denote the yield stress ill tension and compression, rcspecti\'ely, Aj
is the cross sectional area of the jth member, and I1j denotes the force in member j
under the ith load casco The limit design problem for minimum weight design of the
truss can then be formulated as
r
minimize m = I:PAjL j
j=1
(10.2.8)
subject to Eni =pi,
and Aj(J C ::; nj ::; A j(JT ,

390
Section 10.2: Decomposition
where p and L j denote the density and length of the jth member, respectively. In
this problem the member forces and cross-sectional areas are the design variables. In
this case the member forces for the ith load case, ni play the role of the local variable
vectors Xi since ni appears only in the constraints associated with the ith load case.
The cross-sectional areas play the role of the coupling vector y since they appear in
the objective function and in the constraints for all load conditions.

Example 10.2.1

The three-bar truss in Figure 10.2.2 is to be designed for minimum mass so as not to
collapse, under two load systems: a vertical load of magnitude 8p and a horizontal
load of magnitude p. We assume that the truss can collapse not only due to yield, but
also due to Euler buckling of the compression members. The post-buckling behavior
is assumed to be fiat (that is constant load with increasing deformation), so that the
buckling stress can be substituted for the yield stress in Eq. (10.2.7) for members
in compression. The design variables are the cross-sectional areas and moments of
inertia of the members (assumed to be independent).

r-x'u t~
Y,v section a-a

overall geometry

Figure 10.2.2 Three-bar tubular truss in compression

The horizontal load can act either to the right or to the left, and so we require a
symmetric design, AA = Ae and IA = Ie. We assume that the material properties
of the members are identical, and that under the horizontal load member B will not
be critical in tension. Denoting the two load cases by superscripts H and V, we

391
Chapter 10: Decomposition and Multilevel Opt'imization

formulate the limit design problem as

minimize m = pl( 4.4.1 + An)


such that (l8G6(n:{ -ng) =p,
nZ + 0,5(11;[ + ng) = (),

Ii \'
and 0.8GG(n'l - 1!(') = 0,
\.
lin + O.o(nct + nel -
h.I".F_
-8]),
7[2 EfA
_n V
C -
< 4[2

The block diagram for the problem is shown in Figure 10.2.3 , with a detailed
variahle-by-variable diagram in (a), and a variable-group diagram in (b). The dia-
gram shows that the optimization problem has a block angular form, with the cross-
sectional properties heing the cOllpling \'ariables, and the member fOlT(,S for each load
case heing the local variables .•••
A block angular form can be used in various ways, discussed later, to [(>place
the overall optimization problem by a series of smaller problems. Aside from its
value in decompositioll, a block angular form also has other complltational benefits.
The lllaill advantage is that derivative calculation is inexpellsive because constraints
depend only on a lilllited number of design variables. Therefore, it is wortll\vhilc to try
and induce such a block angular structure by proper choice of desigll yariables, even
if we use a standard optimization algorithm to solye the problem. This is illustrated
in the following example.

Example 10.2.2

The three-bar truss in figure 10.2.2 is now to be designed for minimum weight in the
clastic range by varying the radius anel the thickness of the members, The two loads
are now assumed to act simultaneously, so that wc consider only a single load case.
Because of symmetry we assume that membns A and C are identical so that the
design variables are r.I, (I, 1'lJ, and tlJ. \Ve assume that the thicknesses of the tubes
are small compared to the radii, so that the cross-sectional areas are approximated
as

Displacement, stress anel huckling constraints are applied, The vertical displacement
/' is restricted to be less than 0.0011. The stress in each membcr should be less than
(To = 0.002E, where E is YOllng's modulus, and (To is the yield stress in tension
and compression, (To = lOspl z2. Additionally, tlie members should not huckle. This
means that the stress in (>aeh member is limited to be below the shell- buckling stress
of O.G05Etl]' where r is the radius of the member and t its thickness, and the stress

302
Section 10.2: Decomposition

H H H V V V
AA IA AB IB n A n B ne n A n B ne

mass x x
horizontal eql. X X
"0
~
0 vertical eql. X X X
....J
c:a yielding A X X
C
0
N
·c0 buckling B X X X
::r: buckling C X X
(a)
horizontal eql. X X
"0
~ vertical eql. X X X
0
....J
c:au buckling A X X
·B
(l)
buckling B X X X
>
buckling C X X

Cross
nH nH nH V V V
Sectional ABe nA nB ne
Variables

mass X
Horizontal
(b) load X X
constraints
Vertical
load X X
constraints

Fignre 10.2.3 Block diagmm for· Example 10.2.1

must also be below the Euler buckling stress of 7f2 £,.2 /2L2 where L is the length of
the member.

The truss was analyzed in Example 6.1.2 for a vertical tensile force, and it is easy
to change the sign of that force and obtain

393
Chapter 10: Decomposition and Multilevel Optimization

8pl
v= - ,
E(AB + O.25A A)
aA _ P(_J3_3_ _ 2 )
- 3A A AB + O.25AA '
8p
aB = - AB + 0.25A A ,
ac = _p (_J3_3_ + ___2___ )
3A A AB + 0.25A A
We assume that the yield stress is the same in compression and in tension, and then
member C will always be more critical than member A, so that the design problem
may be written as

minimize m = pl(AB + 4A A )
v
such that 1+ 0.0011 ~ 0,
I + aB ~ 0,
ao
0.605EtB aB 0 1r2Er~ aB
-----=-+-> -2 2 1 +-,~O,
rBaO ao - ao ao
>0
1 + ac ~ 0, 0.605EtA ac
----+-
ao rAaO ao -
1r2 Er2 a
__ A +--2>0.
8l2ao ao-

As posed the problem is fully coupled in that each constraint depends on all four
design variables (note that the stresses in each member depend on the area and
hence on the thickness and radius of the other member). However, it is simple to
decouple the members and construct a block angular problem structure by changing
design variables. We select the cross-sectional areas as the coupling variables (y),
and then either the radii or the thicknesses of the members can be the local or
subsystem variables. In this example, let us use the two radii as the local variables.
The thicknesses may then be obtained from the radius and cross-sectional areas. \Ve
define nondimensional area variables as

and then the mass, the displacement, and the stresses may be written in terms of Yl
and Y2 only. The buckling constraints also require the radii. Defining the nonclimen-
sional radii as

we can write the buckling stress limits for member B as

394
Section 10.2: Decomposition

and
1f2 Er2 1f2 E
~ = - -aox~ = 2467 aox~ .
2l 2 ao
Using similar expressions for member C, we can now write the design problem as

minimize m = (plpjao)(4YI + Y2)


16
such that gl(Y) = 1 - + 0 25 ~ 0, (displacement)
Y2 . YI
8
g2(Y) - 1 - >0 (stress in B)
- Y2 + 0.25YI - ,

g3(Y) = 1- -v'3 - 2
~ 0, (stress in C)
3YI Y2 + 0.25YI
and gll(XI,Y) = 4.814 x 10
-4YI v'3 -
2 - -
2
~ 0, (shell buc. C)
Xl 3YI Y2 + 0.25YI
gI2(XI,Y) = 616.9x 2v'3
I - -3 - + 02 25 ~ 0, (Euler buckling C )
YI Y2 . YI
4Y2 8
g21(X2,Y) = 4.814 x 10- 2 - 25 ~ 0, (shell buckling B)
x2 Y2 + O. YI
8
g22(X2,Y) = 2467x2-
2
025 ~O. (Euler buckling B)
Y2 + . YI
The problem now has the requisite block angular structure.e e e
Now consider the case of a more complex truss structure composed of s tubular
members, designed for minimum mass and subject to stress, displacement, and local
buckling constraints. The stresses will be calculated from a finite element model.
For optimization we will need the derivatives of the stresses with respect to design
variables, and this derivative calculation can be the major cost in the optimization
process, especially if derivatives are calculated by finite differences. If the radii and
thicknesses of the members are used as design variables, then the problem is fully
coupled, in that a change in each design variable may affect the stresses in all mem-
bers. \Ve will need to calculate derivatives of the stresses in the members with respect
to 28 design variables. If, on the other hand, we use the decomposition approach em-
ployed for the three-bar truss, the cross-sectional areas and the radii are the design
variables. The partial derivatives of the stresses with respect to the member radii are
taken for fixed values of the corresponding areas (this is, of course, possible because
the thicknesses are not specified). So these derivatives of stresses with respect to
radii are zero, and we need to calculate only the s partial derivatives of stresses with
respect to areas.
A similar approach may be used for frame type structures. The portal frame shown
in Figure 10.2.4 , for example, was introduced by Sohieski et al. [4] for demonstrating
multilevel optimization concepts. Each one of the three beams has an I cross-section
defined by 6 design variables. Constraints are imposed on stresses and displacements
under the loads shown in the figure. If the detail (local) design variables are used,

395
Chapter 10: Decompositim Id Multilevel Optimization

J.!·----lOC

r
!
---------------;1::-1-;0'.---
P=50000N
2 !
!
500 em
A_
!
! V
i
I- h

l.~ ~---h-
+~
i '1
N 3
!
! b1 [~- -=I-I i -~jb;~
!
I
..JL t3 ..JL
! 1000 em tl t2
! A-A
I not to scale
!
!

Figure 10.2.4 Decomposit'ion of portal frame


the stress and displacement constraints are fully coupled, in that they are affected by
each one of the 18 design variables. However, if we choose the cross-sectional area A
and the moment of inertia I of each beam as design variables, we can eliminate 2 of
the local design variables for each beam. Now all the constraints depend on the Il.reas
and moments of inertia, but the other four variables for each beam influence only the
stresses in that same beam. It is possible to apply the same approach to a planar
frame with s members, and have 2s coupling (y) design variables, and s subsystems.
For both truss and frame problems decomposition is achieved by recognizing thll.t
the effect of one member on the rest of the structure can be expressed in terms of
a small number of parameters (areas for a truss, areas and moments of inertill. for
a planar frame). These parameters become "global" or coupling variables, and are
used to eliminate an equal number of local variables.
Thareja and Haftka [5] employed a similar approach for composite panels, using
panel membrane stiffnesses as global variables. However, for more general structures,
it may not be easy to select global variables that decompose the design problem.
Another difficulty associated \vith the decomposition is the elimination of the lo-
cal variables in terms of global variables. For panel problems, as well as for complex
truss and frame cross-sectional forms, it is impossible to find analytical expressions
for eliminating local variables and replacing them with global variables. It is possible
to keep both local and global variables, and supplement the problem with equality
constraints that guarantee the consistency of the global variables with the local vari-
ables, However, this approach often tends to make the optimization problem more
ill-conditioned as well as increase the number of design variables (e.g., [6]). In many
cases it is possible, instead, to eliminate local design variables in terms of global ones
even if analytical expressions for the elimination are not available.

396
Section 10.2: Decomposition

Consider, for example, a generalization of the truss and frame cases where each
subsystem has a set of global variables that are used to eliminate a number of the
subsystem variables. For the sake of simplicity we will consider a single subsystem,
and omit the subscript associated with it. That is, let x be the vector of subsystem
variables (such as the radius and thickness for the truss tube member), and let y be
the part of the global variable vector associated with that subsystem (such as the
cross-sectional area for that truss member).
\Ve assume that \ve can identify a subset of x that can he eliminated in terms of
y and denote it as XE, and denote the rest of the local variables (to be ret.ained ) as
XR. The relationship between y, XE, and XR is given as
h(y,XE,XR) = O. (10.2.9)
This relationship cannot always be solved analytically to yield an expression for XE
in terms of y and xu, but it can be solved numerically (e.g., Newton's method).
The numerical solution for XE is usually inexpensive, because Eq. (10.2.9) is a small
system of algebraic equations. It is important, however, to choose XE such that the
system has a solution, that is the Jacobian ah/ aXE must be nonsingular.
If we replace x by y and Xu as design variables without having an analytical
expression for the eliminated variables, our main difficulty will be in calculating
derivatives of objective function and constraints with respect to the new set of design
variables. Consider, for example, a constraint function
g(x) = g(XR,XE) = Y(XR'Y). (10.2.10)
\Ve need to calculate the derivatives of Y without having an explicit expression for
it. This is easily accommodated using implieit differentiation. Differentiating Eq.
(10.2.10) we get
ay ag ag aXE
-aXIl
=DXIl-+ --
aXE aXR '
(10.2.11)
ay ag aXE
ay aXE ay .
Note the difference between ag/aXR and a!J/axn. The first is a derivative of the
constraint with XE h(,ld constant, while the second is a derivative of the constraint
with y held constant.
To be able to evaluate the derivatives from Eq. (10.2.11) we need the derivatives
aXE/aXR and axE/Dy. These arc obtained by differentiating Eq. (10.2.9) as
ah + ah aXE = 0
ay ax E ay ,
(10.2.12)
ah
-+---0
ah aXE
aXIl aXE aXR - ,
which can be solved to yield
aXE _ [ah ] -1 Dh
ay - - aXE ay ,
(10.2.13)
aXE [ ah ] -1 ah
aXn = - aXE DXR .

397
Chapter 10: Decomposition and Multilevel Optimization
This process is illustrated in the following example.

Example 10.2.3

Consider again the portal frame of Figure 10.2.4. The natural global variables are the
cross-sectional areas and moments of inertia. Denoting the area and moment of inertia
of a typical member by A and I, respectively, and assuming that the thicknesses are
much smaller than the other dimensions we have for Eq. (10.2.9)

hI = blt l + b2t2 + HiJ - A = 0,


(a)
h2 = t3H3/12 + (bltl + b2t2)H2/4 - (blh - b2t2? H2 /4A - I = O.

Assume that we have a local constraint which requires (say, to avoid unreasonable
geometries) that the web accounts for at least 20 percent of the total area, that is

(b)
Assume further that \VC use the area and moment of inertia to eliminate the variables
tl and t3. That is, here tj and t3 are the components of XE and bl , b2 , t2 and Hare
the components of XR. After the elimination of the two local variables the constraint
may be written as
g(A,I,b l ,b2 ,t2,H) ~ O.
\Ve want to demonstrate that we do not need to have an explicit form for !J to be
able to evaluate it and its derivatives. To evaluate !J for a given set of its arguments
we first solve Eqs. (a) for tl and t 3, and then we evaluate 9 from (b) and note that

Consider now, for example, the derivative of g with respect to the area A.

We obtain otI/oA and ot 3 /oA by differentiating Eqs. (a) with respect to A


b otl H ot3 - 1 =0
loA + 0.4
(c)

For example, consider a nominal design with tl = t2 = ta = t and bl = b2 = H. For


this initial design A = 3Ht, I = 7tH 3 /12, and 9 = O.4Ht. We start by solving Eqs.
(c) for otdoA and ota/oA to obtain
otl -1 3
=
oA 2H' 2H'
398
Section 10.3: Coordination and Multilevel Optimization

and then
8y
8A = 1.3.

As a check we can change the area by a small amount D..A without changing the
other arguments of y. This can be accomplished by changing tl by (8tl/8A)D..A =
-0.5D..A/ H and changing t3 by (at 3/8A)D..A = 1.5D..A/ H. We then check that the
moment of inertia I does not change (to first order in D..A), and that g changes by
approximately 1.3D..A. • ••

When it is difficult to eliminate local variables by using the global variables,


we may want to use both types of variables. As noted before, the use of equality
constraints to enforce compatibility between local and global variables may lead to
ill-conditioning. Instead Schmit and co-workers (e.g., [7]) used the objective function
of the lower-level problems as a means of enforcing compatibility. That objective
function was made to be a measure of the discrepancy between the lower level and
upper level variables. This approach (as well as the use of equality constraints) trans-
fers the problem of the compatibility between lower-level and upper-level variables
from the formulation or decomposition stage to the solution stage. The solution of a
decomposed problem is discussed in the following sections.

10.3 Coordination and Multilevel Optimization

Once a problem has been transformed to have a block-angular form, we realize


important savings in the cost of calculating sensitivity derivatives. However, it may
be possible to gain additional savings by employing an optimization method which
capitalizes on the special form of the problem (in particular on the smaller size of the
subproblems).

A natural approach to the problem is to use a nested or two-level optimization


procedure where the optimization of the subsystem variables, Xi, is nested inside an
upper-level optimization of the global variables, y. In some cases the two levels of
optimization can be uncoordinated, with the optimization process simply shuttling
back and forth between the upper-level and the lower-level optimization. If changes
in the global variables affect local constraints only weakly, this process can converge
fast (but not necessarily to the optimum). For example, Kirsch [8] developed a three-
level optimization procedure for reinforced concrete structures which relies on such
an iterative procedure.

In many cases, however, the optimization process at the two levels has to be
coordinated. For linear problems Dantzig and Wolfe ([9] and [10]) and Rosen [11]
and [12] developed two-levels algorithms for the block-angular problem, Eq. (10.2.5).
For nonlinear problems, one possible approach is known as the model-coordination
method. Here we describe a version based on derivatives of the optima of subsys-
tems with respect to upper-level variables. Consider the block-angular problem, Eq.

399
Chapter 10: Decomposition and Multilevel Optimization

(10.2.5). \Ve start by replacing it by the following two levd-problem

minimize fo(Y) + L f;'(y)


i=1
such that go(Y) 2:0,
(10.3.1)
where
ft(y) = min fi(Xi, y)
Xi

such that gi(Xi, y) 2: o.


This problem can be solw'd ill two stages. First, an initial guess for y is selected, and
each of the s sublevels is optimized for the corresponding Xi. Then, the sensitivities of
the optima of each sublevel with respect to changcs in yare calculated (as described
in section 5.4). Finally, these sensitivities are used to change the coupling or top-level
variables (y), in one or more iterations.
One of the difficulties associated with such a two-level approach is that for some
values of y there may he no feasible solution to some of the suble\'('l problems. For
linear programmiIlg, nosen's algorithm [12] starts by finding a feasible solution. For
nonlinear problems it is difficult to ensure that for a given value of the vector y all
subproblems have feasible solutions, even though it is possible to add constraints to
the upper-leVel problem that help prevcnt lower-level infeasibility (see Kirsch, [13]).
Additionally, the nse of sensitivities of the subsystems to challges in the top-level
variables has one spriOllS drawback: These sensitivities may not he continuous (see
Barthelemy and Sobie8ki [14]). This is demonstrated in the following example.
Example 10.3.1

Consider again the three-bar truss of Example 10.2.1. As shown in that example,
the problem has a hlock-angular form, with the areas and moments of inertia being
the global design variahles, and member forces the local variables. The upper level
optilllization in a two-level approach for this problem can be formulated as follows:
minimize m = (II (4A.\ + A B)
such that p~f - P 2: 0 ,
\.
jJ" -]J 2: 0 ,
where p:f amI p;:. denote the collapse values of p for the horizontal and vcrtiealload
cases, respectivel.\". These collapse values are ohtained from the solution of two sub-
level optimization problems. For the horizontalloacl we solve
maXImIze p;f
such that 0.8GG(1l ~ - n{!) = p:f ,
n~ + O.!j(n~ + n{!) = 0,
7f2 EI 4
-ncf~ < --_.
412
400
Section 10.4: Penalty and Envelope Function Approaches

Similarly, for the vertical load we solve

maximize
such that

To optimize the upper-level problem we will need derivatives of the two collapse loads
with respect to the cross-sectional areas and moments of inertia. \Ve will consider
only the derivatives of the horizontal collapse load p!!. The problem is simple enough,
so that the solution for the collapse load can be found by inspection. If IB is large
enough, so that member B is not critical, then collapse will be reached when members
A and C reach their maximum (yield or buckling) loads, and from the horizontal
equation of equilibrium we get

From the vertical equation of equilibrium we can then check that at this load member
B will be indeed below its failure load if

If, on the other hand, I B < I BO then members C and B will reach t.heir maximum
load first, and using the two equations of equilibrium we find that

II 0.8667f2 E
Pc = [2 (2I B +0.5IA ).

It is easy to check that when IlJ = I Bo both expressions for the collapse load give
identical results, so that p~l is a continuous function of lB. The derivative of p~I with
respect to III, on the other hand, is not continuous. When IB < I Bo this derivative
is zero, as the collapse load is independent of the properties of member n when that
member is not critical. For Is > I Bo we get
()p~I 1.7327f2
OlB [2

This discontinuity in t.he derivative can pose difficulties to most optimization algo-
rithms, especially if the optimum design is in the vicinity of IB = I Bo .• ••

10.4 Penalty and Envelope Function Approaches

One way of avoiding the difficulties of the two-level approach discussed above is
to use an exterior or extended interior penalty-function method (see Section 5.7)

401
Chapter 10: Decomposition and Multilevel Optimization
for the objective function at the lower levels. The penalty function approach allows
us to accept upper-level (y variables) that do not have lower-level (x; variables)
feasible solutions. Indeed, the penalty associated with constraint violation at the
lower levels will eventually drive the upper-level design variables away from regions
with no lower-level feasible solutions. Also, the extended penalty function smoothens
the discontinuities associated with the derivatives of the lower-level optima, especially
when the lower-level optimization is not performed with extreme values of the penalty
parameter. Finally, the use of a penalty function solves the difficulty that occurs when
the lower-level variables do not contribute to the objective function.
Consider the block-angular problem described by Eq. (10.2.5). Using a penalty
function approach we replace the constrained problem with

minimize <I>(y, x, 1") = 10(Y) + Pv[go(Y), r] + L (f;(Xi' y) + Pv[g;(Xi,Y), 1"]) ,


;=1
(10.4.1)
where Pv is the penalty associated with a vector of constraints. For example, if g is a
constraint vector with m components we will often use a cumulative penalty function
m

PV(g,I') = LP(gj, 1"), (10.4.2)


j=1
where p denotes some penalty function such as the convenient extended interior
penalty function (see Section 5.7)

for 9j ~ go ,
(10.4.3)
for gj < go .

The transition parameter go depends on 1" as

go = go01"1/2 , (10.4.4)

where goo is a constant. The problem described by Eq. (10.4.1) is solved for a series
of values of 1" such that 1" --+ O. A multilevel version of this formulation is
s

minimize 10(Y) + Pv[go(y, r)] +L cPi(Y, r)


i=1 (10.4.5)
where cPi(Y,r) = min {/;(Xi,Y)+P,,[gi(Xi,y),1"i]} .
Xi,ri

The series of values for the subsystem penalty parameters r; tend to zero together
with the global penalty parameter 1".
The method of varying the penalty parameters of the subsystems defines the
particular multilevel algorithm. One attractive approach is to perform each sublevel
optimization for only a single value of 1";, arguing that there is no point in striving
for an exact sublevel optimum before the upper-level variables have settled close to
their final values. That single value of the penalty parameter for each subsystem

402
Section 10.4: Penalty and Envelope Function Approaches

can then be gradually reduced towards zero as the optimization proceeds. Reference
[15] shows that when all subsystems use the same penalty parameter, the multilevel
optimization is completely equivalent to the single-level approach. This means that
the same series of int.ermediate designs are obtained on the way to t.he final opt.imum,
and the calculat.ions performed could be made to be identical. The process can be
viewed as a two-level optimization, or a single-level optimization where the block-
angular form is utilized to reduce the amount of computation and permit parallel
operations.
Note that even when other techniques are used to solve multilevel optimization
problems it is common practice to use approximate or partially converged solutions
of the sub-level optimizations.

Example 1004.1

Consider the two-level formulation of the elastic design of the three-bar truss in
Example 10.2.2. For this simple example it is convenient to use a vector penalty
function Pv which is equal to the penalty associated with the most critical constraint.

Pv(g, r) = p[min(gd, r] .

In general this penalty approach may create discontinuity problems when the most
critical constraint changes identity. For our problem, though, this does not happen.
The penalty function formulation is then

minimize ¢ = m(y) + Pv[gl(y),g2(y), g3(y), r] + ¢l(Y, r) + ¢2(Y, r)


where ¢l(Y,r) = minpv[gll(.rl,y),gdxl,y),r] ,
Xl (10.4.6)
¢2(y, r) = minpv[g2l(X2, y), g22(X2, y), r] ,
X2

where the mass and constraint functions are given in Example 10.2.2. Note that the
local variables Xl and X2 do not contribute to the mass, so that the formulation of
Eq. (10.3.1) would not have any objective function at the lower level, and the lower
level problems would only require finding a feasible solution.
With this penalty function formulation, the lower-level objectives ¢l and ¢2 each
contain the contributions of two constraints. Because the penalty is based on the
most critical constraint the lower-level optimum occurs when these two constraints
are equally critical. For the first subsystem we get gll = g12 which yields

For the second subsystem we get g21 = g22 or


1/4
X2 = 0.02102Y2 .

With these relationships we can now solve the upper level problem as a single-level
optimization problem .•••

403
Chapter 10: Decomposition and Multilevel Optimization

Instead of a penalty function it is possible to usc an envelope function which


replaces a vector of constraints with a single envelope constraint. Sobieski and co-
workers have made extensive use of the Kresselmeier-Steinhauser (KS) envelope con-
straint (see Chapter 5) for multilevel formulations (e.g., Sobieski et al. [IG]). The KS
envelope constraint replaces the constraint vector g by K S(g) where

KS(g) = -grnin - (1/ p)log [~exP[p(gmin - gd]] ,

where g; are the components of g, p is a factor that plays the same role as the penalty
parameter and gmin is the most critical constraint. It is easy to show that

gmin ;::: KS(g) ;::: gmin - (1/ p)log(m), (10.4.7)

where rn is the number of constraints (components of g). As p is increased K S(g)


approaches the value of gm;n' The negative of KS can be used instead of Pv in the
penalty formulation.

10.5 Narrow-Tree Multilevel Problems

\Vhile in many cases the objective of decomposition is to produce a problem with


many daughter sublevels (Figure 10.l.la), there are many cases where we have a
narrow-tree structure with one or few daughter sublevels (Figure 1O.1.1b). In some
cases there is an advantage to pursue the solution using multilevel optimization.
However, in other ca.<;es it may be better to convert the multilevel problem into a
single-level one.

10.5.1 Simultaneous Analysis and Design

Interest in converting a two-level optimization problem into a single-level problem


has been particularly evident in the area of simultaneous structural analysis and
design. The simultaneous analysis and design (SAND) approach seeks to change
the nested approach typical of traditional structural optimization. In the nested
approach the structure is analyzed for a trial design, the sensitivity of the response
with respect to structural sizes is then calculated, and the sizes are modified based
on these sensitivities to obtain the next trial design. The structural analysis is nested
inside the optimization procedure, repeated again and again for a sequence of trial
designs. The SAND approach seeks to perform the analysis and design as a single
problem with response variables added to structural sizes as unknowns to be treated
all in a similar way.
The two-level form of the traditional nested approach is evident in problems where
the structural analysis can be formulated as an optimization problem. For example,
limit design of structures can be formulated as weight minimization subject to con-
straints on the collapse loads. These collapse loads are the solution of a maximization
problem. In Example 10.3.1 we saw the two-level format of a limit design problem

404
Section 10.5: Narrow-Tree Multilevel Problems

that was formulated as a single-level problem in Example 10.2.1. The single-level for-
mulation had cross-sectional areas (structural sizes) and member forces (structural
response) as design variables.
In the case oflimit design the single-level formulation, that is the SAND approach,
is the method of choice in engineering practice. However, in the elastic range the
nested approach is the rule. The problem of minimum-weight clesign subject to
displacement and stress constraints in the elastic range can be formulated as

minimize W (x)
(10.5.1)
such that gj(u,x):::,,:O, j=l, ... ,m,
where the displacement field u can be obtained as the solution to the minimization
of the potential energy U given in term of the stiffness matrix K and the load vector
f
minimize U = (1/2)u T K(x)u - uTf. (10.5.2)
The common approach is to solve this problem as a two-level optimization since
the solution to the energy minimization prohlem is obtained simply by solving the
equations of equilibriulIl Ku = f(x).
The SAND approach of using the equations of equilibrium as equality constraiuts
and treating both strnctural sizes and displacements as design variahles was at-
tempted in the 1960's by Fox and Schmit [17] using a conjugate gradient (CG) tech-
nique for the optimization. However, the CG method could not deal effectively with
the equality constraints associated with the equations of equilibrium because the stiff-
ness matrix generated hy a finite-element model is typically ill-conditioned. Gaussian
elimination techniques lose accuracy when applied to ill-conditioned equations, but
this can be tolerated if the numher of digits used in the computer arithmetic is high
enongh (most finite-element computations are done in douhle precision). The ef-
fect of ill-conditioning on iterative methods snch as the CG method is to slow down
COllYergence.
Recent advances in optimization methods such as preconditioned CG methods,
however, improve the efficiency of the SAND approach, and make it competitive
for three-dimensional problems that result in a poorly-banded stiffness matrices. As
a result there has been a revival of interest in SAND approaches (see Haftka [IS],
Smaoui and Schmit [19], Ringertz [20], anel Haftka and Kamat [21]). Overall, the
SAND method eliminates the need for continnally reanalysing the structure, at the
expense of solving a larger optimization problem (including displacements as design
variables). It is, therefore, most appropriate to usc SAND in problems with a very
large number of structnral design variables, where the addition of displacement vari-
ables has a small effect on the total nnmber of design variahles.
The SAND method is not the method of choice when there are many load cases
because in that case the number of displacement design variables becomes very large.
However, Chibani [22] employed SAND in this case using a two-level approach and
geometric programming to alleviate the computational burden. The method is also
very useful in topology optimization where the traditional nested approach runs into

405
Chapter 10: Decomposif;,;"t and Multilevel Optimization

trouble when the elimination of parts of the structure can render the stiffness matrix
singular (see Bends0e et al. [23])
It is not always possible to transform a two-level problem into a single-level
one. Consider, for example, the problem of maximizing the lowest frequency Wi of a
strnctnre subject to the constraint that its weight TV docs not exceed a limit H'u. A
two-level formulation of the problem is

maximize Wi(X)
(10.5.3)
such that IVIt - lV(x) >
_ 0,

where WI is the solutioll of the lower-level minimization of the Rayleigh qnotient

(10.5.4)

with M being the mass matrix and u the eigrnwctor corresponding to Wi. It is not
possible to replace this tVv"O-level problem by thr single-level problem

w2 - _uTKu
__
find x and u to maximize 1 - uTMu (10.5.5)
such that 1V'Ii - W(x) >
_ 0,

because in the above formulation the optimization ,yill choose the eigenvector corre-
sponding to the highest. rather than the lowest frequency. It is still possible to convert
this frequency maximization problem to an SAND single-level approach [24] by using
the Kuhn-Tucker conditions of the problem, but the process is more complex and
more computationally costly than the nested approach of Eqs. (10.5.3) and (10.5.4).

10.5.2 Other Applications

One of the common applications of multilevel approach to a problem with a narrow-


tree form is in combined sizing and geometry optimization. Typically, t.he geometrical
design variables are selected to be the upper level variables, and the sizing variables to
be the lower-level variables. The motivation for this approach has been the disparate
nature of the two types of variables that can lead to numerical difficnlties when they
are treated together as a single group of design variables. Typical applications have
been to truss (e.g., [25-28]) and frame design (e.g., [29]-31]) problems.

10.6 Decomposition in Response and Sensitivity Calculations

Systems that have block-angular structures in term of the design optimization


problem will usually have a similar structure in the analysis problem. That is, if we
denote the response of the oS subsystems by Ui, i = 1,··· ,oS, we can often find a set of
global response variables w which deconples the response computations (that is the

406
Section 10.6: Decomposition in Response and Sensitivity Calculations
analysis) of the individual subsystems. That is, the equations governing the response
of the system can he written as

rO(Ul,··· , Us> w) =0,


(10.6.1)
Ti(Ui, w) =0, i=l, .. ·,s.

\Ne can take advantage of this block angular structure in the solution procedure.
For example, consider the use of Newton's method for solving the system. Given an
initial estimate for the solution we compute a correction to that estimate from a first
order Taylor series expansion

ro + To,16.U1 + ... + ro,,6.u s + ro,o6.w =0,


(10.6.2)
ri + ri,i6.ui + ri,06.w =0, i = 1,··· ,8.
where a comma followed by i indicates a derivative with respect to Ui and a comma
followed by a zero indicates a derivative with respect to w. For example, ro,; indicates
a matrix with its jth row consisting of the derivatives of the jth component of ro with
respect to the components of Ui. All quantities are evaluated at the initial estimate.
An examination of Eq. (10.6.2) shows that we can first express 6.Ui in terms of 6.w
as
(10.6.3)
and then substitute into Eq. (10.6.2) to obtain

(10.6.4)

That is, the problem can be reduced t.o a solution of a system, Eq. (10.6.4), of the
order of w, and then the individual subsystem responses, Ui, can be calculated, as
needed, from Eq. (10.6.3).
The same procedure can be used to calculate the sensit.ivity of the response with
respect to design variables. Assume now that the system depends also on a design
parameter x. That is, we have

rO(Uj,···, Us> w,.r) =0,


(10.6.5)
ri(ui,W,.T) =0, i = 1,···, s.

Differentiating the system with respect to x we get

oro OUI oU s ow
- + TO 1 - - + ... + ro - - + ro 0 - =0
ox 'o:c ,8 ox 'ox '
(10.6.6)
OUi ow
r· + r · · - +r·o- =0
, ',' ox "ox '
i = 1,···, s.

\Ne can now express oud ox in terms of ow / ih and reduce the problem to a system
of the same order as that of w.

407
Chapter 10: Decomposition and Multilevel Optimization

The typical example of the above approach is substructuring. For displacement-


based finite element formulation W is the vector of boundary degrees of freedom,
and Uj is the vector of interior degrees of freedom of the ith substructure. However,
another important example is from the area of multidisciplinary design. There each
subsystem may represent a different disciplina.ry analysis of the same system. The
Ui are disciplinary responses that do not influence the other disciplinary analyses
while t.he W vector includes all the response quant.ities that affect more than one
discipline. In this case, however, the components of W can typically be identified
with one discipline or another, so that it is convenient to divide W into subvectors
Wi, i = 1, ... , s, where Wi consists of the response variables of the ith discipline which
affect the response calculations in one or more ot.l)('r disciplines.
Sobieski [32] clevdoped the followiug procedure for calculating the sensitivity of
a 1111lltidisciplinary system with respect to design variables and callccI it the global
sensitivity equation (GSE). In describing the GSE procedure we assume that the
response calculations in each discipline are performed by some analytical or software
hlocks (or 'black boxes') or even experimental tools that can be described as

Ui = ti(Wj,"', wsox). (10.6.7)

That is, ri is a procedurt, for calculating Wi given the response of the other disciplines
and a vector x of design variables. Similarly ti H'presents a procedurc for calculating
the response Ui. Equation (10.6.7) represents a systf'm of coupled nonlinear cquations
in the Wi, i = 1,'" ,s. The solution of this system can proceed, for exampk, by the
use of ~ewton's mctl1od, so that given an iuitial estimate w? for the Wi'S we can find
a correction tl.wiby solving
JD.W = tl.r, (10.6.8)
where

1:: ) 1~~: )
I -rj,2
-r2.1 I
J= W= tl.r =

I Ws tl.rs
( 10.6.9)
and where
( 10.6.10)
After \\Ie converge to the solution for W we can thcn find the Ui from Eq. (10.6.7).
The calculation of sensitivity with respect to a design parameter proceeds ill a similar
manner. Differentiating Eq. (10.6.7) with respect to a component of x we get

(10.6.11)

The special structure ofthe Jacobian J permits us to reduce the order of the equations
by eliminating onc of the Wi'S as illustrated in the example below.

408
Section 10.6: Decomposition in Response and Sensitivity Calculations
The GSE approach requires the derivatives of the individual disciplinary responses
with respect to the input of all the other disciplines. The cost of these calculation can
be very large when the front of interaction between disciplines is large. In comparing
the cost of the GSE approach to that of finite-difference calculation of the derivatives,
a key parameter is the number of design variables. For a large number of design vari-
ables, the GSE method tends to be more efficient than the finite-difference method,
while for a small number of design variables finite-differences are less expensive. For
a more detailed discussion of the cost issues, as well as the pathological cases when
the GSE matrix may be singular, the reader is referred to [32].
As noted before, the major difficulty associated with using multilevel techniques
is in finding a way to decompose the problem so that it would have the requisit.e hi-
erarchical structure. Successful decomposition breaks the problem into elements that
have only narrow fronts of interaction. For multidisciplinary analysis and sensitivity
we seek wa.ys to narrow the front of interaction between disciplines. The following ex-
a.mple of integrated aerodynamic-structural wing analysis and sensiti\'ity calculations
illustrates the use of a reduced-basis technique for achieving this goal.

Example 10.6.1

Consider the aeroelastic analysis of an aircraft wing. The flow field around the wing
is calculated based on the shape of the wing. Then pressures and loads are calculated
from flow velocities, and these are used to calculate structural displacements which
in turn change the shape of the wing. The solution for this coupled problem is often
performed iteratively, starting with the flow field around a rigid wing, continuing
with the loads and displacements associated with this flow field, updating the shape
of the wing based on these displacements, and so on. This approach, called fixed-point
iteration, may be preferable to Newton's method if the calculation of the Jacobian
is expensive. However, if we need also the sensitivity of the aeroelastic response to
design parameters, it may be advantageous to use Newton's method instead of the
fixed-point iteration. The feasibility of using Newton's method depends on the \vidth
of the front of interaction. To focus on the question of the front of interaction we start
without consideration of design variables and examine the solution of the aeroelastic
interaction.
We assume that we have an aerodynamic 'black box' that solves for the flow
field represented, say, by the velocity vector, v, given the shape of the wing which is
represented by a shape vector, s,
(a)
where b a denotes the application of the aerodynamic black box. Next we have a force
black box which translates the flow velocities into aerodynamic loads fa that can be
used in the structural analysis
(b)
The next black box is the structural analysis package which combines the aerodynamic
loads with inertia loads and calculates the displacement vector u
u = bs(f,,). (c)
409
Chapter 10: Decomposition and Multilevel Optimization

Finally, we have an interpolation black-box which updates the shape of the wing
based on the displacement field
(d)
At first glance, the system described by Eqs. (a)-( d) appears to be fully coupled.
Solving this system by Newton's method appears to be impractical because of the
huge size of the Jacobian. The flow field vector v and the displacement vector u
usually have thousands or tens of thousands of components. However, the vectors
fa and s can have a fairly small number of components, and we can reduce the
problem size enormously by combining the first two and last two black boxes. The
first combination gives us the aerodynamic forces in terms of the shape of the wing

(e)
and the second combination gives us the shape of the wing as a function of the
aerodynamic forces
(1)
\Ve note that the variables fa and s play the role of Wj and W2 in Eq. (10.6.7), while
v and u play the role of Uj and U2.
The above approach of using only fa and s as interaction variables leads to a
great reduction in the number of cross-derivatives that need to be calculated. How-
ever, the number of components of fa and s is often several dozens, and calculating
the Jacobian can still be prohibitively expensive. Further reduction in the number of
required derivatives is achieved by using a reduced basis technique to represent the
displacements for the purpose of describing the aero elastic interaction. The displace-
ment vectors are assumed to be adequately represented by a linear combination of
mode shapes (often vibration modes) as

U=Uq, (g)

where U is a matrix of modes and q a vector of modal amplitudes. The order of the
vector q is typically much smaller than that of U or even fa. Furthermore, for the
reduced basis structnral analysis we now do not need fa but instead the generalized
load vector f; given as (see Eq. (7.4.30))

(h)
The reduced-basis (or modal) structural analysis black box is now described schemat-
icallyas
( i)
It is now most efficient to group our four black boxes in a slightly different order to
make f; and q the interaction variables. That is, the generalized aerodynamic forces
are given in terms of the modal amplitudes as

(j)

and r:; is simply equal to b;.

410
Section 10.7: Exercises
For the Newton iteration, Eq. (10.6.8), we need to calculate J 12 = {)ri/{)q
and J 21 = {)r'2/ {)f:. These are cross derivatives, in that they are derivatives of the
aerodynamic forces with respect to the shape changes due to structural displacements,
and derivatives of shape change due to structural displacements with respect to the
aerodynamic loads. J 12 and J 21 are matrices, and it is convenient to label them as
A and S. The component aij of the matrix A is the derivative of the ith component
of f:, f:;, with respect to the jth component of q, qj. Similarly the component Sij of
f:
the matrix S is the derivative of qi with respect to j . These derivatives are often
calculated by finite differences. For example, if we perturb qj and recalculate from f:
f:
Eq. (j) we can estimate the jth column of the matrix A as the difference in divided
by the perturbation in qj.

Equation (10.6.8) can now be written as

(k)

Because of the special structure of Eq. (k) we can eliminate either ~q or ~f:. For
example, if ~f: has more components than ~q, it may be advantageous to eliminate
~f: by using the first row of Eq. (k)

~f: = A~q + ~r;: , (1)

and substituting it into the second row to get

(I - SA)~q = ~r2 + S~r;:. (m)


The solution of the aeroelastic interaction via Newton's method will consist then of
solving Eq. (m) for ~q, followed by the calculation of ~fa from Eq. (I), and then
updating q and f:
and calculating new A and S to repeat the iteration.

The calculation of sensitivity with respect to a design parameter .1: will proceed
along the same line. Equation (10.6.11) will become

[1 -Aj {PJi}
-S ~1
= { ~}
~
(n)

While the reduced basis technique approximates the aero elastic interaction, it does
not require that we also approximate the calculation in each individual discipline.
After we find f: and q from the coupled analysis, we do not need to use Eq. (g) to
calculate the displacements. Instead we can calculate the actual aerodynamic forces
fa corresponding to displacements U q, and then calculate the displacement from the
full structural analysis, Eq. (c) .•••

411
Chapter 10: DecOm1)Osition and Multilevel Optimization

10.7 Exercises

1. Consider the 3 bar truss of Figure 10.2.2. The cross-sectional areas and moments
of inertia of the three members are given, and we want to optimize the geometry of the
truss to minimize the weight subject to the constraint that the truss does not collapse
under either load case (consider both yielding and Euler buckling). Formulate the
problem in a block-angular form.
2. Consider the portal frame of Figure 10.2.4. Formulate the minimum weight design
of the frame subject to stress constraints and horizontal displacement limit of 10cm.
The design variabks are the cross-sectional dimensions for each of the three beams.
Define global design variables to reduce the problem to a block-angular form.
3. Calculate the dcri\'at.ives of !J in Example 10.2.3 with respect to its ot.her five
arguments.
4. Obtain t.he solut.ion of Example 10.4.I.
5. Solve Example IDA. 1 using the 1(S function.
6. Formulate the clastic design problem of the three-bar truss (Example 10.2.2) as a
simultaneous-analysis and design problem.

10.8 References

[1] Giles, G.L. "Procedure for Automating Aircraft vYing Structural Design," J. of
the Structural Division, ASCE, 97 (ST1), pp. 99-113, 1971.
[2 SohieszczCluski, J. and Loendorf, D., "A :t..Iixed Optimization Method for Auto-
mated Design of Fuselage Structures", .T. of Aircraft, 9 (12), pp. 805-811, 1972.
[3] I3art.helemy, J.-F.,r.l., "Ellgineering Design Applications of Multilevel Optimiza-
tion Methods," in Computer-Aided Optimum Design of Structures: Applications
(eds. C.A. Brebbia and S. Hernandez), Springer-Verlag, pp. 113-122,1989.
[4] Sobieszczanski-Sobieski, J., James, B.B., and Dovi, A.B., "Structural Optimiza-
tion by tvTultilevel Decomposition", AIAA J., 23,11, PI>. 1775-1782,1985.
[3] Thareja, n. R., and Haftka, R. T., "Efficient Singlc-Le\'el Solution of Hierarchical
Problems in Struct1ll'al Optimizatioll", AIAA .T., 28,3, pp. 506-514, 1990.
[6] Thareja, R., and Haftka, RT., "Numerical Difficultif's Associatf'd with using
Eqnality Constraints to Achieve :'vlultilevel Decomposition in St.ructural Opti-
mization," AlA A Paper No. 86-0854CP, Proceedings oCthe AIAA/ASME/ ASCE/
AHS 27th Stl'1letures, Structural Dynamics and Materials Conference, San Anto-
nio, Texas, May 1986, pp. 21-28.
[7] Schmit L.A., and Mehrinfar, M., "Multilevel Optimum Design of Structures with
Fiber-Composite Stiffened Panci Components" ,AIAA J., 20,1, pp. 138-147,1982.

412
Section 10.8: References

[8] Kirsch, V., "Multilevel Optimal Design of Reinforced Concrete Structures", En-
gineering Optimization, 6, pp. 207-212, 1983.
[9] Dantzig, G.B., and Wolfe, P., "The Decomposition Algorithm for Linear Pro-
gram," Econometrica, 29, No.4, pp. 767-778,1961.
[10] Dantzig, G.B., "A Decomposition Principle for Linear Programs," in Linear Pro-
gramming and Extensions, Princeton Press, 1963.
[11] Rosen, J.B., "Primal Partition Programming for Block Diagonal Matrices", Nu-
merische Mathematik, 6, pp. 250-260, 1964.
[12] Geoffrion, A.M., "Elements of Large-Scale 11athematical Programming", in Per-
spectives on Optimization (A.M. Geoffrion, editor) Addison "'('sley, pp. 25-64,
1972.
[13] Kirsch, V., "An Improved Multilevel Structural Synthesis 11ethod", J. Structural
Mechanics, 13 (2), pp. 123 144, 1985.
[14] Barthelemy, J.-F .11., and Sobieszczanski-Sobieski, J., "Ext.rapolation of Optimum
Designs based on Sensitivity Derivatives," AIAA J., 21, pp. 797-799,1983.
[15] Haftka, R.T., "An Improved Computational Approach for Multilevel Optimum
Design", .1. of Structural Mechanics, 12,2, pp. 245-261, 1984.
[16] Sobieszczanski-Sobieski, J., James, B. B., and Riley, M. F., "St1'llctural Sizing by
Generalized, Multilevel Optimization", AlA A J., 25, 1, pp. 139-145,1987.
[17] Fox, R. L., and Schmit, L. A., "Advances in the Integrated Approach t.o Structural
Synthesis", J. of Spacecraft and Rockets, 3 (6), pp.858-866, 1966.
[18] Haftka, RT., "Simultaneous Analysis and Design" ,AlAA .1.,23, 7, pp. 1099-1103,
1985.
[19] Smaoui, H., and Schmit. L.A., "An Integrated Approach to the Synthesis of Geo-
metrically Non-linear Structures," Int.ernational .1ournal for Numerical Methods
in Engineering, 26, pp. 555-570, 1988.
[20] Ringertz, V.T., "Optimization of Structures with Nonlinear Response," Engineer-
ing Optimization, 14, pp. 179-188, 1989.
[21] Haftka, R. T., and Kamat, M. P., "Simultaneous Nonlinear St1'llctural Analysis
and Design", Computational Mechanics, 4, 6, pp. 409-416, 1989.
[22] Chibani, L., Optimum Design of Structures, Springer-Verlag, Berlin, Heidelherg,
1989.
[23] Bendsl'le, M.P., Ben-Tal, A., and Haftka, RT., "New Displacement.-Ba.<;ed Met.h-
ods for Optimal Truss Topology Design," AIAA Paper 91-1215, Proceedings,
AIAA/ ASME/ ASCE/ AHS/ ASC 32nd Structures, Structural Dynamics and Ma-
terials Conference, Baltimore, MD, April 8-10,1991, Part 1, pp. 684-696.
[24] Shin, Y., Haftka, R T., and Plaut, R H., "Simultaneous Analysis and Design for
Eigenvalue Maximization", AlAA ,1., 26, 6, pp. 738-744,1988.

413
Chapter 10: Decomposition and Multilevel Optimization

[25] Pedersen, P., "On the :\linimum Mass Layout of Trusses" , AGARD Conference
Proceedings, No. 36 on Symposium on Structural Optimization, Turkey, October,
1969, pp. 1Ll-1l.l7, 1970.
[26] Vanderplaats, G.N., and Moses, F., "Automated Design of Trusses for Optimum
Geometry" . J. of the Structural Divisioll, ASCE, 98, ST3, pp. 671-690, 1972.
[27] Spillers, \V.R., "Iterative Design for Optimal Geometry", J. of the Structural
Division, ASCE, 101, ST7, pp.1435-1442, 1975.
[28] Kirsch, u., "Synthesis of Structural Geometry using Approximation Concepts",
Computers and Structures. 15, 3, pp. 303-314, 1982.
[29] Ginsburg, S., and Kirsch, U .. "Design of Protective Structures against Blast", .I.
of the Structural Division, ASCE, 109 (6), pp. 1490-1506,1983.
[30] Kirsch, l;., "Nlultilevel Synthesis of Standard Building Structures," Engineering
Optimization, 7, PI'. 105-120,1984.
[31] Kirsch, U., "A Bounding Procedure for Synthesis of Prestressed Systems," Com-
puters alld Structures, 20 (5), pp. 885-895,1985.
[32] Sobieszczanski-Sobieski, ,I., "Sensitivity of Complex, Illternally Coupled Sys-
tems," AIAA .TournaI, 28 (I), pp. 153-160,1990.

414
Optimum Design of Laminated Composite Structures 11

Because of their superior mechanical properties compared to single phase materi-


als, laminated fibrous composite materials are finding a wide range of applications
in structural design, especially for lightweight strnctures that have stringent stiff-
ness and strength requirements. Designing with laminated composites, on the other
hand, has become a challenge for the designer because of a wide range of parameters
that can be varied, and because the complex behavior and multiple failure modes of
these structures require sophisticated analysis techniques. Finding an efficient com-
posite structural design that meets the requirements of a certain application can be
achieved not only by sizing the cross-sectional areas and member thicknesses, but
also by global or local tailoring of the material properties through selective use of
orientation, number, and stacking sequence of laminae that make up the composite
laminate. The increased number of design variables is both a blessing and a curse
for the designer, in that he has more control to fine-tune his strllctme to l1H'et de-
sign requirements, but only if he can figme out how to select those dpsign variables.
The possibility of achieving an efficient design that is safe against multiple failure
mechanisms, coupled with the difficulty in selecting the values of a large set of design
variables makes structural optimization an obvious tool for the design of laminated
composite structures.
Because of the need for sophisticated analysis tools for most realistic applica-
tions, designing with laminated composites largely relied on procpclmes that simply
coupled those analyses with black-box optimizers. However a better understanding
of the peculiarities associated with optimization of composites can best be illustrated
through simple examples. In this chapter we emphasize examples that focus on hasic
concepts.

11.1 Mechanical Response of a Laminate

\Vhile laminated composite materials are attractive replacements for metallic ma-
terials for many structural applications that require high stiffness-to-weight and high
strength-to-weight ratios, the analysis and design of these materials are considerably

415
Chapter 11: Optimum De tn of Laminated Composite Structures

more complex than those of metallic structures. One of the complexities in formu-
lating the analysis of a laminated composite material is due to material anisotropy
that requires an increased number of material constants for characterization of the
mechanical response of the laminate. The generalized Hooke's law for an anisotropic
material is given in terms of 21 independent stiffness coefficients. It is this aspect
of composite materials which makes them attractive for optimal design and tailoring
purposes. However, for a general structure with a three-dimensional stress state it is
very difficult to solve the governing equations. Fortunately, most composite structures
are plate-type structures which are composed of layers or plies of orthotropic material
which can be characterized in terms of a smaller number of stiffness constants. In
the following section, the basic equations that govern the mechanical response of an
orthotropic lamina are summarized.

Figure 11.1.1 An orthotropic lamina with off-axis principal material directions.

11.1.1 Orthotropic Lamina

For an orthotropic material with the axes of orthotropy 1-2 aligned with the x-y
coordinate axes (e = 0 in Fig. 11.1.1), the stress-strain relation in the principal
material directions is given by the following set of equations with 9 independent
constants.

o o
11
l i~)
o o o E2
o o oo 103
/01 ) (11.1.1 )
= C44 () o 123·
T31 o C55 o 131
T12 o o C 66 112

Furthermore, by assuming a plane stress state in each of the layers in the 1-2 principal
material plane, we have

0"3 = 0, T23 = 0, and (11.1.2)

416
Section 11.1.' Mechanical Response of a Laminate

which reduces the stress-strain relations to [1 J

(11.1.3)

where the Qij'S are called the reduced stiffnesses and are given in terms of four
independent engineering material constants in principal material directions as
El E2
Qll = , Q22 = ,
1- 11121121 1- 11121121

Q12 -_ 1112E2 1121El


, (11.1.4)
1 - 11121121 1 - 11121121
and Q66 = G 12 .
Since the orthotropic layers are generally rotated with respect to reference coor-
dinate axes, the stress-strain relations given in the principal directions of material
orthotropy Eq. (11.1.3) must be transformed to these axes. This transformation
produces
(lx}
{ (ly
[Qll Q12
= QI2 Q22 Q26
glo] {
fx }
fy , (11.1.5)
Txy Q16 Q26 Q66 fXY

where the transformed reduced stiffnesses Oij are related to the Q ij by


011 = Qll cos4 8 + 2( Q12 + 2Q66) sin2 8 cos2 8 + Q22 sin4 8,
012 = (Qn + Q22 - 4Q66) cos 2 8 sin2 () + Qdsin4 8 + cos4 8) ,
022 = Qll sin 4 () + 2(Q12 + 2Q66)sin2(}cos2 8 + Q22COS48, (11.1.6)
016 = (Qn - QI2 - 2Q66) sin () cos () + (Ql2 - Q22 + 2Q66) sin () cos () ,
3 3

026 = (Qll - Q12 - 2Q66) sin3 8 cos 8 + (Q12 - Q22 + 2Q66) sin 8 cos 3 8,
066 = (Ql1 + Q22 - 2QI2 - 2Q66) sin 2 8 cos 2 8 + Q66(sin4 8 + ('os4 ()) .

Equations (11.1.6) are the basic building blocks of the classical lamination theory
which will be discussed next. These equations, however, can be put into a simpler
form in terms of the angular orientation of the principal axis of orthotropy with
respect to the reference x-y coordinate system. Tsai and Pagano [2J defined the
following material properties that are invariant with respect to ply orientation
1
U1 = g(3Qll + 3Q22 + 2QI2 + 4Q66) ,
1
U2 = 2(Qll - Q22)'
1
U3 = g(Qll + Q22 - 2QI2 - 4Q66), (11.1.7)
1
U4 = g(Ql1 + Q22 + 6Q12 - 4Q66) ,
1
U5 = g( Q11 + Q22 - 2Q12 + 4Q66) .

417
Chapter 11: Optimum Design of Laminated Composite Structures
Using various trigonometric identities, we can write the transformed reduced stiff-
nesses of Eq. (11.1.6) as
Oll = U1 + U2 cos 28 + U3 cos 48 ,
012 = U4 - U3 cos40,
022 = U1 - U2 cos 20 + U3 cos 48,
016 = -~U2 sin 28 - U3 sin48, (11.1.8)

026 = -~U2 sin 20 + U3 sin 48 ,


066 = U5 - U3 cos 40 .
The above form of the reduced stiffnesses is simpler than those shown in Eq.
(11.1.6) in terms of the ply orientation and, therefore, is useful for design optimization
purposes where derivatives of the stiffnesses with respect to the orientation variables
are needed.

I1
hl2
2
.~Zo Z2
h
Jz k ~I
zk zN-1
ZN
N I
Figure 11.1.2 Laminate stacking convention.

11.1.2 Classical Laminated Plate Theory

Classical lamination theory (CLT) assumes that the N orthotropic layers described
above are perfectly bonded together, as in Fig. 11.1. 2, with a non-shear-deformable,
infinitely thin bondline. Kirchhoff plate theory is used, which assumes a linear
throllgh-the-thickness variation of the in-plane displacements,
awo awo
u = uo - z ax ' v = Vo - z ay , (11.1.9)

and vanishing through-the-thickness strain components, fz = IXZ = IYz = 0 and w =


woo The strain distribution is, therefore

{ :: } = {
IXy
:~ } + z { :: }
IXy Kxy
, (11.1.10)

418
Section 11.1: Mechanical Response of a Laminate
where the superscript 0 indicates the mid-plane strains, and the curvatures K, are
the mid-plane curvatures. Therefore, the stresses in the kth ply can be expressed in
terms of the reduced stiffnesses of that particular ply by substituting Eq. (11.1.10)
into the stress-strain relationship, Eq. (11.1.5)

(11.1.11)

~y
• , •
x x
I L
r
x
Nx • tz .. N x
N xy

y ;: ... N xy
y/
«

Figure 11.1.3 Stress and moment resultants in a laminate.

The net stress resultant and moment resultant (stress couple) per unit length of
the cross section acting at a point in the laminate, see Fig. 11.1.3, are obtained by
through-the-thickness integration of the stresses in each ply,

{ N} = J
x
Ny
h/2 { }
ax
ay dz =L
N J
Zk { ax
ay
}
dz, (11.1.12)
N xy -h/2 rxy k k=lzk_t r xy

{ M}
Mu = J h/2 { }
~: zdz =L
N
J
Zk {
~: }
zdz. (11.1.13)
Mxy -h/2 rxy k k=lzk_t r xy

Substituting the stress-strain relations of Eq. (11.1.11), we obtain the following


constitutive relations for the laminate,

and

419
Chapter 11.' Optimum De:;ign of Laminated Composite Structures

where
N

Aij = L (Q;j)k(Zk - Zk-l) , (11.1.16)


k=l

(11.1.17)

(11.1.18)

11.1.3 Bending, Extension, and Shear Coupling

The A and D matrices are the extensional and flexural stiffness matrices, respec-
tively. The A matrix relates the in-plane stress resultants to the mid-plane strains,
and the D matrix relates the moment resultants to the curvatures. The B matrix,
on the other hand, relates the in-plane stress resultants to the curvatures and mo-
ment resultants to the mid-plane strains, and hence is called the bending-extension
coupling matrix. This coupling matrix can be a useful tool in designing laminates
for certain structural applications. If it is undesirable, the B matrix can be avoided
by a symmetric placement of the plies with different orientations with respect to the
mid-plane of a laminate. However, as noted by Caprino and Crivelli-Visconti [3] and
by Gunnink [4], symmetry is a sufficient but not a necessary condition to avoid cou-
pling. It is shown by Kandil and Verchery [5] that a certain class of laminates, such as
laminates consisting of two symmetric sub-laminates with equal numbers of plies and
equal but arbitrary fiber orientations, (h and B2 , for which the minimum number of
layers is eight [BI/B 2 /B2/BI/B 2 /BI/BI/B 2 ], possess no bending-extension coupling. This
may be important for design optimization purposes because symmetric placement of
the plies may restrict certain combinations of the in-plane and bending stiffnesses.
In addition to the bending-extension coupling, certain elements of the A, B, and
D matrices result in coupling response. \Vhen the Al6 and A 26 terms are not zero,
there is a shear-extension coupling. The existence of D16 and D 26 terms induces
bending-twisting coupling, and bending-shear coupling as well as extension-twisting
coupling results from non-zero B16 and B 26 terms. Again, by proper selection of the
laminate, these coupling terms can be eliminated. For example, by using negative
angle plies for every positive angle ply used in the laminate one can eliminate the
shear-extension coupling. Such laminates arc referred to as balanced laminates. How-
ever, these same terms can also be manipulated to tailor the response of a laminate
to the needs of a specified design application, as in the case of aeroelastic tailoring
(see Section 11.4.2).
The A, B, and D matrices are commonly used in the literature in the form
defined in Eqs. (11.1.16)-(11.1.18) together with the definitions of (Qij) given by
Eq. (11.1.6). However, for some design procedures, the use of sines and cosines of
multiple angles (see Eqs. 11.1.8) proved to be more useful, especially for derivation

420
Section 11.1: Mechanical Response of a Laminate
of the sensitivities of these matrices with respect to the angular orientation design
variables. Starting with the integral form of the Eqs. (11.1.16)-(11.1.18), for example,

J
h/2
{All,Bll,D ll } = Qll{1,z,Z2}dz, (11.1.19)
-h/2
and assuming each layer to be of the same material, we have

J J
h/2 h/2
{All, B ll , D ll } = Udh,O, ~:}+U2 cos2B{I,z,z2}dz+U3 cos4B{1,z,z2}dz.
-h/2 -h/2
(11.1.20)
Similar expressions can be found for the other stiffness terms, and are summarized
in Table 11.1.1 where the expressions for the V's are the following
Table 11.1.1 : A, B, D Matrices in Terms of Lamina Invariants
VO{A,B,D} V1{A,B,D} V2 {A,B,D} V3 {A,B.D} V4{A,B,D}
{All, B ll , D l l } U1 U2 0 U3 0
{A 22 , B 22 , D 22 } U1 -U2 0 U3 0
{AI2' B 12 , D 12 } U4 0 0 -U3 0
{A 66 , B 66 , D 66 } U5 0 0 -U3 0
2{A16, B 16 , D 16 } 0 0 -U2 0 -2U3
2{A26, B 26 , D 26 } 0 0 -U2 0 2U3

h3
VO{A,B,D} = {h, 0, 12} ,

J
h/2
Vl{A,B,D} = cos2B{1, z, z2}dz,
-h/2

J
h/2
V2{A,B,D}= sin2B{I,z,z2}dz, (11.1.21 )
-h/2

J
h/2
V3{A,B,D} = cos4B{1, z, z2}dz,
-h/2

J
h/2
V4{A,B,D} = sin4B{1, z, z2}dz .
-h/2
The above set of integrals can again be replaced hy summations.

421
Chapter 11: Optimum Design of Laminated Composite Stmctures

11.2 Laminate Design

The laminate stiffness matrices described in the previous section can he manipulated
both by changing the number of layers and their orientations. Therefore, use of
these quantities as design variables enables us to change the material properties of a
laminate as well as its thickness. In many practical applications, bending-extension
and shear-extension coupling is undesirable. Consequently, most laminates in use
today are symmetric and balanced to eliminate these couplings. Balanced symmetric
laminates are also much easier to analyze. For example, analysis of a laminate with
bending-extension coupling is difficult because out-of-plane deformations associated
with in-plane loads may be large and, therefore, require nonlinear analysis capability.
Therefore, most of optimization work to date has been limited to balanced symmetric
laminates. In the remainder of this chapter only such laminates are considered.
Most commercially available composite materials come in fixed ply thickness.
Furthermore, most of the data available for laminate behavior is limited to ply ori-
entations of 0-, 90-, and ±45-deg. For these reasons, laminate design is primarily an
integer programming problem. However, most of the available optimization software
is for continuous-valued design variables and the past work on laminate optimization
are based on the use of such variables. The total thicknesses of contiguous plies of
the same orientation, referred to as the ply thickness variables, were commonly used
as design variables. Ply orientations were also occa..<;ionally used as of'sign variables,
with orientations taking any value between 0- ano 90-deg. The final ply thicknesses
(or orientations) can be rounded-off to integer multiples of the commertially available
ply thickness (or convential ply orientations). However, for large number of design
variables finding a rounded-off design that does not violate any constraint is often dif-
ficult. Also, the problem must be formulated with a given stacking sequence, rather
than letting the optimization obtain the best stacking sequence. For these rea..'>ons,
there is a growing interest in the application of integer programming methods to
laminate design. We start this chapter with description of approaches that imple-
ment traditional continuous-valued variables, with integer programming applications
described in section 11.3.
There are a number of design considerations for optimization of laminated plates
depending on the intended application. One of the key considerations in terms of
analysis and design is whether the plate is designed for in-plane or out-of-plane re-
sponse. For the sake of simplicity we review these two cases separately.
11.2.1 Design of Laminates for In-plane Response

Ply Thickness Variables: One of the earliest efforts in designing laminates for in-plane
strength and stiffness requirements is due to Schmit and Farshi [6) who considered
a symmetric balanced laminate with fixed ply orientations. The thicknesses of the
individual layers ti, i = 1, ... , I with different prescribed orientations were used as
design variables. Because of the symmetric laminate restriction, only the thicknesses
of one half, I, of the total number of layers, N, are used. The laminate is under the
action of combined membrane force resultants, N xb N yk , N xyb k = 1, ... , J{ where J{
is the number of load cases.

422
Section 11.2: Laminate Design
The optimization problem is formulated as the following:
I

minimize W = 2:2Piti (11.2.1)


i=1
subject to S
gijk -
_ 1 - (p(i)
j flik + Q(i)
j f2ik +
R(i)
j /12ik
)
~
1, (11.2.2)
All ~ All/, A22 ~ A 221 , A66 ~ A 661 , (11.2.3)
and ti ~ 0, (11.2.4)
for i = 1, ... , I, j = 1, ... , J, k = 1, ... , K, (11.2.5)

where pP), Q~i), RY); are coefficients which define the jth boundary of a failure enve-
lope for each layer (i) in the strain space, and the flik, f2ik, and /12ik are the principal
material-direction strains in the ith layer under the kth load condition. For a simple
maximum strain criterion, which puts bounds on the maximum values of the strains
in the principal material directions, the failure envelope has 6 facets with P, Q, and
R defined as the inverse of the normal and shearing failure strains in the longitudinal
and transverse directions to the fibers in tension and compression. Equations (11.2.3)
prescribe lower limits All/, A 22 /, and A661 of the membrane stiffnesses of the laminate.
The approach used by Schmit and Farshi transforms the nonlinear programming
problem described in Eqs. (11.2.1)-(11.2.5) into a sequence of linear programs (see
section 6.1). The inequality constraint Eq. (11.2.2) representing the strength criterion
is a nonlinear function of the thickness variables and, therefore, is linearized as

~,
g,)kL
(t) = S(t)
9 0
+ ~(t
L...J 1
_ t ) (p(i)af 1ik
01 ) at + Q(i)af
) at + R(i)a/
2ik
) at
12ik )
' ( 11.2.6)
1=1 I I I

where the derivatives of the principal strains in the ith layer are related to the deriva-
tives of the laminate strains through the transformation relations

aeik _ T,ae'k
(11.2.7)
atl - 'atl'

where eik = (flik, f2ik, /12ikf, and Ti is the transformation matrix for the ith layer
defined by

Ti = [
COS 20i
sin20i
sin20i
0i
COS 2
cos Oi sin Oi
- cos 0i sin Oi
1. (11.2.8)
- 2 cos Oi sin Oi 2 cos 0i sin Oi (COS 2 0i - sin 2 Oi)

For a given in-plane loading condition Nk = (Nxk' N yk , Nxykf, the derivative of


the laminate strains with respect to the thickness variables can be determined by
differentiating Eq. (11.1.14), N = Ae'k, to obtain

aN k = aA e'k + A ae'k = 0 . (11.2.9)


atl at l atl

423
Chapter 11: Optimum Design of Laminated Composite Structures
Since the A matrix is a linear function of the thickness variables (see Eq. 11.1.16),
the derivative is simply equal to the transformed reduced stiffnesses of the ith layer
8A -
oti = Qi, (11.2.10)

so that from Eq. (11.2.9) we have

8e'k A-1Q-
=- (11.2.11)
0
otl lek'

Equation (11.2.6) together with (11.2.7) and (11.2.11) can be used to form the linear
approximations at any stage of the design optimization.
In addition to the constraint approximation, Schmit and Farshi also used a con-
straint deletion technique by including only those constraints that are potentially
critical at each stage of the constraint approximations.
Table 11.2.1 : Minimum weight laminates with stiffness constraints loaded in
axial compression.
Layup and Orientation Initial Final Final Number
Layer Angle Design Design Design of Plies
Number deg ti (in.) tf (in.) % (rounded)
[0/ ± 45/90).
1 00 0.032281 0.018793 28.96 4
2 +450 0.032281 0.023048 35.52 6
3 -45 0 0.032281 0.023048 35.52 6
4 90 0 0.032281 0.000000 0 0

2:ti 0.129124 0.064890


[0/ ± 45).
1 00 0.034194 0.012555 21.12 3
2 +45 0 0.034194 0.023441 39.44 6
3 -450 0.034194 0.023441 39.44 6

2: ti 0.102583 0.059438

Results of optimal designs for various conventional laminates with O-deg, ±45-
deg, and 90-deg ply orientations under various combinations of in-plane normal and
shear loads presented in Ref. [6) demonstrate the importance of the choice of laminate
stacking sequence on the optimum design. For example, for a laminate under uniaxial
stress and limits on shear stiffness, it does make a difference whether we select a
[0/ ± 45/90). laminate or [0/ ± 45). laminate even though at the end of the design
iterations the thickness of the 90-deg plies of the first laminate vanishes. Results
of these laminates obtained from Ref. [6) are summarized in Table 11.2.1. The
final design of the first laminate has a critical strength constraint for the 90-deg

424
Section 11.2: Laminate Design

ply. Compared to the second laminate [OJ ± 45], it is about 9% thicker due to
an additional O-deg ply required for the first laminate to prevent violation of the
strength constraint in the 90-deg layers. In order to achieve a true optimal solution,
therefore, the designer has to repeat the optimization process with different laminate
definitions, especially by removing the layer(s) that converge to their lower bounds.
However, the fact that a layer assumes a value different from its lower bound may not
mean that the particular layer is essential for the optimal design. That is, it is quite
possible that once a layer with a thickness different from its lower bound is removed,
the optimization procedure can resize the remaining layers to achieve a weight lower
than the one achieved before. This can make the design procedure difficult, because
of the need to try all possible combinations of preselected angles. However, for most
practical applications the presence of plies with fibers running in prescribed directions
(such as fibers transverse to the load direction) is desirable. Therefore, lower limits
which are generally different than zero are imposed, and ply removal is not an option.
Multiple load conditions also tend to produce designs where ply removal may not be
possible.
Ply Orientation Variables: In order to find the laminate stacking sequence which
is best suited to the load condition under consideration, the ply orientations of the
laminate as well as the ply thicknesses need to be used as design variables. Indeed
many design codes treat both as design variables. In order to demonstrate the use
of ply orientations as design variables, however, we concentrate on examples with
only ply orientation variables. For optimization problems formulated as minimum-
weight designs, the objective function is independent of the ply orientations. This
might cause difficulties in converging to an optimum solution with some optimiza-
tion algorithms. An alternative to the weight objective function minimization is the
maximization of the laminate strength as demonstrated by Park [7] and l\Iassard [8].
A quadratic first-ply failure (FPF) criterion based on an approximate failure
envelope in the strain space [9] is used by Park [7] for laminates under various in-
plane loading conditions (Nx , Ny, N xy ). This approximate failure envelope is given
by
€;+ €~ + (lj2h;y = bo2 , (11.2.12)
where bo is defined solely in terms of the stiffness and strength properties in the
principal material directions. The objective function to be minimized is defined as
f = €; + €~ + (lj 2h;y, (11.2.13)
which represents the square of the norm of the strain vector. The smaller the objec-
tive function value, the larger the loads that can be applied to the laminate before
the failure envelope is violated, therefore, the stronger the laminate in FPF. One
key feature of this approximate strain failure envelope is that it applies to laminate
strains and does not require ply-level strain calculations. Only balanced symmetric
laminates are considered in reference 7, and six different laminates were studied, five
of which were the following conventionallayups; [-B, +B]" [-B, 0, +B1.., [-B, 90, +B]s,
[-B, 0, 90, +B]., and [-B, -45, +45, +B],. The sixth laminate wa.<; called a continu-
ous laminate, and was assumed to have fiber orientation changing linearly from the
top surface to the mid-plane of the laminate covering a range from -B to B-deg.
orientations.

425
Chapter 11: Optimum Desig of Laminated Composite Structures
Results in [7] showed that under combined loading the best laminate, according
to the FPF criterion, for large longitudinal loading without shear is the [-0,0, +8]8
type, and for large shear loading without the longitudinal load, the best is the
[-0, -45, +45, +8]8 laminate. The optimum angle for the [-8,0, +8]8 laminate de-
pended on the magnitude of the transverse load Ny, and was equal to O-deg for Ny = O.
As the transverse load is increased, the optimum angle reached 45-deg for Ny = N x/2,
and was equal to 60-deg for Ny = N x • Similarly, for the [-8, -45, +45, +0]. laminate
(with shear loading and no axial loading), the optimal angle was 45-deg for Ny = O.
As the transverse load Ny increased, the optimal angle increased and reached a value
of about 73-deg for Ny = Nxy . The continuous laminate proved to have the best
overall performance under combined longitudinal and shear loadings with a range
±65-deg for Ny = O.
The above results were intuitively appealing in that the fibers were mostly placed
in a direction parallel to the applied loads. But such intuition may not always lead
to optimal designs when working with composite materials. Consider, for example,
using Hill's yield stress criterion interpreted for composite materials by Tsai [10],

f = (~ r-C;2) + (~;r + c~2r ~ 1, (11.2.14)

for the strength prediction of a unidirectional composite. The quantities X, Yare the
normal strengths in directions parallel and transverse to the fibers, and S is the shear
strength of a ply. Brandmaier showed [11] that if the transverse normal strength Y
is less than the shear strength S, optimal placement of the fibers is not along the
principal stress directions, but depends on the values of the strength quantities as well
as the applied stresses. This can be demonstrated (see Exercise 1) by expressing the
principal stresses in terms of the applied stresses (Jx, (Jy, T xy , and the fiber orientation
8, and equating the derivative of Eq. (11.2.14) with respect to the fiber orientation
to zero.
A Graphical Tool for Optimum Design: A graphical procedure introduced by
Miki [12, 13] for the design of laminates with prescribed in-plane stiffness properties is
a highly practical tool for design optimization. The procedure is suitable for multiple
balanced angle-ply laminates of the type [(±8/ )N)(±8/-dNI_J ... /(±8dN,]s where
the total number of plies in the laminate is N = 2 2:{=1 N i . In addition to the
balanced angle-ply sub-laminates, one unidirectional lamina with principal material
axes aligned with the axes of the laminate can be included into the stacking sequence.
The major effort of this design procedure is the construction of a lamination
parameter diagram which describes the allowable region of lamination parameters
Vt and V3*' These parameters are obtained by normalizing the in-plane components
of VIA and V3A in Eq. (11.1.21) by the total laminate thickness. For a laminate
of total thickness h, in which the volume fraction of the plies with ±8; orientation
angles are V;, the lamination parameters are given as

L
/ TT 1
VI* = h
VIA = "'"
~ vk cos 28k, and V3* = -v3A =
h
Vkcos 48k, (11.2.15)
k=1 k=1

426
Section 11.2: Laminate Design
where
I

Vi =
2(Zi - zi-d
h ' and L v;=1.
;=1
(11.2.16)

Because of the normalization, the values of the lamination parameters are always
bounded, -1::; Vi· ,Va· ::; 1. For a laminate with only one fiber orientation angle,
the lamination parameters are

vt = cos20, and Va" = cos 40 , (11.2.17)

and the two parameters are related as

Va" = 2Vi"2 - 1 . (11.2.18)

C A

-1

Figure 11.2.1 Lamination parameter diagram of a [±O]. laminate.

Values of all possible combinations of the lamination parameters are, therefore,


located along the curve ABC in Fig. 11.2.1 described by Eq. (11.2.18). Note that the
points A, B, and C correspond to laminates with [0], [±45]., and [90] ply orientations,
respectively. For each design point along the curve ABC the values of the effective
engineering elastic constants can be obtained from

(11.2.19)

427
Chapter 11: Optimum Design of Laminated Comr ,ite Strnctures
where the elements of the extensional stiffness matnx of the laminate are determined
from the following equations from Table (11.1.1)

( 11.2.20)

and where the Ui are the orientation-invariant material properties, Eq. (11.1.7).
If the laminate consists of two or more fiber orientations, then it is shown bv i\Iiki
[12] that Eq. (11.2.18) becomes an inequality •

(11.2.21 )

The allowable region of the lamination parameters is the area bounded by the curve
ABC in Fig. 11.2.1, independent of the number of different ply orientations. Any
point inside the lamination parameter diagram, therefore, corresponds to laminates
with two or more fiber orientations. Because a point is defined by two parameters,
this means that only two orientation angles (h and O2 arc sufficient for designing
laminates for prescribed stiffness requirements. For halanced angle-ply laminates with
more than two orientations, there will be many eombinations of til(' ply orientations
that will produce the same lamination parameters and, therefore, the same stiffness
properties. Each point inside the design space is called a lamination point, and
corresponds to a laminate with a specific stiffness properties. It is also possible to
restrict permissible values of the various effective engineering stiffnesses (Ex, E y , G xy,
and vxy ) graphically. This is achieved by introducing contours of constant effective
engineering stiffnesses, obtained from Eqs. (11.2.19) and (11.2.20), for each of the
engineering constants

Ex contours: ( 11.2.22)

u* _ U:jVt 2 + U2 EyV't + EyU1 - Uf + Ul


Ey contours: ( 11.2.23)
U3 (2U 1 + 2U4 - Ey)
Y3 -

contours:
V;* _ vxy U2 Vt - VryU j + U4 (11.2.24)
(1 + Vxy)U3
v xy
3 - '

17* _ U.s - G xy
Gxy contours: ~3 - (11.2.25)
U:I

Contours of constant laminate effective engineering properties are shown in Fig.


11.2.2 . The figure indicates that, if no other constraints are specified, the maximum
values of the Ex, Ey, and G xy are all achieved for lamination points located around
the boundary of the design space which require only one lamination angle. As ex-
pected, the maximum Ex and Ey are obtained for O-deg and 90-deg laminates, and
the maximum shear stiffness for [±45}s laminate. However, detC'Imination of the value

428
Section 11.2: Laminate Design

IEx (GPa) I y * IEy<GPa) I y*


3
20 50 80 13110 140 170 170 140 1101 80 50 20

y* y 1*
~~~~~++~~~~.-~ 1

-1 -1

y*
1
-1 't----i----f30 1 -1 1

-1

Figure 11.2.2 Contours of constant effective engineering elastic properties.

of lamination angle [±B]s that maximizes the effective Poisson's ratio is not straight
forward and is a function of the lamina properties via Eqs. (11.1.7) and (11.1.4). For
example, for T300/5208 graphite/epoxy and Scotchply 1002 glass/epoxy materials
the laminates that produce the maximum Poisson's ratio are [±25]s and [±31]s,
respectively.

For design problems where one or more of the effective engineering constants are
constrained, appropriate contours can be superimposed to identify the feasible design
space and the lamination point that maximizes (or minimizes) the desired stiffness
property (see Exercise 2).

429
Chapter 11: Optimum Design of Laminated Composite Structures

11.2.2 Design of Laminates for Flexuml Response

Ply Thickness Variables: For rectangular laminated plates under in-plane compres-
sive loads, the strength constraint becomes unimportant if the size of the plate is
large compared to the thickness. For such plates, elastic stability and vibration,
which are governed by the flexural rigidities of the plate must be considered. One of
the earliest studies that included the elastic stability constraint during the optimal
design of composite plates is by Schmit and Farshi [14].

b
f-
~Nx
XY

... x

Figure 11.2.3 Laminated panel under in-plane loads.

For a symmetrically laminated, balanced orthotropic laminate with only thickness


design variables, the elastic stability constraint is imposed in the form
(11.2.26)
where t is a vector of design variables which are the thicknesses of individual layers in
a laminate, and ,\ is the buckling load factor. For a balanced, symmetric laminate of
dimensions a by b with simply supported boundaries under-in plane loads an assumed
displacement function of the form
N M
( ) = "L....JL....J
W X,Y
" w:mnsm-a-sm-b-'
. m7rX • rl7rY (11.2.27)
n=l m=l

gives M x N of these constraints representing the ].,[ x N possible modes of buckling


associated with the transverse displacement patterns. This form of the displacements
satisfies the boundary conditions exactly. A truncated series can be used for an ap-
proximate solution of the differential equation governing the buckling of a rectangular
orthotropic plate
~w ~w ~w ~w ~w ~w
Dn!i"""4
vX
+ 2(D12 + 2D66 )!'lvX 2!'lvy2 + D22!i"""4
vy
= ,\(N.!'lvX 2 + Ny?li
vy
+ 2Nxy~),
V;T:VY
(11.2.28)

430
Section 11.2: Laminate Design
where N x , Ny, and Nry are equal to the applied design loads. Substituting Eq.
(11.2.27) into the equilibrium equation and applying Galerkin's method leads to an
eigenvalue problem of the form
Kw = AKcw, (11.2.29)
where the eigenvector is composed of the unknown coefficients of the displacement
function, w = {Wl1 ... IV1N W 21 ... IV2N •..... IVM N V. The matrices K and Kc
are given as
{ m,p= 1, ... ,"~1} (11.2.30)
n,q=l, ... ,N '
where

(11.2.31)

{ m,p= 1, ... ,Af}


n,q= 1, ... ,N '
(11.2.32)
where
0 if p = m or q = n }
~mnpq = { mnpq bpmbqn '
(p2 _ m 2)(q2 _ n2)

bpm _ { 1 if (p + m) odd } and b qn _ { 1 if (q + n) odd }


- 0 if (p + m) even ' - 0 if (q + n) even
The indices p and q are used as a counter for the equations and m and n are the
indices for the coefficients of the IVmn 's in each one of the equations. Therefore, no
summation is implied over the indices m, n, p, q in the calculation of elements of the
two matrices.
For a simply supported plate under biaxial compression loads only (Nxy = 0),
the plate buckles when the load amplitude parameter A reaches a critical value Acr
given as

(11.2.33)

where m and n are the number of half waves in the x and y directions, respectively,
that minimize Acr .
As they did for the strength constraint Eq. (11.2.2), Schmit and Farshi used a
linear approximation for the buckling constraints in the form

I aAb
gdt) = 1 - Ab(to) - 2:(t; - to;)-I . (11.2.34)
.
,=1 at; t=to

431
Chapter 11: Optimum Design oj Laminated Composite Structures

Noting from Eqs. (11.2.32) that the matrix KG is independent of the design variables,
and using Eq. 7.3.5, we can show that the derivatives of the kth buckling load factor
are given by

(11.2.35)

Since the matrix K is a function of the flexural stiffnesses, an explicit expression for
the derivatives of K with respect to the design variables can be written as

okpq _ ab
otj - 4
[6 6 OJmn]
mp nqOti '
(11.2.36)

where

Bimn = 7f4 [OD ll (m)4 + 2(OD 12 + 2BD66) (m)2 (~)2 + BD22 (~)4]
Oti Oti a ot i Oti a b Oti b
(11.2.37)
The partial derivatives of the flexural stiffnesses can be related to the partial deriva-
tives of the in-plane stiffness matrix A. For a quasi-homogeneous laminate in which
the bending-twisting coupling terms, D16 and D 26 , are ignored (these terms vanish
as the number of ply groups increase), the in-plane and flexural moduli are related
by (see page 204 of Ref. 9)
h2
Djj = 12Aij, (11.2.38)
where h is the laminate thickness. The partial derivatives of the flexural stiffnesses
are, therefore, given by

oD rs
ot i
=~
12
[(OArs)
at i h2 + 2A rs h] , r,s = 1,2,6, (11.2.39)

where the derivative of the A matrix is given by Eq. (11.2.10).


Graphical Buckling Optimization: Just as with the in-plane lamination diagram
discussed earlier, a diagram can be constructed, as shown by Miki [15], for designing
laminates for buckling response. We define flexural lamination parameters as
I
and *12 \1;10
W3 = ~ = ""
~skcos4Bk' (11.2.40)
k=l

where I = N /2, and


(11.2.41 )

Miki shows [15] that a relation of the same form as Eq. (11.2.21) is obtained

(11.2.42)

432
Section 11.2: Laminate Design

Therefore, any balanced symmetric angle-ply laminate with multiple orientations can
be represented as a point in a region bounded by
w; = 2Wt 2 -1, (11.2.43)
where the designs on the boundaries correspond to designs with only one lamination
angle, [(±O)I]., and
wt = cos 20, and w; = cos 40 . (11.2.44)

The diagram for the flexural lamination parameters can be used for designing
laminates for maximum buckling load under uniaxial and biaxial loads. For pre-
scribed values of the m and n, and a fixed ratio of applied transverse load to axial
load it can be shown, by manipulating Eq. (11.2.33), that the contours of the critical
load parameter ACT are straight lines in the flexural lamination diagram. However,
a difficulty in using flexural lamination parameter in designing laminates with max-
imum buckling load is that m and n are seldom known a priory. Since these two
numbers depend on the design variables, as well as the plate aspect ratio and the ap-
plied loads, it is not always possible to predict them accurately. For further discussion
of the use of the flexural lamination parameter diagram for buckling maximization
see Ref. [15]. Also, the following analytical discussion of the use of ply orientation
variables for buckling problem explains the role of m and n.
Ply Orientation Variables: A number of researchers carried out analytical inves-
tigations of the optimization of various flexural response quantities such as vibration
frequency [16-18]' structural compliance [19]' and buckling response [20] of simply
supported laminated plates. For a plate with length a and width b, Pedersen [20]
defined a parameter 4> which is proportional to the square of natural frequency and
buckling load, and inversely proportional to the out-of-plane displacements. The
quantity 4>, composed of a linear combination of the non-dimensional bending stiff-
nesses dij (i,j = 1,2,6), is defined as
(11.2.45)
where "I is a mode parameter defined as the ratio of the longitudinal and transverse
half-wave lengths by
na
"1= - , (11.2.46)
mb
with m and n being the modal half-wave numbers in the x and y directions, respec-
tively (see Eq. 11.2.27). The non-dimensional bending stiffllesses, dij, are defined in
terms of the flexural stiffnesses as

(11.2.47)

For a laminate with fixed ply thicknesses, the maximization of the buckling load or
the natural frequency, or minimization of the displacements, is achieved by obtaining
the stationary value of the 4> with respect to the ply orientations. That is
84> _ 8du 2 2( 8d 12 2 8d66 ) 48d22 - 0 (11.2.48)
80 - 80 + "I 80 + 80 + 17 80 - .
433
Chapter 11: Optimum Design of Laminated Composite Structures
Restricting the laminate to be a balanced, symmetric angle-ply laminate and
ignoring the bending-twisting coupling terms, we can put the bending stiffness matrix
from Table 11.1.1 into a summation form
W*1
- W1*
(11.2.49)
o
o
where Wi and W; arc defined by Eq. (11.2.40). Using Eqs. (11.2.40), (11.2.48), and
(11.2.49) we have

(11.2.50)

The stationary values of ¢ correspond to

fh = 0, or Ilh 1 = 90, (11.2.51 )

or 10k 1--2~cos-l (U2 (7]4 -1)


4U3(1-67]2+TJ4)
) ( 11.2.52)

The existence of multiple values of the fiber orientation that yield stationary values
for the quantity ¢ indicates local optima. The first two roots are independent of the
material properties and the geometry. The solution in Eq. (11.2.52), on the other
hand, contains the material properties and the mode parameter TJ, and is valid in a
range (see Muc [21]) TJ~in < T,2 < TJ~ax where

2 6 ± }36 + 4[(U2/4U3? - 1]
TJmax = 2[(U2 /4U3 ) + 1] ,
( 11.2.53)
when 0 reaches 0 and 90-deg , respectively.
The optimal values of the fiber angles for two different values of the U2/4U3
values are presented in Figure 11.2.4 from Eq. (11.2.52). The range of the U2 /4U3
values used in the figure practically covers many commercially available composites
including Graphite-Epoxy, Boron-Epoxy, Glass-Epoxy, and Aramid-Epoxy. Clearly
the optimal fiber orientation is insensitive to the material properties, but strongly
influenced by the mode shape parameter. For small or large values of the mode
parameter TJ the optimal orientation is either Ok = O-deg or Ok = 90-deg, and the
optimal orientation is independent of the position of the layer in the laminate.
The influence of the mode parameter TJ on value of the optimal fiber orientation
needs to be investigated further. The minimum value of ¢ which corresponds to the
buckling mode shape with lowest buckling load is obtained for transverse wavelength
parameter n = 1, but it is not always clear what value of the longitudinal wavelength
parameter m leads to the lowest value of the parameter ¢. For plate aspect ratios
r = alb less than a critical value (rer)l the wave number Tn = 1 gives the lowest value.
For r > (rer)l the wave number is determined such that it minimizes ¢. The points

434
Section 11.2: Laminate Design

90

80
- - U2!'4U3 = 0.96
70
- - U2/4U3 = 1.16
60

...g. 50

CJ:)
40

30

20

10

0.0 0.2 0.4 0.6 0.8 l.0 1.2 1.4 1.6 1.8 2.0 2.2
ll=nalmb

Figure 11.2.4 Optimal ply orientation as a function of the mode parameter ",.
points of intersection of the curves of ¢J for mode parameters m and m + 1 give the
critical values of the plate aspect ratio [151

(11.2.54)

where m = m + 1 is the wave number of the adjacent mode shape. However, it is


demonstrated by Miki [151 that, in the range (rcr)m < r < (rcr)m, laminates designed
by assuming the mode shape to be m lead to a laminate which has lower buckling
load corresponding to mode m. Similarly, laminates designed by assuming the mode
shape to be m give a laminate which has lower buckling load corresponding to mode
m. This indicates that at the optimum both buckling loads are the same. In the
range of r where two successive modes are simultaneously active, the optimum value
of the fiber orientation is determined from ¢J(m) = ¢J(m) and is given by

U2 (r 4 m 2 2 ) + m ± +
y'Ui(r 4 m 2 m 2 )2 - 8U3 (U1 - U3 )(r4 - m 2 m 2 )2
cos20=--~------~--~~~--~--~~~~----~~------~
3 (r 4 - m 2 2 ) 4U m
(11.2.55)
The optimal orientation of the fibers, including the interaction of the adjacent
modes, for a T300/5208 Graphite/Epoxy laminate as a function of the plate aspect
435
Chapter 11: Optimum Design of Laminated Composite Structures

80
m=1
70

...'
m=! , 2, ,I m=2
60 ,,
, . m=2,3
I
I

m;::3
'.,:
'.
...c.. 50
a
CD 40

30 ,.
I
I
,,
20 I I I
I I
I
I
, I
10 I
,•• I

0

• ••
0 2 3 4 5
r= a I b

Figure 11.2.5 Optimal ply orientation as a function of plate aspect ratio.

ratio is shown in Figure 11.2.5. For aspect ratios greater than unity, the optimal
angle oscillates around 45-deg. The amplitude of the oscillations decreases as the
aspect ratio r is increased, therefore, for all practical purposes and for aspect ratios
r > 4 the optimal angle can be assumed to be Bapt = 45-deg.
If the laminate is loaded under biaxial compression [20], for small aspect ratios,
r < 1.5, the optimal fiber angle is similar to the case of uniaxial compression. For
aspect ratios larger than 1.5, the value of the optimal angle increases rapidly as the
ratio of the transverse load to the axial load (Ny/Nr) increases. For Ny ~ 4Nx , the
optimal fiber orientation is 90-deg.
Importance of Laminate Stacking Sequence: When ply thickness design variables
are used, the stacking sequence is selected ahead of time. As for in-plane loads, the
optimum design can be influenced by a choice of whether include or not to include
a particular ply orientation. However, for flexural response, the stacking sequence
is more important because it strongly affects the D matrix while it has no effect on
the A matrix. Fortunately, as shown below, the optimum design is insensitive to the
choice of stacking sequence.
If the relative position of the boundaries between the plies are ~k = Zk/ h for a
laminate with N plies, then

(k=1, ... ,N-1); (11.2.56)

436
Section 11.2: Laminate Design
the derivative of ¢ with respect to the ply boundary variable is

()¢ = {)d ll 2 2( ()d 12 2 ()d 66 ) 4{)d 22 == 0 (11.2.57)


{)~k {)~k + .,., {)~k + {)~k +.,., {)~k .
Since the contribution of the individual layers to the overall D matrix depends only
on the distance of the layers to the laminate mid-plane, the derivative of the D matrix
is expressed as
()D ij
- 2( D·· -D·· )
{)~k- - ~k 'J k 'J HI .
(11.2.58)

Here Diik depends only on the properties and orientation of the k-th layer and (as-
suming the adjacent layers to be made of the same material so that the constant U
terms are omitted) is defined by
U2 cos 20k + U3 cos 40k -U3cos40k
Dk == h3 [ -U3cos40k
o
-U2 cos 20k + U3 cos 4fh
o
(11.2.59)
-U3c~os40J
Then, as shown by Cheng Kengtung [19] the derivative of the function ¢ can be
expressed as

;~ == 2{~~) [-U2(1- .,.,4) - 2U3{1-6.,.,2+.,.,4){cos 20k +coS28k+d] (cos 2(h -cos 28 k+ 1) .


(11.2.60)
Since the sign of the derivative of ¢ is independent of the position of the boundary,
we choose either the minimum or the maximum thickness for the k-th ply depending
on the sign of the derivative. For example, if it
> 0 we will use {k = {k max in
order to maximize the buckling load. Furthermore, some specific combinations of the
neighboring ply orientations lead to stationary values for the ¢, see Eq. (11.2.60),
indicating possible local minima. These roots are
and (11.2.61 )

(11.2.62)

If the total thickness of the two plies is kept constant, the derivative is zero for
these angles whatever the location of the boundary between the plies. Therefore,
the buckling load is independent of the thickness distribution of the adjacent plies.
Moreover, for a square laminate, .,., = 1, ¢ is constant for

(11.2.63)

whatever the material properties.


Shin et al. showed [22] that for a symmetric laminate with fixed total thick-
ness, the order of ply orientations can also be permitted in any desired way without
changing the D matrix (see exercise 3). The individual ply thicknesses do change,
437
Chapter 11: Optimum De~ of Laminated Composite Structures
........
'. \

of course. In practice, the requirement of an integer number of plies forces changes


in the D matrix, but if the total thickness is large compared to angle-ply thickness
this effect will be small. This is demonstrated in Table 11.2.2, taken from [22]. which
presents six permutations of a plate made of 0, 90 and 45 plies. The total thickness
for all six permutations is the same (normalized to one) and all have the same D
matrix and the same buckling load. If the total number of plies is 50 the buckling
loads of the six laminates vary by less then one percent (see Ref. 22).

Table 11.2.2 : Optimum Designs with Equivalent D Matrix

StackingSequence
[0/90/ 45
[0/45/90.
1' 0.0366
0.0366
(0.04)
(0.04)
0.1539
0.2496
(0.16)
(0.24)
0.8095
0.7139
(0.80)
(0.72)
[45/0/90]. 0.2228 (0.20) 0.0634 (0.08) 0.7139 (0.72)
[45/90/0]. 0.2228 (0.20) 0.3044 (0.32) 0.4729 (0.48)
[90/45/0]. 0.1399 (0.12) 0.3872 (0.40) 0.4729 (0.48)
[90/0/45]. 0.1399 (0.12) 0.0506 (0.04) 0.8095 (0.84)
t Ply thicknesses are rounded such that each laminate has a total of 50 plies.

The insensitivity of the design to the choice of stacking sequence disappears when
strength is also a consideration. In such cases the choice of stacking sequence is
critical, and this topic is discussed in the next section.

11.3 Stacking Sequence Design

The methods presented in the previous section yield results that are valuable for un-
derstanding the basic trends in laminate design. However, one of the major difficulties
of a realistic design situation is the need for a practical laminate which is generally
made up of plies with only O-deg, 90-deg and ±45-df'g orientations (or occasionally
orientations with 15-df'g increments between 0- and 90-dc-g), and thicknesses which
are integer multiples of the ply thickness. Of course, df'ciding the numhf'r of plies of a
specified orientation is not sufficient to define a laminate, but through-the-thickness
location of the ply must be decided as well. This means that the basic design problem
is to determine the stacking sequence of the composite laminate~a problem which
calls for discrete programming techniques. In the following, we introduce various
approaches t.hat address t.his problem.

11.3.1 Graphical Stacking Sequence Design

The lamination parameter diagrams introduced in s('ction 11.2 can be used for de-
signing laminates with predetermined ply orientation angles. It is shown by ~1iki
and Sugiyama [23] that the feasible region for laminates with fixed ply angles is a
polygon with vertices located on the envelope of the lamination parameter diagram.
If the design point is on the periphery of the diagram, the laminate is an angle ply

438
Section 11.3: Stacking Sequence Design

laminate with one fiber orientation. Therefore, given a set of permissible integer ply
orientations, vertices of the polygons are placed at those locations that correspond
to the selected angles. For example, the design spaces for laminates made up of plies
with O-deg, ±45-deg, and 90-deg orientations and O-deg, ±30-deg, ±60-deg, and 90-
deg orientations are shown in Fig. 11.3.1-a and 11.3.1-b, respectively. For laminates
with 0, ±45, and 90-deg plies, the design space is a triangle with vertices at A, B,
and C as shown in the figure. For ply orientations of O-deg, ±30-deg, ±60-deg, and
90-deg, the design space is a trapezoid.

v·3

-1 B (45">.
a) 0-, ±45-, and 9O-deg plies b) 0-, ±30-, ±60-, and 9O-deg plies

Figure 11.3.1 In-plane lamination diagmm for laminates with integer ply orientations.

Points along the edges and interior points of the polygons correspond to laminates
with combinations of two or more ply orientations, and their number is determined
by the total number of layers in the laminate. If the total number of layers is N
and I = N /2, then in addition to the vertices, we obtain I - 1 equally spaced design
points along the edges and along the internal lines that join two vertices. From the
nodes we obtained along the edges, we also draw lines parallel to the lines that join
vertices. If such a line terminates at another discrete design point at the opposite
end of the polygon, then it is easy to label the design that would be in the interior
by looking at the designs at the two end points. For example, for an eight-ply (total)
laminate with O-deg, ±45-deg, and 90-deg angles (triangular design space), there are
five equally spaced design points with fiber orientations varying incrementally from
one vertex to another as shown in Fig. U.3.I-a. Note that the design points inside
the triangular region also follow an incremental pattern, but are combinations of the
three available angles. Design points for a laminate with total six layers are shown
in Fig. 11.3.I-b. Labeling of those designs is left to the reader (see exercise 4).
Just as for the in-plane lamination diagram, it is possible to construct the flexural
lamination diagram for a laminate with prescribed fiber orientations. The boundaries
439
Chapter 11: Optimum Design of Laminated Composite Structures

of the design space are same as the in-plane parameters; the prescribed angles are on
the envelope of the lamination diagram and form the vertices of a polygon. However
in this case the design points, which are combinations of the given angles, are not
equally spaced (although combinations of the angles corresponding to two vertices
are still located along the edge that connect these vertices) but are located through
the use of Eq. (11.2.40).

11.3.2 Penalty Function Formulation

Buckling Design: The procedure described in section 5.7.4 for the use of a penalty
function to achieve designs with discrete valued variables is demonstrated in this
section for buckling maximization of laminates with fiber orientation variables. In
order to establish results that can be used to compare with integer orientation designs,
a series of results was generated for the continuous problems, see Gurdal and Haftka
[24]. This was achieved by turning off the penalty terms for the non-discrete values
of the design variables.
The problems solved are for a = 20 in by b = 10 in (50.8 ern x 25.4 ern) rectangular
plates of specified numbers of plies and fiber orientation design variables. The critical
eigenvalues are maximized for applied compressive load of N x = 1 with varying N y / N x
ratios.
90
..... ,..

80
.. ..........
' ,
..... '.-'.....'..... -....,...-..,.-
......' .
..--- 70 .:.t·:· .
oil
~
"0
.~,: .
'-'
P.. 60
0
<:D

50
- - - midplane layers
surface layers
40
0.00 0.50 1.00 1.50 2.00 2.50
Ny/Nx

Figure 11.3.2 Optimum continuous fiber orientations for maximum buckling load.

Plates with four different thicknesses corresponding to 8, 12, 16, and 24 ply lam-
inates were designed. The optimal orientations of the surface layer fibers (indicated

440
Section 11.3: Stacking Sequence Design

by dashed lines) and the layer adjacent to the mid-plane (solid lines) are shown in
Fig. 11.3.2 for each of the four laminates. For uniaxial compression, Ny = 0 or
N x = 0 (or NyjNx > 2.5), the laminates have the same fiber orientation through
the entire thickness which are ±45-deg and 90-deg, respectively. For intermediate
load ratios, the fiber angles at the surface layer are larger than the mid-plane layers
with the difference being largest for the thick 24-ply laminates. However, the fiber
orientation of the surface layers appears to depend only on the load ratio, and not
on the laminate thickness.
40
· .•. 24-ply Laminate
· .... 16-ply Laminate
=
~30
· .•. 12-ply Laminate
· .•. 8-ply Laminate
*
-
..
~,

...
. . &. . ........ "' ..
• : ........ .... :::£.
:1;' •
.. "
. ••••

. . . .
•.... '-''''''''
... :
o
0.00 0.50 1.00 1.50 2.00 2.50
Ny/Nx

Figure 11.3.3 Buckling load reduction for laminates with O-deg, ±45-deg, and 90-deg
plies.

Next, the same design cases were repeated using discrete fiber orientations of 0-,
±45-, and 90-deg. Solutions were obtained with the penalty function approach, and
checked by the branch-and-bound approach described in section 11.3.3. Plies with
+45-deg orientation were required to be adjacent to -45-deg plies so as to minimize
bending-twisting coupling. For the penalty function approach, it was convenient to
require also the plies with 0- and 90-deg orientations to appear in pairs. Plots of
the percentage reduction in buckling load due to the restrictions to discrete orien-
tations arc shown in Fig. 11.3.3 for the four laminates. Discrete valued designs are
accompanied with a substantial buckling load reduction over at least a portion of
the load ratio range considered. The largest penalty was for Nyj N x = 0.5 (about
22% reduction), and the thin 8-ply and 12-ply laminates. However, buckling load
reductions associated with different thicknesses appeared to be quite random.
The laminate stacking sequences obtained for the discrete valued designs are

441
Chapter 11: Optimum Design of Laminated Composite Structures
Table 11.3.1: Optimum stacking sequence for 8-ply laminates under biaxial compression.
Continuous Optima Penalty Approach Global Optima

0.0 ±45h. ±45 2•


0.25 ±53. 7/ ± 49.8]. ± 45 28
0.50 ±64.3/ ± 53.2j8 ±45 28
0.75 ±70.0/ ± 58.6 • 902/ ± 45j.
1.00 ±73.5/ ± 65.8 B 902 / ± 45 8
1.50 ±79.4/ ± 70.5 • 902 / ± 45].
2.00 ±83.4/ ± 78.1 • 902 / ± 45].
2.50 ±89.2/ ± 88.4]. [904]4.

Table 11.3.2 : Optimum stacking sequence for 16-ply laminates under biaxial compression.
Continuous Optima Penalty Approach Global Optima

0.0 [±45]4. [±45]4.


0.25 [±52.2/ ... / ± 46.5]. [±45]4. [±452 /90 4].
0.50 [±65.3/ ... / ± 60.0]. [90 2 ± 453 ]. [±45/9061.
[90 2/ ± 4531. [90 2 / ± 45 2 /90 2].
!
0.75 [±70.9/··· / ± 52.3].
1.00 ±74.9/ ... / ± 52.6]. [904 / ± 45 2]. [90 2 / ± 45/90 4].
1.50 ±80.0/ ... / ± 64.1]. [906 / ± 45]. [90 4 / ± 45 21.
2.00 [±83.9/··· / ± 71.8]. [906/ ± 45]. [90 4 / ± 45/90 2].
2.50 [±89.2/ ... / ± 87.9]. [90]s.

presented in Table 11.3.1 and 11.3.2 for the 8-ply and the 16-ply laminates. Included
in the table are the laminate stacking sequences for the continuous valued designs,
the discrete designs obtained by using the modified penalty method, and the global
optimal designs. If the design obtained by the penalty function approach is same as
the global optimal design, the entry under the Global Optima column is left blank.
The penalty approach is unable to reach the global optimum in some cases, especially
for laminates with large numbers of plies. In every case, the discrete designs obtained
by the penalty function approach followed a pattern such that the orientations of the
outer plies were larger than those plies close to the mid-plane; this was similar to
the trend observed for the continuous designs. Global optimal designs, on the other
hand, had orientations that were more random. The differences in buckling loads
ranged up to 14%, and illustrate the danger oflooking for the discrete optimum near
the continuous one.
11.3.3 Integer Linear Progmmming Formulation

The normalized integrals used for the graphical procedure as design variables, see Eqs.
(11.2.14) and (11.2.39), may not be a good choice for more general design problems.
In order to define the integrals that are needed for characterizing the laminate, a new
set of variables that define the existence of a given orientation layer or the orientation
of a specified layer are proposed by Haftka and Walsh [25]. Such variables arc referred

442
Section 11.3: Stacking Sequence Design
to as ply-identity design variables. For example, if we have four possible orientations
and N plies, we can use N design variables that take the values of 1 to 4 to define
the stacking sequence. If symmetry is used this number can be reduced to N /2.
It is also possible to use zero-one ply identity design variables. For example, if
the laminate is made up of O-deg, 90-deg, and ±45-deg plies the stacking sequence
can be defined in terms offour sets of ply-orient at ion-identity variahles 0;, ni, ff and
ft, i = 1"", N/2, that are zero-one integer variables. The variables 0i, ni, If or
ft is equal to one if there is a O-deg, 90-deg, 45-deg or -45-deg ply, respectively, in
the ith layer.
The advantage of these zero-one ply-identity variables is that the integrals, and
therefore the A and D matrices are linear functions of these variables. The integrals
VOA , ViA and V3A are given in terms of the ply identity variables and the thickness
of a single ply t as

VOA = j h/2
dz =
N/2
2tL:)Ok + nk + fr + fl:'),
-h/2 k=l

ViA = j h/2
cos20dz =
N/2
2t~)Ok - nk), (11.3.1)
-h/2 k=l

h~ ~2
V3A = j cos40dz = 2t~)Ok + nk - tr - fl:') .
-h/2 k=l

For the flexural response, the integrals Von, Vin and V3n are exprcs:-;cd as

3 N/2 3 N/2
2t '"'
Vm = 3"" Zk 3 - (-t-)
~Pkcos20k [(T) Zk-l 3] 2t '"'I
= 3"" ~ k 3 - (k -1) 3] (Ok - nk), (11.3.2)
k=l k=l

3 N/2 3 N/2
V3n = 32t '"' [ Zk
~Pkcos40k (T) -(-t-)
3 Zk-l
1= 32t
3 '"' 3 3 m
~[k -(k-l) ](Ok+nk-tr-fk)'
k=l k=l

where tr
and ff do not appear in the expression for ViA and Vin since the cosine of
90 degrees is equal to zero. The variable Pk in Eq. (11.3.2) is unity if the Hh ply is
occupied and zero if it is empty. Constraints are applied during the optimization to
ensure that Pk can be zero only for the outermost plies.
Stacking Sequence for Buckling Design: Since the buckling load for symmetric
laminates under biaxial loads is a linear function of the flexural lamination param-
eters which are linear functions of the ply-identity yariables (see Eqs. (11.2.32) and

443
Chapter 11: Optimum Design oj Laminated Composite Structures

(11.2.48)), the problem can, therefore, be posed as a linear integer programming


problem.
Two formulations for the optimization problem are possible. The first is the
optimization of a laminate with a fixed thickness for maximum buckling load, and the
second is the optimization of a laminate for minimum thickness for a given buckling
load. For the first optimization problem the lowest buckling load oX' is maximized,
over values of m and n. The objective oX* is not a smooth function of the design
variables, and the standard device (see section 2.4) for removing this problem is to
add oX* as a design variable and require it to be less than or equal to each oXcr(m,n).
Thus, the optimization problem is formulated as
find oX*, and OJ, nj, If, Ijm, i = 1,···, N/2,
to maximize oX*
such that oX*~oXcr(m,n), m=l,···,mj, n=l,···,nj,
OJ + nj + Jr + Jim = 1 , i = 1, ... , N /2, (11.3.3)
N/2
and LJr - Ijm = o.
i=1

The minimization over m and n is performed by checking for all valu('s of 111 between
1 and mj, and all values of n between 1 and nj. The last constraint in Eq. (11.3.3)
ensures that the number of 45-deg and -45-deg plies is the same, so that the laminate
is balanced. The optimization problem of Eq. (11.3.3) is a int('gcr linear programming
problem, and the methods described in chapter 3.9 can be applied.
For the dual problem of weight minimization of a laminate capable of sustaining
a specified load without buckling, the total number of layers must be variable. This
seems to contradict the use of ply-identity variables which requires N to be known
in advance. A remedy for this contradiction is to start with a number of layers large
enough so that the initial design does not buckle, but permit some of the plies to be
empty (OJ + ni + Jr + Ijm ~ 1). Of course, plies that are permitted to be empty must
be the outer plies of the laminate in order to maintain integrity of the laminate. The
formulation takes the form
find OJ, ni, If, Ijm, i = 1, ... , N/2 ,
N/2
to minimize L(Oi + nj + Jr + It)
i=1
suchthat oX cr (m,n);:::l, m=l,···,mj, 71=l,···,nj,
(11.3.4)
Oi+ni+Jr+lim~l, i=1,···,N/2,
N/2
2:. If - lim = 0,
j=1
and 0i + ni + Jr + Ijm ~ Oi-l + ni-l + If-l + /[':.1 .
where the last constraint ensures that the empty pli('s are on the outside.

444
Section 11.3: Stacking Sequence Design

In general, the solution of the weight minimization problem is not unique. For a
minimum weight design with N· layers, it is possible to change the orientations of the
fibers and come up with designs that will have the same weight but different buckling
loads. Out of those feasible designs, ideally, one would like to choose the one that has
the largest margin for the buckling constraint. This can be achieved by subtracting
a small fraction of Acr from the objective function, so that the modified objective
function serves the dual purpose of minimizing weight while maximizing the buckling
load. For results on weight minimization designs, the reader is referred to Haftka and
Walsh [25]. In the following paragraphs, results for buckling maximization will be
presented.

40

..
•• 24-ply Laminate
.~.
16-ply Laminate
=
'.'
' 12-ply Laminate
~ 30 a-ply Laminate
*
- ..
., '.
:::: 20
Y
:a..
~

.
.. .•*', '. .•. . . :....•....
'-'

c<• 10 ...
c<
. •• .... • .
_...... ••
~
........ .. '

• ....... ' ".t-


".4.:
o .. .....................
• I . . . . . _. " . ' . "'. 1 .0 \ "

0.00 0.50 1.00 1.50 2.00 2.50


Ny/Nx

Figure 11.3.4 Buckling load reduction for globally optimal laminates with O-deg} ±45-
deg} and 90-deg plies.

For the results presented in this section the solution of Eqs. (11.3.3) is gener-
ated with the LINDO program [26] which employs the branch-and-bound algorithm
described in section 3.9.1. First we present the biaxial load cases that were reported
earlier in Table 11.3.1 and 11.3.1 as global optima. A plot similar to the plot shown
in Fig. 11.3.3, this time for the global optimum designs obtained through the use of
the linear integer programming approach, is shown in Fig. 11.3.4 for comparison. In
general, there is a small amount of improvement in the buckling load reduction for
most of the laminates. For example, the worst buckling load reduction (compared to
the continuous designs) is still for the 8-ply laminate for a load ratio of Ny/Nx = 0.5,
but it is only about 18% as compared to 22%. Also, there is an orderly progres-
sion with increasing laminate thickness. The smallest and the largest buckling load

445
Chapter 11: Optimum Design of Laminated Composite Stmct11res

reductions are associated with the 24-ply and the 8-ply laminates, respectively.
When the number of contiguous plies with the same orientation angle is large,
composite laminates are known to experience matrix cracking. Therefore, it is desir-
able to limit the number of such contiguous plies. vVe demonstrate the use of such
a constraints on the design obtained for N y/ N x = 2. \Ve start with the design that
was presented in Table 11.3.2, [90 4 / ± 45/90 2]" which we imposed the constraint that
the plies with different orientations appear in pairs. The critical load factor for this
optimal design was Au = 36.19. Next, we relax this requirement and redesign the
plate so that we can have single plies with different orientations adjacent to one an-
other. This yields a design which has 5 contiguous 90-deg plies, [90,,/ + 45/- 45/90]'•.
The critical load factor for this design is Acr = 36.8--1, a 1.8% increase compared to
the design which restricts each orientation to be in pairs. TIl(' fad that 45-deg plies
appear in a pair is of course coincidental. vVe then implement the contiguous ply
requirement by adding the constraint

n1 + n5 + 116 + n7 + 718 :::; --1 . (11.3.5)

The design ohtained with this constraint is [90 4 / + --13/9021 - 43]s and has a slightly
smaller load factor, Acr = 36.59, compared to the previous design. However, it still
has a slightly larger load factor compared to the design from Table 11.3.2, hut violates
the requirements that off-axis angles appear in pairs. By introducing a constraint of
the form
if - Jr';-l = 0, i = 1,2, ... , (I -1),
( 11.3.6)
where if' = 0, and If = 0,
designs that have the --13-deg plies in plus and minus pairs can he achieved, \vithout
requiring the 0- and the 90-deg plies to be in pairs, and without exceeding 4 contiguous
plies with the same orientations. In this particular case we obt ain again the design
presented in Table 11.3.2.
Stiffness and Buckling Design: In some cases it rna}' be dpsirahle to impose
constraints on the stiffness of the plate. For example, a constraint requiring All to
have a minimum value of A~l can be written as

(11.3.7)

As shown in [25] this constraint can he expressed as a linear function of the ply
identity design variables similar to the buckling constraint. Therefore, it can be used
as a constraint in the prohlem formulated by Eqs. (11.3.3). The effect of introducing
a minimum stiffness requirement is checked for Ny/Nx = 2. The optimum laminate
for this case, was dominated hy 90-deg plies, and has only 16 percent of the axial
stiffness All of an all O-deg laminate. A requirement that All be at least 50 percent
of the unidirectional laminate was added, ,vith and without the requirement of no
more than four contiguous plies. The results are compared to the original design in
Fig. 11.3.5. It is seen that the stiffness requirement is satisfied by putting O-deg plies
near the plane of symmetry where they have only a minimal effect on the bending
stiffnesses, and hence on the buckling load. The reduction in the buckling load is
about 8 percent. For this design the effect of adding the requirements of no more

446
Section 11.3: Stacking Sequence Design

(905/451-45190)s Acr =36.84 (904/04)s Acr =33.77 (903/02/90/02)s\:r =31.36


without constraints with stiffness with stiffness and
constraint contiguous ply
KEY constraint
• 1m!$) 1 .. "; 90-deg plies
45-deg plies
....
I~..........~~___..........___
· . . .I. O-deg plies

Figure 11.3.5 Effect of stiffness requirement on laminate design.


than 4 contiguous plies had a substantial effect (7 percent reduction) on the buckling
load.
Stacking Sequence for Strength and Buckling Design: In the absence of applied
shear loads, the laminate strains €x and €y can be calculated (for a load factor>' = 1)
from

and (11.3.8)

The strains for the kth ply may be calculated from the transformation

COS2()kfx + sin2 ()k f y ,


kfx + cos kfy,
. 2() 2() (11.3.9)
sm
sin 2()k( €y - f x) •

Even though the extensional stiffnesses Aij are linear functions of the design variables
the strains calculated by Eq.(l1.3.8), are nonlinear functions ofthese variables. These
strains can be linearized, as shown by Nagendra et al. [27], by a linear Taylor series
in A ij . We have

(11.3.10)

447
Chapter 11: Optimum Design of Laminated Composite Structures
where E is a typical strain component (.x = 1 ), EL is its linear approximation, and
Aijo and Aij are the extensional stiffnesses calculated at the nominal design point Xo
and neighboring designs, respectively. The derivatives of the strain with respective
to the extensional stiffnesses at the nominal design point are calculated in terms of
the midplane strains and the extensional stiffnesses at the nominal design. The linear
strain approximation can thus be constructed along a particular fiber orientation and
transverse to it by evaluating the strains E~, E~ and ,f2 for each orientation (since the
orientation is chosen apriori, either 0° or 45°) in terms of the midplane strains using
Eq. (11.3.9). For example, the strains along and transverse to the 45° fibers and in
shear can be derived as

(11.3.11)

The derivatives needed for the strain approximation of Eq.(11.3.10) can then be
obtained by differentiating Eq.(I1.3.11). For example, the derivative of the strain
along the 45° fiber with respect to All can be written as
OEI 1 (A12 - Ad
(11.3.12)
OAll = 2 (AU A 22 -Ai2) Ex,

where Aij are the extensional stiffnesses at the nominal design point. Similar strain
derivatives with respect to A22 and A12 can be derived. The extensional stiffnesses are
a linear function of the ply-identity design variables, thus the strain approximation
is a linear function of the ply-identity variables. It is also important to note that the
strains are initially calculated based on some reference value of the load. In order
to implement the strain constraint they have to be multiplied by the value of the
buckling load multiplier .xc which is also a function of the design variables,

< (11.3.13)

where Ei a is the strain allowable. The strain constraint of Eq. (11.3.13) can be
linearized by moving .xc to the right hand side, and expanding 1/.xc in linear Taylor's
series to obtain
(11.3.14)

where Ao is the buckling load factor for the nominal design.


The linear strain constraint of Eq. (11.3.14) can now be added to the problem
formulation of Eqs. (11.3.3) for designing laminates that are buckling and strength
failure resistant. Since the formulation involves a local approximation for the strength
constraint, sequential linear programming needs to be used. In using sequential linear
programmig, imposing move limits is generally recommended so that designs geneated
based on approximate constraints remain in or near the feasible design space. In the
case of zer%ne ply-identity variables, imposing move limits on the design variables is
not practical. Hence move limits were applied as bounds on the extensional stiffnesses

448
Section 11.3: Stacking Sequence Design
Ny =0.25 lb/in.

(±454190t±451902/901 O)s (±452190t±453/04-'±45/02/±45/~ )s (902/±455/02/±45/04-'±45/02)s


Acr= 13441.85 Acr = 12622.44 Acr = 12674.84
buckling constraint buckling and strain failure constraints buckling and strain failure constrai.
LINDO Optimum Design LINDO Optimum Design Global Optimum Design

Ny =0.5Ib/in.
.,,~,~~.,~ symmetry plane symmetry plane

Acr= 9999.13 Acr = 9998.18 Acr = 9998.18


buckling constraint buckling and strain failure constraints buckling and strain failure constraints
LINDO Optimum Design LINDO Optimum Design Global Optimum Design
KEY
90-deg plies
45-deg plies
O-deg plies
Figure 11.3.6 Maximum buckling load designs with strength constraints.
Aij expressed in terms of the ply-identity variables. This requires addition of six more
constraints to the problem

<
-

'J
i,j=1,2. (11.3.15)

Designs with strength constraints were obtained for laminates that are thicker
than those considered in the previous cases so that the buckling loads are likely to

449
Chapter 11: Optimum Design of Laminated Composite Structures

violate the strain failure constraints. Design results for 48-ply laminates under two
different combination of biaxial loads (Ny / N z = 0.25 and N y / N z = 0.5), for N z =
0.25 lb/in ( 175 N /m) are presented in Fig. 11.3.6, along with the results for designs
with no strain constraint. Since the method used involves local approximations, the
final design may be a locally optimal design. Designs with a higher confidence of
being globally optimum can be generated by using one of the probabilistic search
algorithms for nonlinear programming problems with discrete valued design variables
(see chapter 4). The last design in each of the load cases presented in Fig. 11.3.6
is generated using the genetic algorithm discussed in section 4.4.2 and verified to be
actually the global optimum design. Compared to the design without strength failure
constraint, the failure load factor decreased by 6.05% for Ny = 0.25. Although the
design for this load case was only a local maximum, the load factor differed from the
global optimum design only by a fraction of a percent. For the load ratio of 0.5, the
design without the strain constraint violated the shear strength by 7%. The design
obtained from the sequential integer linear programming approach was also the global
optimum.

11.3.4 Probabilistic Search Methods

Probabilistic search methods such as simulated annealing and genetic algorithms have
a number of parameters that can be tuned to tailor the method to the problem at
hand. For simulated anealing these parameters include the initial temperature and
the rate of cooling. For genetic algorithms the tuning parameters are the probabilities
of the various genetic operators, such as mutation, as well as population size and
convergence criteria. The design of unstiffened laminates using Classical Lamination
theory is a good problem for tuning such parameters because it is so computationally
inexpensive to optimize.
For simulated annealing Lombardi [28] studied the effect of initial temperature
and cooling rate on the performance of the algorithm for the buckling load maximiza-
tion problem described in the previous section. The performance of the algorithm was
judged by two criteria: computational cost and reliability in finding the global opti-
mum. The problem tends to have a large number of solutions (stacking sequences)
with very similar buckling loads. For this reason, a success was defined as a solution
which is within 0.1 % of the maximum buckling load. Results were obtained for 32-
ply plates where plies were grouped in stacks of two O-deg, 90-deg or ±45-deg plies.
For symmetric laminates this requires to define the angles of 8 stacks for a total
of 38 = 6561 possiblilities. The simulated annealing algorithm required about 1000
analyses for high reliability, which is a sizable fraction of the design space. However,
when the number of plies was incleased from 32 to 64, the number of required anal-
yses increased only to about 3000, while the number of possible designs increased to
3 16 = 43 million.
Le Riche and Haftka [29] solved the same buckling maximization problem for 48-
and 64-ply laminates using genetic algorithms. Tuning the probabilites of the genetic
operators as well as the population size could reduce substantially the number of
required analyses. For 48-ply laminates, for example, the number of required analyses
was found to be about of 200-300. One advantage of the genetic algorithm is that
450
Section 11.4: Design Applications
it yields several near optimal designs, rather than one optimum. For example, for a
= =
plate with a 20in, b 5in, N x = =
llbjin, and Ny 0.51bjin two of the best designs
were: [90 2 , ±452 , 902 , ±45, 902 , ±456]s, and [±45, 90 4 , ±45, 90 2 , ±455 , 90 2 , ±45]s. The
first laminate has a buckling load of .xc = 9998, while the second buckles at .xc = 9976.
For a designer, the differences between the laminates, such as the presence of ±45-deg
plies on the outside, or the reduced percentage of 90-deg plies in the second laminate
may be more important than the 0.2% difference in buckling loads.

11.4 Design Applications

11.4.1 Stiffened Plate Design

Laminated plates stiffened by longitudinal and transverse members are one of the
most common structural components. Use of stiffeners makes it possible to resist
highly directional loads, and to introduce multiple load paths that may provide pro-
tection against damage and crack growth under both compressive and tensile loads.
The biggest advantage of the stiffeners, though, is the increased bending stiffness
of the panel with a minimum of additional material, which makes these structures
highly desirable for out-of-plane loads and destabilizing compressive loads. In addi-
tion to placement of the stiffeners to resist directional loads, the use of composite
materials makes it possible to further tailor the stiffness and strength characteristics
of the individual elements (such as webs, flanges, and skin) of a stiffened plate to meet
various structural requirements. This local tailoring is achieved through selection of
ply orientations and thicknesses for the different sections of the plate. Also the use
of composite materials makes it possible to adopt stiffener cross-sectional geometries
which may be expensive to manufacture using metallic materials.
However, the complex behavior of stiffened composite plates makes it difficult
to adopt the simplifying assumptions used for the analysis of flat laminates which
often lead to closed-form solutions. Therefore, design optimization of such plates
typically requires use of numerical algorithms. In this section we will discuss the
design of stiffened composite plates under compressive and shear loadings, and subject
to mainly buckling constraints.
In one of the early studies of optimum design of stiffened plates, Stroud and
Agranoff [30] considered a longitudinally stiffened plate composed of an assembly
of orthotropic plate elements. The plate configurations were limited to corrugated
and hat-stiffened plates, but the same procedure used in Ref. 30 can be extended to
other geometries such as the ones shown in Figure 11.4.1. The simplified analysis was
based on buckling of orthotropic plates with simply supported boundary conditions.
Both global and local modes of buckling were considered. The global buckling analysis
modeled the stiffened plate as an orthotropic plate with smeared stiffeners, a'lsumed to
buckle as a wide column. For local buckling, each element of the plate wac;; considered
separately as a narrow strip of orthotropic plate with simply supported boundary
451
Chapter 11: Optimum Design of Laminated Composite Structures

Figure 11.4.1 Examples of typical stiffened panel concepts.

conditions along the lines of attachment to adjacent elements. That is, the rotational
restraint between panel elements such as stiffener and skin was ignored, and the
continuity of the buckling mode shapes between different elements was not accounted
for. Equations for the buckling loads resulting from these assumptions are presented
in Table 11.4.1 for plates loaded by compressive and shear loads.
The local buckling equations in the table are applied to each of the plate elements
of width b and length L. The length L of each element is assumed to be much larger
than the width of the elements for both longitudinal compression and shear loadings.
The D;/s are the bending stiffness coefficients (Eq. 11.1.18) of the respective plate
elements. For global buckling under longitudinal compression, the panel is treated
as a wide column with the loaded edges simply supported and the unloaded edges
free. The longitudinal stiffness of the column is equal to the smeared longitudinal
stiffness of the panel, EI. For the shear loading case, the stiffened panel is modeled
as a uniform thickness orthotropic laminate (with smeared orthotropic properties,
D l , D 2 , and D 3 ) infinitely long in the transverse direction and simply supported
along the loaded edges. The smeared stiffness terms (EI, D 1 , D 2 , and D 3 ) in the
global buckling relations strongly depend on the cross-sectional configuration of the
stiffeners. The calculation of these smeared stiffnesses for complicated stiffened panel
geometries is quite involved and requires various kinematic assumptions depending
on the applied loads. The derivation of some of the smeared stiffness terms is demon-
strated in Ref. 30 for corrugated and hat-stiffened panels.
The design problem of Ref. 30 was formulated as a mathematical programming
problem with panel mass per unit width being the objective function. Design variables

452
Section 11.4: Design Applications
Table 11.4.1 : Overall and Local Buckling Equations from Reference 30
Loading Equation Reference
Global Buckling
Longitudinal Eq. (92), [31)
Compression Eq. (3), [32)

Shear ( = JD1D2 Eqs. (2.2.2-21),


D3
2 2 3! 5.05
For ( > 1, Nxy,er = ([) (Dl D2)4 (8.125 + -(-) (2.2.2-22), [33)

For ( < 1, Nxy,er = (~?JD1D3(l1.7+0.532(+0.938(2) pp. 468-471, [34)

Combined Nx +( N xy ) 2 _ 1 Eq. (105.8), [34)


Nx,er Nxy,cr -

Local Buckling
Longitudinal 21l'2 [ 1 + D12 + 2D66]
Nx,er = ---,;2 (DllD22)'i
Eq. (92), [31)
Compression Eq. (3), [32)

Shear (= JD ll D 22 Eqs. (2.2.2-21),


D12 + 2D66
2 2 3! 5.05
For ( > 1, Nxy,er = (b) (D ll D 22 ) 4 (8.125 + -(-) (2.2.2-22), [33) ;

For ( < 1, Nxy,er = (~)2J D 22 (D 12 + 2D 66 ) pp. 468-471, [34)


(11.7 + 0.532( + 0.938(2)
2
Combined Nx +( N xy ) _ 1 Eq. (105.8), [34)
N x,er Nxy,cr -

were the element widths and thicknesses of the layers that make up the elements. The
design constraints were buckling load, strength and stiffness requirements, and lower
and upper bounds on some of the panel dimensions. A general purpose optimization
code AESOP [35)' which is based on exterior penalty function formulation, was used
for the design optimization.

A more rigorous design procedure [36] based on a stiffened panel buckling and
vibration analysis code VIPASA [37, 38) and a mathematical programming code
based on the method of feasible directions algorithm (see Section 5.6 ) CONMIN [39)
was introduced to improve some of the assumptions made in Ref. 30. The analysis
code VIPASA is capable of computing buckling loads of structures comprised of flat
rectangular plate elements connected together along their longitudinal edges. As
opposed to the procedure used in Ref. 30, the analysis accounts for the physical
connection between the adjacent elements by maintaining the continuity of the buckle

453
Chapter 11: Optimum Design of Laminated Composite Structures
patterns across the intersection of neighboring plate elements. Buckling solutions are
based on exact thin-plate equations with D 16 and D 26 anisotropic stiffness terms so
that bending-twisting coupling is allowed. Individual plate elements may be isotropic,
orthotropic, or anisotropic. However, the laminates that make up the elements are
limited to balanced symmetric layups such that bending-extension and extension-
shearing couplings are eliminated. Another limitation of the analysis is the buckling
boundary conditions. Although the unconnected longitudinal edges may take various
boundary conditions, the boundary conditions along the loaded edges are limited
to simply supported conditions. Any combination of longitudinal, transverse, and
shearing loads that are constant along the length of the panel may be applied, see
Fig. 11.4.2. However, as will be discussed later, in the case of applied shear loads
the limitation of the simply supported boundary condition at the loaded edge may
result in inaccuracies in the buckling load calculations.

Figure 11.4.2 Loading conditions and initial imperfections.

The VIPASA analysis program was eventually used by Stroud and Anderson as
the basis of a design code PASCO [40,41] which is commonly used for preliminary de-
sign of uniaxially-stiffened panel structures. PASCO uses the nonlinear mathematical
programming code CONMIN [39] for optimization. The design problem is formulated
so as to minimize the panel mass for a given set of loadings. Constraints include up-
per and lower bounds on design variables, lower bounds on material strength and
buckling loads, lower and upper bounds on overall bending, extensional, and shear
stiffnesses, and lower bounds on vibration frequencies. In addition to the design con-
dition described for VIPASA analysis (Nx , Ny, N xy ), PASCO includes applied bending
moment (Mx), lateral pressure (p), overall bow-type initial imperfections, and tem-
perature loadings. The effects of the bending strains, resulting from the applied
bending moment, pressure, initial imperfection, or the temperature, are included in
the strain failure analysis by superimposing them on the uniform strains resulting
from the in-plane loads. The bending strains resulting from the applied pressure
and bow-type imperfections are calculated based on a beam-column approach [42] by
calculating the corresponding bending moment at the panel midlength. This maxi-
mum bending moment is conservatively assumed to act over the entire panel length.

454
Section 11.4: Design Applications

This approach is in line with the VIPASA requirement that the prebuckling stress
distribution be constant along the panel length. For more detailed discussion of the
bending moments see Ref. 40. Use of multiple sets of design conditions is also allowed
in PASCO. The set of design variables consists of the widths, b, the ply thicknesses,
t, and orientations, 6, of any of the plate elements that make up the panel. Re-
ducing the number of design variables by linking of some of the element dimensions
or ply orientations through linear relations is also possible. PASCO is also capable
of implementing approximations for the buckling and vibration constraints through
first-order Taylor series expansion of those constraints, and set move limits for the
design variables. This aspect of the code makes it computationally efficient and very
attractive for preliminary design purposes, and lets the designer compare various
design concepts in a cost-effective manner.

Example 11.4.1

a) Tailored corrugated panel.

roc
b) Corrugated panel with continuous laminate
2b j

1\ 1--:::--1 H
2bl b3
c) Hat-stiffened panel

o
d) Blade-stiffened plate

e) Unstiffened flat plate.

Figure 11.4.3 Design configurations.

455
Chapter 11: Optimum Design of Laminated Composite Stmctures

This example by Swanson and Giirdal [43] is a comparison of the structural ef-
ficiencies of optimally designed composite wing rib panel configurations typical of a
center-wing-box fuel-cell closeout rib of large transport-type aircraft. Rib dimensions
of 28 inches high by 80 inches wide are used. The panel configurations are cho-
sen to be practical and applicable to cost-effective manufacturing techniques. These
configurations are shown in Fig. 11.4.3, and include a tailored corrugated panel, a
corrugated panel with a continuous laminate throughout its length and width, and a
hat-stiffened panel. A corrugated panel is relatively easy to manufacture since it has
continuous plies which run throughout the configuration that form integral stiffeners
without requiring fasteners. It is also suitable for the thermoforming process which is
a potentially economical manufacturing technique for thermoplastic materials. Also
included are a blade-stiffened panel, which is the most commonly used concept for
wing rib applications, and a flat unstiffened plate which is used as a ba.'leline config-
uration for comparison.

The constraints considered in this example include those associated with material
strength, buckling, and geometric limits. The material failure criterion chosen is the
maximum strain failure criterion. The buckling criterion implemented is hased on a
common design practice used for wing structures that docs not allow the components
to buckle at design limit loads. Thus, the design of the wing rib does not consider
any post buckling-load-carrying capability of the panel.

The design variahles are thc thicknesses of plies with different ply orientations in
the different sections of the panels. Conventional ply angles of ±45-deg, O-deg, and
90-deg orientations are chosen. Also, detailed cross-sectional dimensions are used as
sizing variables to determine the best cross-sectional geometry. Hercules AS4/3502
preimpregnated graphite-epoxy tape is chosen as a typical graphite-epoxy material.

211 (45°)

212 (0°)
0° plies
211 (45°)

Figure 11.4.4 Tailored corrugated panel model.

456
Section 11.4: Design Applications

The geometry of the repeating elements is typically defined by the plate element
width design variables b1 through b4 as shown in Fig. 11.4.3. For the corrugated
panels, for example, both the upper and lower corrugation caps are assumed to be
of equal width due to symmetry. The plate element widths, b2 and b3 , define the
corrugated panel web angle. The panel webs are made of only ±45-deg plies, Fig.
11.4.4, that run continuously across the width of the cross section. Such continuous
plies help reduce manufacturing costs and eliminate stress concentrations that could
occur at the ±45-deg ply termination points. In the plate elements which make up the
caps, O-deg plies are included between the layers of ±45-deg fibers. Thus, the entire
laminate is defined by two thickness design variables, tl and t2, relating to the 45-deg
and O-deg plies, respectively. Cross-sectional details of the other configurations can
be obtained from Ref. 43.
The loads considered in Ref. 43 are combined in-plane axial compression (Nx ),
shear (Nxy ), and pressure (p) loads with magnitudes typical of an inboard wing rib
fuel closeout cell for a large transport aircraft. In the present example a load index of
N x / L, where L is the panel length, is used with values ranging from 0.3 to 1000 lb/ in 2 .
This range includes loadings above and below typical rib loads so that design trends
for panels for other subcomponents, such as a wing skin, are covered.

1.0 10.0 100.0 1000.0


~ (~)
L • In'

Figure 11.4.5 Structural efficiency of axial compression loaded panels.

The effect of axial compression load intensity on the structural efficiency and
geometry of all the panel configurations considered in the present study is shown in
Figs. 11.4.5. The tailored corrugated panel concept with different laminates in the
corrugation crowns and webs is the most structurally efficient configuration. The
corrugated panel concept with a continuous laminate is the next most structurally

457
Chapter 11: Optimum Design of Laminated Composite Structures
efficient concept, followed by the blade-stiffened panel concept, the hat-stiffened panel
concept, and the unstiffened flat panels (see Fig. 11.4.5). The weight differences in
this load range are due largely to the modeling of the laminates that define the
panel geometry. Each configuration is modeled such that a minimum number of plies
necessary to define the geometry is used, and that number differs for each model.
For low axial load intensity, all configurations, excluding the unstiffened plate, are
constrained by the same minimum gage ply thickness of 0.005 inches on all the plies.
Therefore, the weight of a panel is almost directly proportional to the number of
layers in the cross section and is independent of the intensity of the load .•••

One drawback of PASCO is possible inaccuracy in modeling the boundary condi-


tions under shear loads. Boundary conditions on the panel ends perpendicular to the
stiffeners are assumed to be simply supported and cannot be changed. \Vithout the
shearing loads, the buckling pattern consists of a series of straight nodal lines that
coincide with the loaded edges of the panel. When shear is applied to the panels,
the buckling pattern consists of a series of skewed nodal lines and the buckling load
calculated for this load case may deviate from to the buckling load of a simply sup-
ported plate. In particular, if a single buckling half-wave of length A forms along the
panel length, L, the PASCO analysis can severely underestimate the buckling load.
An optional smeared stiffness solution [38] is included in PASCO for the A = L
case to provide a more accurate solution when a shear load is present. The smeared
stiffness approach was shown [44] to be an improved solution but not always a con-
servative one. Additionally, in order to achieve an optimally designed stiffened panel
configuration, the full cross-sectional detail must be retained to account for local
stiffener buckling, while at the same time, maintaining the simple support bound-
ary conditions at the loaded edges. The smeared stiffener solution in PASCO does
not account for such detail. An improved analysis exists in the VICON, (VIPASA
with CONstraints) program [45, 46] which modifies the VIPASA buckling analysis
to include supports at arbitrary locations along t.he panel length through the use of
Lagrange multipliers. By specifying the supports at intervals corresponding to the
ends of the desired panel length, the simple support boundary conditions can be en-
forced at the panel ends when shear is applied. The VICON analysis has recently
been implemented in a design code VICONOPT by Butler and Williams [47].

The design requirement that does not allow buckling of the panels at the limit
load is appropriate for wing and empennage cover panels because of nonstructural
considerations such as maintaining a good aerodynamic surface. However, fuselage
panels of metallic aircraft structures are commonly designed to buckle below their
ultimate loads. The lack of sufficient information on the post buckling response of
composite panels hindered the application of such a design philosophy in the past.
Realization of the possible weight saving kindled interest in designing post buckled
panels in recent years (see Dickson et al. [48, 49] and Shin et al. [50]. A non-
linear theory for the prediction of behavior of locally imperfect stiffened panels has
been incorporated by Bushnell into the design optimization program PANDA2 [51].
However, because of the complexity and serious computational cost involved in post-
buckling analysis of stiffened panel structures, optimal design of such panels is still
far from being a routine practice.

458
Section 11.4: Design Applications

11.4.2 Aeroelastic Tailoring

Another major area of design optimization application is the aeroelastic tailoring of


aircraft wing structures which involve aeroelastic constraints. Aeroelastic tailoring
involves the use of structural deformations to improve the structural and aerodynamic
characteristics of a lifting surface. A suggested standard definition [52] is;
Aeroelastic tailoring is the embodiment of directional stiffness into an aircraft
structural design to control aeroelastic deformation, static or dynamic, in such a
fashion as to affect the aerodynamic and structural performance of that aircraft in a
beneficial way.
The beneficial behavior characteristics are those associated with the aero elastic
twist, aeroelastic camber, improved flutter and divergence speeds, reduced aeroelastic
roll control losses, and increased strength [53].
The subject of aero elastic tailoring has gained popUlarity during the past decade
because of advancements in the structural optimization field and increased use of
composite materials in aircraft structures. Composite wing designs are often more
flexible than metal ones, which makes them more susceptible to aero elastic effects.
However, composite materials often provide the designer with an opportunity to im-
prove aerodynamic performance by tailoring the material response, through the use
of ply thickness and orientation design variables, to generate favorable aeroelastic
effects. While there is an increased flexibility in tailoring the design, the increased
number of design variables and complex response characteristics of composite ma-
terials make the difficult wing design problem even more difficult [54, 55]. This is
where the use of advanced optimization techniques come into picture. Although it is
still considered costly, application of rigorous optimization algorithms to the detailed
structural model of a lifting surface may make it possible to achieve the desired per-
formance improvements. Many of the early studies, however, relied on simplification
of the structural model to make the design affordable. These simplifications included,
in some cases, beam models for the structural representation. A survey of the appli-
cations of structural optimization techniques to problems of design under aeroelastic
constraints is presented by Haftka [56].
One of the early efforts in introducing structural optimization into aeroelastic tai-
loring is the TSO program developed by McCullers and Lyneh [57]. The program orig-
inated under the name WASP (Wing Aeroelastic Synthesis Procedure) [58], and uses
a mathematical programming procedure based on a penalty approach (sec Section
5.7) for converting the constrained problem into a series of unconstrained problems.
The unconstrained minimizations are performed via the Davidon-Flctcher-Powell al-
gorithm (see Section 4.2). Modeling of the wing structure is based on plate analysis
with a Ritz solution technique. The objective function may be any combination
of weight, lift curve slope, control surface effectiveness, flutter speed, fundamental
natural frequency or deflections. The design variables are coefficients of polynomi-
als which control both the orientations of the various plies and their thicknesses.
Use of a polynomial description of the design parameters along with the Ritz pro-
cedure makes the application the mathematical programming method manageable

459
Chapter 11: Optimum Design of Laminated Co losite Structures
for optimization purpose. The TSO program was used for several design studies for
aeroelastic tailoring applications to existing aircraft [54, 58, 59]
Another popular program for the design of lifting surfaces subject to strength
and aero elastic constraints is the finite element based program FASTOP developed
by Grumman [60]. The program employs optimality criteria methods (see Chapter
9), and is also capable of handling flutter constraints. Optimality criteria meth-
ods are very efficient for designs subject to a single constraint. Thus, despite the
costly finite element analysis involved, the cost of optimization was kept manageable
through the use of sequential treatment of constraints. First the stress constraints
are treated by the non-optimal Fully Stressed Design (FSD, see Section 9.1), fol-
lowed by a 'uniform-cost-effectiveness' optimality criterion (Section 9.3) for each of
the aero elastic constraints. The process is repeated with the strength and aeroelastic
constraints until convergence is achieved. Design variables are limited to thickness
or cross-sectional areas, and ply orientations are not allowed to change during the
design.
A more recent finite element based design program is ASTROS (Automated
Structural Optimization Systems) [61] developed by Northrop under an Air Force
contract. ASTROS is designed as an automated procedure to address interdisci-
plinary requirements during preliminary design of aerospace structures. The struc-
tural analysis module of ASTROS is derived from the public domain version of the
NASTRAN finite element code and forms the core of the procedure. The structural
analysis module is used to obtain structural response to applied mechanical, gravita-
tional, aerodynamic, induced thermal, and time dependent loads. Design constraints
include limits on stresses, strains, displacements, modal frequencies, flutter response,
aeroelastic lift effectiveness, and aileron effectivE'ness. Design variables that can be
used in the process are element areas and thicknesses, structural inE'rtias and con-
centrated masses. Membrane and bending elements used in the structural analysis
provide full-composite modeling capability. Individual ply thicknesses of the mate-
rial can be used as design variables, but the ply orientation design variables are not
allowed. In order to reduce the number of design variables and to assure physically
meaningful dimensions design variable linking is used. The design variable linking
is implemented together with a procedure that divides the design variables into two
groups that are identified as global and local variables. A global design variable can
be specified as a weighted sum of a number of local design variables. Similar to
TSO, shape function type of linking can be used to define shapes such as a smooth
thickness variation along the span direction. The design optimization module used
in ASTROS is the ADS (Automated design Synthesis) [62] program. All sensitivities
of the objective function and of the constraints are calculated based on analytical
derivatives. Both direct and adjoint-variable methods (see Chapter 7) are available.

11.5 Design Uncertainties

Although composite materials provide a vast, and probably so far underutilized,


freedom in tailoring structural response to sui t the needs of the designer, they also dis-
play certain problems uncharacteristic of conventional materials. Optimally designed

460
Section 11.5: Design Uncertainties
structures are known to be sensitive to changes in load conditions and imperfections.
Because of increased number of variables which enable designers to tailor the design
closer to the desired specifications, this sensitivity may be heightened for composite
structures. The simplest example of sensitivity to changes in the load condition is the
case of a laminate designed to carry uniaxial loads [63]. For this application, it can
easily be demonstrated that the best design is the one that has all the layers oriented
along the load direction. It is also well known that this design is extremely poor for
carrying loads transverse to the fiber direction. Therefore, any change in the direc-
tion of the applied design load is likely to result in a failure, wherea.<; a similar design
made of a conventional isotropic material would be capable of carrying a transverse
load of magnitude equal to the original design load.
Another complication in designing optimal composite structures is sometimes
the difficulty in identifying and imposing proper strength constraints. Not only the
load and stress distributions are functions of the ply thickness and fiber orientation
variables, but the strength properties are also dependent on these variables. Failure of
composite laminates is largely due to highly localized stresses. The number of possible
local failure modes is large, and these failure modes are generally micromechanically
governed and complex. Fiber breaking, matrix cracking, fiber-matrix debonding, and
separation of individual layers can result in surface and through-the-thickness cracks,
splits, and delaminations. Under compressive loads, even the instability of fibers on
a microscopic scale (often referred as fiber microbuckling) wa.<; proposed as a failure
mechanism, although based on more recent studies compression failures for high-
performance composites are believed to be strength-related failures. Furthermore,
failure modes can interact with one another making the strength prediction even
more difficult.
Some of the basic assumptions used for simplification of the laminate stress anal-
ysis that reduce the three-dimensional nature of the laminated composites to two
dimensions may also cause loss of information important for failure predictions. It is
well known that laminated composite plates can locally display a three-dimensional
stress state. The most common examples of these three-dimensional effects are free-
edge stresses, and interlaminar stresses at the stiffener-skin interface of stiffened pan-
els. It is important that designers be aware of such local effects during the formulation
of the optimization problem and include appropriate constraints to account for them.
It is only fair to claim that some of the design-related issues of composites failures
are not well understood. Sometimes strength quantities that are needed for imple-
mentation of a certain stress constraint may not be available. For example, based
on their experience with metallic materials, designers often look for a compressive
material strength limit that they can include in an optimization problem. It can be
argued that the compressive failure strength is a highly problem-dependent quantity,
rather than a material strength parameter. In some applications, the lack of under-
standing and availability of predictive models for certain design considerations may
hamper the design effort. For example, unlike metallic materials, composites have
been found to be sensitive to low-velocity impact loadings. Currently, there is no
predictive model that can realistically be used for designing laminates under impact
damage conditions. Some of these topics are still under development and constitute
a major effort in the area of mechanics of composite materials.

461
Chapter 11,' Optimum Design of Laminated Composite Structures
Under these difficulties, designers sometimes resort to practical guidelines. Rather
than using ply orientation angles as design variables, designers often fix them to
prescribed practical angles such as O-deg, ±45-deg, and 90-deg. Even if the applied
loading is highly directional, such as panels under uniaxial loadings, presence of plies
other than the ones aligned along the load direction provides increased safety for off-
design load conditions such as unexpected transverse loads. In order to assure that
the thickness design variables associated with those plies that are placed based on
intuitive guidelines do not disappear, either lower bound on those thicknesses are used
or additional loads are specified. For example, application of a certain percentage of
the axial load as shear load leads to non-zero thickness for ±45-deg layers even if the
lower bound on those layers is zero.
The selection of a stacking sequence for a laminate is also guided by intuitive
considerations. For example, use of ±45-deg plies as the outside layers of a laminate
is preferred because of damage tolerance considerations. Another practical guideline
is not to allow more than 4 identical contiguous plies. This guideline helps to reduce
the interlaminar stresses between plies with different orientations. In order to satisfy
such ply stacking sequence rules, an iterative procedure may be used as outlined in
Ref. [64]. If the branch-and-bound algorithm with ply identity variables is used, this
requirement can easily be implemented through the nse of Eq. (11.3.5) as dcscribed
earlier.

11.6 Exercises

1. For a unidirectional laminate under uniform applied stresses, (Jx, (Jy, and T xy ,
show that the stationary values of the Tsai-Hill function

(11.6.1)

are achieved for


a cos 20 + sin 20 = 0 ,
and
a sin 20 - cos 20 =b,
where
a = 2Txy - 1, and b= (Jy + (Jx 1- 0:2 ,
(Jy (J Y - (J x (32 - 0: 2 - 2
and
o:=X/Y, and (3=X/S,
where X, Yare the normal strengths parallel and transverse directions to the fibers
and S is the shear strength.
2. Using the graphical procedure described in scction 2, determine the orientations
and thickness ratio of a balanced angle-ply symmetric, [(±01)"t/(±02),,,,], T300-5208

462
Section 11.6: Exercises

graphite-epoxy laminate with maximum effective shear stiffness G;xy. The laminate
must also meet the following stiffness requirements
Ex 2:: 17.5 106 psi, Ey 2:: 5.8 106 psi, and 0.1 2:: Vz: y 2:: 0.3 .
Engineering properties of T300-5208 graphite-epoxy material along its principal ma-
terial directions are
El = 26.25 106 psi, E2 = 1.49 106 psi, G 12 = 1.04 106 psi, and V12 = 0.28 .
3. Show that a quasi-isotropic laminate [Of, 90f, -45f, +45fl can be replaced by
[90j, Ok, -45f, +45fl with an identical D matrix by suitably selecting j and k so
that j + k = 2i (note: j and k may be non-integers).

Figure 11.6.1 Blade stiffened panel under uniaxial compression.

4. For a laminate that is made up of integer number of plies with 0-, ±30-, ±60-,
and 90-deg orientations, the design space is shown in Fig. l1.3.1-b.
a) Complete the figure by putting the stacking sequences of laminates next to the
appropriate discrete design points on the figure.
b) If the laminate is required to have a Poisson's ratio V",y greater than 0.3,
determine the stacking sequence that maximizes the transverse modulus E y •

463
Chapter 11: Optimum Design of Laminated Composite Structures
5. The skin laminate of a ~irnply snpported blade stiffened panel shown in Figure
11.6.1 is a [±45 n l s construction, and the stiffeners are made of unidirectional laminae.
Determine the longitudinal ~rneared stiffness EI which can be used for the global
buckling load calculation presented in Table 11.3.1. Assuming the thicknesses of
individual plies to be continuously variable, determine the minimum weight design
for an axial compression of N x = 10000 lbjin. Consider only buckling constraints.

11. 7 References

[11 Jones, R. "Nr., !\Iechanics of Composite Materials, IVlcGraw-Hill Book Co., New
York, pp. 45~57, 1975.
[2J Tsai, S. \V., and Pagano N. J., "Invariant Properties of Composite :vlaterials,"
in Composite lvlaterials \Vorkshop, (Eds. Tsai, S.\V., Halpin, J.C., Pagano, N.J.)
Technomic Publishing Co., Westport, pp. 233-253, 1968.
[3] Caprino, G., and Crivelli Visconti, 1., "A Note on Specially Orthotropic Lami-
nates," J. Compo Il,latls., 16, pp. 395~399, 1982.
[4] Gllnnink, ,T. \V., "Comment on A Kote on Specially Orthotropic Laminates," ,T.
Compo 1\1atls., 17, pp. 508510, 1983.
[5] Kandil, N., and Verchery, G., "New Methods of Design for Stacking Sequences of
Laminates ," Proceedings of the International Conference on "Compnter Aided
Design in Composite Ivlaterial Technology," Eels. Brebbia, C. A., de \Vilcle, \V.
P., and Blain, W. R, pp. 243-257, 1988.
[6] Schmit, L. A., and Farshi, B., "Optimum Laminate Design for Strength and
Stiffness," Int. J. Num. Mcth. Engng., 7, pp. 519-536,1973.
[7J Park, W. J., "An Optimal Design of Simple Symmetric Laminates Uncler the
First Ply Failure Criterion," J. Compo I\Iatls., 15, pp. 341-355, 1982.
[8] Massard, T. K., "Computer Sizing of Composite Laminates for Strength," ,T.
Reinf. Plastics and Composites, 3, pp. 300-345, 1984.
[9] Tsai, S. \V., and Hahn, H. T., Introduction to Composite Materials, Technomic
Publishing Co., Inc., Lancaster, Pa., pp. 315-325, 1980.
[10] Tsai, S. W., "Strength Theories of Filamentary Structnres," in R. T. Schwart~
and H. S. Schwartz (cds.), Fundamental Aspects of Fiber Reinforced Plastic,
\Viley Illterscience, New York, pp. 3-11, 1968.
[11] Brandmaier, H. E., "Optimum Filament Orientation Criteria," J. Composite Ma-
terials, 4, pp. 422-425, 1970.
[12J Miki, M., ":VIaterial Design of Composite Laminates with Required In Plane Elas-
tic Properties," Progress in Science and EngillE'ering of Composites, Eds., T.

464
Section 11.7: References

Hayashi, K. Kawata, and S. Umekawa, ICCM-IV, Tokyo, Vol. 2, pp. 1725-1731,


1982.
[13] Miki, M., "A Graphical Method for Designing Fibrous Laminated Composites
with Required In- plane Stiffness," Trans. JSCM, 9, 2, pp. 51-55, 1983.
[14] Schmit, L. A., and Farshi, B., "Optimum Design of Laminated Fihre Composite
Plates," Int. J. l\'um. Meth. Engng., 11, pp. 623640, 1977.
[15] Miki, M., "Optimum Design of Laminated Composite Plates Subject to Axial
Compression," Composites' 86: Recent Advances in .Japan and the United States,
Eds., Kawata, K., Umekawa, S., and Kobaya.shi, A., Proc. Japan-U.S. CCM- III,
Tokyo, pp. 673-680, 1986.
[16] Bert, C. W., "Optimal Design of a Composite-~laterial Plate to Maximize its
Fundamental Frequency," J. Sound and Vibration, 50 (2), pp. 229-237, 1977.
[17] Rao, S. S., and Singh K., "Optimum Design of Laminates with Natural Frequency
Constraints," J. Sound and Vibration, 67 (1) pp. 101-112,1979.
[18] Mesquita, L., and Kamat, M. P., "Optimization of Stiffened Laminated Compos-
ite Plates with Frequency Constraints," Eng. Opt., 11, pp. 77-88, 1987.
[19] Cheng Kengtung, "Sensitivity Analysis and a I\Iixed Approach to the Optimiza-
tion of Symmetric Layered Composite Plates," Eng. Opt., 9, pp. 233-248, 1986.
[20] Pedersen, P., "On Sensitivity Analysis and Optimal Design of Specially Or-
thotropic Laminates," Eng. Opt., 11, pp. 305-316, 1987.
[21] Muc, A., "Optimal Fiber Orientation for Simply-Supported, Angle-Ply Plates
Under Biaxial Compression," Compo Struc., 9, pp. 161-172,1988.
[22] Shin, Y. S., Haftka, R. T., Watson, L. T., and Plaut, R. H., "Design of Laminated
Plates for Maximum Buckling Load", J. Composite Materials, 23, pp. 348-369,
1989
[23] Miki, M., and Sugiyama, Y., "Optimum Design of Laminated Composite Plates
Using Lamination Parameters," Proceedings of the AIAA/ ASME/ ASCE/ AHS/
ASC 32th Structures, Structural Dynamics, and Materials Conference, Balti-
more, MA., Part I, pp. 275-283, April, 1991.
[24] Gurdal, Z. and Haftka, R. T., "Optimization of Composite Laminates," presented
at the NATO Advanced Study Intitute on Optimization of Large Structural Sys-
tems, Berchtesgaden, Germany, Sept. 23 - Oct. 4, 1991.
[25] Haftka, R.T., and Walsh, J.L., "Stacking-Sequence Optimization for Buckling of
Laminated Plates by Integer Programming, AIAA Journal (in Press).
[26] Schrage, L., Linear, Integer and Quadratic Programming with LINDO, 4th Edi-
tion, The Scientific Press, Redwood City CA., 1989.
[27] Nagendra, S., Haftka, R. T., and Gurdal, Z., "Optimization of Laminate Stacking
sequence with Stability and Strain Constraints," submitted for presentation at

465
Chapter 11: Optimum Design of Laminated Composite Structures
the AIAAj ASMEj ASCEj AHSj ASC 33th Structures, Structural Dynamics,
and Materials Conference, Dallas, TX., April, 1992.
[28] Lombardi, M., "Ottimizzazione di Lastre in Materiale Composito con l'uso di
un Metodo di Annealing Simulato," Tesi di Laurea, Department of Structural
Mechanics, University of Pavia, 1990.
[29J Le Riche, R., and Haftka, R.T., "Optimization of Laminate Stacking-Sequence for
Buckling Load Maximization by Genetic Algorithm," submitted for presentation
at the AIAAj ASMEj ASCEj AHSj ASC 33th Structures, Structural Dynamics,
and Materials Conference, Dallas, TX., April, 1992.
[30] Stroud, W. J., and Agranoff, N., "Minimum-mass Design of Filamentary Com-
posite Panels Under Combined Loadings: Design Procedure Based on Simplified
Buckling Equations," NASA TN D-8257, 1976.
[31] Timoshenko, S., Theory of Elastic Stability, McGraw-Hill, New York, 1936.
[32] Stein, M., and 1\1ayers, J., "Compressive Buckling of Simply Supported Curved
Plates and Cylinders of Sandwich Construction," NACA TN 2G01, 1952.
[33] Advanced Composites Design Guide. Vols. I~V, Third Edition, U.S. Air Force,
Jan. 1973.
[34] Lekhnitskii, S. G., Anisotropic Plates. Translated by Tsai, S. W., and Cheron,
T., Gordon and Breach Sci. Pub!., Inc., New York, 19G8.
[35] Hague, D. S., and Glatt, C. R., "A Guide to the Automated Engineering and
Scientific Optimization Program, AESOP, " NASA CR-73201, April H)G8.
[3G] Stroud, \Y. J., Agranoff, N., and Anderson, t-.I. S., "Minimum-Mass Design of
Filamentary Composite Panels Under Combined Loads: Design Procedure Based
on a Rigorous Buckling Analysis, " NASA TN D-8417, July 1977.
[37] Wittrick, W. H., and Williams, F. W., "Buckling and Vibration of Anisotropic
or Isotropic Plate Assemblies Under Combined Loadings, " Int. J. Mech. Sci., 16,
4, pp. 209-239, April 1974.
[38J Plank, R. J., and Williams, F. W., "Critical Buckling of Some Stiffened Panels in
Compression, Shear and Bending, " Aeronautical Q., XXV, Part 3, pp. 165-179,
August 1974.
[39J Vandcrplaats, G. N., "CONMIN - A Fortran Program for Constrained Function
Minimization, User's Manual," NASA TM X-52, 282, 1973.
[40] Stroud, W. J., and Anderson, M. S., "PASCO: Structural Panel Analysis and Siz-
ing Code, Capability and Analytical Foundations, " NASA TM 80181, November
1981.
[41] Anderson, M. S., Stroud, W. J., Durling, B. J., and Hennessy, K. W., "PASCO:
Structural Panel Analysis and Sizing Code, User's Manual, " NASA TM 80182,
November 1981.

466
Section 11. 7: References
[42] Giles, G. L., and Anderson, M. S., "Effects of Eccentricities and Lateral Pressure
on the Design of Stiffened Compression Panels," NASA TN D-6784, June 1972.
[43] Swanson, G. D., and Giirdal, Z., "Structural Efficiency Study of Graphite-Epoxy
Aircraft Rib Structures," J. Aircraft, 27 (12), pp. 1011-1020, 1990.
[44] Stroud, W.J., Greene, W.H., and Anderson, M.S., "Buckling Loads of Stiffened
Panels Subjected to Combined Longitudinal Compression and Shear: Results Ob-
tained With PASCO, EAL, and STAGS Computer Programs," NASA TP 2215,
January 1984.
[45] Williams, F.W., and Kennedy, D., "User's Guide to VICON, VIPASA with Con-
straints," Department of Civil Engineering and Building Technology, University
of Wales Institute of Science and Technology, August, 1984.
[46] Williams, F.W., and Anderson, M.S., "Incorporation of Lagrangian Multipliers
into an Algorithm for Finding Exact Natural Frequencies or Critical Buckling
Loads," Int. J. Mech. Sci., 25, 8, pp. 579-584, 1983.
[47] Butler, R., and Williams, F.W., "Optimum Design Features of VICONOPT, an
Exact Buckling Program for Prismatic Assemblies of Anisotropic Plates," Pro-
ceedings of the AIAA/ ASME/ ASCE/ AHS/ ASC 31st Structures, Structural Dy-
namics, and Materials Conference, Long Beach, CA, Part 2, pp. 1289-1299,1990.
[48] Dickson, J. N., Cole, R. T., and Wang, J. T. S., "Design of Stiffened Composite
Panels in the Post buckling Range, " In Fibrous Composites in Structural Design,
Eds. Lenoe, E. M., Oplinger, D. W., and Burke, .J . .J., Plenum Press, New York,
pp. 313-327, 1980.
[49] Dickson, J. N., and Biggers, S. B., "Design and Analysis of a Stiffened Composite
Fuselage Panel, " NASA CR-159302, August 1980.
[50] Shin, D.K., Giirdal, Z., and Griffin, O. H . .Jr., "Minimum-Weight Design of
Laminated Composite Plates for Post buckling Performance, Proceedings of the
AIAA/ AS ME/ ASCE/ AHS/ ASC 32th Structures, Structural Dynamics, and Ma-
terials Conference, Baltimore, Maryland, Part I, pp. 257-266, 1991.
[51] Bushnell, D., "PANDA2 - Program for Minimum Weight Design of Stiffened,
Composite, Locally Buckled Panels," Comput. Struct., 25 (4), pp. 469-605, 1987.
[52] Shirk, M. H., Hertz, T . .J., and Weisshaar, T. A., "Aeroelastic Tailoring - Theory,
Practice, and Promise, " J. Aircraft, 23 (1), pp. 6-18, 1986.
[53] Lynch, R. W., and Rogers, W. A., "Aeroelastic Tailoring of Composite Materials
to Improve Performance, " Proceedings of the AIAA/ ASMEjSAE, 17th Struc-
tures Structural Dynamics and Materials Conference, King of Prussia, PA., May
5-7, pp. 61-68, 1976.
[54] McCullers, L. A., "Automated Design of Advanced Composite Structures, " Pro-
ceedings of the ASME Structural Optimization Symposium, AMD-7, pp. 119-
133, 1974.

467
Chapter 11: Optimum Design of Laminated Composite Structures

[55] McCullers, L. A., and Lynch, R. \V., "Dynamic Characteristics of Advanced Fil-
amentary Composite Structures, " AFFDL-TR-73-111, vol. II, Sept. 1974.
[56] Haftka, R. T., "Structural Optimization with Aeroelastic Constraints: A Survey
of US Applications, " Int. J. of Vehicle Design, 7 (3/4), pp. 381-392, 1986.
[57] McCullers, L. A., and Lynch, R. W., "Composite Wing Design for Aeroelastic
Tailoring Requirements, " Air Force Conference on Fibrous Composites in Flight
Vehicle Design, Sept. 1972.
[58] Fant, J. A., "An Advanced Composite Wing for the F-16, " paper presented at
the 22nd National SAMPE Symposium and Exhibition, San Diego, pp. 773-783,
April 1977.
[59] Gimmestad, D., "Aeroelastic Tailoring of a Composite Winglct for KC-135,"
AIAA Paper No. 81-0607, presented at the AIAA/ ASME/ ASCE/AHS 22nd
Structures, Structural Dynamics and Materials Conference, Atlanta, GA., Part
2, pp. 373-376, April 1981.
[60] Wilkinson, K., Markowitz, J., Lerner, E., George, D., and Batill, S. M., "FASTOP:
A Flutter and Strength Optimization Program for Lifting Surface Structures, "
J. Aircraft, 14 (6), pp. 581-587,1977.
[61] Neill, D ..1., Johnson, E.H., and Canfield, R., "ASTROS-A Multidisciplinary Au-
tomated Structural Design Tool," J. Aircraft, 27, 12, pp. 1021 1027,1990.
[62] Vanderplaats, G. N., "ADS - A Fortran Program for Automated Design Syn-
thesis, " NASA-CR-177985, Sept. 1985.
[63] Stroud, W . .1., "Optimization of Composite Structures," NASA TM 84544, Au-
gust 1982.
[64] Nagendra, S., Haftka, R. T., Gurdal, Z., and Starnes, J. H., Jr., "Design of a
Blade-Stiffened Composite Panel with a Hole," Composite Structures, Vol. 18
(3), pp. 195-219,1991.

468
Name Index

Aarts, E. 148, 157 Bert, C.W. 465


Abadie, J. 177,206 Bertsekas, D.P. 199, 207
Adali, S. 20 Biggers, S.B. 467
Adelman, H.M. 226,251,302, 303, 352 Bindolino, G. 290,303
385 Bjorck, A. 172, 206
Agranoff, N. 451,466 Blackburn, C.L. 385
Anderson, M.S. 454, 466, 467 Blain, W.R. 464
Aragon, C.R. 157 Booker, L. 151,158
Armand, J.L. 57,67,69 Botkin, M.E. 251
Arora, J.S. 21, 176, 206, 254, 346 Box, G.E.P. 250
Ashley, H. 20 Brach, R.M. 68
Atrek, E. 253 Braibant, V. 20, 214, 250, 386
Avriel, M. 131, 154 Bnima, T. 253
Axelsson, O. 155 Brandt, A. 19
Brandmaier, H.E. 426, 464
Balasubramanyam, K. 61, 69 Brill, E.D. 157
Balling, R.J. 149, 158 Brotchie, J.F. 384
Barnett, R.L. 38, 66 Broydcn, C.G. 138-140,155,156
Barthelemy, B. 303,346 Bruno, R.J. 158
Barthelemy, J-F. M. 206,220,250, 251, Burke, J.J. 467
304, 390, 400, 412, 413 Burns, N.H. 113
Bartholomew, P. 252 Burrell, B. P. 114
Baruh, H. 304 Bushnell, D. 20, 458, 467
Batill, S.M. 468 Butler, R. 458, 467
Beale, E.M.L. 136
Beckers, P. 20 Calladine, C.R. 74, 113
Belsare, S. 253 Calo, J.M. 304
Bends¢e, M.P 20,48,66, 240, 251, 252 Camarda, C.J. 303
406, 413 Cameron, G.E. 386
Bennett, J.A. 251 Canfield, R.A. 253, 468
Ben-Tal, A. 413 Caprino, G. 420,464
Berke, L. 252,350, 366, 385, 386 Cardani, C. 277, 303
469
Name index

Cardoso, J. B. 346 Fadel, G.M. 221, 251


Carpentier, J. 177, 206 Falk, J.E. 354,385
Cauchy, A. 132, 155 Fant, J.A. 468
Cerny, V. 147, 157 Farshi, B. 209, 210, 249, 422, 430, 431
Chang, K-J. 251 464, 465
Charmichael, D.G. 21 Fiacco, V. 188, 206
Charnes, A. 113 Flaggs, D.L. 20
Chen, D. H. 124, 125, 155 Fletcher, R. 127, 134, 137, 140, 155, 156
Chen, G.-S. 149, 158 201, 207
Cheng, Kentung 437, 465 Fleury, C. 20, 215, 250, 252, 254, 354
Cherkaev, A.V. 69 357, 385, 386
Chibani, L. 405, 413 Fomin, S.V. 66
Choi, KK. 303, 345, 346 Fox, R. L. 405, 413
Chon, C.T. 303 Frauenthal, J.C. 57, 67
Cilly, F.H. 384 Freedman, B. A. 114
Cobb, W.G.C. 253 Friedmann, P.P. 20
Cohen, G.A. 345 Fuchs, M.B. 213, 225, 249~251
Cohn, M.Z. 77, 113
Cole, R.T. 467 Gajewski, A. 57,68
Cornell, C.A. 384 Garfinkel, R.S. 105, 114
Crivelli~ Visconti, 1. 464 Ge, R. 157
Curtis, A.R. 144, 156 Gelfand, 1.M. 66
Gelatt, C.D. 157
Dahlquist, G. 172, 206 Gellatly, R.A. 252, 386
Dailey, R.L. 281, 303 Geoffrion, A.M. 413
Dantzig, G. 88, 113, 399, 413 George, D. 468
Davidon, W.C. 127, 155 Ghalib, M.A. 66
Dayaratnam, P. 385 Ghosh, S.K. 77, 113
de Wilde, W.P. 464 Giles, G.L. 251, 385, 390, 412, 467
Decker, D.W. 157 Gill, P.E. 145, 155, 157, 170, 206, 302
De Jong, KA. 151, 158 Gimmestad, D. 468
Dems, K. 346 Ginsburg, S. 414
Dennis, J.E. 142, 155, 156 Glatt, C.R. 466
Desai, R. 251 Goldberg, D.E. 146, 150, 152, 157, 158
Dickson, J.N. 458, 467 Goldfarb, D. 140, 156
Dixon, S.C. 385 Gomrooy, R.E. 252
Doig, A.G. 107, 114 Grace, D.W. 125, 155
Dorn, W.C. 241,252 Grandhi, R.V. 20, 254
Dovi, A.R. 412 Greenberg, H ..T. 113, 252
Draper, N.R. 250 Greene, W.H. 299, 304, 467
Dupree, D.M. 252 Grierson, D.E. 361, 385, 386
Durling, B.J. 466 Griewank, A.O. 157
Dwyer, W. 385 Griffin, O.H., Jr. 207, 467
Gunnink, J.W. 420,464
Edgeworth 6, 20 Giirdal, Z. 207, 440, 456, 465, 467
Elperin, T. 149, 158 468
Emerton, R. 385
Eschenauer, H.A. 21, 346
470
.\'UIII(' II/ell'.\"

Haber, R B. 337, 346 Kao, P.-J. 251


Haftka, RT. 20, 21, 57, 60, 67, 110, 114 Karmarkar, N. 100-104,113
191, 206, 227, 250-254, 277 Kawata, K. 465
302-304, 346, 385, 396, 405, 412 Keller, J.B. 54, 66, 67
413, 440, 445, 450, 465, 466, 468 Kelley, C.T. 157
Hague, D.S. 466 Kennedy, D. 467
Hahn, H.T. 464 Khot, N.S. 252, 350, 365, 366, 369
Haj Ali, R.M. 213, 250 382, 385, 386
Hajela, P. 20, 152, 158 Kiefer, .T. 154
Haley, S.B. 225, 251 Kikuchi, N. 240, 251
Hancock, H. 66 Kincaid, RK. 149, 158
Hansen, E. 157 Kirkpatrick, S. 147,157
Hansen, S.R. 250 Kirsch, U. 20,83, 113,223,242,251,
Hariran, M. 253 252, 399, 400, 413, 414
Hartley, R.L. 158 Killsalaas, .T. 68, 253
Haug, E.J. 19, 179,206,276,303,345 Kobaya.<;hi, A. 379
Hayashi, T. 465 Kodiyalam, S. 250
Hayduk, RJ. 115, 154 Kohn, R.Y. 240, 251
Hennessy, K\V. 466 Komkov, Y. 303,345
Hertz, T.J. 467 Korst, .T. 157
Hestenes, M.R 134, 155, 199, 207 Koski, .T. 21
Hext, G. R. 123, 124, 155 Kovacs, L.B. 114
Hildebrand, F.B. 66 Kramer, M.A. 304
Himsworth, F. R. 123, 124, 155 Kreisselmeier, G. 160,206,291,304,404
Holland, J .H. 150, 158 Kruzelecki, .T. 20
Holmes, A.M.C. 20 Kwok, H.H. 156
Holnicki-Szulc, J. 225, 251
Hornbuckle, J.C. 67 Laarhoven, P.J.~L van 146, 157
Huang, N.C. 67, 139, 141 Land, A.H. 107, 114
Lansing, W. 385
Icerman, L.J. 60,68 Lawler, E.L. 114
Iott, J. 302 Lee, \V.H. 385, 386
Irvine, H. M. 113 Lekhnitskii, S.G. 466
Isakson, G. 253 Le Riche, R. 450, 466
Lerner, E. 468
James, B.B. 412, 413 Lev, O. E. 19
Johnson, D.S. 147,157 Lin, T.Y. 113
Johnson, E.H. 253, 468 Lodier, B. 57, 67
Johnson, E.L. 114 Loendorf, D.D. 390, 412
Johnson, L.W. 54, 67 Lombardi, M. 450, 466
Johnson, O.G. 155 Lowder, H.E. 213, 250
Jones, RM. 464 Luenberger, D. 113, 155
Junkins, J.L. 156 Lurie, KA. 69
Kamat, M.P. 57, 60, 67, 68, 69, 115, 154 Lust, RY. 249, 250
156, 157, 345, 405, 413, 465 Lynch, RW. 252, 459, 467, 468
Kandil, N. 420,464 Majid, KI. 84, 113
Kao, J.-J. 157 Makky, S.}.,!. 40, 66
471
/Vame Index

Mangiavacchi, A. 68 Noor, A.K. 213, 250


Mantegazza, P. 277, 290, 303 Olhoff, N. 20, 54, 57, 60, 67, 68, 69, 251
Markowitz, J. 468 8 303
Massard, T.. N 425 , 464 Ojalvo, LV. 2 1,
Onada 18, 21
Massonet, C.E. 79, 113 0 T G 251
ng, . .
Masur, E .F. 54 , 67 0 8 S 155
· H 143 156 ren, . .
Matt h les,. , Osyczka, A. 21
May, S.A. 149, 158
Mayers, J. 466 Padula, S.L. 149, 158
McCormick, G.P. 188,206 Paeng, J.K. 253
McCormick, P.J. 20 Pagano, N.J. 417,464
McCullers, L.A. 252, 459, 467, 468 Palmer, A.C. 21
McGeoch, L.A. 157 Pan, T.-S. 158
Mead, R. 124, 155 Parbery, RD. 61, 69
Mehrinfar, M. 412 Pardo, H. 252
Meketon, M. 8. 114 Pareto, V. 6, 20
Mesquita, L. 465 Parimi, 8.R 77, 113
Metropolis, N. 146, 147, 157 Paris, G.H. 113
Micchelli, C.A. 155 Park, W.J. 425, 464
Miele, A. 68 Parme, RL. 113
Miki, M. 426, 432, 438, 464, 465 Pars, L.A. 66
Mills-Curran, W.C. 213, 249, 281, 303 Patnaik, S. 385
Mitchell, A.G.M. 384 Paul, G. 155
Miura, H. 218, 243, 250, 252-254 Pedersen P. 19, 414, 433, 465
Moe, J. 206 Pfeffer, J.T. 157
Mohanty, B.P. 68 Phelan, D.G. 346
More, J.J. 156 Pierson, B.L. 19,68
Morris, D. 83, 113 Plank, RJ. 466
Moses, F 18,21,414 Plaut, RH. 54, 60, 67, 68, 345, 413, 465
Mota Soares C.A. 346,386 Polak, E. 137, 155
Mr0z, Z. 60, 68, 345, 346 Powell, M.J.D. 117, 124, 127, 136, 137
Muc, A. 434,465 142, 144, 154-156,202,207
Munksgaard, N. 155 Powell, S. 114
Murray, W. 145, 155, 157, 206, 302 Prager, W. 19, 54, 60, 66, 68, 365, 386
Murthy, D.V. 227, 251, 277, 290, 303, 304 Prasad, B. 20, 57, 60, 67, 214, 215, 250
Nachlas, J.A. 251 253
Nagendra, G. 253 Pritchard, J.1. 226,251
Nagendra, S. 447, 465, 468 Rabitz, H. 304
Nahar, S. 149, 158 Ranalli, E. 385
Narayanaswami, R. 352, 385 Rao, 8.S. 19, 152, 158, 465
Neal, B. G. 113 Rasmussen, H. 54, 67
Neill, D.J. 253, 468 Rasmussen, J. 240, 252
NeIder, J. A. 124, 155 Razani, R 385
Nelson, RB. 277, 303 Reddy, G.B. 253
Nemhauser, G.L. 105, 114 Reddy, J.N. 66
Niordson, F.1. 19, 67, 68 Reddy, V.8. 366, 386

472
Name index

Reeves, C.M. 134, 137, 155 Singh K 465


Reid, J.K 144, 156 Smaoui, H. 405, 413
Reinschmidt, KF. 241, 252, 384 Smith, C.V. 67
Riley, KM. 206 Sobieszczanski-Sobieski, J. 20, 160, 174
Riley, M.F. 251, 304,413 206, 213, 249, 253, 390, 395, 400, 404
Rinaldi, G. 114 408, 412-414
Ringertz, V.T. 405, 413 Spendley, W. 123, 124, 155
Rizzo, T. 250 Spillers, W.R. 61, 69, 414
Rogers, J.L., Jr. 253 Stadler, W. 6, 20, 21
Rogers, L.C. 277, 303 Starnes, J.H. 191, 206, 250, 468
Rogers, W.A. 467 Stein, M. 466
Rosen, J .B. 176, 178, 206, 399, 400, 413 Steinberg, Y. 225, 251
Rosenbluth, A.W. 157 Steinhauser, R. 160, 206, 291, 304, 404
Rosenbluth, M.N. 157 Stiefel, E. 134, 155
Rozvany, G.I.N. 240,251,366,386 Storaasli, 0.,0. 213, 249
Russel, A.D. 241, 252 Strang, G. 101,114,143,156,240,251
Rutenbar, R.A. 157 Stroud, W.J. 353, 385, 451, 454, 466, 467
468
Sahni, S. 158 Sugiyama, Y. 438, 465
Salajeghah, E. 250 Sutter, T.R. 278,303
Salama, M. 158 Swanson, G.D. 456, 467
Saleem, Z. 125, 155 Szeto, W.T. 251
Salinas, D. 67 Szu, H. 152, 158
Samtani, M.P. 152, 158
Sandridge, C. A. 304 Tadikonda, S. 304
Saunders, M.A. 302 Tadjbaksh, I. 54, 66
Save, M.A. 79, 113 Taye, S. 223, 251
Schevon, C. 157 Taylor, J.E. 20,48,54,60,65-67,68
Schmit, L.A. 19,209,210,218,241-243 Teller, A.H. 157
249,250,252,254,357,384,385 Teller, E. 157
399,405,412,413,422,430,431 Thareja, R. 254, 396, 412
464, 465 Thierauf, G. 346
Schnabel, R.B. 142, 145, 155 Tischler, V.A. 252
Schrage, L. 114, 465 Timoshenko, S. 466
Schubert, L.K 143, 156 Todd, M. J. 114
Seong, H.G. 346 Toint, Ph.L. 143, 144, 156
Shanno, D.F. 140, 142, 143, 156 Tomlin, J.A. 114
Shephard, M.S. 251 Topping, B.H.V. 242, 252
Sheu, C.Y. 19, 67, 241, 252 Tsach, V. 385
Shield, R.T. 60, 68 Tsai, S.W. 417, 464
Shin, D.K, 195, 207, 458, 467 Tseng, C.H. 254
Shin, Y.S. 413, 437, 465 Turner H.K., 67
Shirk, M.H. 467 Turner, M.J. 60, 65, 68
Shore, C.P. 214, 250
Shragowithz, E.V. 158 Vanderbei 114
Siegel, S. 352, 385 VandenBrink, D.J. 157
Simitses, G.J. 57, 67, 68 Vanderplaats,G.N.206,219,238,239,248

473
.''l/ame Inc/ex

250, 253, 254, 414, 466, 468


Vecchi, M.P. 157
Venkayya, V.B. 19,158,252,352,365,366
373, 380, 385, 386
Verchery, G. 420, 464
Wallerstein, D. 253
Walsh, G.R. 155
Walsh, J.L. 110, 114,253,303,442,445
465
Wang, B.P. 278, 303
Wang, J.T.S. 467
Ward, P. 253
Washizu, K. 66
Wasiutynski, Z. 19
Watson, L.T. 156, 157, 250, 465
Weisshaar, T.A. 467
Wellen, H.K. 253
Wilde, D.J., 1, 19
Wilkinson, J .H. 252, 303
Wilkinson, K. 252, 386, 468
Williams, D. 299, 304
\~Tilliams, F.W. 458,466,467
\Vittrick, W.H. 466
Wolfe, P., 206, 399, 413
Woo, T.H. 215, 250
'Wood, D.E. 114
Wright, M.H. 155, 206, 302
\Vu, A.K. 68
\Vu, C.C. 21
Yang, R.J. 251, 346
Yerry, M.A. 251
Zarghamee, M.S. 68
Zeleny, M. 20
Zeman, P. 113
Zoutendijk, G. 206
Zyczkowski, M. 20, 57, 68

474
Subject Index

ACCESS program 242 ASTROS program 243, 460


active constraints 13
beam problems 2,38,40, 77
active set 13
BFGS update 140
adjoint method (see sensitivity
bisection technique 121
derivatives)
block-angular decomposition 387, 389-399
ADS program 243, 460
block-diagonal decomposition 387, 389
analysis (see also simultaneous
bracketing 116
analysis and design)
branch and bound algorithm 107
.nonlinear equations 142
buckling problems 52-57,61-63,382
.reanalysis 222-225
440-458
.system of linear equations 142
buckling load sensitivity 274-283, 323-333
approximations
.conservative-convex 213-214 cable problems 26, 31
.explicit 210 calculus of variations 2, 29-61
.generic 211-222 central-difference derivative
.global 210, 219-222 approximation 256
.higher order 215 circular plate, frequency 60
.linear 210, 211 classical tools 23-69
.linear force 219, 239 collapse load 72
.local 210, 211-219 collapse mechanisms 80
.local-globaI221 column problems 52-57, 61
.mid-range 219-222 composite laminates 415-468
.multipoint 221 .aeroelastic tailoring 459-460
.quadratic 215 .buckling problems 440-458
.reciprocal 213, 361, 366, 369, 375 .classicallamination theory 417, 418
.reciprocal quadratic 215 .coupling between bending, extension
.two-point 221 and shear 420
approximations and move limits 210,229 .corrugated panel 456
arch problems 77 .design of stiffened panels 451-458
artificial variables 94 .design uncertainties 460-462
ASOP program 242 .genetic algorithm 450
475
Subject index

.graphical solution 426-429, .reciprocal quadratic 215


432-433,438-440 .two-point 221
.integer linear programming 442-450 constraint normalization 12
.hat-stiffened panels 456 constraint relaxation 232
.laminate design for flexural conservative approximation 213-214, 363
response 430-438 conservative concave approximation 214
.laminate design for in-plane convex approximation 214, 363
response 422-429 convex function 166, 214
.mechanical response 415-421 convex linearization 215, 363
.orthotropic laminates 416 convex polytope 87
.penalty function formulation 440-442 convex problems 166-168
.ply-orientation design variables convexity 166
425, 433 coordination of multilevel optimization
.ply-thickness design variables 387, 399-401
422, 430 critical point constraint 292
.probabilistic search methods 450 cubic interpolation 121, 123
.quadratic first-ply failure 425 decomposition methods
.removing plies 424 • block angular structure 389-399
.simulated annealing 450 .block diagonal structure 389
.smeared stiffness 452 .computational benefits 392-399
.stacking convention 418 .global or coupling variables 396
.stacking sequence design 436-451 .implicit elimination of variables
.stiffened plate design 451-458 396-399
.stiffness design 446 .local variables 396
.Tsai-Hill yield criterion 426 .relation to multilevel methods 387-388
.wing-rib panel configurations 456 .response calculation 406-411
concave function 166 .sensitivity calculation 406-411
condition error 256 .substructures 390
conjugate directions (see Powell's derivative calculations (see sensitivity
conjugate directions method) derivatives)
conjugate gradient methods 134-137 DESAP Program 242
155, 156, 405 design model 211
CON:tvlIN program 184, 243 design-variable normalization 12
constraint approximations domain parametrization 337-339
.conservative-convex 213-214 DFP update 139
.explicit 210 differential calculus 23-28
.generic 211-222 discrete-valued design variables 105, 195-
.global 210, 219222 198, 357-361
.higher order 215 dual methods 353-365
.linear 210, 211 .Falk's dual 354, 355
.linear force 219, 239 .first order approximations 361
.local 210,211-219 .integer programming 357361
.local-globaI221 .linear programming 96-100, 354
.mid-range 219-222 .separable problems 355
.multipoint 221 dummy load method 35, 264
.quadratic 215 dynamic compliance 60
ereciprocal 213, 361, 366, 369, 375 dynamic programming 21
476
Subject Index

eigenvalue sensitivity 276-290 generalized reduced gradient method 177


eadjoint or modal technique 277, 289 generic approximations 211-222
edirect approach 277 genetic algorithms 146, 149-152,450
eeigenvectors 276, 277 ereproduction 151
emodified modal method 278 ecrossover 151
eNelson's method 277 emutation 151
enon-hermitian 283-289 geometric programming 21
enonlinear problems 290 geometrical optimization 239
enormalization condition 289 global approximations 210, 219-222
-reduced-basis approach 285 global optimization 145
-repeated eigenvalues 276, 281-283 global sensitivity equation 408
evibration 276-283 golden section search 118-120
eigenvalue reanalysis 226-228 gradient projection method 176-182, 369
elastically supported column 67 Green's function method 295
envelope functions 160,291,404 ground structure 240
equality constraints 9 Hessian matrix 24, 137, 138
Euler-Bernoulli beams 57 einverse Hessian approximations
Euler-Bernoulli columns 52 138-140
Euler-Lagrange equations 31,32,37 erank one updates 138
explicit approximation 210 erank two updates 139
extended interior penalty function esparse update techniques 156
190-192 esparsity 143
exterior penalty function 187-190 higher-order approximations 215
extreme point 87 Huang's family of updates 139, 141
fast reanalysis techniques 222-228 I-DEAS program 243
FASOR program 305 IDESIGN program 243
FASTOP program 242, 371, 460 implicit approximation 210
feasible direction 163 inactive constraints 13
emethod 182-186 inequality constraints 9
feasible domain 13 infeasible domain 13
Fi bonacci search 118-119 integer programming 4
finite difference approximations 256-263 ebranch and bound algorithm 107
eaccuracy and derivative magnitude elinear programming 104-11 0
261 epenalty function method 195--198,
eaccuracy and step size selection 259 440-442
ecentral-difference approximation 256 integrated analysis and design 21
econdition error 256 interior penalty function 190, 191
eiteratively solved problems 259-261 internal boundaries or holes 240
eforward-difference approximation 256 intervening variables 212
eround-off error 256 Karmarkar's Algorithm 100-104
etruncation error 256 Kresselmeier-Steinhauser function 160,
flutter sensitivity derivatives 290 291, 404
Fourier series 61 Kuhn-Tucker conditions 12, 99, 161-170
frame structures 77-81 366
fully stressed design 15, 348-352, 390 Kuhn-Tucker multipliers (see Lagrange
functions of one variable 115-123 multipliers)
Gauss-Newton method 193 Lagrange multipliers 13, 33, 34, 37-39
477
Subject Index

161-173,354,368 linear approximation 210, 211


.as dual variables 354 linear force approximation 219, 239
.as price of constraints 174 line search 14, 115-123
.computation of 170-173, 371, 376 linear static response reanalysis 222-225
.technique 34, 162 linear programming 10
Lagrangian function 162, 234, 354 .duality in 96-100
laminate design .graphical solution 86-88
.aeroelastic tailoring 459-460 .integer programming 104-110
.buckling problems 440-458 .standard form 88
.classicallamination theory 417, 418 .use of 72-86
.coupling between bending, local approximation 210-219
extension and shear 420 local constraints 44
.currugated panel 456 local-global approximation 221
.design of stiffened panels 451-458 logarithmic derivative 261
.design uncertainties 460-462 lower bound theorem 74
.genetic algorithm 450 marginal prices 174
.graphical solution 426-429, material derivative 334
432-433,438-440 mathematical programming 3
.integer linear programming 442-450 mesh generators 240
.hat-stiffened panels 456 Metropolis algorithm (see simulated
.design for flexural response 430-438 annealing)
.design for in-plane response 422-429 mid-range approximations 219-222
.mechaical response 415-421 min-max approach 44-49
.orthotropic laminates 416 move limits 210, 229
.penalty function formulation 440-442 multicriteria optimization 5-9, 20, 21
.ply-orientation design variables multilevel techniques
425, 433 .coordination 387, 399-401
.ply-thickness design variables .decomposition 387-399
422, 430 .derivatives of subsystem optima
.probabilistic search methods 450 399-401
.quadratic first-ply failure 425 .discontinuous sensitivities 400
.removing plies 424 .envelope function approach 401, 404
.simulated annealing 450 .narrow-tree problems 387,404-406
.smeard stiffness 452 .penalty function approach 401-404
.stacking convention 418 .wide-tree structure 387
.stacking sequence design 436-451 multiple objective functions 5-9
.stiffened plate design 451-458 multiplier methods 198-201
.stiffness design 446 multipoint approximations 221
.Tsai-Hill yield criterion 426 necessary conditions for optimality 24, 49
.wing-rib panel configurations 456 Nelson's method 277
limit analysis and design 72-81 NEWSUMT Program 243
.arches 77 NEWSUMT-A Program 243
.beams 77-79 NISAOPT program 243
.frames 77-81 Newton's method 122-123, 137-138,
.trusses 73-77 157, 274
limit load 72 Newton-type methods 157
limit load sensitivity 274-275, 323-326 normalization of design variables
478
Subject Index

and constraints 12 .OPTCOMP 242


Objective function 5 .OPTIMUM 242
.linear 10 .OPTSYS 243
.multiple 5-9 .PANDA2372
one dimensional line search 14 .PARS 242
OPSTAT Program 242 .PASCO 454-458
OPT Program 243 .PROSSS 242
OPTCOMP Program 242 .SHAPE 243
optimality criteria methods .SPAR 242
.displacements constraints 366-370 .STARS 242
.fully stressed design 348-352 .STROPT 243
.intuitive 348-353 .TSO 242,459
.Lagrange multiplier estimation 371 .WASP 242, 459
376 PANDA2 program 458
.scaling based 372-374, 380-381 PARS Program 242
.several constraints 375-382 PASCO program 454-458
.single constraint 365-374 penalty function
.stability constraints 382 .asymptotic behavior 188, 192
.stress-ratio technique 351, 378, 390 .exterior 187-190
.uniform cost-effectiveness resizing .interior 190, 191
371,460 penalty function methods 186-198
.uniform strain-energy density 352 .extrapolation procedure 188, 192
OPTIMUM Program 242 .extended interior 190-192
optimization packages 242-243 .ill-conditioning 189
.ABAQUS 243 .integer programming problems
.ACCESS 242 195-198,440-442
.ADS 243, 460 .unconstrained minimization 193-195
.ANSYS 243 plastic design (see limit analysis
.ASKA 243 and design)
.ASOP 242 plate problems 4, 20, 56, 60
.ASTROS 243, 460 ply-orientation variables 425, 433
.CONMIN 184, 243 ply-thickness variables 422, 430
.DESAP 242 positive definiteness 24
.DOC 243 Powell's conjugate directions method 124
.DOT 243 127-131
.EAL 242 preconditioned conjugate gradient
.FASTOP 242,371,460 methods 137, 155, 405
.GENESIS 243 prestressed concrete design 81-83
.I-DEAS 243 probabilistic search 145-152
.IDESIGN 243 (see genetic algorithms and
.NASTRAN 243 simulated annealing)
.NEWSUMT 243 projected Lagrangian methods 201-204
.NEWSUMT-A 243 projection matrix 171
.NISA II 243 PROSSS Program 242
.NISAOPT 243 pseudo loads 271,306,308
.OPSTAT 242 QR factorization 172
.OPT 243 quadratic extended penalty function 191
479
Subject fndcx

quadratic interpolation 117, 123 .direct approach 277


quadratic approximation 215 .eigenvectors 276, 277
quadratic programming 169-170 .modified modal method 278
quasi-Newton methods 138-145, 156, 157 .l'\elson's method 277
.rank-one updates 138 139 .non-hermitian 283-289
.rank-two updates 139 -140 .nonlinear problems 290
Rayleigh quotient 52, 53, 58, 64, 65, 227 .normalization condition 289
reanalysis techniques 222-228 ereduced-basis approach 285
reciprocal approximation 213, 361, 366 .repeated eigenvalues 276, 281-283
369, 375 .vibration 276-283
reciprocal quadratic approximation 215 sensitivity derivatives of limit load
reduced gradient method 176-182 274-275
response surface 219 sentiitivity derivatives of optimum
Rosen's decompostion algorithm 400 solutions 173-175
Rosen's gradient projection method sensitivity derivatives of static response
176-177 263-275, 306-327
safeguarded polynomial interpolation 123 .accuracy problems 269, 343
sandwich beams 60 .analytical first derivatives 263-268
scaling-based resizing 372374,380-381 .adjoint method 264, 274, 312,343
second derivatives of static response 268 .computational effort 267
semi-analytical method 269-273 .direct method 264, 274, 308, 319, 338
sensitivity derivatives .nonlinear analysis 273, 318
.adjoint method 264, 274, 277, 294 .second derivatives 268
312, 320, 331, 343 .semi-analytical method 269-273
.buckling load 274-283, 323-333 sensitivity derivatives of transient
.decomposition 406 411 response 291-300
.direct method 264, 274, 277, 293, 308 .adjoint method 294
319, 327, 339 .critical point constraint 292
.dummy-load method 264 .direct method 293
.finite-difference approximations .equivalent constraints 291
256-263 .equivalent integrated constraint 291
.flutter 290 .Green's function method 294298
.global sensitivity equation 408 .linear structural dynamics 298-300
.linear static analysis 263-273 .mode-acceleration method 299
306-317 .mode-displacement met hod 298
.linear structural dynamics 298-300 separable problems 355
.limit loads 274, 323-326 sequential approximate optimization 209
.nonlinear analysis 273, 290, 318 sequential linear programming 210
.second derivatives 268 228- 236, 423
.shape sensitivity 269-273, 334-345 tiequential nonlinear approximate
.structural dynamics 298-300 optimization 236-239
.variational sensitivity (see sequential quadratic programming
variational sensitivity analysis) 201-204
.vibration 276-283,327-333 sequential simplex method 123-127
sensitivity derivatives of eigenvalue series solution 61-63
problems 276-290 seventy- two-bar truss 247
.adjoint or modal technique 277, 289 SHAPE program 243
480
Subject Index

shape optimization 5, 20, 239-242 371, 460


shape sensitivity derivatives 269-273 unimodal function 118
333-345 univariate search 116
shell problems 20 updated versus fixed modes 285-289
simplex method 88-96 usable feasible direction 163, 183
eartificial variables 94 variable-metric methods 138, 155, 156
esequential123-127 variational sensitivity analysis 305-346
etableau 93 eadjoint method 312, 320, 331, 343
simulated annealing 146-149, 450 edirect method 308, 319, 327, 339
ecooling schedule 147 elinear static analysis 306-319
eMetropolis algorithm 146, 147 elimit loads 323-326
simultaneous analysis and design 11, 15 .nonlinear static analysis 318-322
21, 404-406 estatic shape sensitivity 334-345
simultaneous mode design approach 353 .unfavorable computational
slack variables 88 experience 343
stability problems 52-57, 61-63, 382 evibration and buckling 323-333
440-458 vibration problems 57, 60
stacking sequence effects in composite vibration sensitivity calculation 276-283
design 436-451 327-333
stiffened panel design 451-458 eeffect of updating modes 285-289
STARS program 242 ereduced-basis approach 285
statically determinate trusses 83 VICON program 458
steepest descent method 132-134 VICONOPT program 458
stress ratio technique 351, 378, 390 VIP ASA program 453
STROPT program 243 WASP Program 242,459
structures of maximum stiffness 50, 51 wing design 194
sufficient conditions for optimality 24
49-61, 164
surplus variables 88
ten-bar truss 237, 244, 350
test problems 244-247
eten-bar truss 237, 238, 244, 350
etwenty-five-bar truss 245
eseventy-two-bar truss 247
topological optimization 20, 239-242
transient response sensitivity 291-300
truncation error 256
truss problems 73-77, 83-85, 395
Tsai-Hill yield criterion 426
TSO Program 242, 459
twenty-five-bar truss 245
two-point approximations 221
unconstrained optimization 115-157
efunctions of one variable 115-123
efunctions of several variables 123-142
.with penalty functions 193-195
uniform cost-effectiveness criterion
481
Mechanics
SOUD MECHANICS AND ITS APPLICATIONS
Series Editor: G.M.L. Gladwell
Aims and Scope of the Series
The fundamental questions arising in mechanics are: Why?, How?, and How much? The aim of this
series is to provide lucid accounts written by authoritative researchers giving vision and insight in
answering these questions on the subject of mechanics as it relates to solids. The scope of the series
covers the entire spectrum of solid mechanics. Thus it includes the foundation of mechanics;
variational formulations; computational mechanics; statics, kinematics and dynamics of rigid and
elastic bodies; vibrations of solids and structures; dynamical systems and chaos; the theories of
elasticity, plasticity and viscoelasticity; composite materials; rods, beams, shells and membranes;
structural control and stability; soils, rocks and geomechanics; fracture; tribology; experimental
mechanics; biomechanics and machine design.
1. R.T. Haftka, Z. Giirdal and M.P. Kamat: Elements 0/ Structural Optimization. 2nd rev.ed.,
1990 ISBN 0-7923-0608-2
2. J.J. Kalker: Three-Dimensional Elastic Bodies in Rolling Contact. 1990
ISBN 0-7923-0712-7
3. P. Karasudhi: Foundations o/Solid Mechanics. 1991 ISBN 0-7923-0772-0
4. N. Kikuchi: Computational Methods in Contact Mechanics. (forthcoming)
ISBN 0-7923-0773-9
5. Y.K. Cheung and A.Y.T. Leung: Finite Element Methods in Dynamics. (forthcoming)
ISBN 0-7923-1313-5
6. J.F. Doyle: Static and Dynamic Analysis 0/ Structures. With an Emphasis on Mechanics and
Computer Matrix Methods. 1991 ISBN 0-7923-1124-8; Pb 0-7923-1208-2
7. 0.0. Ochoa and J.N. Reddy: Finite Element Modelling 0/ Composite Structures.
(forthcoming) ISBN 0-7923-1125-6
8. M.H. Aliabadi and D.P. Rooke: Numerical Fracture Mechanics. ISBN 0-7923-1175-2
9. J. Angeles and C.S. L6pez-Cajun: Optimization o/Cam Mechanisms. 1991
ISBN 0-7923-1355-0
10. D.E. Grierson, A. Franchi and P. Riva: Progress in Structural Engineering. 1991
ISBN 0-7923-1396-8
11. R.T. Haftka and Z. Giirdal: Elements of Structural Optimization. 3rd rev. and expo ed. 1992
ISBN 0-7923-1504-9; Pb 0-7923-1505-7

Kluwer Academic Publishers - Dordrecht / Boston / London


Mechanics
FLUID MECHANICS AND ITS APPLICATIONS
Series Editor: R. Moreau
Aims and Scope of the Series
The purpose of this series is to focus on subjects in which fluid mechanics plays a fundamental
role. As well as the more traditional applications of aeronautics, hydraulics, heat and mass transfer
etc., books will be published dealing with topics which are currently in a state of rapid develop-
ment, such as turbulence, suspensions and multiphase fluids, super and hypersonic flows and
numerical modelling techniques. It is a widely held view that it is the interdisciplinary subjects that
will receive intense scientific attention, bringing them to the forefront of technological advance-
ment. Fluids have the ability to transport matter and its properties as well as transmit force,
therefore fluid mechanics is a subject that is particularly open to cross fertilisation with other
sciences and disciplines of engineering. The subject of fluid mechanics will be highly relevant in
domains such as chemical, metallurgical, biological and ecological engineering. This series is
particularly open to such new multidisciplinary domains.
1. M. Lesieur: Turbulence in Fluids. 2nd rev. ed., 1990 ISBN 0-7923-0645-7
2. O. Metais and M. Lesieur (eds.): Turbulence and Coherent Structures. 1991
ISBN 0-7923-0646-5
3. R. Moreau: Magnetohydrodynamics. 1990 ISBN 0-7923-0937-5
4. E. Coustols (ed.): Turbulence Control by Passive Means. 1990 ISBN 0-7923-1020-9
5. A. A. Borissov (ed.): Dynamic Structure of Detonation in Gaseous and Dispersed Media.
1991 ISBN 0-7923-1340-2
6. K.-S. Choi (ed.): Recent Developments in Turbulence Management. 1991
ISBN 0-7923-1477-8

Kluwer Academic Publishers - Dordrecht / Boston / London


Mechanics
From 1990, books on the subject of mechanics will be published under two series:
FLUID MECHANICS AND ITS APPLICATIONS
Series Editor: R.J. Moreau
SOLID MECHANICS AND ITS APPLICATIONS
Series Editor: G.M.L. Gladwell
Prior to 1990, the books listed below were published in the respective series indicated below.

MECHANICS: DYNAMICAL SYSTEMS


Editors: L. Meirovitch and G.JE. Oravas

I. E.H. Dowell: Aeroelasticity of Plates and Shells. 1975 ISBN 90-286-0404-9


2. D.G.B. Edelen: Lagrangian Mechanics of Nonconservative Nonholonomic Systems.
1977 ISBN 90-286-0077-9
3. J.L. Junkins: An Introduction to Optimal Estimation of Dynamical Systems. 1978
ISaN 90-286-0067-1
4. E.H. Dowell (ed.), H.C. Curtiss Jr., R.H. Scanlan and F. Sisto: A Modern Course in
Aeroelasticity. Revised and enlarged edition see under Volume 11
5. L. Meirovitch: Computational Methods in Structural Dynamics. 1980
ISBN 90-286-0580-0
6. B. Skalmierski and A. Tylikowski: Stochastic Processes in Dynamics. Revised and
enlarged translation. 1982 ISBN 90-247-2686-7
7. P.C. MUller and W.O. Schieh1en: Linear Vibrations. A Theoretical Treatment of Multi-
degree-of-freedom Vibrating Systems. 1985 ISBN 90-247-2983-1
8. Gh. Buzdugan, E. Mihailescu and M. Rade~: Vibration Measurement. 1986
ISBN 90-247-3111-9
9. G.M.L. Gladwell: Inverse Problems in Vibration. 1987 ISBN 90-247-3408-8
10. G.1. Schueller and M. Shinozuka: Stochastic Methods in Structural Dynamics. 1987
ISBN 90-247-3611-0
11. E.H. Dowell (ed.), H.C. Curtiss Jr., R.H. Scanlan and F. Sisto: A Modern Course in
Aeroelasticity. Second revised and enlarged edition (of Volume 4). 1989
ISBN Hb 0-7923-0062-9; Pb 0-7923-0185-4
12. W. Szempliriska-Stupnicka: The Behavior of Nonlinear Vibrating Systems. Volume I:
Fundamental Concepts and Methods: Applications to Single-Degree-of-Freedom
Systems. 1990 ISBN 0-7923-0368-7
13. W. Szempliriska-Stupnicka: The Behavior of Nonlinear Vibrating Systems. Volume II:
Advanced Concepts and Applications to Multi-Degree-of-Freedom Systems. 1990
ISBN 0-7923-0369-5
Set ISBN (Vols. 12-13) 0-7923-0370-9

MECHANICS OF STRUCTURAL SYSTEMS


Editors: J.S. przemieniecki and G.lE. Oravas

1. L. Fryba: Vibration of Solids and Structures under Moving Loads. 1970


ISBN 90-01-32420-2
2. K. Marguerre and K. Wolfel: Mechanics of Vibration. 1979 ISBN 90-286-0086-8
Mechanics
3. E.B. Magrab: Vibrations 0/ Elastic Structural Members. 1979 ISBN 90-286-0207-0
4. R.T. Haftka and M.P. Kamat: Elements 0/ Structural Optimization. 1985
Revised and enlarged edition see under Solid Mechanics and Its Applications. Volume 1
5. J.R. Vinson and R.L. Sierakowski: The Behavior 0/ Structures Composed o/Composite
Materials. 1986 ISBN Hb 90-247-3125-9; Pb 90-247-3578-5
6. B.E. Gatewood: Virtual Principles in Aircra/t Structures. Volume 1: Analysis. 1989
ISBN 90-247-3754-0
7. B.E. Gatewood: Virtual Principles in Aircraft Structures. Volume 2: Design, Plates,
Finite Elements. 1989 ISBN 90-247-3755-9
Set (Gatewood 1 + 2) ISBN 90-247-3753-2

MECHANICS OF ELASTIC AND INELASTIC SOLIDS


Editors: S. Nemat-Nasser and G.JE. Oravas

1. G.M.L. Gladwell: Contact Problems in the Classical Theory 0/ Elasticity. 1980


ISBN Hb 90~286-0440-5; Pb 90-286-0760-9
2. G. Wempner: Mechanics o/Solids with Applications to Thin Bodies. 1981
ISBN 90-286-0880-X
3. T. Mura: Micromechanics 0/ Defects in Solids. 2nd revised edition, 1987
ISBN 90-247-3343-X
4. R.G. Payton: Elastic Wave Propagation in Transversely Isotropic Media. 1983
ISBN 90-247-2843-6
5. S. Nemat-Nasser, H. AM and S. Hirakawa (eds.): Hydraulic Fracturing and Geother-
mal Energy. 1983 ISBN 90-247-2855-X
6. S. Nemat-Nasser, R.J. Asaro and G.A. Hegemier (eds.): Theoretical Foundation for
Large-scale Computations o/Nonlinear Material Behavior. 1984 ISBN 90-247-3092-9
7. N. Cristescu: Rock Rheology. 1988 ISBN 90-247-3660-9
8. G.I.N. Rozvany: Structural Design via Optimality Criteria. The Prager Approach to
Structural Optimization. 1989 ISBN 90-247~3613-7

MECHANICS OF SURFACE STRUCTURES


Editors: W.A. Nash and G.JE. Oravas

1. P. Seide: Small Elastic Deformations o/Thin Shells. 1975 ISBN 90-286-0064-7


2. V. Pane: Theories 0/ Elastic Plates. 1975 ISBN 90-286-0104-X
3. J.L. Nowinski: Theory o/Thermoelasticity with Applications. 1978
ISBN 90-286-0457-X
4. S. Lukasiewicz: Local Loads in Plates and Shells. 1979 ISBN 90-286-0047-7
5. C. Fii't: Statics, Formfinding and Dynamics 0/ Air-supported Membrane Structures.
1983 ISBN 90-247-2672-7
6. Y. Kai-yuan (ed.): Progress in Applied Mechanics. The Chien Wei-zang Anniversary
Volume. 1987 ISBN 90-247-3249-2
7. R. Negruliu: Elastic Analysis 0/ Slab Structures. 1987 ISBN 90-247-3367-7
8. J.R. Vinson: The Behavior o/Thin Walled Structures. Beams, Plates, and Shells. 1988
ISBN Hb 90-247-3663-3; Pb 90-247-3664-1
Mechanics
MECHANICS OF FLUIDS AND TRANSPORT PROCESSES
Editors: R.J. Moreau and G..tE. Oravas

1. J. Happel and H. Brenner: Low Reynolds Number Hydrodynamics. With Special


Applications to Particular Media. 1983 ISBN Hb 90-01-37115-9; Pb 90-247-2877-0
2. S. Zahorski: Mechanics o/Viscoelastic Fluids. 1982 ISBN 90-247-2687-5
3. J.A. Sparenberg: Elements o/Hydrodynamics Propulsion. 1984 ISBN 90-247-2871-1
4. B.K. Shivamoggi: Theoretical Fluid Dynamics. 1984 ISBN 90-247-2999-8
5. R. Timman, A.J. Hermans and G.C. Hsiao: Water Waves and Ship Hydrodynamics. An
Introduction. 1985 ISBN 90-247-3218-2
6. M. Lesieur: Turbulence in Fluids. Stochastic and Numerical Modelling. 1987
ISBN 90-247-3470-3
7. L.A. Lliboutry: Very Slow Flows 0/ Solids. Basics of Modeling in Geodynamics and
Glaciology. 1987 ISBN 90-247-3482-7
8. B.K. Shivamoggi: Introduction to Nonlinear Fluid-Plasma Waves. 1988
ISBN 90-247-3662-5
9. V. Bojarevic!s, Ya. Freibergs, E.!. Shilova and E.V. Shcherbinin: Electrically Induced
Vortical Flows. 1989 ISBN 90-247-3712-5
10. J. Lielpeteris and R. Moreau (eds.): Liquid Metal Magnetohydrodynamics. 1989
ISBN 0-7923-0344-X

MECHANICS OF ELASTIC STABILITY


Editors: H. Leipholz and G..tE. Oravas

1. H. Leipholz: Theory 0/ Elasticity. 1974 ISBN 90-286-0193-7


2. L. Librescu: Elastostatics and Kinetics 0/ Aniosotropic and Heterogeneous Shell-type
Structures. 1975 ISBN 90-286-0035-3
3. C.L. Dym: Stability Theory and Its Applications to Structural Mechanics. 1974
ISBN 90-286-0094-9
4. K. Huseyin: Nonlinear Theory 0/ Elastic Stability. 1975 ISBN 90-286-0344-1
5. H. Leipholz: Direct Variational Methods and Eigenvalue Problems in Engineering.
1977 ISBN 90-286-0106-6
6. K. Huseyin: Vibrations and Stability o/Multiple Parameter Systems. 1978
ISBN 90-286-0136-8
7. H. Leipholz: Stability 0/ Elastic Systems. 1980 ISBN 90-286-0050-7
8. V.V. Bolotin: Random Vibrations o/Elastic Systems. 1984 ISBN 90-247-2981-5
9. D. Bushnell: Computerized Buckling Analysis 0/ Shells. 1985 ISBN 90-247-3099-6
10. L.M. Kachanov: Introduction to Continuum Damage Mechanics. 1986
ISBN 90-247-3319-7
11. H.H.E. Leipholz and M. Abdel-Rohman: Control 0/ Structures. 1986
ISBN 90-247-3321-9
12. H.E. Lindberg and A.L. Florence: Dynamic Pulse Buckling. Theory and Experiment.
1987 ISBN 90-247-3566-1
13. A. Gajewski and M. Zyczkowski: Optimal Structural DeSign under Stability Con-
straints. 1988 ISBN 90-247-3612-9
Mechanics
MECHANICS: ANALYSIS
Editors: V.J. Mizel and G.m. Oravas

1. M.A. Krasnoselskii, P.P. Zabreiko, E.I. Pustylnik and P.E. Sbolevskii: Integral
Operators in Spaces of Summable Functions. 1976 ISBN 90-286-0294-1
2. V.V. Ivanov: The Theory of Approximate Methods and Their Application to the
Numerical Solution of Singular Integral Equations. 1976 ISBN 90-286-0036-1
3. A. Kufner, O. John and S. Pu~fk: Function Spaces. 1977 ISBN 90-286-0015-9
4. S.G. Mikhlin: Approximation on a Rectangular Grid. With Application to Finite
Element Methods and Other Problems. 1979 ISBN 90-286-0008-6
5. D.G.B. Edelen: Isovector Methods for Equations of Balance. With Programs for
Computer Assistance in Operator Calculations and an Exposition of Practical Topics of
the Exterior Calculus. 1980 ISBN 90-286-0420-0
6. R.S. Anderssen, F.R. de Hoog and M.A. Lukas (eds.): The Application and Numerical
Solution of Integral Equations. 1980 ISBN 90-286-0450-2
7. R.Z. Has'minskiI: Stochastic Stability of Differential Equations. 1980
ISBN 90-286-0100-7
8. A.1. Vol'pert and S.1. Hudjaev: Analysis in Classes of Discontinuous Functions and
Equations of Mathematical Physics. 1985 ISBN 90-247-3109-7
9. A. Georgescu: Hydrodynamic Stability Theory. 1985 ISBN 90-247-3120-8
10. W. Noll: Finite-dimensional Spaces. Algebra, Geometry and Analysis. Volume I. 1987
ISBN Hb 90-247-3581-5; Pb 90-247-3582-3

MECHANICS: COMPUTATIONAL MECHANICS


Editors: M. Stem and G.m. Oravas

1. T.A. Cruse: Boundary Element Analysis in Computational Fracture Mechanics. 1988


ISBN 90-247-3614-5

MECHANICS: GENESIS AND METHOD


Editor: G.)E. Oravas

1. P.-M.-M. Duhem: The Evolution of Mechanics. 1980 ISBN 90-286-0688-2

MECHANICS OF CONTINUA
Editors: W.O. Williams and G.m. Oravas

I. C.-c. Wang and C. Truesdell: Introduction to Rational Elasticity. 1973


ISBN 90-01-93710-1
2. PJ. Chen: Selected Topics in Wave Propagation. 1976 ISBN 90-286-0515-0
3. P. Villaggio: Qualitative Methods in Elasticity. 1977 ISBN 90-286-0007-8
Mechanics
MECHANICS OF FRACTURE
Editors: G.C. Sih

1. G.c. Sih (ed.): Methods of ATUllysis and Solutions of Crack Problems. 1973
ISBN 90-01-79860-8
2. M.K. Kassir and G.C. Sih (eds.): Three-dimensional Crack Problems. A New Solution
of Crack Solutions in Three-dimensional Elasticity. 1975 ISBN 90-286-0414-6
3. G.C. Sih (ed.): Plates and Shells with Cracks. 1977 ISBN 90-286"0146-5
4. G.C. Sih (ed.): ElastodYTUlmic Crack Problems. 1977 ISBN 90-286-0156-2
5. G.C. Sih (ed.): Stress ATUllysis of Notch Problems. Stress Solutions to a Variety of
Notch Geometries used in Engineering Design. 1978 ISBN 90-286-0166-X
6. G.C. Sih and E.P. Chen (eds.): Cracks in Composite Materials. A Compilation of Stress
Solutions for Composite System with Cracks. 1981 ISBN 90-247-2559-3
7. G.C. Sih (ed.): Experimental Evaluation of Stress Concentration and Intensity Factors.
Useful Methods and Solutions to Experimentalists in Fracture"Mechanics. 1981
ISBN 90-247-2558-5

MECHANICS OF PLASTIC SOLIDS


Editors: J. Schroeder and G.lE. Oravas

1. A. Sawczuk (ed.): Foundations of Plasticity. 1973 ISBN 90-01-77570-5


2. A. Sawczuk (ed.): Problems of Plasticity. 1974 ISBN 90-286-0233-X
3. W. Szczepitiski: Introduction to the Mechanics of Plastic Forming of Metals. 1979
ISBN 90-286-0126-0
4. D.A. Gokhfeld and O.F. Chemiavsky: Limit ATUllysis of Structures at Thermal Cycling.
1980 ISBN 90-286-0455-3
5. N. Cristescu and I. Suliciu: Viscoplasticity. 1982 ISBN 90-247-2777-4

Kluwer Academic Publishers - Dordrecht / Boston / London

You might also like