Gujarat Technological University
Gujarat Technological University
Module 1: Introduction
Historical Development
The existence of optimization methods can be traced to the days of Newton, Lagrange, and
Cauchy. The development of differential calculus methods for optimization was possible because
of the contributions of Newton and Leibnitz to calculus. The foundations of calculus of
variations, which deals with the minimization of functions, were laid by Bernoulli, Euler,
Lagrange, and Weistrass. The method of optimization for constrained problems, which involve
the addition of unknown multipliers, became known by the name of its inventor, Lagrange.
Cauchy made the first application of the steepest descent method to solve unconstrained
optimization problems. By the middle of the twentieth century, the high-speed digital computers
made implementation of the complex optimization procedures possible and stimulated further
research on newer methods. Spectacular advances followed, producing a massive literature on
optimization techniques. This advancement also resulted in the emergence of several well
defined new areas in optimization theory.
Some of the major developments in the area of numerical methods of unconstrained
optimization are outlined here with a few milestones.
Development of the simplex method by Dantzig in 1947 for linear programming
problems
The enunciation of the principle of optimality in 1957 by Bellman for dynamic
programming problems,
Work by Kuhn and Tucker in 1951 on the necessary and sufficient conditions for the
optimal solution of programming problems laid the foundation for later research in
non-linear programming.
The contributions of Zoutendijk and Rosen to nonlinear programming during the
early
1960s have been very significant.
Work of Carroll and Fiacco and McCormick facilitated many difficult problems to be
solved by using the well-known techniques of unconstrained optimization.
Geometric programming was developed in the 1960s by Duffin, Zener, and Peterson.
Gomory did pioneering work in integer programming, one of the most exciting and
rapidly developing areas of optimization. The reason for this is that most real world
applications fall under this category of problems.
Dantzig and Charnes and Cooper developed stochastic programming techniques and
solved problems by assuming design parameters to be independent and normally
distributed.
The necessity to optimize more than one objective or goal while satisfying the physical
limitations led to the development of multi-objective programming methods. Goal programming
is a well-known technique for solving specific types of multi-objective optimization problems.
The goal programming was originally proposed for linear problems by Charnes and Cooper in
1961. The foundation of game theory was laid by von Neumann in 1928 and since then the
technique has been applied to solve several mathematical, economic and military problems. Only
during the last few years has game theory been applied to solve engineering problems.
Simulated annealing, genetic algorithms, and neural network methods represent a new
class of mathematical programming techniques that have come into prominence during the last
decade. Simulated annealing is analogous to the physical process of annealing of metals and
glass. The genetic algorithms are search techniques based on the mechanics of natural selection
and natural genetics. Neural network methods are based on solving the problem using the
computing power of a network of interconnected ‘neuron’ processors.
Objective Function
As already stated, the objective function is the mathematical function one wants to maximize
or minimize, subject to certain constraints. Many optimization problems have a single objective
function. (When they don't they can often be reformulated so that they do) The two exceptions
are:
• No objective function. In some cases (for example, design of integrated circuit
layouts), the goal is to find a set of variables that satisfies the constraints of the model.
The user does not particularly want to optimize anything and so there is no reason to
define an objective function. This type of problems is usually called a feasibility
problem.
• Multiple objective functions. In some cases, the user may like to optimize a number of
different objectives concurrently. For instance, in the optimal design of panel of a
door or window, it would be good to minimize weight and maximize strength
simultaneously. Usually, the different objectives are not compatible; the variables that
optimize one objective may be far from optimal for the others. In practice, problems
with multiple objectives are reformulated as single-objective problems by either
forming a weighted combination of the different objectives or by treating some of the
objectives as constraints.
where X is an n-dimensional vector called the design vector, f(X) is called the objective
function, and gi(X) and lj(X) are known as inequality and equality constraints, respectively.
The number of variables n and the number of constraints m and/or p need not be related in
any way. This type problem is called a constrained optimization problem.
If the locus of all points satisfying f(X) = a constant c, is considered, it can form a family
of surfaces in the design space called the objective function surfaces. When drawn with the
constraint surfaces as shown in Fig 1 we can identify the optimum point (maxima). This is
possible graphically only when the number of design variables is two. When we have three or
more design variables because of complexity in the objective function surface, we have to
solve the problem as a mathematical problem and this visualization is not possible.
Such problems are called unconstrained optimization problems. The field of unconstrained
optimization is quite a large and prominent one, for which a lot of algorithms and software
are available.
Variables
These are essential. If there are no variables, we cannot define the objective function and the
problem constraints. In many practical problems, one cannot choose the design variable
arbitrarily. They have to satisfy certain specified functional and other requirements.
Constraints
Constraints are not essential. It's been argued that almost all problems really do have
constraints. For example, any variable denoting the "number of objects" in a system can only
be useful if it is less than the number of elementary particles in the known universe! In
practice though, answers that make good sense in terms of the underlying physical or
economic criteria can often be obtained without putting constraints on the variables.
Design constraints are restrictions that must be satisfied to produce an acceptable design.
Constraints can be broadly classified as:
1) Behavioral or Functional constraints: These represent limitations on the behavior
performance of the system.
2) Geometric or Side constraints: These represent physical limitations on design
variables such as availability, fabricability, and transportability.
For example, for the retaining wall design shown in the Fig 2, the base width W cannot be
taken smaller than a certain value due to stability requirements. The depth D below the
ground level depends on the soil pressure coefficients Ka and Kp. Since these constraints
depend on the performance of the retaining wall they are called behavioral constraints. The
number of anchors provided along a cross section Ni cannot be any real number but has to be
a whole number. Similarly thickness of reinforcement used is controlled by supplies from the
manufacturer. Hence this is a side constraint.
Constraint Surfaces
Consider the optimization problem presented in eq. 1.1 with only the inequality constraint
gi(X) ≤ 0 . The set of values of X that satisfy the equation gi(X) ≤ 0 forms a boundary surface
in the design space called a constraint surface. This will be a (n-1) dimensional subspace
where n is the number of design variables. The constraint surface divides the design space
into two regions: one with gi(X) < 0 (feasible region) and the other in which gi(X) > 0
(infeasible region). The points lying on the hyper surface will satisfy gi(X) =0. The collection
of all the constraint surfaces gi(X) = 0, j= 1, 2, …, m, which separates the acceptable region is
called the composite constraint surface.
Fig 3 shows a hypothetical two-dimensional design space where the feasible region is
denoted by hatched lines. The two-dimensional design space is bounded by straight lines as
shown in the figure. This is the case when the constraints are linear. However, constraints
may be nonlinear as well and the design space will be bounded by curves in that case. A
design point that lies on more than one constraint surface is called a bound point, and the
associated constraint is called an active constraint. Free points are those that do not lie on any
constraint surface. The design points that lie in the acceptable or unacceptable regions can be
classified as following:
1. Free and acceptable point
2. Free and unacceptable point
3. Bound and acceptable point
4. Bound and unacceptable point.
Problem formulation
Problem formulation is normally the most difficult part of the process. It is the selection of
design variables, constraints, objective function(s), and models of the discipline/design.
Selection of design variables
A design variable, that takes a numeric or binary value, is controllable from the point of view
of the designer. For instance, the thickness of a structural member can be considered a design
variable. Design variables can be continuous (such as the length of a cantilever beam),
discrete (such as the number of reinforcement bars used in a beam), or Boolean. Design
problems with continuous variables are normally solved more easily.
Design variables are often bounded, that is, they have maximum and minimum values.
Depending on the adopted method, these bounds can be treated as constraints or separately.
Selection of constraints
A constraint is a condition that must be satisfied to render the design to be feasible. An
example of a constraint in beam design is that the resistance offered by the beam at points of
loading must be equal to or greater than the weight of structural member and the load
supported. In addition to physical laws, constraints can reflect resource limitations, user
requirements, or bounds on the validity of the analysis models. Constraints can be used
explicitly by the solution algorithm or can be incorporated into the objective, by using
Lagrange multipliers.
Objectives
An objective is a numerical value that is to be maximized or minimized. For example, a
designer may wish to maximize profit or minimize weight. Many solution methods work only
with single objectives. When using these methods, the designer normally weights the various
objectives and sums them to form a single objective. Other methods allow multi-objective
optimization (module 8), such as the calculation of a Pareto front.
Models
The designer has to also choose models to relate the constraints and the objectives to the
design variables. These models are dependent on the discipline involved. They may be
empirical models, such as a regression analysis of aircraft prices, theoretical models, such as
from computational fluid dynamics, or reduced-order models of either of these. In choosing
the models the designer must trade-off fidelity with the time required for analysis.
The multidisciplinary nature of most design problems complicates model choice and
implementation. Often several iterations are necessary between the disciplines’ analyses in
order to find the values of the objectives and constraints. As an example, the aerodynamic
loads on a bridge affect the structural deformation of the supporting structure. The structural
deformation in turn changes the shape of the bridge and hence the aerodynamic loads. Thus,
it can be considered as a cyclic mechanism. Therefore, in analyzing a bridge, the
aerodynamic and structural analyses must be run a number of times in turn until the loads and
deformation converge.
Representation in standard form
Once the design variables, constraints, objectives, and the relationships between them have
been chosen, the problem can be expressed as shown in equation 1.1
Maximization problems can be converted to minimization problems by multiplying the
objective by -1. Constraints can be reversed in a similar manner. Equality constraints can be
replaced by two inequality constraints.
Problem solution
The problem is normally solved choosing the appropriate techniques from those available in
the field of optimization. These include gradient-based algorithms, population-based
algorithms, or others. Very simple problems can sometimes be expressed linearly; in that case
the techniques of linear programming are applicable.
Gradient-based methods
• Newton's method
• Steepest descent
• Conjugate gradient
• Sequential quadratic programming
Population-based methods
• Genetic algorithms
• Particle swarm optimization
Other methods
• Random search
• Grid search
• Simulated annealing
Most of these techniques require large number of evaluations of the objectives and the
constraints. The disciplinary models are often very complex and can take significant amount
of time for a single evaluation. The solution can therefore be extremely time-consuming.
Many of the optimization techniques are adaptable to parallel computing. Much of the current
research is focused on methods of decreasing the computation time.
The following steps summarize the general procedure used to formulate and solve
optimization problems. Some problems may not require that the engineer follow the steps in
the exact order, but each of the steps should be considered in the process.
1) Analyze the process itself to identify the process variables and specific characteristics
of interest, i.e., make a list of all the variables.
2) Determine the criterion for optimization and specify the objective function in terms of
the above variables together with coefficients.
3) Develop via mathematical expressions a valid process model that relates the inputoutput
variables of the process and associated coefficients. Include both equality and
inequality constraints. Use well known physical principles such as mass balances,
energy balance, empirical relations, implicit concepts and external restrictions.
Identify the independent and dependent variables to get the number of degrees of
freedom.
4) If the problem formulation is too large in scope:
break it up into manageable parts, or
simplify the objective function and the model
5) Apply a suitable optimization technique for mathematical statement of the problem.
6) Examine the sensitivity of the result, to changes in the values of the parameters in the
problem and the assumptions.
The classical optimization techniques are useful in finding the optimum solution or
unconstrained maxima or minima of continuous and differentiable functions. These are
analytical methods and make use of differential calculus in locating the optimum solution.
The classical methods have limited scope in practical applications as some of them involve
objective functions which are not continuous and/or differentiable. Yet, the study of these
classical techniques of optimization form a basis for developing most of the numerical
techniques that have evolved into advanced techniques more suitable to today’s practical
problems.
These methods assume that the function is differentiable twice with respect to the
design variables and that the derivatives are continuous. Three main types of problems can be
handled by the classical optimization techniques, viz., single variable functions, multivariable
functions with no constraints and multivariable functions with both equality and inequality
constraints. For problems with equality constraints the Lagrange multiplier method can be
used. If the problem has inequality constraints, the Kuhn-Tucker conditions can be used to
identify the optimum solution. These methods lead to a set of nonlinear simultaneous
equations that may be difficult to solve. The other methods of optimization include
• Linear programming: studies the case in which the objective function f is linear and
the set A is specified using only linear equalities and inequalities. (A is the design
variable space)
• Integer programming: studies linear programs in which some or all variables are
constrained to take on integer values.
• Quadratic programming: allows the objective function to have quadratic terms,
while the set A must be specified with linear equalities and inequalities.
on random variables.
• Dynamic programming: studies the case in which the optimization strategy is based
on splitting the problem into smaller sub-problems.
• Combinatorial optimization: is concerned with problems where the set of feasible
solutions is discrete or can be reduced to a discrete one.
• Infinite-dimensional optimization: studies the case when the set of feasible solutions
is a subset of an infinite-dimensional space, such as a space of functions.
• Constraint satisfaction: studies the case in which the objective function f is constant
(this is used in artificial intelligence, particularly in automated reasoning).
Most of these techniques will be discussed in subsequent modules.
• Genetic algorithms
A genetic algorithm (GA) is a search technique used in computer science to find
approximate solutions to optimization and search problems. Specifically it falls into
the category of local search techniques and is therefore generally an incomplete
search. Genetic algorithms are a particular class of evolutionary algorithms that use
techniques inspired by evolutionary biology such as inheritance, mutation, selection,
and crossover (also called recombination).
Genetic algorithms are typically implemented as a computer simulation. in which a
population of abstract representations (called chromosomes) of candidate solutions
(called individuals) to an optimization problem, evolves toward better solutions.
Traditionally, solutions are represented in binary as strings of 0s and 1s, but different
encodings are also possible. The evolution starts from a population of completely
random individuals and occur in generations. In each generation, the fitness of the
whole population is evaluated, multiple individuals are stochastically selected from
the current population (based on their fitness), and modified (mutated or recombined)
to form a new population. The new population is then used in the next iteration of the
algorithm.
Stationary points
For a continuous and differentiable function f(x) a stationary point x* is a point at which the
slope of the function vanishes, i.e. f ’(x) = 0 at x = x*, where x* belongs to its domain of
definition.
Relative and Global Optimum
A function is said to have a relative or local minimum at x = x* if f (x*) ≤ f (x* + h) for all
sufficiently small positive and negative values of h, i.e. in the near vicinity of the point x*.
Similarly a point x* is called a relative or local maximum if f (x*) ≥ f (x* + h) for all values
of h sufficiently close to zero.
A function is said to have a global or absolute minimum at x =x* if f x ≤ f x for all x in
the domain over which f(x) is defined.
2 said to have a global or absolute maximum at x = x* if f (x*) ≥ f (x) for all x in the domain
over which f(x) is defined.
Figure 2 shows the global and local optimum points.
Consider the function f(x) defined for a ≤ x ≤ b . To find the value of x* ∈ such that x =x* [a,b]
Functions of a single variable
relative maximum at x = x∈[a,b], x* ∈[a,b] if the derivative f '(X ) = df (x) / dx exists as a finite
Necessary condition: For a single variable function f(x) defined for x which has a
f (n)(x*) is negative, with f(x) concave around x*. When n is odd, hn changes sign with the
change in the sign of h and hence the point x* is neither a maximum nor a minimum. In this
case the point x* is called a point of inflection.
Necessary conditions
Perturbations from points of local minima in any direction
result in an increase in the response function f(x), i.e. the slope of the function is zero at this
point of local minima. Similarly, at maxima and points of inflection as the slope is zero, the
first derivatives of the function with respect to the variables are zero.
Sufficient conditions
Consider the following second order derivatives:
The Hessian matrix defined by H is made using the above second order derivatives.
a) If H is positive definite then the point X = [x1, x2] is a point of local minima.
b) If H is negative definite then the point X = [x1, x2] is a point of local maxima.
c) If H is neither then the point X = [x1, x2] is neither a point of maxima nor minima.
A square matrix is positive definite if all its eigen values are positive and it is negative
definite if all its eigen values are negative. If some of the eigen values are positive and some
negative then the matrix is neither positive definite or negative definite.
To calculate the eigen values λ of a square matrix then the following equation is solved.
A−λI = 0
The above rules give the sufficient conditions for the optimization problem of two variables.
Kuhn-Tucker Conditions
It was previously established that for both an unconstrained optimization problem and an
optimization problem with an equality constraint the first-order conditions are sufficient for a
global optimum when the objective and constraint functions satisfy appropriate
concavity/convexity conditions. The same is true for an optimization problem with inequality
constraints.
The Kuhn-Tucker conditions are both necessary and sufficient if the objective function is
concave and each constraint is linear or each constraint function is concave, i.e. the problems
belong to a class called the convex programming problems.
Consider the following optimization problem:
In case of minimization problems, if the constraints are of the form gj(X) 0, then ≥jλ have to be
nonpositive in (1). On the other hand, if the problem is one of maximization with the constraints
in the form gj(X) ≥ 0, then jλ have to be nonnegative.
It may be noted that sign convention has to be strictly followed for the Kuhn-Tucker
conditions to be applicable.