Lecture Note On Optimization
Lecture Note On Optimization
GTP
Introduction
Whats new? is an interesting and broadening eternal question, but one which, if pursued exclusively, results only in an endless parade of trivia and fashion, the silt of tomorrow. I would like, instead, to be concerned with the question What is best?, a question which cuts deeply rather than broadly, a question whose answers tend to move the silt downstream. Robert M. Pirsig Zen and the Art of Motorcycle Maintenance (1974) Mathematical optimization is the formal title given to the branch of computational science that seeks to answer the question What is best? for problems in which the quality of any answer can be expressed as a numerical value. Such problems arise in all areas of mathematics, the physical, chemical and biological sciences, engineering, architecture, economics, and management, and the range of techniques available to solve them is nearly as wide. The primary objective of this course is to provide a broad overview of standard optimization techniques and their application to practical problems.
Aims
When using optimization techniques, one must: understand clearly where optimization ts into the problem; be able to formulate a criterion for optimization; know how to simplify a problem to the point at which formal optimization is a practical proposition; have sufcient understanding of the theory of optimization to select an appropriate optimization strategy, and to evaluate the results which it returns.
4M13
GTP
1. Denitions
The goal of an optimization problem can be stated as follows: nd the combination of parameters (independent variables) which optimize a given quantity, possibly subject to some restrictions on the allowed parameter ranges. The quantity to be optimized (maximized or minimized) is termed the objective function; the parameters which may be changed in the quest for the optimum are called control or decision variables; the restrictions on allowed parameter values are known as constraints. A maximum of a function f is a minimum of f . Thus, the general optimization problem may be stated mathematically as: minimize subject to f (x ) , ci ( x ) = 0 , ci ( x ) 0 , x = ( x1 , x2 ,... , xn ) i = 1, 2,... , m i = m + 1,... , m .
T
(1.1)
where f ( x ) is the objective function, x is the column vector of the n independent control variables, and { ci ( x ) } is the set of constraint functions. Constraint equations of the form ci ( x ) = 0 are termed equality constraints, and those of the form ci ( x ) 0 are inequality constraints. Taken together, f ( x ) and { ci ( x ) } are known as the problem functions. If inequality constraints are simply a restriction on the allowed values of a control variable, e.g. minimum and maximum possible dimensions: xi min xi xi max , these are known as bounds. (1.2)
2. Classications
There are many optimization algorithms available to the engineer. Many methods are appropriate only for certain types of problems. Thus, it is important to be able to recognize the characteristics of a problem in order to identify an appropriate solution technique. Within each class of problems there are different minimization methods, varying in computational requirements, convergence properties, and so on. Optimization problems are classied according to the mathematical characteristics of the objective function, the constraints and the control variables. Probably the most important characteristic is the nature of the objective function. If the relationship between f ( x ) and the control variables is of a particular form, such as linear, e.g. f ( x) = b x + c , where b is a constant-valued vector and c is a constant, or quadratic, e.g. f ( x ) = x Ax + b x + c ,
T T T
(2.1)
(2.2)
4M13
GTP
where A is a constant-valued matrix, special methods exist that are guaranteed to locate the optimal solution very efciently. These, along with other, classications are summarized in Table 2.1. Table 2.1: Optimization Problem Classifications. Characteristic Number of control variables Property One More than one Continuous real numbers Type of control variables Integers Both continuous real numbers and integers Linear functions of the control variables Problem functions Quadratic functions of the control variables Other nonlinear functions of the control variables Subject to constraints Problem formulation Not subject to constraints Unconstrained Classication Univariate Multivariate Continuous Integer or Discrete Mixed Integer Linear Quadratic Nonlinear Constrained
3. Problem Formulation
The optimal solution to a problem is the one with the best combination of the parameters which fulls the functions set out in the problem specications. In engineering optimization is often performed in the context of design, where best frequently means the cheapest solution that does the job. The overall aim is thus to minimize the cost (or some other performance attribute) while maximizing the functionality. The overall aim is to formulate the optimization problem in precise mathematical terms and then to solve for the optimal solution. The ve main steps are: 1. Create a basic conguration 2. Identify the decision variables 3. Establish the objective function 4. Identify any constraints 5. Select and apply an optimization method
4M13
GTP
GTP
Numerical methods. An initial trial solution is selected, either using common sense or at random, and the objective function is evaluated. A move is made to a new point (2nd trial solution) and the objective function is evaluated again. If it is smaller than the value for the rst trial solution, it is retained and another move is made. The process is repeated until the minimum is found. Search methods are used when: the number of variables and constraints is large; the problem functions (objective and constraint) are highly nonlinear; or the problem functions (objective and constraint) are implicit in terms of the decision/control variables making the evaluation of derivative information difcult. The most appropriate method will depend on the type (classication) of problem to be solved.
GTP
Step 4: Constraints
4M13
GTP
4. Optimality Conditions
4.1 Denitions And Background Theory
The goal of the optimization process is to: minimize f ( x ) subject to x S , (4.1)
where S is the set of feasible values (the feasible region) for x dened by the constraint equations (if any), i.e. the values of x for which all constraints are satised. Obviously, for an unconstrained problem S is innitely large. The gradient vector of f ( x ) g ( x) = f ( x ) = f f f , , , x1 x2 xn
T
(4.2)
denotes the direction in which the function will increase most per unit distance travelled. The Hessian of f ( x ) is an n n symmetric matrix giving the spatial variation of the gradient: f x12 H ( x) = ( f ( x ) ) =
2
f x1 xn
. . .
..
. . .
f xn2
2
(4.3)
f xn x1
d is a feasible direction at x if an arbitrarily small move from x in the direction d remains in the feasible region, i.e. if there exists an 0 such that x + d S 0 < < 0 . feasible directions (4.4)
constraint boundary
infeasible space
If we can replace with < then this is a strict or strong global minimum.
4M13
GTP
A relative or weak local minimum exists at x* if an arbitrarily small move from x* in any feasible direction results in f ( x ) either staying constant or increasing, i.e. f ( x* ) f ( y ) y = x* + d S , y x* . (4.6)
If f ( x ) is a smooth function with continuous rst and second derivatives for all feasible x, then a point x* is a stationary point of f ( x ) if g ( x* ) = f ( x* ) = 0 . (4.7)
Figure 4.1 illustrates the different types of stationary points for unconstrained univariate functions. Figure 4.1: Types of Minima for Unconstrained Optimization Problems. f ( x) weak local minimum strong local minima
global minimum x As shown in Figure 4.2, the situation is slightly more complex for constrained optimization problems. The presence of a constraint boundary, in Figure 4.2 in the form of a simple bound on the permitted values of the control variable, can cause the global minimum to be an extreme value, an extremum (i.e. an endpoint), rather than a true stationary point.
4M13 Figure 4.2: Types of Minima for Constrained Optimization Problems. f ( x) infeasible feasible strong local minima constraint
GTP
global minimum x A function is unimodal if it has a single minimum value, with a path from every other feasible point to the minimum for which the gradient is < 0. An example is shown in Figure 4.3. Figure 4.3: An Example of a Unimodal Function (Rosenbrocks Function f ( x ) = 100 (x 2 x1 ) + ( 1 x1 ) ). 2.0 1.5 1.0 x2 0.5 0.0 -0.5 -1.0 -1.5 global minimum
2 2 2
-1.0
-0.5
0.0 x1
0.5
1.0
1.5
A function is strongly unimodal if a straight line in control variable space from every point to the minimum has a gradient < 0. An example is shown in Figure 4.4.
GTP
global minimum
x1
for all possible feasible directions d at x* , since any move from the minimum must cause the function to increase or stay the same. f ( x* ) d = 0 if x* is an interior point, a point not on the boundary of the feasible region. Clearly, this condition also applies to a maximum and to a saddle point, since the gradient at both these points is zero. Thus this condition is necessary but not sufcient for a local minimum. A second-order condition is needed. 4.2.1 Sufcient Condition for a Minimum Any continuous function can be approximated in the neighbourhood of any point by a Taylor series. This series can thus be used to establish necessary and sufcient criteria for a minimum. For the single variable case:
2 f ( x ) = f ( x* ) + ( x x* ) f ( x* ) + 1 ( x x* ) f ( x* ) + R , 2
(4.9)
where R is the remainder of the expansion, and is small compared with the rst terms of the f series, f = df and f = d 2 . dx dx Let x = x* + d and f ( x ) = f ( x* ) + f . 10
2
4M13 If x* is a minimum and an interior point, then f ( x* ) = 0 and f 0 . Thus, from the Taylor series (neglecting R): f = Hence f ( x* ) 0 1 2 d f ( x* ) . 2
GTP
(4.10)
is a sufcient second-order condition for a minimum. (If > then it is a strict or strong minimum.) For a function of several variables, the Taylor series has the form
T T 1 f ( x ) = f ( x* ) + f ( x* ) ( x x* ) + ( x x* ) H ( x* ) ( x x* ) + R . 2
(4.11)
Let x* be an interior minimum, x = x* + d and f ( x ) = f ( x* ) + f . Then, f ( x ) f ( x* ) and f ( x* ) = 0 . Thus, from the Taylor series (neglecting R): 1 T f ( x* ) + f = f ( x* ) + d H ( x* )d 2 T d H ( x* ) d 0 Thus, the conditions for a local minimum are: f ( x* ) d 0
T d H ( x* ) d 0
(4.8) (4.12)
Equation (4.12) implies that the function is convex at x*, and will be satised if the Hessian H ( x* ) is a positive denite matrix. 4.2.2 Convex Functions A function is said to be convex at x* if for all arbitrarily small x = x x* the value of the function is greater than or equal to the rst two terms of the Taylor expansion. If the function is convex everywhere, i.e. if for all x and y f ( y ) f ( x ) + f ( x ) ( y x ) , then f is called a convex function and x* is a global minimum. An example of a convex function is shown in Figure 4.5.
T
(4.13)
11
GTP
f ( y) f ( x)
yx x 4.2.3 Positive Denite Hessian The Hessian H ( x* ) is a positive denite matrix, if the determinant of each principal minor matrix of H is positive. For example, a 4 4 Hessian matrix will be positive denite if the determinant of each of these matrices is positive: X X X X X X X X X X X X X X X X y x
If H ( x* ) is positive denite then all its eigenvalues are positive. If H ( x* ) = 0 then higher-order terms need to be evaluated to determine whether x* is a local minimum or not. The Hessian of a quadratic function is the same everywhere. So, for a quadratic, a strict local minimum at an interior point is a global minimum.
12
4M13
GTP
13