0% found this document useful (0 votes)
6 views

Chapter 4. Optimization

The document outlines the concepts of optimization, including minimization and maximization, and the interpretation of first and second derivatives. It discusses critical points, conditions for maxima and minima, and the differences between relative and absolute extrema. Additionally, it covers unconstrained and constrained optimization, including the use of Lagrange multipliers for equality constraints.

Uploaded by

yodahekahsay19
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Chapter 4. Optimization

The document outlines the concepts of optimization, including minimization and maximization, and the interpretation of first and second derivatives. It discusses critical points, conditions for maxima and minima, and the differences between relative and absolute extrema. Additionally, it covers unconstrained and constrained optimization, including the use of Lagrange multipliers for equality constraints.

Uploaded by

yodahekahsay19
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

1

Outline
General Ideas of Optimization
Interpreting the First Derivative
Interpreting the Second Derivative
Unconstrained Optimization
Constrained Optimization

2
Optimization
There are two ways of examining optimization.
Minimization
 In this case you are looking for the lowest point on the
function.
Maximization
 In this case you are looking for the highest point on the
function.

3
y
y = f(x) = x2 – 8x + 20
20

4 x

4
Questions Regarding the Minimum
What is the sign of the slope when you are to the
left of the minimum point?
Another way of saying this is what is f’(x) when x < x*?
Note: x* denotes the point where the function is at a
minimum.

5
Questions Regarding the Minimum Cont.
What is the sign of the slope when you are to the
right of the minimum point?
Another way of saying this is what is f’(x) when x > x*?
What is the sign of the slope when you at the
minimum point?
Another way of saying this is what is f’(x) when x = x*?

6
y

16 y = f(x) = -x2 + 8x

4 x
8

7
Questions Regarding the Maximum
What is the sign of the slope when you are to the
left of the maximum point?
Another way of saying this is what is f’(x) when x < x*?
Note: x* denotes the point where the function is at a
maximum.

8
Questions Regarding the Maximum
Cont.
What is the sign of the slope when you are to the
right of the maximum point?
Another way of saying this is what is f’(x) when x > x*?
What is the sign of the slope when you at the
maximum point?
Another way of saying this is what is f’(x) when x = x*?

9
Interpreting the First Derivative
The first derivative of a function as was shown
previously is the slope of the curve evaluated at a
particular point.
In essence it tells you the instantaneous rate of change of
the function at the given particular point.
Knowing the slope of the function can tell you where a
maximum or a minimum exists on a curve.
 Why?

10
Defining Critical Point
A point x* on a function is said to be a critical
point if when you evaluate the derivative of the
function at the point x*, then the derivative at that
point is zero, i.e., f’(x*) = 0.

11
Question
Can the derivative tell you whether you are at a
maximum or a minimum?
The answer is yes if you examine the slope of the function
around the critical point, i.e., the point where the
derivative is zero.
An easier way of examining whether you have a
maximum or a minimum is to examine the second
derivative of the function.

12
The Second Derivative
The second derivative of a function f(x) is the
derivative of the function f’(x), where f’(x) is the
derivative of f(x).
The second derivative can tell you whether the function is
concave or convex at the critical point.
The second derivative can be denoted by f’’(x).

13
Concavity and the Second
Derivative
The maximum of a function f(x) occurs when a
critical point x* is at a concave portion of the
function.
This is equivalent to saying that f’’(x*) < 0.
If f’’(x) < 0 for all x, then the function is said to be
concave.

14
Convexity and the Second
Derivative
The minimum of a function f(x) occurs when a
critical point x* is at a convex portion of the
function.
This is equivalent to saying that f’’(x*) > 0.
If f’’(x) > 0 for all x, then the function is said to be
convex.

15
Special Case of the Second
Derivative
Suppose you have a function f(x) that has a
maximum at x*.
What does it mean when the second derivative is equal to
zero, i.e., f’’(x*) = 0?
This is a point where the second derivative may not be
able to tell you whether you have a maximum or a
minimum.
Usually in this case you will get a saddle point where the
point is neither a maximum nor a minimum.

16
Example of Special Case of the
Second Derivative
Suppose y = f(x) = x3, then f’(x) = 3x2 and f’’(x) =
6x,
This implies that x* = 0 and f’’(x*=0) = 0.

y=f(x)=x3
y

17
Unconstrained Optimization
An unconstrained optimization problem is one
where you only have to be concerned with the
objective function you are trying to optimize.
An objective function is a function that you are trying to
optimize.
 None of the variables in the objective function are
constrained.

18
First and Second Order Condition
For a Maximum
The first order condition for a maximum at a point
x* on the function f(x) is when f’(x*) = 0.
The second order condition for a maximum at a
point x* on the function f(x) is when f’’(x*) < 0.

19
First and Second Order Condition
For a Minimum
The first order condition for a minimum at a point
x* on the function f(x) is when f’(x*) = 0.
The second order condition for a minimum at a
point x* on the function f(x) is when f’’(x*) > 0.

20
Example of Using First and Second
Order Conditions
Suppose you have the following function:
f(x) = x3 – 6x2 + 9x
Then the first order condition to find the critical
points is:
f’(x) = 3x2 - 12x + 9 = 0
This implies that the critical points are at x = 1 and x = 3.

21
Example of Using First and Second
Order Conditions Cont.
The next step is to determine whether the critical
points are maximums or minimums.
These can be found by using the second order condition.
f’’(x) = 6x – 12 = 6(x-2)

22
Example of Using First and Second
Order Conditions
Testing x = 1 implies:
Cont.
f’’(1) = 6(1-2) = -6 < 0.
Hence at x =1, we have a maximum.
Testing x = 3 implies:
f’’(3) = 6(3-2) = 6 > 0.
Hence at x =3, we have a minimum.
Are these the ultimate maximum and minimum of
the function f(x)?

23
Relative Vs. Absolute Extremum
A relative extremum is a point that is locally
greater or lesser than all points around it.
A relative extrema can be found by using the first order
condition.
An absolute extremum is a point that is either
absolutely greater than or less than all other
points, i.e., f(x*) > f(x) for all x not equal to x* for a
maximum and f(x*) < f(x) for all x not equal to x*
for a minimum.

24
Finding the Absolute Extremum
To find the absolute extremum, you need to
compare all the critical points on the function, as
well as, any potential end points of the function
like ∞ and - ∞.
When evaluating a polynomial function at ∞ , the value of
the function at ∞ takes the value of the at the highest
ordered variable.

25
Finding the Absolute Extremum Cont.
Some properties of ∞:
∞ + ∞ = ∞
∞ - ∞ is undefined
c*∞ = ∞ , where c is any value grater than zero
∞ * ∞ = ∞
∞ * (-∞ ) = -∞
From the previous example, the relative extremum
points occur at x =-∞, 1, 3, and ∞.
The absolute maximum occurs at x =∞ and the
absolute minimum occurs at x =-∞.

26
Unconstrained Optimization: Two
Variables
Suppose you have a function y = f(x1,x2), then to
find the critical points, you can use the
following first order condition:

∂f ( x1* , x2* )
f x1 = =0
∂x1
∂f ( x , x )
* *
f x2 = 1 2
=0
∂x2

27
Unconstrained Optimization: Two
Variables Cont.
The second order condition are more complex
where you have to examine the second derivative of
each of the variables, as well as, the cross
derivative.

28
Constrained Optimization
Constrained Optimization is said to occur when
one or more of the variables in the objective
function is constrained by some function.
Hence a constrained optimization problem will have an
objective functions and a set of constraints.

29
Constrained optimization
In the presence of constraints, a (local)
optimum does not need to be a stationary point
of the objective function!
Consider the 1-dimensional examples with
feasible region of the form a ≤ x ≤ b
Local optima are either
Stationary and feasible
Boundary points
Constrained optimization
We will study how to characterize local optima
for
multi-dimensional optimization problems
with more complex constraints
We will start by considering problems with only
equality constraints
We will also assume that the objective and constraint
functions are continuous and differentiable
Constrained optimization: equality
constraints
A general equality constrained multi-
dimensional NLP is:

max f ( x ) = f ( x 1 ,..., x n )
subject to
g1 ( x 1 ,..., x n ) = b1
g 2 ( x 1 ,..., x n ) = b2
M
g m ( x 1 ,..., x n ) = bm
Constrained optimization: equality
constraints
The Lagrangian approach is to associate a
Lagrange multiplier λi with the i th constraint
We then form the Lagrangian by adding
weighted constraint violations to the objective
function:
L( x 1 ,..., x n , λ1 ,..., λm ) =
m
f ( x 1 ,..., x n ) + ∑ λi ( bi − g i ( x 1 ,..., x n ) )
i =1
or L( x , λ ) = f ( x ) + λ ′ ( b − g ( x ) )
Constrained optimization: equality
constraints
Now consider the stationary points of the
Lagrangian:
∂L( x , λ ) ∂f ( x ) m ∂g i ( x )
= − ∑ λi =0 j = 1,..., n
∂x j ∂x j i =1 ∂x j
∂L( x , λ )
= bi − g i ( x ) = 0 i = 1,..., m
∂λi
The 2nd set of conditions says that x needs to satisfy
the equality constraints!
The 1st set of conditions generalizes the unconstrained
stationary point condition!
Constrained optimization: equality
constraints
Let (x*,λ*) maximize the Lagrangian
Then it should be a stationary point of L
g(x*)=b, i.e., x* is a feasible solution to the original
optimization problem
Furthermore, for all feasible x and all λ
L ( x * , λ * ) ≥ L( x , λ )
f (x * ) + λ * ′ ( b − g (x * ) ) ≥ f (x ) + λ′ ( b − g (x ) )
f (x * ) ≥ f (x )
So x* is optimal for the original problem!!
Constrained optimization: equality
constraints
Conclusion: we can find the optimal solution to
the constrained problem by considering all
stationary points of the unconstrained
Lagrangian problem
i.e., by finding all solutions to

∂L( x , λ ) ∂f ( x ) m ∂g i ( x )
= − ∑ λi =0 j = 1,..., n
∂x j ∂x j i =1 ∂x j
∂L( x , λ )
= bi − g i ( x ) = 0 i = 1,..., m
∂λi
Constrained optimization: equality
constraints
As a byproduct, we get the interesting
observation that
L( x * , λ * ) = f ( x * ) + λ * ′ ( b − g ( x * ) ) = f ( x * )
We will use this later when interpreting the
values of the multipliers λ*
Constrained optimization: equality
constraints
Note: if
the objective function f is concave
all constraint functions gi are linear
Then any stationary point of L is an optimal
solution to the constrained optimization
problem!!
this result also holds when f is convex
Constrained optimization: equality
constraints
An example:

max − x − x 2
1
2
2

s.t. x 1 + x 2 = 1
Constrained optimization: equality
constraints
Then
L( x 1 , x 2 , λ ) = − x − x + λ ( 1 − x 1 − x 2 )
2
1
2
2

= −x 12 − x 22 + λ − λ x 1 − λ x 2
First order conditions:

−2x 1 − λ = 0 x 1 = − 12 λ x1 = 1
2

−2x 2 − λ = 0 or x 2 = − 12 λ or x 2 = 1
2

1 − x1 − x 2 = 0 x1 + x 2 = 1 λ = −1
Constrained optimization:
Inequality Constraints
We will still assume that the objective and constraint
functions are continuous and differentiable
We will assume all constraints are “≤ ”
constraints
We will also look at problems with both equality and
inequality constraints
Constrained optimization:
inequality constraints
A general inequality constrained multi-
dimensional NLP is:
max f ( x ) = f ( x 1 ,..., x n )
subject to
g1 ( x 1 ,..., x n ) ≤ b1
g 2 ( x 1 ,..., x n ) ≤ b2
M
g m ( x 1 ,..., x n ) ≤ bm
Constrained optimization:
inequality constraints
In the case of inequality constraints, we also
associate a multiplier λi with the i th constraint
As in the case of equality constraints, these
multipliers can be interpreted as shadow prices
Constrained optimization:
inequality constraints
Without derivation or proof, we will look at a
set of necessary conditions, called Karush-
Kuhn-Tucker- or KKT-conditions, for axgiven
point, say , to be an optimal solution to the
NLP
These are valid when a certain condition
(“constraint qualification”) is verified.
The latter will be assumed for now.
Constrained optimization:
inequality constraints
By necessity, an optimal point should satisfy
the KKT-conditions.
However, not all points that satisfy the KKT-
conditions are optimal!

The characterization holds under certain


conditions on the constraints
The so-called “constraint qualification conditions”
in most cases these are satisfied
 for example: if all constraints are linear
Constrained optimization:
KKTconditions
Ifx is an optimal solution to the NLP (in max-
form), it must be feasible, and:
there must exist a vector of multipliers λ satisfying

∂f ( x ) m ∂g i ( x )
− ∑ λi = 0 j = 1, , n
∂x j i =1 ∂x j
λi ( bi − g i ( x ) ) = 0 i = 1, , m
g i ( x ) ≤ bi i = 1,  , m
λi ≥ 0 i = 1, , m
Constrained optimization:
KKT conditions
The second set of KKT conditions is

λi ×( bi − g i ( x ) ) = 0 i = 1,..., m
This is comparable to the complementary
slackness conditions from LP!

if λi > 0 then g i ( x ) = bi
if g i ( x ) < bi then λi = 0
Constrained optimization:
KKT conditions
This can be interpreted as follows:
Additional units of the resource bi only have value
if the available units are used fully in the optimal
solution

Finally, note that increasing bi enlarges the


feasible region, and therefore increases the
objective value
Therefore, λi≥ 0 for all i
Constrained optimization:
KKTconditions
Derive similar sets of KKT-conditions for
A minimization problem
A problem having ≥ -constraints
A problem having a mixture of constraints (≤ , =, ≥ )
Constrained optimization:
sufficient
If
conditions
f is a concave function
g1,…,gm are convex functions

then any solution x* satisfying the KKT


conditions is an optimal solution to the NLP
A similar result can be formulated for
minimization problems
Constrained optimization:
inequality constraints
An example:

max − x 12 − x 22
s.t. − x 1 − x 2 ≤ −1
Constrained optimization:
inequality constraints
The KKT conditions are:

−2x 1 − λ = 0 with solution


−2x 2 − λ = 0
x1 = 1
λ ×( −1 + x 1 + x 2 ) = 0 2

x2 = 1
2
− x 1 − x 2 ≤ −1
λ =1
λ ≥0
Constrained optimization:
inequality constraints
With multiple inequality constraints:

max − x − x
2
1
2
2

s.t. − 2x 1 − x 2 ≤ −1
− x 1 − 2 x 2 ≤ −1
Constrained optimization:
inequality constraints
The KKT conditions are:
−2x 1 + 2λ1 + λ2 = 0 with solution
−2x 2 + λ1 + 2λ2 = 0 x1 = 1
3

λ1 ×( −1 + 2x 1 + x 2 ) = 0 x2 = 1
3

λ2 ×( −1 + x 1 + 2x 2 ) = 0 λ1 = 2
3

−2x 1 − x 2 ≤ −1 λ2 = 2
3

− x 1 − 2 x 2 ≤ −1
λ1 , λ2 ≥ 0
Constrained optimization:
inequality constraints
Another example:

max − x 12 − x 22
s.t. − 5x 1 − x 2 ≤ −5
− 3x 1 − 2x 2 ≤ −6
Constrained optimization:
inequality constraints
The KKT conditions are:
2x 1 + 5λ1 + 3λ2 = 0 with solution
2x 2 + λ1 + 2λ2 = 0 x 1 = 1 135
λ1 ×( −5 + 5x 1 + x 2 ) = 0 x2 = 12
13
λ2 ×( −6 + 3x 1 + 2x 2 ) = 0 λ1 = 0
−5x 1 − x 2 ≤ −5 λ2 = 12
13
−3x 1 − 2x 2 ≤ −6
λ1 , λ2 ≥ 0
A Word on Constraint Qualification:
It has to be satisfied before we can apply KKT
theorem
It comes in several flavors
We only focus on the following:
The gradients of the constraint functions, including
those corresponding to non-negativity, have to be
linearly independent
When the constraints are all linear, the constraint
qualification is satisfied.
A Word on Constraint Qualification:
An example:

max f ( x1 , x2 ) = x1
s.t. x2 − (1 − x1 ) ≤ 0
3
(1)
x1 ≥ 0
x2 ≥ 0
A Word on Constraint Qualification:

If x1 > 1 then inequality (1) above implies x2 < 0.


Therefore x1 must be ≤ 1 for a feasible solution.
Since ( x1 , x2 ) = (1,0) is feasible and f ( x1 , x2 ) = x1 ,
the maximum is achieved at (1,0 ).
A Word on Constraint Qualification:
On the other hand, the KKT conditions are :
1 + 3λ1 (1 − x1 ) = − µ1
2

µ1 ≥ 0 (2)
When evaluated at (1,0 ) they yield µ1 = −1,
contradicting (2).
A Word on Constraint Qualification:
In other words, we have an optimal solution that
does not satisfy KKT.

In fact, the KKT conditions do not apply here because


the constraint qualification is violated.
A Word on Constraint Qualification:
Indeed, if we let g1 ( x1 , x2 ) = x2 − (1 − x1 ) ,
3

g 2 ( x1 , x2 ) = − x1 , g 3 ( x1 , x2 ) = − x2 ,
we get
0
1× ∇g1 (1,0) + 0 × ∇g 2 (1,0) + 1× ∇g 3 (1,0) =  ,
0
showing that the gradients at (1,0) are not linearly indep.

You might also like