0% found this document useful (0 votes)
13 views32 pages

Opte

The document discusses optimization theory, focusing on both unconstrained and constrained optimization problems involving multivariable functions. It covers various concepts such as global and local optima, optimality conditions, and practical examples, while also outlining the structure of optimization problems. The content is structured into chapters that address different aspects of optimization, including numerical methods and dynamic optimization.

Uploaded by

Debit YT
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views32 pages

Opte

The document discusses optimization theory, focusing on both unconstrained and constrained optimization problems involving multivariable functions. It covers various concepts such as global and local optima, optimality conditions, and practical examples, while also outlining the structure of optimization problems. The content is structured into chapters that address different aspects of optimization, including numerical methods and dynamic optimization.

Uploaded by

Debit YT
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Optimization

Theory and Applications

Khaled HAMIDI
Table of contents

I 1 1

Chapter 1 Optimization without Constraints 2


1.1 Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Optimum of Multivariable Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Global optimum point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 Using increments ∆x and ∆f (x) . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.3 Link between maximization and minimization . . . . . . . . . . . . . . . . . . 6
1.2.4 Local optimum point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Optimality Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.1 Existence of optimum point (sufficient conditions) . . . . . . . . . . . . . . . 10
1.3.2 Optimum point as critical point . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.3 Taylor’s approximation of the increment . . . . . . . . . . . . . . . . . . . . . 14
1.3.4 Necesssary conditions for the differentiable case . . . . . . . . . . . . . . . . 14
1.3.5 Second-order sufficient conditions for the differentiable case . . . . . . . . . . 15
1.3.6 First derivative test for univariable function . . . . . . . . . . . . . . . . . . . 15
1.3.7 Second derivative test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3.8 Conditions under concavity and convexity . . . . . . . . . . . . . . . . . . . . 16
Exercices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.4 Small practical examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.4.1 What dimensions for advertising cardboard at a lower cost . . . . . . . . . . . 18
1.4.2 What speed for a minimum expenditure of a crossing . . . . . . . . . . . . . . 19
1.4.3 What production for optimum profit . . . . . . . . . . . . . . . . . . . . . . . 19
1.4.4 Which optimal trajectory for a skier . . . . . . . . . . . . . . . . . . . . . . . 20
1.4.5 Optimum Profit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Chapter 2 Optimization under Equality Constraints 23

i
Table of contents Table of contents

Chapter 3 Optimization under Inequality Constraints 24

Chapter 4 Numerical Optimization 25

Chapter 5 Dynamic Optimization 26

Chapter 6 Calculus of Variations 27

Chapter 7 References 28

Chapter A Tests and Controls 29

Kh. Hamidi, Enssea, P-ii


Part I

1
Chapter 1

Optimization without Constraints

This chapter primarily addresses the problem of unconstrained optimization of a multivariable scalar-
valued function. It aims to be both a review and an extension of the mathematical analysis course on
functions of two variables. The review aspect concerns the theorems of existence and the search for
optimal values, while the extension aspect involves generalizing to the case of multiple variables and
applying this to the study of economic agents’ behavior.

1.1 Optimization Problem

Let f (x) be a function of x, where f : D ⊆ Rn −→ R. Optimizing f over D consists of finding,


when they exist, elements x ∈ D where f attains its optimal values. In practice : (i) f describes
an objective to be achieved, such as maximizing profits or minimizing losses ; (ii) D describes the
conditions under which the objective must be met, and is interchangeably referred to as the constraint
domain, the admissible domain, the feasible domain, or simply the domain of possibilities ; (iii) x ∈ D
is then an admissible solution, a feasible solution, a possibility, or sometimes a program.

An optimization problem is classified based on the structure of the triplet (x, f, D) : (i) When the
variables x themselves are functions of another variable t, x = x(t), the problem is referred to as
functional optimization, infinite-dimensional optimization, or optimal control ; when the variables are
integers, it is called combinatorial optimization or integer programming ; when they are real numbers,
it is called continuous optimization. (ii) When the function f is linear, and the set D is the domain of
a linear mapping, it is referred to as linear programming with linear constraints; when f is quadratic,
it is called quadratic programming, and so on. (iii) When the function f itself is a vector, it is referred
to as multi-objective optimization or multi-criteria decision-making.

In this chapter, as in the next ones, we will limit ourselves to the optimization of a real scalar-valued
function of several variables subject to m equality constraints and k inequality constraints :

min f (x) ; D = {x ∈ Rn | g(x) = 0, h(x) ≤ 0} (1.1)


x∈D

2
Chapter 1. Optimization without Constraints 1.2. Optimum of Multivariable Function

where g(x) is a vector of m functions and h(x) is a vector of k functions. We will focus on the case
where the functions f , g, and h are at least twice continuously differentiable.

1.2 Optimum of Multivariable Function

We begin by recalling some basic definitions and results. A multivariable scalar-valued function
depends on nore than one variable. It maps a set of inputs to single input. In symbols, we have :
f : D ⊆ Rn −→ R ; which maps x ≡ (x1 , . . . , xn ) ∈ D to y ≡ f (x) ∈ R.

f : D ⊆ Rn −→ R ; x ∈ D 7−→ y = f (x) ∈ R (1.2)

D is the domain of f , the set of all possible input values for which f is defined ; f (D) = {f (x) | x ∈
D} is the image or the range set of D by f , the set of all possible output values the function f can
produce. The graph of f over D is the Rn+1 subset :

Γ(f, D) = {(x1 , . . . , xn , y) ∈ Rn+1 | y = f (x1 , . . . , xn ) ; x ∈ D} (1.3)

1.2.1 Global optimum point Let y = f (x) as in (1.2). If there exists x̂ ∈ D such that f (x̂) =
min f (D), then (x̂, ŷ) is a global minimum point of the graph of f over D, Thus, in symbols :

(∃x̂ ∈ D) : f (x̂) = min f (D) (1.4)

Regarding this definition, we note the following remarks : (i) The point (x̂, ŷ) is a free or unconstrained
global minimum if x̂ is interior to D ; otherwise, it is a boundary or constrained global minimum. (ii)
It is unique or stric if x̂ is unique.

Example 1.1 (standard parabolic function). Let D = R and f (x) = x2 . Then f (x) = x2

f (D) = [0, +∞[, which means that : (i) As f (D) has no upper bound, the problem 1

max{f (x) | x ∈ D} has no solution x̂ ; (ii) as f (D) has a lower bound that it
reaches at the unique x̂ = 0, the problem min{f (x) | x ∈ D} has the unique
solution x̂ = 0 (The equation f (x) = 0 has the unique solution x̂ = 0). Thus, the

point A = (x̂, ŷ) = (0, 0) is a unique free global minimum point. −1 A 1

Kh. Hamidi, Enssea, P-3


Chapter 1. Optimization without Constraints 1.2. Optimum of Multivariable Function

Example 1.2 (variance of the Bernoulli distribution). Let D = [0, 1] and f (x) = x(1−x). Then f (D) = 0, 14
 

(draw its graph), which means that : (i) The problem max{f (x) | x ∈ D} has the unique solution x̂ = 1/2
2
(The equation f (x) = 14 leads to − x − 12 = 0 which has the unique solution x̂ = 12 ). The point A = 21 , 14


is a unique free global maximum point of f over D.


f (x) = x(1 − x)
1 •
The problem min{f (x) | x ∈ D} has two distinct solutions, x̂ = 0 leading to
4
A

f (0) = 0, and x̂ = 1 leading to f (1) = 0. (The equation f (x) = 0 has the two
solution x̂B = 0 and x̂C = 0). Each of the points B = (0, 0) and C = (1, 0) is a
boundary global minimum, because they are not interrior to D = [0, 1].
B C
• •
1 1
2

1.2.2 Using increments ∆x and ∆f (x) To implement the optimum definition (1.4), one can
check if there exists at least one x̂ ∈ D such that f (x) − f (x̂) ≥ 0 for all x ∈ D and x ̸= x̂. Thus,
the image f (D) must be and ordered set. One way to proceed, which will be useful later, is to
reformulate it in terms of increments. Let ∆x be the increment from x̂ to x = x̂ + ∆x for x ̸= x̂, and
∆f (x̂ = f (x̂ + ∆x) − f (x̂) the f increment induced by ∆x :

∆x = x − x̂ ; ∆f (x̂) = f (x̂ + ∆x) − f (x̂)

Thus, ∆f (x̂) is the total increment of f induced by ∆x ̸= 0 (note that ∆x must be ̸= 0). The
definition of a minimum point, as seen in (1.4), can be reformulated as follows : if there exists x̂ ∈ D
such that (x̂, ŷ) is a minimum point, then any non-zero increment from x̂ results in a positive or null
increment of the function. Formally, we have :

(∃ x̂ ∈ D) (∀ x = x̂ + ∆x ∈ D) : (∆x ̸= 0) ⇒ ∆f (x̂) ≥ 0 (1.5)

(i) In the case of ∆f (x̂) > 0 (strict inequality), the optimum point is said to be strict. This is the case
where x̂ is unique, which implies that strict optimum points are unique and vice versa. (ii) If instead
of equation (1.4), we consider the problem f (x̂) = max f (D), then (x̂, ŷ) defines a global maximum
point of the graph of f over D. In terms of increments, the definition becomes :

(∃ x̂ ∈ D) (∀ x = x̂ + ∆x ∈ D) : (∆x ̸= 0) ⇒ ∆f (x̂) ≤ 0 (1.6)

Let’s apply the increment rule to our two previous examples :

Kh. Hamidi, Enssea, P-4


Chapter 1. Optimization without Constraints 1.2. Optimum of Multivariable Function

• The Standard Parabola example (1.1) where f (x) = x2 and D = R2 , leads to ∆f (x̂) =
(x̂ + ∆x)2 − x̂2 = 2x̂ ∆x + (∆x)2 . At the point A = (x̂, ŷ) = (0, 0), the function increment
is ∆f (x̂A ) = (∆x)2 > 0 for all ∆x ̸= 0. Therfore, A is the unique free minimum point.

• The Bernoulli-variance example (1.2) where f (x) = x2 and D = [0, 1], leads to ∆f (x̂) = (1 −
2x̂)∆x−(∆x)2 . (i) At A = (x̂, ŷ) = 12 , 14 , the function increment is ∆f (x̂A ) = −(∆x)2 < 0


for all ∆x ̸= 0, which implies that A is the unique free global maximum point of f over D.
(ii) At B = (0, 0), we have ∆f (x̂B ) = (1 − ∆x)∆x, which is ≥ 0 only for 0 < ∆x ≤ 1, or
equivalently for x̂ < x ≤ 1 + x̂ as ∆x = x − x̂. According to the f domain D = [0, 1], it
reduces to x̂ < x ≤ 1. (iii) At C = (1, 0), we have ∆f (x̂C ) = −(1 + ∆x)∆x, which is ≥ 0
only for −1 ≤ ∆x < 0, or equivalently for x̂ − 1 ≤ x < x̂. According to domain D = [0, 1], it
reduces to x < x̂.

The two following examples deal with function of two variables. The first one is the quadratic
distance function, and the second one the probability density function of the standard bivariate
normal distribution.

Example 1.3 (quadratic distance, unique global minimum point). The function f (x) = (x1 − 1)2 + (x2 + 1)2 ,
defined on D = R2 , has a unique global minimum that it reaches at x̂ = (1, −1). This is the point A =
(1, −1, 0). Indeed, we can write f (x) = ∥x − x̂∥2 , and conclude that the function f measures the squared
distance between x and x̂. The increment ∆f (x) induced by ∆x is :

∆f (x̂) = ∥x̂ + ∆x − x̂∥2 − ∥x̂ − x̂∥2 = ∥∆x∥2 > 0

We deduce that ∆f (x̂) > 0 for all ∆x ̸= 0. Hence, A = (x̂1 , x̂2 , ŷ) = (1, −1, 0) is
the unique free global minimum point of the graph of f over D. In fact, since f is a
squared distance, this implies that f (D) = [0, ∞[ and consequently min{f (D)} =
0 and max{f (D)} → ∞. The only element of D that satisfies f (x) = 0 is
x̂ = (1, −1). Furthermore, max f (D) being ∞ indicates that f has no global
maximum over D.

Example 1.4 (unique global maximum, using ratio rather than increment). The probability density function
of the standard normal distribution is f (x) = 2π
1
exp − 12 ∥x∥2 , where ∥x∥2 = x21 + x22 . It is defined on


D = R2 , and has a unique free global maximum point that it reaches at x̂ = (0, 0). Note that, using ratio, we
get :

∥x̂ + ∆x∥2 ∥x̂∥2 ∥∆x∥2


   
f (x̂ + ∆x)
= exp − + = exp −
f (x̂) 2 2 2

Kh. Hamidi, Enssea, P-5


Chapter 1. Optimization without Constraints 1.2. Optimum of Multivariable Function

It follows that f (x̂ + ∆x)/f (x̂) < 1, which implies that f (x̂ + ∆x) − f (x̂) < 0,
and finally that ∆f (x̂) < 0 for all ∆x ̸= 0. We can prove this in another way
by considering the increment of log f , which is log f (x̂ + ∆x) − log f (x̂) =
−∥∆x∥2 /2 < 0. Since this logarithmic ratio is always strictly negative at x̂ = (0, 0),
we have f (x̂ + ∆x)/f (x̂) < 1, at x̂ = (0, 0). We deduce again that ∆f (x̂) < 0
for all ∆x ̸= 0. Furthermore, since the solution x̂ is unique and interior to D, we
conclude that A = (x̂, f (x̂)) is the unique free global maximum point of the curve
of f . Note that f (D) =]0, 1/2π] and max{f (D)} = 2π 1
, and that f (D), not being
left-closed, has no minimum.

1.2.3 Link between maximization and minimization When searching for an optimal value
of a function y = f (x) over x in a set D, we look for a point x̂ in D such that ŷ = f (x̂) is an
optimum for f . Before proceeding further, and in order to standardize our notation, we observe that
a maximization problem is always equivalent to a minimization problem, and vice versa. In other
words, we have :

max{f (x) | x ∈ D} = − min{−f (x) | x ∈ D} (1.7)

Indeed, if ∃ x̂ ∈ D such that f (x̂) = max f (D), then f (x) ≤ f (x̂) for
all x ∈ D, which means that −f (x) ≥ −f (x̂) for all x ∈ D, and thus
−f (x̂) = min{−f (D)}. Ultimately, we have f (x̂) = − min{−f (D)}.
The proof of the equivalence in the other direction relies on the fact that
−(−f ) = f . This reasoning also holds if we replace D with an open
neighborhood of x̂.

Notice that the solution x̂ maximizes f (x) and minimizes −f (x) simul-
taneously. Therefore, in the definitions and theorems that follow, we will
primarily refer to the minimization problem in equation (1.1).

1.2.4 Local optimum point What about locally optimal values ? When the definitions presented
above, in (1.4) and (1.5), are only valid in the neighborhood of x̂, locally, we speak of a local optimum
point. Consider the function y = f (x) : D ⊂ Rn −→ R, and the open set Oδ = {x ∈ Rn | ∥x− x̂∥ <
δ} with a real δ > 0 (neighborhood of x̂). We say that (x̂, ŷ) is a local minimum point of the graph of
Γ(f, D) if we can find a δ > 0 and a x̂ ∈ D such that :

f (x̂) = min f (Oδ ∩ D) (1.8)

Kh. Hamidi, Enssea, P-6


Chapter 1. Optimization without Constraints 1.2. Optimum of Multivariable Function

Remarks : (i) The local minimum point is a free or unconstrained if Oδ ∩ D = Oδ , otherwise it is a


boundary or constrained local minimum point.

(ii) If instead of (1.8), we consider the equation f (x̂) = max f (Oδ ∩ D),
then (x̂, ŷ) is a local maximum point of Γ(f, D). (iii) A global optimum
point is also a local optimum point, but a local optimum point is not
always a global optimum point. **(iv)* A local optimum point is unique
when x̂ is unique.

A local minimum point can be redefined using the increment method as


seen in (1.5) : Any local increment ∆x such that 0 < ∥∆x∥ < δ results
in a positive increment of f :

(∃ x̂ ∈ D) (∃ δ > 0) (∀x = x̂ + ∆x ∈ D) : (0 < ∥∆x∥ < δ) =⇒ ∆f (x̂) ≥ 0 (1.9)

Note that x̂ + ∆x must be in D, and that in the case of a univariable function, the inequatlity
0 < ∥∆x∥ < δ is equivalent to −δ < ∆x < δ and ∆x ̸= 0.

Example 1.5 (unique local minimum, unique local maximum). The function f (x) = 13 x3 − 2x defined on

D = R, has a unique free local minimum point at x̂B = 2 (point B), and a free local maximum point at
√ √
x̂D = − 2 (point D). The increments of the f near ± 2, are given by :

√ √
(∆x + 3 2) (∆x)2 (∆x − 3 2) (∆x)2
∆f (x̂B ) = ; ∆f (x̂D ) =
3 3

For the minimum point B, the sign of ∆f (x̂B ) remains strictly positive as long
√ √
as ∆x + 3 2 > 0. Thus, we can find a real δ > 0 (precisely 0 < δ ≤ 3 2) such

that ∆f (x̂B ) > 0 when |∆x| < δ. In other words, we have f ( 2) = min f (Oδ )
√ √ √
where Oδ = {x ∈ R | ∥x − 2∥ < δ} ⊂ D. Thus, B = ( 2, f ( 2)) is a unique
free local minimum point.

Using the same proof, we can state that the increment of f near x̂D = − 2 is strictly negative as long as
√ √
∆x < 3 2. Then, for all 0 < δ ≤ 3 2, we have ∆f (x̂D ) < 0 as soon as |∆x| < δ. The set Oδ = {x ∈
√ √ √
R | 0 < ∥x + 2∥ < δ} ⊂ D thus satisfies f (x̂D ) = max f (Oδ ), and consequently D = (− 2, f (− 2)) is
a unique free local maximum point. Note that since f is an odd function, the local maximum could also be
proven by point symmetry.

Why an open neighborhood ? The reason for reducing the neighborhood to an open set of x̂ is that
we are certain that x̂ has neighbors in any direction. If the neighborhood O is included in D, the
optimum point is free, otherwise the optimum point is said to be constrained or boundary.

Kh. Hamidi, Enssea, P-7


Chapter 1. Optimization without Constraints 1.2. Optimum of Multivariable Function

Example 1.6 (boundary local optimum). In the previous example, the point C satisfies f (x̂D ) = f (x̂C ) where
√ √
x̂C = 2 2. However, the behavior of f in the neighborhood of x̂C reveals that ∆f (x̂C ) = (∆x+3 2)2 ∆x/3,
√ √
which means ∆f (x̂C ) has the same sign as ∆x. It appears that the point C = (2 2, f (2 2)) is not a free

maximum since the function increases to the right of x̂C = 2 2, and decreases to the left. It is a constrained

maximum with the additional condition that x ≤ 2 2 (a constraint). By the same reasoning, we conclude
√ √ √
that A = (−2 2, f (−2 2)) is a local minimum bound by the condition that x ≥ −2 2. In the case of xC ,
√ √
the domain becomes D = {x ∈ R | x ≤ 2 2} =] − ∞, 2 2] and Oδ ̸⊂ D; in the case of xD , we have
√ √
D = {x ∈ R | x ≥ −2 2} = [2 2, ∞[ and again Oδ ̸⊂ D.

Exercises

Exercise 1.1 (linear function increment). Show that the increment of a linear function, f (x) = a + bx for
a change ∆x is linear and inndependant of x. Aanalyze the behavior of the increment ∆f (x), and discuss
whether the function has any local or global optima.

Exercise 1.2 (quadratic function increment). Show that the increment of the parabolic function f (x) =
q0 + q1 x + q22 x2 (with q2 ̸= 0) at x̂ can be expressed as ∆f (x̂) = (q1 + q2 x̂)∆x + q22 (∆x)2 . Then, demonstrate
that at x̂ = − qq12 , the increment becomes independent of the linear term and is solely determined by the
quadratic term. Conclude that the graph of f has a global optimum point at x̂ = − qq12 , which is a minimum if
q2 > 0 and a maximum if q2 < 0.

Exercise 1.3 (particular quadratic cost function). Consider a cost function C(x) = 100 + 50x + 2x2 ,
where x represents the level of production, and C(x) is the total cost of production. The constants a = 100,
b = 50, and c = 2 represent fixed costs, variable costs per unit of production, and the increasing cost per
additional unit of production, respectively. (i) Show that the increment in cost, ∆C(x), for a change in
production by ∆x, is : ∆C(x) = (50 + 4x)∆x + 2(∆x)2 . (ii) Next, demonstrate that at the production level
4 = −12.5, the increment in cost becomes independent of the linear term and is solely
50
x̂ = − 2(2) = − 50
determined by the quadratic term 2. This suggests that at the production level x̂ = −12.5, the increment in cost
is purely quadratic in nature, and does not depend on the exact level of production x. (iii) Finally, conclude that
the cost function C(x) has a global optimum at x̂ = − 50 4 = −12.5, which would correspond to a maximum if
the cost function represents diseconomies of scale (since q2 > 0).

Exercise 1.4 (absolute value of quadratic function). Plot the function f (x) = |1 − x2 |. Using the increment
rule, prove that f has a global minimum point at x̂ = ±1, a local maximum point at x̂ = 0. Discuss its behavior
as x tends to the infinity.

Exercise 1.5 (increment of cosine function). The increment of the cosine function for a change ∆x, at x̂ is
given by ∆ cos(x̂) = cos(x̂ + ∆x) − cos(x̂), where x̂ is the initial angle and ∆x is the change in the angle.
Show that this increment can be expressed as : ∆ cos(x̂) = cos x̂(1 − cos ∆x) − sin x̂ sin ∆x. Calculate the
increment ∆ cos(x̂) at x̂ = 0, x̂ = π/4, and x̂ = π/2. Analyse the behavior of ∆ cos(x̂) at x̂ = 0, x̂ = π/4,
and x̂ = π/2 for very small but non-zero ∆x. Does cosine have a local or global optimum, and if so, where
does it occur ?

Kh. Hamidi, Enssea, P-8


Chapter 1. Optimization without Constraints 1.3. Optimality Conditions

Exercise 1.6 (regular quadratic form). Let Q(x) be a quadratic function in the multivariate case, given by :

x′ Qx
Q(x) = Q0 + x′ q + (1.10)
2

where Q0 is a constant scalar, q is a constant vector, Q is a regular symmetric matrix (Q is invertible and
Q = Q′ ). Show that the increment ∆Q(x̂) can be written as :

∆x′ Q∆x
∆Q(x̂) = (q + Qx̂)′ ∆x + (1.11)
2

Where (q + Qx̂)′ ∆x represents the linear term and 12 ∆x′ Q∆x represents the quadratic term. Find the critical
point (x̂, ŷ) by setting the coefficient of the linear term to zero, and solve for x̂. At the critical point, show that
the increment ∆Q(x̂) becomes purely quadratic, independent of the linear term. Analyse the nature of the
critical point based on the definiteness of the matrix Q : (i) if Q is positive definite, explain why Q(x) has a
global minimum at x̂; (ii) if Q is negative definite, explain why Q(x) has a global maximum at x̂; and (iii) if Q
is indefinite, explain why Q(x) has a saddle point at x̂.

Solution 1.1. Given Q(x) = Q0 + x′ q + 12 x′ Qx, where Q is a regular symmetric matrix, the increment
∆Q(x̂) when x changes by ∆x is : ∆Q(x̂) = Q(x̂ + ∆x) − Q(x̂). Expanding and simplifying ∆Q(x̂), leads
to : ∆Q(x̂) = (q + Qx̂)′ ∆x + 21 ∆x′ Q∆x. Thus, the increment consists of the linear term (q + Qx̂)′ ∆x and
the quadratic term 21 ∆x′ Q∆x. To find the critical point x̂, set the coefficient of the linear term to zero :

x̂′ Qx̂
q + Qx̂ = 0 ⇒ x̂ = −Q−1 q ⇒ Q(x̂) = Q0 − (1.12)
2

At this critical point (x̂, ŷ), the increment becomes purely quadratic : ∆Q(x̂) = 21 ∆x′ Q∆x. The nature of the
critical point based on the definiteness of Q is : (i) If Q is positive definite, ∆x′ Q∆x > 0 for all ∆x ̸= 0, so
Q(x) has a global minimum at x̂. (ii) If Q is negative definite, ∆x′ Q∆x < 0 for all ∆x ̸= 0, so Q(x) has a
global maximum at x̂. (iii) If Q is indefinite, ∆x′ Q∆x can take both positive and negative values for different
directions of ∆x, so Q(x) has a saddle point at x̂. We conclude that, the critical point is a global minimum if Q
is positive definite, a global maximum if Q is negative definite, and a saddle point if Q is indefinite.

1.3 Optimality Conditions

The optimization problem of f over D is framed by three theorems. The theorem of Karl Wilhelm
Theodor Weierstrass (1815-1897) provides the sufficient conditions for the existence of an optimal point,
the theorem of Pierre de Fermat (1601-1665) gives the necessary conditions for a point to be optimal,
and the convexity theorem which provides its the sufficient conditions.

Kh. Hamidi, Enssea, P-9


Chapter 1. Optimization without Constraints 1.3. Optimality Conditions

1.3.1 Existence of optimum point (sufficient conditions) Let us begin with the first theorem
and pose the following question : under what conditions on the pair (f, D) does there exist a point
x̂ ∈ D such that (x̂, ŷ), where ŷ = f (x̂), is an optimal point ? If the set D is finite, the image f (D)
is also finite, in which case f admits and attains its optimal values over D. Outside this trivial case,
the Weierstrass theorem provides the sufficient conditions for the existence of an optimum point.

Theorem 1.1 (Weierstrass, existence of optimal values). Let (i) D ⊂ Rn be a non-empty compact set (closed
and bounded), and (ii) f : D −→ R be a continuous function on D. Then f attains its optimal values on D.
(In other words, there exist a and b in D such that f (a) ≤ f (x) ≤ f (b) for all other x ∈ D, or equivalently,
f (D) = [f (a), f (b)] is also a compact set).

Example 1.7 (Constrained optimum in linear programming, Simplex) ). Let the function f (x) = 2x1 − x2 be
defined on the domain D = {x ∈ R2 | x1 + x2 ≤ 2, x1 ≥ 0, x2 ≥ 0}, represented by the blue background in
the figure below. This is an optimization problem of a linear function subject to linear constraints, called a
linear programming problem.

The domain D has three particular points (vertices),


namely : a = (0, 2), b = (0, 2), and c = (0, 0). It is
closed and bounded since all the inequalities defining
it are broad inequalities, and their right-hand sides are
all finite. According to the Weierstrass theorem, the
function f attains its optimal values over D.

The function is represented at x = a, x = b, and x = c with f (a) = −2, f (b) = 4, and f (c) = 0. It
increases in the direction of its gradient f˙(x) = (2, −1)′ (shown in red on the the first plot). From the
graph, referred to as the simplex, we have min{f (D)} = f (a) = −2 and max{f (D)} = f (b) = 4, so that
−2 = f (a) ≤ f (x) ≤ f (b) = 4 for all x ∈ D. However, these two optimal points, being on the boundary of
D, are constrained.

The Weierstrass theorem only states the sufficient conditions for the existence of optimal values. As
a result, if these conditions are not met, no conclusion can be drawn : simply abandoning a single
condition of the theorem does not ensure the existence of optimal values, and optimal values may
exist without any of the conditions being satisfied.

Example 1.8 (Case of an unbounded domain). (a) Let D = R and f (x) = 2x3 − x + 2. The function is
continuous on D, but this domain is not bounded, which means the first condition of the theorem is not
satisfied (D is not compact). Since the image f (D) = R, it is evident that f does not have an optimum.

Kh. Hamidi, Enssea, P-10


Chapter 1. Optimization without Constraints 1.3. Optimality Conditions

Example 1.9 (Case of a non-closed domain). Let D =] − 1, 1[ and f (x) = 2x + 1. The function is continuous
on D, but this domain is not closed. Indeed, since −1 < x < 1, we deduce that −2 < 2x < 2, and subsequently
−1 < f (x) < 3. It follows that f (D) is bounded but does not have an optimum.

Example 1.10 (Non-continuous function on a non-compact domain). Let D = R+ = [0, ∞[ and f (x) = 1 if
x is rational, and f (x) = 0 otherwise. Thus, D is not compact (not closed, not bounded), and the function f is
not continuous on D. However, f (D) = {0, 1}, and f attains its global maximum at each rational number and
its global minimum at each irrational number.

1.3.2 Optimum point as critical point The search for an optimal point of a multivariable function
involves determining the points on its graph where its increments exhibit special behavior. These
points can be observed where the function is not differentiable, as well as where its derivatives are
zero. They are called critical points or singular points.

Let y = f (x) : D ⊆ Rn −→ R be a mulvivariable scalar-vauled function, and f˙(x̂) its gradient


evalued at x̂. If there exists x̂ interior to D such that f˙(x̂) is either zero or does not exist, then (x̂, ŷ)
is a critical point of Γ(f, D). A critical point corresponding to f˙(x̂) = 0 is usually called a stationary
point.

Example 1.11 (critical value corresponding to nondefined gradient). The function f (x) = |x| is defined and

is continuous on D = R. By equivalently writing f (x) = x2 , it follows that its derivative f˙(x) = x/|x| is
not defined at x̂ = 0. Therefore, A = (0, 0) is a critical point of Γ(f, D).

(
−1 ; x<0
f˙(x) = ; ∆f (x) = |x + ∆x| − |x|
+1 ; x>0

However, since at x̂ = 0, the increment ∆f (x̂) = |∆x| is > 0 for any ∆x ̸= 0.


Thus, A is a free unique global minimum point. It is easy to notice that the set
f (D) = [0, ∞[ has the minimum value 0 = f (0), and does not have a maximum
value. (The derivative is undefined at x̂ = 0, but it takes the value 1 to the right
of x̂ = 0, and −1 to its left. It changes sign there, from negative to positive. What
conclusion can we draw from this ?)

Example 1.12 (critical point as a stationary point). The function f (x) = |1 − x2 | is defined and is continuous
p

on D = R. Graphically, it can be seen that both points A = (−1, 0) and B = (1, 0) are global minimum points,
and C = (0, 1) is a local maximum point. Its derivative is :

Kh. Hamidi, Enssea, P-11


Chapter 1. Optimization without Constraints 1.3. Optimality Conditions

 x x
 p = ; |x| > 1
 |1 − x2 |
 f (x)
f˙(x) =
−x −x
= ; |x| < 1


p
2
|1 − x | f (x)

It does not exist if |x| = 1, and is zero for x = 0. Its critical points are A = (−1, 0),
B = (1, 0), and C = (0, 1). Analytically, the increments at these points are :

p p p p p
∆f (x̂A ) = |∆x − 2| |∆x| ; ∆f (x̂B ) = |∆x + 2| |∆x| ; ∆f (x̂C ) = |∆x2 − 1| − 1

(i) The increments ∆f (x̂A ) and ∆f (x̂B ) are positive or zero for any increment ∆x ̸= 0. If we are at A and
the variable x increases by ∆x = 2, we will arrive at point B ; conversely, if we are at B and the variable
x decreases by ∆x = −2, we will arrive at point A. Then, if ∆x ̸= 0 and ∆x ̸= ±2, these increments are
strictly positive, which makes f reach its global minimum value at A and B. (ii) At C = (0, 1), the derivative
of f exists and is zero. Thus, C is a stationary critical point. The increment ∆f (x̂C ) is strictly negative if

|(∆x)2 − 1| < 1, or equivalently if −1 < (∆x)2 − 1 < 1, or further if 0 < |∆x| < 2. Thus, we can find
p
√ √
δ > 0 such that the inequality 0 < |∆x| < δ ≤ 2 implies the inequality ∆f (x̂C ) < 0. Any δ ∈]0, 2] will
work. (iii) Finally, since f (D) = [0, ∞[, we deduce that max{f (D)} → +∞, and therefore, there is no global
maximum point.

Example 1.13 (infinite set of global minimum points). The function f (x) = 12 (x1 −x2 )2 has partial derivatives
everywhere on D = R2 . Its gradient is f˙(x) = (x1 − x2 , −x1 + x2 )′ . The system f˙(x) = 0 has an infinite
number of solutions, corresponding to the subspace of R2 spanned by the vector v = (1, 1) (it is the line with
equation x2 = x1 , shown in blue, passing through the origin and parallel to v). A = (x̂, ŷ) = (α, α, 0) is the
set of stationary points of the graph of f . It is the subspace of R3 spanned by the vector (1, 1, 0). The function
increment at x̂ = α v (α ∈ R) :

(∆x1 − ∆x2 )2
∆f (x̂) = ≥0
2

is positive or zero for all ∆x ̸= 0. Thus, each point A = α(1, 1, 0) is a global


minimum point of the graph of f on D. Finally, since f (D) = [0, ∞[, the function
does not have a global maximum point.

Example 1.14 (infinite set of global maximum points). The function f (x) = −|2x1 − x2 | is defined and
continuous everywhere on D = R2 . To easily compute its partial derivatives, we can rewrite it as f (x) =
− (2x1 − x2 )2 and deduce that :
p

Kh. Hamidi, Enssea, P-12


Chapter 1. Optimization without Constraints 1.3. Optimality Conditions

" #
2x1 − x2 −2
f˙(x) =
|2x1 − x2 | 1

The gradient f˙(x) does not exist for any x ∈ D such that x2 = 2x1 . It equals
f˙(x) = (2, −1)′ if 2x1 < x2 , and f˙(x) = −(2, −1)′ if 2x1 > x2 . Thus, the graph of
f has an infinite number of critical points A = (x1 , 2x1 , f (x1 , 2x1 )) = (x1 , 2x1 , 0)
where x̂ = (x1 , 2x1 ). This is the subspace of R3 spanned by the vector (1, 2, 0).

In terms of increments, we have ∆f (x) = −|(2∆x1 − ∆x2 ) − (2x1 − x2 )| + |2x1 − x2 | which leads to
∆f (x̂) = −|2∆x1 − ∆x2 | ≤ 0. It is seen that for any ∆x ̸= 0, the increment ∆f (x̂) is negative or zero.
Thus, each point A = (x1 , 2x1 , 0) is a global maximum point. Finally, since f (D) =] − ∞, 0], we have
f (x̂) = max{f (D)} and min{f (D)} → −∞ (the function does not have a minimum point).

Example 1.15 (inflection point). The function f (x) = (x − 1)3 defined on D = R has the derivative
f˙(x) = 3(x − 1)2 which vanishes at x = 1. However, the critical point A = (1, f (1)) = (1, 0) is not an
optimum point. In fact, we have ∆f (x) = (x + ∆x − 1)3 − (x − 1)3 , which implies:

∆f (x̂) = (∆x)3 = (∆x)2 ∆x

In other words, the increment of the function, at x̂ = 1, has the same sign as the
increment of x. Moreover, the critical point A = (x̂, f (x̂)) highlighted does not
imply that f (1) = max{f (D)} or that f (1) = min{f (D)}. In fact, the image set
f (D) =] − ∞, +∞[ does not have an optimal value. It is important to note that
the derivative does not change sign near x̂ = 1, and that at x̂, the rate of change
∆f (x̂)/∆x = (∆x)2 is strictly positive for any ∆x ̸= 0.

Example 1.16 (saddle point). Let D = R2 and f (x) = 12 (x21 − x22 ). The function has partial derivatives
everywhere on D. Its gradient is f˙(x) = (x1 , −x2 )′ . It is zero at x̂ = (0, 0). However, the critical point
A = (x̂, f (x̂)) = (0, 0, 0) is not an optimum point. In fact, since :

(∆x1 )2 − (∆x2 )2
∆f (x) = + x1 ∆x1 − x2 ∆x2
2
2 2
we deduce that ∆f (x̂) = (∆x1 ) −(∆x
2
2)
. The increment is positive along the x1
direction, and negative along the x2 direction. As shown in the figure, A = (0, 0, 0)
is a saddle point. Therefore, since f (D) =] − ∞, ∞[, the graph of f does not have
a global optimum point.

Kh. Hamidi, Enssea, P-13


Chapter 1. Optimization without Constraints 1.3. Optimality Conditions

In light of the previous examples, the optimum point is connected to critical point : an optimum point
is necessarily a critical point; however, a critical point is not always an optimum point, as summarized
by the following theorem.

Theorem 1.2 (optimum point as a critical point). Let : (i) y = f (x) be a scalar-valued function defined on
D ⊆ Rn , and (ii) x̂ be an interior point of D. If (x̂, ŷ) is a point of optimum of f over D, then it is necessarily a
critical point. In other words, either f˙(x̂) = 0 or f˙(x̂) does not exist. The converse is not always true.

How can we determine whether a critical point of the graph of f on D is a point of maximum, a
point of minimum, an inflection point, or a saddle point ? In other words, once a critical point is
located, what conditions can help us specify its nature ?

1.3.3 Taylor’s approximation of the increment In the case of differentiable function, the
solution to the unconstrained optimization problem is based on the study of the behavior of the
∆f (x) in the neighborhood of x̂, using Taylor’s series approximation. Taylor’s theorem recall that : if
f : D ⊆ Rn −→ R is differentiable at x̂ interior to D, then ∆f (x) can be approximated by polynom
whose coefficients are expressed as the derivatives of f evaluated at x̂. Its first order, or linear, Taylor
expansion is :

∆f (x̂) = [f˙(x̂)]′ ∆x + ρ1 (∆x) (1.13)

where ρ1 (∆x)/∥∆x∥ tends to zero as ∥∆x∥ tends to zero. For a twice differentiable f at x̂ interior
to D, the second order, or quadratic Taylor approximation, is :

∆x′ f¨(x̂) ∆x
∆f (x̂) = ∆x′ f˙(x̂) + + ρ2 (∆x) (1.14)
2

where ρ2 (∆x)/∥∆x∥2 tends to zero as ∥∆x∥2 tends to zero.

1.3.4 Necesssary conditions for the differentiable case A necessary condition for local optimal-
ity helps identify all potential candidate points for optimality. It is a statement of the form : if a point
is a local minimum, then this point must satisfy such condition. Thus, necessary conditions are those
that must be satisfied for a point to be optimum, but do not guarantee that it is an optimum.

Kh. Hamidi, Enssea, P-14


Chapter 1. Optimization without Constraints 1.3. Optimality Conditions

Using the first order Taylor expansion (1.13), and the fact that an optimal point is necessary a critical
point, we obtain the first-order necessary condition of point to be optimum in terms of the gradient :
Let y = f (x) : D ⊆ Rn −→ R be a continuously differentiiable function at x̂ interior to D. If (x̂, ŷ)
is local minimum point of Γ(f, D), then the gradient of f at x̂ must be zero :

∆f (x̂) ≥ 0 =⇒ f˙(x̂) = 0 (1.15)

Using the first order Taylor expansion (1.14), we obtain the second-order necessary condition for a
critical point to be optimum in terms of the hessian matrix : Let y = f (x) : D ⊆ Rn −→ R be a twice
continuously differentiiable function at x̂ interior to D. If (x̂, ŷ) is local minimum point of Γ(f, D),
and the gradient of f at x̂ is zero, then the hessian matrix f¨(x̂) must also be positive semi-definite :

(∆f (x̂) ≥ 0 ; f˙(x̂) = 0) =⇒ f¨(x̂) ⪰ 0 (1.16)

1.3.5 Second-order sufficient conditions for the differentiable case A sufficient condition
for local optimality allows to automalically declare that a point is indeed a local optimum. It is a
statement of the form : if a point satisfies such condition, then this point is a local optimum. Thus, a
sufficient optimumality conditions are those that guarantee a point is optimum, assuming they hold.

Let y = f (x) : D ⊆ Rn −→ R be a twice continuously differentiiable function at x̂ interior to D, If


f˙(x̂) is zero and f¨(x̂) is positive definite positive matrix, then (x̂, ŷ) is a strict local minimum point.

(f˙(x̂) = 0 ; f¨(x̂) ≻ 0) =⇒ ∆f (x̂) > 0 (1.17)

1.3.6 First derivative test for univariable function Let y = f (x) : D ⊆ R → R be differ-
entiable at x̂ interior to D, such that f˙(x̂) = 0. Then, (x̂, ŷ) is : (i) a local maximum point if f˙(x)
changes from positive to negative at x̂, (ii) a local minimum if f˙(x) changes from negative to positive
at x̂, (iii) an inflection point if f˙(x) retains the same sign at x̂. The following figure shows these
cases’ illustration :

Kh. Hamidi, Enssea, P-15


Chapter 1. Optimization without Constraints 1.3. Optimality Conditions

Local Minimum Point Local Maximum Point Inflection Point

(f˙ = 0)

(f˙ > 0) (f˙ < 0) (f˙ > 0)


(f˙ = 0)
(f˙ < 0) (f˙ > 0) (f˙ < 0)

(f˙ = 0)

Example 1.17. The derivative of the function f (x) = (x − 1)3 , seen in the inflection point example (1.15), is
f˙(x) = 3(x − 1)2 . It vanishes and retains the same sign in the vicinity of x̂ = 1, which proves that (1, 0) is
an inflection point. For reference, we established in that example that ∆f (x̂) = (∆x)3 = (∆x)2 · ∆x. Thus,
∆f (x̂) and ∆x evolve in the same direction.

Example 1.18. Consider the function f (x) = 13 x3 − 2x (see example 1.5). Its derivative is : f˙(x) = x2 − 2.

To find the critical points, we set f˙(x) = 0. The equation x2 − 2 = 0 implies x̂ = ± 2. Applying the first
derivative test, we conclude that :
f˙(x)

• The sign of f˙(x) changes from positive to negative as we move through


√ √
x̂ = − 2, indicating that f (x) has a local maximum at x = − 2. f (x)

• The sign of f˙(x) changes from negative to positive as we move through


√ √
x̂ = 2, indicating that f (x) has a local minimum at x = 2. √
− 2

2

1.3.7 Second derivative test The decision rule is much simpler if the function also has a continuous
second derivative at x̂. Let y = f (x) be a twice-differentiable at x̂ interior to D, and such that f˙(x̂) = 0.
Then, (x̂, ŷ) is : (i) a local maximum point if all the eigenvalues of f¨(x̂) are strictly negative ; (ii) a
local minimum point if all the eigenvalues of f¨(x̂) are strictly positive ; (iii) a saddle point if f¨(x) has
a non-vanishing eigenvalues with different signs. (iv) If f¨(x̂) has at least one vanishing eigenvalue,
the test is inconclusive.

Example 1.19 (case of univariable function). The function f (x) = 13 x3 − 2x from the previous examples (1.5
and 1.18 ) has the second derivative f¨(x) = 2x, which is zero and changes sign at x = 0 (see figure). Therefore,
(0, f (0)) is an inflection point.

1.3.8 Conditions under concavity and convexity Let D a nonempty convex subset of Rn , and
y = f (x) : D −→ R is a convex function on D. (i) Corresponding to each x̂ interior to D, there
exists an n-vector a such that :

Kh. Hamidi, Enssea, P-16


Chapter 1. Optimization without Constraints 1.3. Optimality Conditions

∆f (x̂) ≥ a′ ∆x (1.18)

for all x̂ + ∆x ∈ D. (ii) If, in addition, f is differrentiable at x̂ interior to D, then :

∆f (x̂) ≥ [f˙(x̂)]′ ∆x (1.19)

Accordingly to the previuos results,

If (x̂, ŷ) is a local minimum point, then it is also global minimum point.

Exercices

Exercise 1.7 (probability density function of the Standard Normal  Distribution).


 The probability density
x2
function of the standard normal distribution is f (x) = (2π)−1/2
exp − 2 . Verify the following results: it
is symmetric about the vertical line at x = 0 and f (0) ≈ 0.3989. The first and second derivatives satisfy
f˙(x) = −xf (x) and f¨(x) = (x2 − 1)f (x). Thus, the density function is increasing for x < 0, decreasing for
x > 0, has a unique global maximum at x = 0, and inflection points at x = ±1. It does not have a minimum.

Solution 1.2. The domain of f is D = R. In particular, f is an even function on D, as it is symmetric with


respect to the vertical axis : f (x) = f (−x). We can restrict its study to the interval [0, ∞[. Its value at 0 is
f (0) = √12π ≈ 0.399, and its limit as x → ∞ is zero. Therefore, the line with equation y = 0 is an horizontal
asymptote to the graph of f . Being an element of the exponential family, f is continuously differentiable
everywhere on D. Its first and second derivatives :


(0, 1/ 2π)
2 2 •
e−x /2 e−x /2
f˙(x) = −x · √ ; f¨(x) = (x − 1)(x + 1) · √ √ √
2π 2π (−1, 1/ 2πe) • • (1, 1/ 2πe)
0.200


It follows that f˙(x) = −x · f (x) and f¨(x) = (x2 − 1) · f (x),
which makes the study of its variations locally quite simple. Since −2 2 f˙

f˙(x) = −x f (x) and f (x) > 0, its first derivative is zero at x = 0, −0.200
and changes sign there from positive to negative.

This is enough to show that f reaches its local maximum value f (0) ≈ 0.399 at x̂ = 0. The second derivative
test confirms this result : the value of f¨(x) at x̂ = 0 is strictly negative, f¨(0) = −f (0) = −0.399. Moreover,

the second derivative is zero and changes sign at x̂ = ±1. Since f (±1) = 2πe ≈ 0.242, it has the two
inflection points B = (−1, 0.242) and C = (1, 0.242).

Kh. Hamidi, Enssea, P-17


Chapter 1. Optimization without Constraints 1.4. Small practical examples

Exercise 1.8 (Probability density function of the Standard Cauchy-Lorentz Distribution). Repeat the questions
from the previous exercise for the probability density function of the standard Cauchy distribution, f (x) =
[π(1 + x2 )]−1 .

Solution 1.3. The probability density function of the standard Cauchy distribution is f (x) = [π(1 + x2 )]−1 .
It is defined everywhere, continuous, and differentiable in R. Its first and second derivatives are:

−2x 2(3x2 − 1)
f˙(x) = = −2πxf 2 (x) ; f¨(x) = = 2π 2 (3x2 − 1) f 3 (x)
π(1 + x2 )2 π(1 + x2 )3
 
1
0, π

• The first derivative is zero at x̂ = 0, and changes sign there 
−1
  
√ , 3 • • √1 , 3
from positive to negative. So A = 0, π1 is a local maximum
 3 4π 3 4π
0.2

point. (Note that f¨(0) = − π1 < 0).


• The second derivative is zero and changes sign at x = f

± √13 ≈ ±0.577. As f ± √13 = 4π 3


≈ 0.239, the inflec- −3 −2 −1 1 2 3

tion points are (−0.577, 0.239) and (0.577, 0.239).
−0.2

1.4 Small practical examples

1.4.1 What dimensions for advertising cardboard at a lower cost An advertising card must
contain c = 54 cm2 of printed text. The margins imposed are b = 1 cm at the top and bottom of the
page, and a = 1.5 cm on each side of the text. Knowing that the price of cardboard is proportional to
its surface area, what will be the dimensions of the cheapest possible box ?
x
Let’s refer to the dimensions of the box by x and y. The desired dimen-
b
sions minimize the surface area of the cardboard: A = xy. The text
therefore occupies an area equal to (y − 2b)(x − 2a) = c. It follows that y y − 2b
y = c/(x − 2a) + 2b with the conditions x − 2a > 0 and y − 2b > 0
(which ensure that c is > 0). Subsequently, we obtain, with the same b
conditions, the function A(x), as well as its first and second derivatives,
a x − 2a a
which are :
cx 2ac 4ac
A(x) = + 2bx ; Ȧ(x) = 2b − 2
; Ä(x) =
x − 2a (x − 2a) (x − 2a)3

In view of the second derivative, the function A(x) is strictly convex for x > 2a and therefore
admits a global minimum point. This is one of the critical points whose abscissa is the solution of
Ȧ(x) = 0, or after reduction, of the equation 2(b x2 − 4ab x + 4ba2 − ac) = 0. The solution of

Kh. Hamidi, Enssea, P-18


Chapter 1. Optimization without Constraints 1.4. Small practical examples

b x2 − 4ab x + 4a2 b − ac = 0 for x > 2a, leads to the discriminant ∆′ = 4a2 b2 − 4a2 b2 + abc = abc.
Hence, the two solutions x = 2a ± ac/b of which we retain only x̂ = 2a + ac/b. Thus, the
p p

coordinates of the global minimum point (2a+ ac/b , 4ab+c+4 abc) become (x̂, A(x̂)) = (12, 96)
p

for x = 3/2, b = 1 and c = 54. So, the dimensions x = 12 and y = A(x)/x = 8 of the box are at the
lowest cost.

1.4.2 What speed for a minimum expenditure of a crossing In other words, a ship that has
to travel a distance d in kilometres. Among all the expenses, there are those of fuel and those of
personnel. The hourly expenditure of fuel is proportional to the square of the speed. It is of the form
C = Kv 2 . The hourly pay of the staff is independent of the speed, and is equal to P . For what speed,
in Km/h, the total expenditure is minimum. If the ship goes too fast, the cost of fuel (K) increases
and the cost of personnel (P ) decreases. If it goes slowly, we will observe the opposite situation of
fuel/personnel costs. So there is a speed of compromise. That’s v speed in kilometres/hour.

That is n the number of hours of crossing. The distance L, in kilometers, is then L = nv. The hourly
expenditure is Kv 2 + P , which gives the total expenditure D = n(Kv 2 + P ). With n = L/v, the
total expense becomes D(v) = KLv + P L/v. The marginal expenditure is then Ḋ = KL − P L/v 2 .
It cancels at v = K/P . Since the second derivative > 0, the x-coordinate point v = K/P is a
p p

global minimum. Thus, the velocity that minimizes the total dependence is equal to the square root
of the ratio of fuel cost to staff pay.

1.4.3 What production for optimum profit Let C(Q) = 4 + 10Q − 3Q2 + Q3 /3 be the total
cost induced by the production of Q units of a commodity, and R(Q) = 179Q − 3Q2 the revenue
generated by Q. Then, Ċ(Q) is the marginal cost, and Ṙ(Q) is the marginal income. Profit is defined
as the total income net of the total cost :

B(Q) = R(Q) − C(Q)

The Q that maximizes profit is given B(Q) = R(Q) − C(Q) B(Q) = −4 + 169 Q − 1
3
Q3

by the first and second order conditions. C •


3 000
The first-order condition ensures that the B =0
• R
1 000
marginal receipt is equal to the marginal 2 000

cost, i.e. Ḃ(Q) = Ṙ(Q) − Ċ(Q) = 0, 1 000 B >0 500

which translates to Ṙ(Q) = Ċ(Q). B =0



10 20 5 10 15 20

Kh. Hamidi, Enssea, P-19


Chapter 1. Optimization without Constraints 1.4. Small practical examples

To determine the optimum profit, we first look for the critical points of Ḃ(Q) = 0. With B(Q) =
−4 + 169 Q − Q3 /3, we get Ḃ(Q) = 169 − Q2 = 0. The equation 169 − Q2 = 0 has two solutions
Q̂ = ±13 ; from which we deduce that the optimum is reached in Q̂ = 13, because the second
derivative B̈(Q) = −2Q is negative for Q ≥ 0. The maximum profit is therefore given by the level
of production Q = 13, i.e. B(13) = 1820 − 1078/3 = 4382/3 ≈ 1 460.67.

1.4.4 Which optimal trajectory for a skier A downhill skier wants to prepare his off-piste route
before putting on his skis. The relief it plans to descend is modelled by the function:

x31 3
f (x) = − − x1 x2 − x22 + x1
3 2

Can you help him locate any peaks of the relief by specifying their nature between optimum and
saddle points? Our skier decides to start from point c = (1, −1). Wanting to optimize its course,
it constantly chooses to orient itself along the steepest slope, i.e. in the opposite direction of the
gradient. In which direction should he go ? In your expected approach and reasoning, determine the
possible critical points, the nature of these points. Your answers should be analytical, numerical, and
supported by graphs.

The function is everywhere differentiable on R2 . Its gradient and Hessian are:

" # " #
−x 2−x + 3 2x 1
2 1
f˙(x) = 1 2 ; f¨(x) = −
−x1 − 2x2 1 2

The critical points of the graph of the function are those in which the gradient is either zero or it does
not exist. Solving f˙(x) = 0 leads the system {x1 = −2x2 ; −4x22 − x2 + 3/2 = 0}. The second
equation gives x2 = − 34 or x2 = 12 . The function thus reaches its critical points at a = −1, 21


and b = 32 , − 43 . These points are A = (a, f (a)) = −1, 12 , − 11


2 and B = (b, f (b)) = 2 , − 4 , 2 .
3 3 27
  

The matrix of second derivatives makes it possible to specify the nature of each of these points. We
have :

" # " #
2 −1 −3 −1
f¨(a) = ; f¨(b) =
−1 −2 −1 −2

At a, the hessian matrix admits the eigenvalues ± 5. Since one is
positive and the other negative, then A is a saddle point. At b, the

eigenvalues of the hessian are −(5 ± 5)/2.

Kh. Hamidi, Enssea, P-20


Chapter 1. Optimization without Constraints 1.4. Small practical examples

Since they are both negative, then B is a local maximum point. The function does not admit a global
optimum value since at the limit, it tends towards ±∞.

Starting from c = (1, −1), in order to optimize its course, by constantly choosing to orient itself
along the steepest slope, one must go in the opposite direction to the gradient evaluated at c. In fact,
the directional derivative in c is :

 ′
  3
f˙(c | u) = ∥f˙(c)∥ cos f˙(c), u ; ˙
f (c) = ,1
2

  √
To have cos f˙(c), u = −1 (steepest descent), we must choose u = −f˙(c)/∥f˙(c)∥ = −(3, 2)/ 13,
so that the angle between the vectors u and f˙(c) is equal to π. At the point C = (c, f (c)) of the
graph, it must orient itself towards (u, f (u)).

1.4.5 Optimum Profit A monopoly firm faces two demand functions for the same product,
emanating from two distinct markets Q1 = 40 − 2P1 and Q2 = 20 − 0.5P2 . Knowing that the cost
of production is C = 25 + 4(Q1 + Q2 ), determine, in each of the following cases, the price that
maximizes the firm’s profit : (i) the firm practices price discrimination between the two markets ; (ii)
The firm charges a single price in both markets.

The two request functions are Q1 (P1 ) = 40 − 2P1 and Q2 (P2 ) = 20 − 0.5P2 . Let us assume
Q = Q1 + Q2 the total quantity. It follows that Q(P1 , P2 ) = Q1 (P1 ) + Q2 (P2 ) = 60 − 2P1 − 0.5P2 .
Also, the total cost function is C(Q) = 25 + 4Q = −2P2 − 8P1 + 265. The profit function is defined
as total revenue minus total cost. This function, its gradient and its Hessian are:

" # " #
P 2 − 44P2 + 4P12 − 96P1 + 530 48 − 4P1 −4 0
B(P ) = − 2 ; Ḃ = , B̈ =
2 22 − P2 0 −1

Obviously, the necessary conditions of


the first order give the critical point A =
(P1 .P2 , f (P1 , P2 )) with :

(P1 , P2 ) = (12, 22)

Kh. Hamidi, Enssea, P-21


Chapter 1. Optimization without Constraints 1.4. Small practical examples

Second derivatives are not complicated for this example. The Hessian B̈ is a diagonal matrix whose
first element is −4 and the second −1. It is therefore a negative definite matrix, which means that in
P = (12, 22), the function B(P ) reaches its maximum, and this is a global maximum. Substituting
this point in the two demand functions, we find (Q1 , Q2 ) = (16; 9). Thus, the equilibrium quantities
and prices in the two markets are (Q1 , P1 ) = (16, 12) for the first and (Q2 , P2 ) = (9, 22) for the
second. The optimal (maximum) total profit is B (P1 = 12, P2 = 22) = 265. If the company does
not practice price discrimination in both markets, it will sell with a single price: P1 = P2 = p. In this
case, its profit function becomes a function of a single variable, and is equal to :

Maximum Free (different prices) Maximum constrained, same price


5 p2 − 140 p + 530 Price (P1 , P2 ) = (12, 22) p = 14
B (P1 = p, P2 = p) = − Quantity (Q1 , Q2 ) = (16, 12) (Q1 , Q2 ) = (12, 13)
2 Profit B(12, 22) = 265 B(14, 14) = 225

whose first derivative is 70−5 p and the second derivative −5. Under the constraint P1 = P2 , his profit
is maximum in p = 70/5 = 14. With this price, her maximum profit is B(P1 = 14, P2 = 14) = 225,
which is less than the profit she would get by discriminating. Moreover, the (optimal) equilibrium
quantities constrained to P1 = P2 are (Q1 , Q2 ) = (12, 13).

Kh. Hamidi, Enssea, P-22


Chapter 2

Optimization under Equality Constraints

23
Chapter 3

Optimization under Inequality Constraints

24
Chapter 4

Numerical Optimization

25
Chapter 5

Dynamic Optimization

26
Chapter 6

Calculus of Variations

27
Chapter 7

References

28
Appendix A

Tests and Controls

29

You might also like