0% found this document useful (0 votes)
2 views3 pages

MML 4

The document outlines concepts related to univariate optimization, including definitions of convex sets, hyperplanes, local and global optima, and methods for optimization such as gradient descent and Newton Raphson. It provides definitions, examples, and tests for determining local and global maxima and minima, as well as exercises for applying these concepts. Additionally, it discusses Legendre transformation conditions for various functions.

Uploaded by

hsadwi303
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views3 pages

MML 4

The document outlines concepts related to univariate optimization, including definitions of convex sets, hyperplanes, local and global optima, and methods for optimization such as gradient descent and Newton Raphson. It provides definitions, examples, and tests for determining local and global maxima and minima, as well as exercises for applying these concepts. Additionally, it discusses Legendre transformation conditions for various functions.

Uploaded by

hsadwi303
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

B.M.S.

COLLEGE OF ENGINEERING, BENGALURU – 560 0 19


Autonomous college, affiliated to VTU
DEPARTMENT OF MATHEMATICS

Unit-4
UNIVARIATE OPTIMIZATION
Convex sets and functions of separating hyperplanes
Convex set definition: A set 𝐶 is convex if the line segment between any two points in 𝐶 lies in 𝐶. That is, a set 𝐶 ⊆
ℝ𝑛 is convex, if for all 𝑥, 𝑦 ∈ 𝐶 and ∀𝜆 ∈ [0,1], 𝜆𝑥 + (1 − 𝜆)𝑦 ∈ 𝐶.
A point of the form 𝜆𝑥 + (1 − 𝜆)𝑦, 𝜆 ∈ [0, 1] is called a convex combination of 𝑥 and 𝑦. As 𝜆 varies between [0, 1],
a line segment is being formed between 𝑥 and 𝑦.

Hyperplane definition: For given scalar 𝑑 in ℝ, the set of all (𝑥1 , 𝑥2 , … 𝑥𝑛 ) ∈ 𝑅𝑛 such that 𝑎1 𝑥1 + 𝑎2 𝑥2 + ⋯ +
𝑎𝑛 𝑥𝑛 = 𝑑 (𝑓(𝑋) = 𝑑) is called hyperplane. That is, if 𝒏 = (𝒂𝟏 , 𝒂𝟐 , … 𝒂𝒏 ) and 𝑑 is scalar then hyperplane is the set
Η = {𝑋 ∈ 𝑅𝑛 : 𝒏 ⋅ 𝑋 = 𝑑}.
Note: The vector 𝒏 is normal to the hyperplane. Analyse How?

Definition: The hyperplane Η = [𝑓: 𝑑] separates two sets 𝐴 and 𝐵 if one of the following holds:
i. 𝑓(𝐴) ≤ 𝑑 and 𝑓(𝐵) ≥ 𝑑, or
ii. 𝑓(𝐴) ≥ 𝑑 and 𝑓(𝐵) ≤ 𝑑.
If in the conditions above, all the weak inequalities are replaced by strict inequalities, then Η is said to strictly separate
𝐴 and 𝐵.

Examples:
1. Show that intersection of two convex sets is a convex set. Is union of two convex sets is
necessarily convex set. Justify your answer.
2. Show that the unit ball 𝐵 = {𝑢 ∈ 𝑅 𝑛 : ||𝑢|| < 1} is convex set.
3. Show that 𝑆 = {(𝑥1 , 𝑥2 ): 2𝑥1 + 3𝑥2 = 7} ∈ 𝑅 2 is a convex set.
4. Show that 𝑆 = {(𝑥1 , 𝑥2 ): 𝑥12 + 𝑥22 ≤ 4} is a convex set.
5. Show that 𝑆 = {(𝑥1 , 𝑥2 , 𝑥3 ): 𝑥12 + 𝑥22 + 𝑥32 ≤ 1} is a convex set.
6. Show that 𝑆 = {(𝑥1 , 𝑥2 , 𝑥3 ): 2𝑥1 − 𝑥2 + 𝑥3 ≤ 4} is a convex set.
3 1
7. Let 𝐴 = [ ] and 𝑣 = [ ], and let Η = {X ∶ A ⋅ X = 12}. Find the implicit description of the
4 −6
parallel hyperplane Η1 = Η + 𝑉.
1 4
8. Let 𝑣1 = [ ] and 𝑣2 = [ ]. Find an implicit description of hyperplane Η that passes through
3 0
𝑣1 and 𝑣2 .
1 2 −1
9. Let 𝑣1 = [1], 𝑣2 = [4] and 𝑣3 = [−2]. Find an implicit description of hyperplane Η that
3 1 5
passes through 𝑣1 , 𝑣2 and 𝑣3 .
2 3 −1 0 1 2 3
10. Let 𝐴 = {[−1] , [1] , [ 6 ]}, 𝐵 = {[ 5 ] , [−3] , [2]}, and 𝒏 = [ 1 ]. Find a hyperplane Η
5 3 0 −1 −2 1 −2
with normal 𝒏 that separates 𝐴 and 𝐵. Is there any hyperplane that separates 𝐴 and 𝐵 strictly?
B.M.S. COLLEGE OF ENGINEERING, BENGALURU – 560 0 19
Autonomous college, affiliated to VTU
DEPARTMENT OF MATHEMATICS

Local and Global Optima


Definition: Let 𝑋 ⊂ 𝑅 2 , let 𝑓: 𝑅 2 → 𝑅 be a function, and let 𝑝 ∈ 𝑋.
1. 𝑓 has a local maximum at 𝑃 if there exist neighbourhood 𝑈 of 𝑃 in 𝑋 such that 𝑓(𝑃) ≥ 𝑓(𝑋)
for all 𝑋 ∈ 𝑈.
2. 𝑓 has a local minimum at 𝑃 if there exist neighbourhood 𝑈 of 𝑃 in 𝑋 such that 𝑓(𝑃) ≤ 𝑓(𝑋)
for all 𝑋 ∈ 𝑈.
3. 𝑓 has a global maximum at 𝑃 if 𝑓(𝑃) ≥ 𝑓(𝑋) for all 𝑋 ∈ 𝑈, in which case we call 𝑓(𝑝) the
global maximum value of 𝑓.
4. 𝑓 has a global minimum at 𝑃 if 𝑓(𝑃) ≤ 𝑓(𝑋) for all 𝑋 ∈ 𝑈, in which case we call 𝑓(𝑝) the
global minimum value of 𝑓.
Second derivative test: Let 𝑧 = 𝑓(𝑥, 𝑦) be the differentiable function. For a critical point (𝑎, 𝑏),
let
2
𝐷 = 𝑓𝑥𝑥 (𝑎, 𝑏)𝑓𝑦𝑦 (𝑎, 𝑏) − (𝑓𝑥𝑦 (𝑎, 𝑏))

be the discriminant.
 If 𝐷 > 0 and 𝑓𝑥𝑥 (𝑎, 𝑏) > 0, then 𝑓 has a local minimum at (𝑎, 𝑏).
 If 𝐷 > 0 and 𝑓𝑥𝑥 (𝑎, 𝑏) < 0, then 𝑓 has a local maximum at (𝑎, 𝑏).
 If 𝐷 < 0, then (𝑎, 𝑏) is the saddle point of 𝑓.
 If 𝐷 = 0, then the test gives no information.
To find optimum point of multivariate functions more than two variables, we need to compute its Hessian matrix at all the
critical points. Hessian matrix of function of three variable 𝑓(𝑥, 𝑦, 𝑧) is given by
𝑓𝑥𝑥 𝑓𝑥𝑦 𝑓𝑥𝑧
𝑓
𝐻 = [ 𝑦𝑥 𝑓𝑦𝑦 𝑓𝑦𝑧 ]
𝑓𝑧𝑥 𝑓𝑧𝑦 𝑓𝑧𝑧

Second derivative test: Find the critical points, plug them in the Hessian matrix, and compute their eigenvalues.

 If all the eigenvalues are strictly positive, then the critical point is a local minimum.
 If all the eigenvalues are strictly negative, then the critical point is a local maximum.
 If 𝐻 has both positive and negative eigenvalues and all are non-zero, then the critical point is saddle point.
 If any of the eigenvalue is zero, then the test gives no information.

Examples:
For the following multivariate function:
i) Find all the stationary points of the function.
ii) Find the Hessian matrix.
iii) Classify the stationary points.
1. 𝑓(𝑥, 𝑦) = 𝑥 2 + 𝑦 2 − 4𝑥 + 2𝑦
2. 𝑓(𝑥, 𝑦, 𝑧) = 𝑥𝑦 + 𝑦𝑧 + 𝑧𝑥 − 4𝑥 + 2𝑦
3. 𝑓(𝑥, 𝑦, 𝑧) = 𝑥 2 + 𝑦 2 + 𝑧 2 + 𝑥𝑦 + 𝑦𝑧 + 𝑧𝑥
4. 𝑓(𝑥, 𝑦, 𝑧) = −𝑥 3 + 3𝑥𝑧 + 2𝑦 − 𝑦 2 − 3𝑧 2
5. 𝑓(𝑥, 𝑦, 𝑧) = 𝑥 2 + 𝑦 2 + 𝑧 2
B.M.S. COLLEGE OF ENGINEERING, BENGALURU – 560 0 19
Autonomous college, affiliated to VTU
DEPARTMENT OF MATHEMATICS

Optimization using gradient descent method


Approximate the minimum point of the following functions near the given point using
Gradient descent/ascent method (Perform three iterations)

1. f ( x, y)  3x2  y 2 near 1,3


 1
2. f  x, y   x2  xy  y 2 near 1, 
 2
3. f ( x, y)  4x  4xy  2 y near  0,0
2 2

2
3 3 1
f ( x, y)   x     y  2   xy near  5,4
2
4.
4 2 4
5. f ( x, y)  4x  8xy  6 y near 1,1
2 2

6. f ( x, y)  x  y  2x2  2xy  y 2 near  0,0

Optimization using Newton Raphson method


1. Minimize 𝑓 (𝑥1 , 𝑥2 ) = 𝑥1 − 𝑥2 + 2𝑥12 + 2𝑥1 𝑥2 + 𝑥22 by taking the starting point as
0
𝑋1 = { }.
0
1
2. 2. Minimize 𝑓 (𝑥1 , 𝑥2 ) = 𝑥12 + 4𝑥22 by taking the starting point as 𝑋1 = { }.
1
3. 3. Minimize 𝑓(𝑥) = 𝑥12 + 𝑥22 + 𝑥1 𝑥2 + 10(𝑥1 + 𝑥2 ) by taking the starting point as
1
𝑋1 = { }.
1
4. Minimize 𝑓(𝑥) = 𝑥12 − 𝑥1 𝑥2 + 3𝑥22 starting at 𝑥 0 = [1,2]𝑇
5. Minimize 𝑓(𝑥) = [1 + (𝑥1 + 𝑥2 − 5)2 ][1 + (3𝑥1 − 2𝑥2 )2 ] taking the initial points
(10,10) and (2,2).

Legendre transformation
Verify Legendre transformation condition 𝑓(𝑥) = 𝑓 ∗∗ (𝑥) for the following function
1. 𝑓(𝑥) = 𝑒 𝑥
2. 𝑓(𝑥) = 𝑐. 𝑥 2
3. 𝑓(𝑥) = 𝑥 2
4. 𝑓(𝑥) = 𝑐. 𝑥
5. 𝑓(𝑥) = |𝑥|

You might also like