0% found this document useful (0 votes)

47 views6 pages

Lec 18

The document discusses unconstrained optimization problems in machine learning. It explains that most machine learning problems can be formulated as optimization problems to find the optimal model parameters given training data. It describes how to find local and global minima/maxima of an objective function by evaluating critical points where the gradient is zero and using the Hessian matrix to determine if it is a minimum, maximum or saddle point based on the eigenvalues being positive, negative or indefinite.

Uploaded by

saswat sahoo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views6 pages

Lec 18

Uploaded by

saswat sahoo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Machine Learning for Engineering and Science Applications

Professor Dr. Balaji Srinivasan

Department of Mechanical Engineering
Indian Institute of Technology, Madras
Optimization – 1 Unconstrained Optimization

(Refer Slide Time: 0:15)

In this video we will be looking at some beginning of optimization specifically unconstrained

optimization.

(Refer Slide Time: 0:21)

So the relevance of optimization machine learning is very very high, as we saw in the first
week the basic idea behind most of machine learning is that you want to build models so data
models that input they take some input and map it to output data. Now usually (what the) our
maps depends on certain parameters and the way we improve our models as you will see in
the next week is based on something called training that is you give more and more data ad
try and improve your parameters.

So usually you would like to know how much does your output change depending on the
parameter, so you have some quantity which is a vector quantity that is changing based on
some other vector quantity. So most of it most of our machine learning is dependent on
finding out the best or optimal model for some given set of data so most machine learning
problems can usually be rewritten as optimization problems.

So what we will be doing in the next series of videos is to try and introduce as well as review
some of you will be familiar with some of these ideas already so we will try to introduce
some optimization techniques in the coming videos.

(Refer Slide Time: 1:38)

So typically what you try to do in a general optimization task is to maximize or minimize a

function once again like in the previous videos this function can be something that takes in a
vector and gives out a scalar. So this function the one that you are trying to maximize or
minimize is called an objective function or a cost function or a loss function. So this
terminology is used interchangeably.

So the function usually in a general optimization task can either be a scalar and this is called a
single objective optimization problem or you can have f itself as being a vector. So in that
case it could be a multi objective vector and in this course we are going to restrict ourselves
to this case to the single objective optimization problem is of even that is an involved
problem.

So we will be only dealing with that and this is actually true of most of practical machine
learning anyway we try and define a cost function or an objective function which is a scalar
for itself, x in general remember is a vector and typically we are going to deal with the case
where f goes from R n to R. So an example of such an f for example could be f of x vector
which is 3 dimensional, so x 1 square plus x 2 square plus x 3 square, here f is going from R
3 to R.

Now even though the general optimization task is to either maximize or minimize a function,
we will typically talk only about minimization because all optimization problems can be
called as minimization problems, why? Because if it is a maximization problem you simply
minimize minus f of x, so whenever I will be talking in the next few slides as well as in the
next video I will only be talking about minimization because maximization is a trivial change
by simply changing the sign.

Now here is some notation the optimal solution let me write the right word optimal or the
minimal so we will be writing that as x star this star denotes optimal. Now notice the term arg
min, min of f of x would simply mean minimal value of f, arg mean of f of x is that x which
results in minimum of f. So just to give you an example if f of x is let us say x square plus 1
then minimum of f is 1, but arg minimum of f is what value of x gave you the value of f equal
to 1 this is x equal to 0. So we will be using this notation quite often, arg min is that argument
or that value of x which gives us minimum of f of x.
(Refer Slide Time: 05:02)

So here is a quick review of scalar optimization, so as you remember if you have some
function f of x versus x it is in general going to be a curve and you are going to have various
minima for now we are going to look at the unconstrained problem, unconstrained problem
means there are no constraints on x there are no limits on x, we are looking at x belonging to
the whole of the real line, we will look at the constrained case in the next video for now in
this video we are only looking at the unconstrained problems, so we will assume x can go
from minus infinity to plus infinity, okay.

So in such a case you could have some global minimum and global maximum and you could
also have a local minimum and local maximum that is locally if I just put a box here all the
values around the local minimum are greater than the minimum local minimum but this might
not be the global maximum or the global minimum.

Now it can be shown that both these extrema we are not going to show it but both these
extrema whether it is local minimum of local maximum all of them will have the property
that f prime x equal to 0 in the unconstrained case. So these points are called stationary points
or critical points. So the stationary point as I have just shown could be a local minimum or a
local maximum or something called a saddle point.

Now how do we figure out whether it is a local minimum or a local maximum? Typically you
look at the second and higher derivatives we will look at just the second derivative case here.
So if f double prime x if the second derivative is positive for example here in such a case it is
a local minimum. For example if you look at the slope here you will see that the slope of the
slope okay so as I move away from here the slope increases which means that it is a
minimum here.

So here the slope is 0, here the slope is positive so that is why del square f del x square is a
local minimum as an exercise you can try and proof this, this is an optional exercise for those
who are interested you can try and proof this using the Taylor series. Similarly if f double
prime x is less than 0 then it is a local maximum once again it has the same idea. Now it can
happen that your f double prime x is actually 0, in such a case it is called a saddle point this
for example is if you look at f of x equal to x cube around x equal to 0 this is precisely what
happens.

Now what is happening here? This is like the shape of a horse’s saddle as we will see in
multiple dimensions also. You will see that in this direction there is an increase, in this
direction there is a decrease, it is sort of the combination of this curve and this curve. So from
one side it looks like a local minimum and from another side it looks like a local maximum.
So this happens when f double prime x is also equal to 0 and in such a case it could be a
saddle point. All of you are familiar with the notation of global maximum and minimum this
is the absolute maximum or the absolute minimum that you will get over all of space.

(Refer Slide Time: 8:40)

So now let us look at the multivariate case. In this case you are trying to find out f of x once
again which is x that minimizes f of x, but x now belongs to R n instead of simply belonging
to R now it belongs to R n, once again we are looking at the unconstrained problem there are
no constraints on x. Now as we saw in the derivate and gradient slides since now x is a vector
quantity we will now have to evaluate instead of simply df dx, you have to now evaluate the
gradient of f.

So in analogy to what we saw earlier any local extremum, so for example here this is a local
extremum in this case the gradient will be 0, remember this is the 0 vector which means del f
del x 1 will be 0, del f del x 2 will be 0, so on and so forth if it is an n dimensional vector x
del f del xn is actually going to be 0. Once again these are called stationary points or critical
points and like in the 1 dimensional case you could have a local minimum, local maximum or
a saddle point.

So some examples are given here, this is a local as well as a global maximum, here for
example is a local minimum which is not a global minimum because there are values lower
than this going on and this is the example of a classic saddle point, in one direction it is a
local maximum and in another direction it is a local minimum so that is what a typical saddle
point looks like.

Now how you find out whether this is a local minimum, maximum or whether it is a saddle
point now depends on the Hessian rather than the simple second derivative remember for
vectors the generalization of a second derivative is the Hessian, okay. So as we saw in the
previous slides Hessian now is a matrix, now unlike before I cannot simply say Hessian is
positive that has no meaning because it is a full matrix.

So when Hessian is positive definite, you might remember this from the linear algebra slides
what does positive definite mean? Positive definite means all eigenvalues of H are positive.
So this is not even positive semi definite you have to have all values of H being actually
positive. So if that is the case and remember since the Hessian was a symmetric matrix we are
guaranteed to have real eigenvalues so that you can talk about this meaningfully.

So if the Hessian is positive definite then it is a local minimum, if the Hessian is negative
definite which would mean all eigenvalues are less than 0 then it is a (local minimum) sorry
local maximum and if the matrix is indefinite, what is meant by indefinite? It is neither
positive definite nor negative definite. So some eigenvalues are positive, some maybe
negative or some even if they are 0 then it is a saddle point, thank you.

(Book) Bertsimas, D. & Tsitsiklis, J. N. 19yy Introduction To Linear Optimization - Athena Scientific
0% (2)
(Book) Bertsimas, D. & Tsitsiklis, J. N. 19yy Introduction To Linear Optimization - Athena Scientific
186 pages
Introduction To Management Science Quantitative Approaches To Decision Making 14th Edition Anderson Solutions Manual Download
100% (22)
Introduction To Management Science Quantitative Approaches To Decision Making 14th Edition Anderson Solutions Manual Download
31 pages
Weatherwax Nocedal Solutions
No ratings yet
Weatherwax Nocedal Solutions
23 pages
LPP Sensitivity Report
No ratings yet
LPP Sensitivity Report
7 pages
Chapter 4. Optimization
No ratings yet
Chapter 4. Optimization
62 pages
Chapter8-Unconstrained Optimization
No ratings yet
Chapter8-Unconstrained Optimization
14 pages
Optimization Techniques - OT
No ratings yet
Optimization Techniques - OT
96 pages
OptimumEngineeringDesign Day2b
No ratings yet
OptimumEngineeringDesign Day2b
24 pages
Princeton University Notation and Terminology in Optimization
No ratings yet
Princeton University Notation and Terminology in Optimization
13 pages
Optimization Techniques
No ratings yet
Optimization Techniques
96 pages
NEOM UNIT-1 Sept-23
No ratings yet
NEOM UNIT-1 Sept-23
34 pages
Chapter 2 - Unconstrained Optimization
No ratings yet
Chapter 2 - Unconstrained Optimization
20 pages
Bms Basic NLP 120609
No ratings yet
Bms Basic NLP 120609
103 pages
Convex Optimization Prerequisite - Topics
No ratings yet
Convex Optimization Prerequisite - Topics
6 pages
Machine Learning Notes2
No ratings yet
Machine Learning Notes2
34 pages
Mathematical Economics (ECON 471) Unconstrained & Constrained Optimization
No ratings yet
Mathematical Economics (ECON 471) Unconstrained & Constrained Optimization
20 pages
Analythical Methods
No ratings yet
Analythical Methods
45 pages
Nocedal - Wright CH - 02-01
No ratings yet
Nocedal - Wright CH - 02-01
9 pages
Optimization: Dixit 1990 Simon and Blume 1994 Carter 2001 de La Fuente 2000
No ratings yet
Optimization: Dixit 1990 Simon and Blume 1994 Carter 2001 de La Fuente 2000
25 pages
Dssm-U5 MHK
No ratings yet
Dssm-U5 MHK
51 pages
CH 2
No ratings yet
CH 2
31 pages
Power Systems Operation and Management: Second Lecture
No ratings yet
Power Systems Operation and Management: Second Lecture
35 pages
Optimisation
No ratings yet
Optimisation
38 pages
Mclas Tema1 v2
No ratings yet
Mclas Tema1 v2
74 pages
Optimization Lecture 1
No ratings yet
Optimization Lecture 1
11 pages
Matinf 2360 Part 3
No ratings yet
Matinf 2360 Part 3
106 pages
55 Optimization
No ratings yet
55 Optimization
21 pages
Lecture 1 2 Background
No ratings yet
Lecture 1 2 Background
6 pages
Lecture When Is A Function Convex Hessian Positive Definite
No ratings yet
Lecture When Is A Function Convex Hessian Positive Definite
7 pages
1 - Theory of Maxima and Minima
No ratings yet
1 - Theory of Maxima and Minima
31 pages
Nonlinear Optimization
No ratings yet
Nonlinear Optimization
6 pages
University of Maryland: Econ 600
No ratings yet
University of Maryland: Econ 600
22 pages
Math Chapter 7
No ratings yet
Math Chapter 7
4 pages
Chapter 6 Lecture Notes
No ratings yet
Chapter 6 Lecture Notes
4 pages
Optimization For Data Science
No ratings yet
Optimization For Data Science
18 pages
Chapter 5 Defination: Ax+b C
No ratings yet
Chapter 5 Defination: Ax+b C
3 pages
Univariate Calculus and Multivariate Calculus
No ratings yet
Univariate Calculus and Multivariate Calculus
141 pages
chp#06
No ratings yet
chp#06
12 pages
Econ 102
No ratings yet
Econ 102
18 pages
Unconstrained and Constrained Optimization Algorithms by Soman K.P
No ratings yet
Unconstrained and Constrained Optimization Algorithms by Soman K.P
166 pages
OPTIMIZATION Lecture
No ratings yet
OPTIMIZATION Lecture
88 pages
Notes 3
No ratings yet
Notes 3
8 pages
Course Notes For MATH 524: Non-Linear Optimization
No ratings yet
Course Notes For MATH 524: Non-Linear Optimization
112 pages
5 Optimization Techniques
No ratings yet
5 Optimization Techniques
40 pages
CH 4-Design Optimization-Optimum Design Concepts PDF
No ratings yet
CH 4-Design Optimization-Optimum Design Concepts PDF
62 pages
OQM Lecture Note - Part 8 Unconstrained Nonlinear Optimisation
No ratings yet
OQM Lecture Note - Part 8 Unconstrained Nonlinear Optimisation
23 pages
Week02 Convex Optimization
No ratings yet
Week02 Convex Optimization
48 pages
2 - Non Linear Optimization - V3
No ratings yet
2 - Non Linear Optimization - V3
20 pages
Lec 17 Multivariable OT
No ratings yet
Lec 17 Multivariable OT
30 pages
Lecture 1 Introduction To Optimization in Economics
No ratings yet
Lecture 1 Introduction To Optimization in Economics
47 pages
Opte
No ratings yet
Opte
32 pages
Optimisation and Optimal Control
No ratings yet
Optimisation and Optimal Control
82 pages
20-Region Elimination Method - Golden Search Method-11-03-2025
No ratings yet
20-Region Elimination Method - Golden Search Method-11-03-2025
20 pages
Lecture1 introductionPCA
No ratings yet
Lecture1 introductionPCA
75 pages
Why Do Local Methods Solve Nonconvex Problems
No ratings yet
Why Do Local Methods Solve Nonconvex Problems
19 pages
06 Differentiation of Multivariate Functions
No ratings yet
06 Differentiation of Multivariate Functions
41 pages
1 Mathematical Preliminaries 2
No ratings yet
1 Mathematical Preliminaries 2
17 pages
Optimization Hansen
No ratings yet
Optimization Hansen
5 pages
Maths For ML
No ratings yet
Maths For ML
1 page
Lecture 32 34
No ratings yet
Lecture 32 34
71 pages
Ma1505 Cheat
No ratings yet
Ma1505 Cheat
4 pages
Material and Energy Balance
No ratings yet
Material and Energy Balance
26 pages
Lightning Awareness Strategy Nayagarh
No ratings yet
Lightning Awareness Strategy Nayagarh
5 pages
Grievance
No ratings yet
Grievance
4 pages
List of Topics - Communication
No ratings yet
List of Topics - Communication
1 page
TCSiON - IITKgp Calendar Cohort 1
No ratings yet
TCSiON - IITKgp Calendar Cohort 1
2 pages
A Combined Genetic Adaptive Search (Geneas) For Engineering Design
100% (3)
A Combined Genetic Adaptive Search (Geneas) For Engineering Design
34 pages
Sapthagiri College of Engineering Department of Computer Science and Engineering Internal Assessment Test - II
No ratings yet
Sapthagiri College of Engineering Department of Computer Science and Engineering Internal Assessment Test - II
3 pages
Linear Programming: Dr. G Srinivasan Industrial Management Division Iitm
No ratings yet
Linear Programming: Dr. G Srinivasan Industrial Management Division Iitm
65 pages
Bounded Bi Harmonic Weights Siggraph 2011 Compressed Jacobson Et Al
No ratings yet
Bounded Bi Harmonic Weights Siggraph 2011 Compressed Jacobson Et Al
8 pages
Halley's Method Final
No ratings yet
Halley's Method Final
7 pages
Classification of The Dubins Set: Andrei M. Shkel, Vladimir Lumelsky
No ratings yet
Classification of The Dubins Set: Andrei M. Shkel, Vladimir Lumelsky
24 pages
2016 Sensor Deployment For Target Coverage in
No ratings yet
2016 Sensor Deployment For Target Coverage in
6 pages
Aimms To Excel Guide
No ratings yet
Aimms To Excel Guide
41 pages
Graphical Method To Solve LPP
100% (1)
Graphical Method To Solve LPP
19 pages
HW7 Questions
No ratings yet
HW7 Questions
3 pages
Sheet 3-Chapter 17
No ratings yet
Sheet 3-Chapter 17
9 pages
A Minimization of The Cost of Transportation: M. L. Aliyu, U. Usman, Z. Babayaro, M. K. Aminu
No ratings yet
A Minimization of The Cost of Transportation: M. L. Aliyu, U. Usman, Z. Babayaro, M. K. Aminu
7 pages
Unit 4 2
No ratings yet
Unit 4 2
24 pages
Sol Mock Exam
No ratings yet
Sol Mock Exam
10 pages
Theory of Tolerance
No ratings yet
Theory of Tolerance
36 pages
Power System Planning and Reliability Sylabus
No ratings yet
Power System Planning and Reliability Sylabus
1 page
Information Sciences
No ratings yet
Information Sciences
46 pages
Sample or Record
No ratings yet
Sample or Record
35 pages
(123dok - Com) View of Location Analysis of Fire Stations in Cagayan de Oro City Using Minimum Impedance P Median P
No ratings yet
(123dok - Com) View of Location Analysis of Fire Stations in Cagayan de Oro City Using Minimum Impedance P Median P
22 pages
Multi-Objective Evolutionary Algorithms: DR Sujit Das
No ratings yet
Multi-Objective Evolutionary Algorithms: DR Sujit Das
22 pages
Arxiv Mocp
No ratings yet
Arxiv Mocp
24 pages
Operations Research
0% (2)
Operations Research
104 pages
Multi Effect Evaporator Design Calculation For Brown Sugar Production Using Computational Fluid Dynamics
No ratings yet
Multi Effect Evaporator Design Calculation For Brown Sugar Production Using Computational Fluid Dynamics
5 pages
Lect DPv4 Z
No ratings yet
Lect DPv4 Z
46 pages
Final - M Tech - Syllabus
No ratings yet
Final - M Tech - Syllabus
52 pages
Optimization Techniques
No ratings yet
Optimization Techniques
3 pages
A Stopping Criterion For Surrogate Based Optimization Using EGO
No ratings yet
A Stopping Criterion For Surrogate Based Optimization Using EGO
9 pages

Lec 18

Uploaded by

Lec 18

Uploaded by

Machine Learning for Engineering and Science Applications

Professor Dr. Balaji Srinivasan

(Refer Slide Time: 0:15)

In this video we will be looking at some beginning of optimization specifically unconstrained

(Refer Slide Time: 0:21)

(Refer Slide Time: 1:38)

So typically what you try to do in a general optimization task is to maximize or minimize a

(Refer Slide Time: 8:40)

You might also like