0% found this document useful (1 vote)
588 views

Optimisation Methods in Machine Learning

The document discusses several top optimization methods in machine learning, including gradient descent, stochastic gradient descent, adaptive learning rate methods like AdaGrad and RMSProp, conjugate gradient method, and derivative-free optimization methods like simulated annealing. It also briefly mentions zeroth-order optimization and using meta-learning approaches like LSTMs as meta-optimizers.

Uploaded by

Saurabh Yadav
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (1 vote)
588 views

Optimisation Methods in Machine Learning

The document discusses several top optimization methods in machine learning, including gradient descent, stochastic gradient descent, adaptive learning rate methods like AdaGrad and RMSProp, conjugate gradient method, and derivative-free optimization methods like simulated annealing. It also briefly mentions zeroth-order optimization and using meta-learning approaches like LSTMs as meta-optimizers.

Uploaded by

Saurabh Yadav
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning

(https://analyticsindiamag.com) PUBLISHED ON JULY 15, 2020


IN DEVELOPERS CORNER (HTTPS://ANALYTICSINDIAMAG.COM/CATEGORY/DEVELOPERS_CORNER/)

Top Optimisation Methods In Machine Learning


BY RAM SAGAR(HTTPS://ANALYTICSINDIAMAG.COM/AUTHOR/RAM-SAGAR/)

(https://praxis.ac.in/top-post-graduate-program-in-data-science/?
utm_source=Analytics+India+Magazine&utm_medium=banner&utm_campaign=DS+Jan23)

 “All the impressive achievements of deep learning amount to just curve


fitting.”

https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 1/22
10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning

  Judea Pearl

Machine learning in its most reduced form is sometimes referred to as glorified


curve fitting. In a way, it is true. Machine learning models are typically founded on
the principles of convergence; fitting data to the model. Whether this approach
(https://analyticsindiamag.com)
will lead to AGI is still a debatable subject. However, for now, deep neural
networks are the best possible solution, and they use optimisation methods to
arrive at the target.

Fundamental optimisation methods are typically categorised into first-order, high-


order and derivative-free optimisation methods. One usually comes across methods
that fall into the category of the first-order optimisation such as the gradient
descent and its variants.

THE BELAMY
Sign up for your weekly dose of what's up in emerging technology.

Enter your email

SIGN UP

via 3blue1brown

According to an extensive survey


(https://arxiv.org/pdf/1906.06821.pdf#:~:text=Building%20models%20and%20construct
done on optimisation methods by Shiliang et al., here are a few of the top methods
that one can frequently encounter in their ML journey:

Gradient Descent
https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 2/22
10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning

The gradient descent method is the most popular optimisation method. The idea
of this method is to update the variables iteratively in the (opposite) direction of
the gradients of the objective function. With every update, this method guides the
model to find the target and gradually converge to the optimal value of the
objective function. 
(https://analyticsindiamag.com)

ALSO READ

(https://analyticsindiamag.com/good-bad-ugly-of-the-new-iphones/)
Good, Bad & Ugly of the New iPhones (https://analyticsindiamag.com/good-bad-ugly-of-the-
new-iphones/)

https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 3/22
10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning

(https://analyticsindiamag.com)

(https://analyticsindiamag.com/a-new-digital-rupee-could-shake-the-indian-financial-
system/)
A New Digital Rupee Could Shake the Indian Financial System (https://analyticsindiamag.com/a-
new-digital-rupee-could-shake-the-indian-financial-system/)

(https://analyticsindiamag.com/github-is-part-of-creator-economy-just-that-it-wont-
pay/)
GitHub Is Part of Creator Economy, Just That It Won’t Pay
(https://analyticsindiamag.com/github-is-part-of-creator-economy-just-that-it-wont-pay/)

https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 4/22
10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning

Stochastic Gradient Descent 


Stochastic gradient descent (SGD) was proposed to address the computational
complexity involved in each iteration for large scale data. The equation is given as:

(https://analyticsindiamag.com)

Taking the values and adjusting them iteratively based on different parameters in
order to reduce the loss function is called back-propagation.

In this method, one sample randomly used to update the gradient(theta) per
iteration instead of directly calculating the exact value of the gradient. The
stochastic gradient is an unbiased estimate of the real gradient. This optimisation
(https://analyticsindiamag.com/how-stochastic-gradient-descent-is-solving-
optimisation-problems-in-deep-learning/)method reduces the update time for
dealing with large numbers of samples and removes a certain amount of
computational redundancy. Read more here (https://analyticsindiamag.com/why-
learning-rate-is-crucial-in-deep-learning/).

Adaptive Learning Rate Method 


Learning rate is one of the key hyperparameters that undergo optimisation.
Learning rate decides whether the model will skip certain portions of the data. If
the learning rate is high, then the model might miss on subtler aspects of the data.
If it is low, then it is desirable for real-world applications. Learning rate has a great
influence on SGD. Setting the right value of the learning rate can be challenging.
Adaptive methods were proposed to this tuning automatically. 

The adaptive variants of SGD have been widely used in DNNs. Methods like
AdaDelta, RMSProp, Adam use the exponential averaging to provide effective
updates and simplify the calculation.

Adagrad: weights with a high gradient will have low learning rate and vice
versa

https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 5/22
10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning

RMSprop: adjusts the Adagrad method such that it reduces its


monotonically decreasing learning rate. 
Adam is almost similar to RMSProp but with momentum
Alternating Direction Method of Multipliers (ADMM) is another
alternative to Stochastic Gradient Descent (SGD) 
(https://analyticsindiamag.com)
The difference between gradient descent and AdaGrad methods is that the learning
rate is no longer fixed. It is computed using all the historical gradients accumulated
up to the latest iteration. Read more here (https://analyticsindiamag.com/a-lowdown-
on-alternatives-to-gradient-descent-optimization-algorithms/).

Conjugate Gradient Method


The conjugate gradient (CG) approach is used for solving large scale linear systems
of equations and nonlinear optimisation problems. The first-order methods have a
slow convergence speed. Whereas, the second-order methods are resource-heavy.
Conjugate gradient optimisation is an intermediate algorithm, which combines the
advantages of first-order information while ensuring the convergence speeds of
high-order methods.

Know more about gradient methods here (https://mlfromscratch.com/optimizers-


explained/#/).

Derivative-Free Optimisation 
For some optimisation problems, it can always be approached through a gradient
because the derivative of the objective function may not exist or is not easy to
calculate. This is where derivative-free optimisation comes into the picture. It uses
a heuristic algorithm that chooses methods that have already worked well, rather
than derives solutions systematically. Classical simulated annealing arithmetic,
genetic algorithms and particle swarm optimisation are few such examples.

Zeroth Order Optimisation

https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 6/22
10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning

Zeroth Order optimisation (https://analyticsindiamag.com/zeroth-order-optimisation-


and-its-applications-in-deep-learning/) was introduced recently to address the
shortcomings of derivative-free optimisation. Derivative-free optimisation methods
find it difficult to scale to large-size problems and suffer from lack of convergence
rate analysis. 
(https://analyticsindiamag.com)

Zeroth Order advantages include:

Ease of implementation with only a small modification of commonly-used


gradient-based algorithms
Computationally efficient approximations to derivatives when they are
difficult to compute
Comparable convergence rates to first-order algorithms.

For Meta Learning


Meta-optimiser is popular within the meta learning
(https://analyticsindiamag.com/how-to-make-meta-learning-more-effective/) regime.
The purpose of meta learning is to achieve fast learning, which, in turn, makes
gradient descent more accurate in the optimisation. The optimisation process itself
can be regarded as a learning problem to learn the prediction gradient rather than a
determined gradient descent algorithm. Due to the similarity between the gradient
update in backpropagation and the cell state update, LSTM is often used as the
meta-optimiser.

Whereas, model-agnostic meta learning algorithm (MAML) is another method,


which learns the parameters of models subjected to gradient descent methods that
include classification, regression and reinforcement learning. The basic idea of the
model-agnostic algorithm is to begin multiple tasks at the same time, and then get
the synthetic gradient direction of different tasks, so as to learn a common base
model.

These are few of the frequently used optimisation methods. Apart from these,
there are other methods, which can be found in this study
(https://arxiv.org/pdf/1906.06821.pdf#:~:text=Building%20models%20and%20construct

https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 7/22
10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning

That said, the problem of optimisation for deep learning doesn’t end here because
not all problems come under convex optimisation. Non-convex optimisation is one
of the difficulties in the optimisation problem.

One approach is to transform the non-convex optimisation into a convex


(https://analyticsindiamag.com)
optimisation problem, and then use the convex optimisation method. The other is
to use some special optimisation method such as projection gradient descent,
alternating minimisation, expectation maximisation algorithm and stochastic
optimisation and its variants.

More Great AIM Stories

IIT-Bombay’s energy-saving AI chip accomplishes new ultralow-power lows


(https://analyticsindiamag.com/iit-bombays-energy-saving-ai-chip-accomplishes-
new-ultralow-power-lows/)

Deeper Insights Series: #AMA with Data Science Hiring Leaders from Intuit
(https://analyticsindiamag.com/deeper-insights-series-ama-with-data-science-
hiring-leaders-from-intuit/)

Meet the champions who cracked the rent prediction problem


(https://analyticsindiamag.com/meet-the-champions-who-cracked-the-rent-
prediction-problem/)

How is this team from IISc building next generation analog chipsets for AI
applications (https://analyticsindiamag.com/how-is-this-team-from-iisc-building-
next-generation-analog-chipsets-for-ai-applications/)

Why did IBM acquire Databand.ai? (https://analyticsindiamag.com/why-did-ibm-


acquire-databand-ai/)

Data science journey of Amazon’s Ankit Sirmorya


https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 8/22
10/10/22, 3:43 PM
j y y
Top Optimisation Methods In Machine Learning

(https://analyticsindiamag.com/data-science-journey-of-amazons-ankit-sirmorya/)

(https://analyticsindiamag.com)
(https://analyticsindiamag.com/author/ram-sagar/)
I have a master's degree in Robotics and I write about machine learning advancements.

(https://business.louisville.edu/learnmo

utm_campaign=MSBA-
INDIA&utm_source=analyticsindia&utm_medium=display&utm_keyword=analyticsindi

Our Upcoming Events

https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 9/22
10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning

Conference, in-person (Bangalore)


Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023
Register
(https://mlds.analyticsindiasummit.com/get-the-tickets/)
(https://analyticsindiamag.com)

Conference, in-person (Bangalore)


Rising 2023 | Women in Tech Conference
16-17th Mar, 2023
Register
(https://rising.analyticsindiamag.com/register/)

Conference, in-person (Bangalore)


Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Register
(https://des.analyticsindiamag.com/)

Conference, in-person (Bangalore)


MachineCon 2023
23rd Jun, 2023

Register
(https://machinecon.analyticsindiamag.com/)

3 Ways to Join our Community

Discord Server
https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 10/22
10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning

Stay Connected with a larger ecosystem of data science and ML Professionals

JOIN DISCORD COMMUNITY


(HTTPS://DISCORD.GG/SBTJ3JDEAZ)

(https://analyticsindiamag.com)

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

JOIN TELEGRAM
(HTTPS://T.ME/+TRPAPV7GNN2OZ1AZ)

Subscribe to our newsletter


Get the latest updates from AIM

SUBSCRIBE

https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 11/22
10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning

MOST POPULAR

(https://analyticsindiamag.com)

(https://analyticsindiamag.com/a-guide-to-stochastic-process-and-its-applications-in-
machine-learning/)
A Guide to Stochastic Process and Its Applications in Machine Learning
(https://analyticsindiamag.com/a-guide-to-stochastic-process-and-its-
applications-in-machine-learning/)
VIJAYSINH LENDAVE

https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 12/22
10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning

(https://analyticsindiamag.com)

(https://analyticsindiamag.com/all-about-digital-twins-interview-with-vinay-jammu-ge-
digital/)
All About Digital Twins: Interview With Vinay Jammu, GE Digital
(https://analyticsindiamag.com/all-about-digital-twins-interview-with-vinay-
jammu-ge-digital/)
SHRADDHA GOLED

(https://analyticsindiamag.com/facebook-loves-self-supervised-learning-period/)
Facebook Loves Self-Supervised Learning. Period.
(https://analyticsindiamag.com/facebook-loves-self-supervised-learning-
period/)
https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 13/22
10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning

AMIT RAJA NAIK

(https://analyticsindiamag.com)

(https://analyticsindiamag.com/4th-edition-machine-learning-developers-summit-
announced/)
The 4th Edition Of Machine Learning Developers Summit Is Back In Virtual
Format | 19-20th Jan (https://analyticsindiamag.com/4th-edition-machine-
learning-developers-summit-announced/)
ANUSHKA PANDIT

https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 14/22
10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning

(https://analyticsindiamag.com)

(https://analyticsindiamag.com/projects-using-classification-technique/)
Top Data Science Projects Using Classification Technique For Beginners
(https://analyticsindiamag.com/projects-using-classification-technique/)
DR. NIVASH JEEVANANDAM

(https://analyticsindiamag.com/deloitte-accenture-it-outsourcing-market/)
What Do Deloitte & Accenture Bumper Results Speak Of The IT Outsourcing
Market (https://analyticsindiamag.com/deloitte-accenture-it-outsourcing-
market/)
KUMAR GANDHARV

https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 15/22
10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning

(https://analyticsindiamag.com)

(https://analyticsindiamag.com/how-to-choose-the-right-data-science-training-program/)

How to Choose The Right Data Science Training Program


(https://analyticsindiamag.com/how-to-choose-the-right-data-science-training-
program/)
SREEJANI BHATTACHARYYA

(https://analyticsindiamag.com/how-artificial-intelligence-is-impacting-hearing-aid/)
How Artificial Intelligence Is Impacting Hearing Aid
(h // l i i di /h ifi i l i lli
https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ i i i 16/22
10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning
(https://analyticsindiamag.com/how-artificial-intelligence-is-impacting-
hearing-aid/)
PRASIDHA

(https://analyticsindiamag.com)

(https://analyticsindiamag.com/not-so-common-machine-learning-examples-that-challenge-
your-knowledge/)
Not So Common Machine Learning Examples That Challenge Your Knowledge
(https://analyticsindiamag.com/not-so-common-machine-learning-examples-
that-challenge-your-knowledge/)
ROGER MAX

https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 17/22
10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning

(https://analyticsindiamag.com)

(https://analyticsindiamag.com/now-ai-tells-you-if-it-will-pour-in-the-next-two-hours/)
Now AI Tells You If It Will Pour In The Next Two Hours
(https://analyticsindiamag.com/now-ai-tells-you-if-it-will-pour-in-the-next-
two-hours/)
SREEJANI BHATTACHARYYA

https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 18/22
10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning

(https://analyticsindiamag.com/upcoming-ai-conferences-and-events-to-look-forward-to/)
Upcoming AI Conferences And Events To Look Forward To
(https://analyticsindiamag.com/upcoming-ai-conferences-and-events-to-look-
forward-to/)
(https://analyticsindiamag.com)
DEBOLINA BISWAS

(https://analyticsindiamag.com/how-bengaluru-based-kreditbee-leverages-ai-and-ml-to-
democratise-credit/)

How Bengaluru-based KreditBee Leverages AI And ML To Democratise Credit


(https://analyticsindiamag.com/how-bengaluru-based-kreditbee-leverages-ai-
and-ml-to-democratise-credit/)
DEBOLINA BISWAS

https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 19/22
10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning

Our Mission Is To Bring About Better-Informed And More


Conscious Decisions About Technology Through Authoritative,
Inf luential, And Trustworthy Journalism.

SHAPE THE FUTURE OF TECH


(https://analyticsindiamag.com)

C O NTA CT U S ⟶
( HT T P S : // A N A LY T I C S I N D I A M A G . C O M / C O NTA CT- U S / )

(https://analyticsindiamag.com)

AIM discovers new ideas and breakthroughs that create new relationships, new industries, and new
ways of thinking. AIM is the crucial source of knowledge and concepts that make sense of a reality
that is always changing.

Our discussions shed light on how technology is transforming many facets of our life, from business
to society to culture.

https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 20/22
10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning

(https://www.linkedin.com/company/analytics-
(https://www.facebook.com/AnalyticsIndiaMagazine/)
(https://www.youtube.com/channel/UCAlwrsgeJavG1vw9qSFOUmA)
(https://twitter.com/@analyticsindiam)
(https://www.instagram.com/analyticsindiamagazine/)
india-magazine)

About Us
(https://analyticsindiamag.com)
Advertise
Weekly Newsletter
Write for us
Careers
Contact Us

RANKINGS & LISTS

Best Firms To Work For


Top Analytics Providers
Top AI Startups
Top Technology Programmes & Colleges

OUR CONFERENCES

Cypher
The MachineCon
Machine Learning Developers Summit
The Rising | Women in Tech Conference
Data Engineering Summit

OUR BRANDS

MachineHack
AIM Recruits
AIM Leaders Council
Best Firm Certification
AIM Research

VIDEOS

Vox Pop: Oye Techie


Podcasts – Simulated Reality
The Pretentious Geek
https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 21/22
10/10/22, 3:43 PM Top Optimisation Methods In Machine Learning

Curiosum – AI Storytelling

Deeper Insights with Leaders


Documentary – The Transition Cost
Behind The Scenes at Tech Firms

GET INVOLVED
(https://analyticsindiamag.com)
AIM Campus Ambassador
AIM Mentoring Circle
Python Libraries Repository

AWARDS

Data Science Excellence Awards


AI100
40 under 40 Data Scientists
Women in AI Leadership

EVENTS

Our Events
AIM Custom Events
On Demand

FOR ML DEVELOPERS

Hackathons
Discussion Forum
Job Portal
Mock Assessments
Practice ML
Free AI Courses

NEWSLETTER

Stay up to date with our latest news, receive exclusive deals, and more.
https://analyticsindiamag.com/optimisation-machine-learning-methods-gradient-descent/ 22/22

You might also like