Optimizers lionVSAdam

The document discusses optimizers in deep learning, focusing on the Adam and Lion optimizers. Adam combines two algorithms to adjust learning rates but struggles with noisy data, while Lion, developed by Google Brain and UCLA, improves upon Adam by effectively tracking momentum and handling noise. Choosing the right optimizer depends on the dataset and requires experimentation for optimal results.

Uploaded by

brahmika shree balaji

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views2 pages

Optimizers lionVSAdam

Uploaded by

brahmika shree balaji

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Op#mizers: Lion vs Adam

Introduc)on
Deep Learning is a sub-sec0on of machine learning that allows machines to process data in a
similar manner as the human brain. The backbone of DL is a network of nodes that connect
to each other forming layers, and a combina0on of these layers form a neural network. Input
data passes through several layers of neural networks and is refined to make accurate
predic0ons. How a neural network ideally works is that data(features) is passed through an
input layer and produced out of an Output Layer. But between these two layers is where the
major processing and fine-tuning of data takes place, in the Hidden Layer. In this post, I am
focusing on understanding an algorithm that is responsible for the fine-tuning between layers.
These algorithms are called Op0mizers.

What are Op)mizers?

Op0mizers are algorithms that tweak certain aHributes of your neural network such as
weights, learning rate to reduce losses. We are targe0ng three very technical terms in the
Deep Learning world, let’s try and understand what they mean.
1. Weights: They control the strength of the connec0on between two consecu0ve nodes.
They help decide how much one layer will affect the next layer. This helps us
understand how the input layer has contributed to the results provided by the output
layer.
2. Biases: These are constant numbers that adjust the level at which an ac0va0on
func0on is triggered, this func0on is responsible for whether a neuron is ac0vated or
not. It’s like a constant in a linear equa0on. It’s an addi0onal parameter that adjusts
the output.
3. Learning Rate: It’s important for deep learning models to be able to take in new and
updated data and train on it. Learning Rate is a variable that shows how quickly a
model can adapt to change.
4. Losses: Losses calculate how far off our predicted value is from the target value. It
helps understand the accuracy of the model.
Op0mizers are used to minimize these differences called Losses by adjus0ng parameters that
we discussed above.
There are several types of op0mizers to pick from, some commonly used op0mizers are
Gradient Descent Op)mizer, Adam Op)mizer and Stochas)c Gradient Descent with
Momentum. As a beginner I always thought increased number of epochs could yield beHer
results but that is not true. These op0mizers need to be picked keeping in mind what
parameter we are aiming for, and which is adaptable to the amount of data we are planning
to feed the model. We are going to discuss about one such op0mizer called “Adam”.

What does Adam do?

Adam is abbrevia0on for Adap)ve Moment Es)ma)on. It is a combina0on of two other
op0mizers called Stochas)c Gradient Descent with Momentum and RMSprop. To understand
what these two algorithms bring to the table, let’s imagine a hill, and if we were trying to get
to the lowest part of the hill, Gradient Descent algorithm does the job. We are aiming for the
lowest point because that would mean the losses are low. To get to this lowest point of the
hill we might have to cross highs as well, RMSprop does the job of deciding how big our steps
must be for us to reach our point, it can neither be too big nor too small. With the help of
SGD, we take smaller por0ons of the hill to navigate and use the momentum to point us in the
right direc0on. Adam is a combina0on of RMSprop’s ability to improve learning rate and SGD’s
capability to navigate in the right direc0on. Its primary focus is to adjust learning rates for
beHer accuracy of a model. While Adam does a good job with clear data it doesn’t do very
well with noisy data i.e., it makes major ﬂuctua0ons of the learning rate when it encounters
noisy data. Researchers might have found just the right solu0on for that.

What does Lion do?

While Adam might seem like an ideal op0mizer despite its nega0ves, in the recent 0mes
researchers have come up with a new op0mizer called the Lion Op0mizer (EvoLved Sign
Momentum) which solves the disadvantages of the Adam Op0mizer. This algorithm was
discovered by Google Brain along with the University of California (UCLA). it has proven to be
beHer that Adam is several ways. The Lion op0mizer focuses of tracking the momentum, while
leveraging the Sign Opera0on (+) which helps the algorithm move in one direc0on despite all
the noisy data. The simplicity of this algorithm also makes it memory eﬃcient. But don’t let
the simplicity of it make u ques0on it’s accuracy, in several instances it has proven to perform
beHer than Adam.

Conclusion
In conclusion, selec0ng the perfect Op0mizer depends on more than one feature. Every kind
of dataset would require a speciﬁc type of Op0mizer and a lot of trial and errors help us
understand how these things work and help us make beHer choices.

BMW E46 Code List
No ratings yet
BMW E46 Code List
82 pages
FASA - Federation Ship Recognition Manual 2385
100% (4)
FASA - Federation Ship Recognition Manual 2385
204 pages
ISO - 3170 - EN. Petroleum Liquids - Manual Sampling
No ratings yet
ISO - 3170 - EN. Petroleum Liquids - Manual Sampling
11 pages
Group 35 Brake
No ratings yet
Group 35 Brake
123 pages
Final Step Part - B Maths
No ratings yet
Final Step Part - B Maths
88 pages
Fembot
No ratings yet
Fembot
7 pages
Processes and Threads
No ratings yet
Processes and Threads
14 pages
Moss White Paper English Final Reduced
No ratings yet
Moss White Paper English Final Reduced
91 pages
Microlab 600 Basic Technical Manual
No ratings yet
Microlab 600 Basic Technical Manual
113 pages
Lecture # 6: Computer Organization and Assembly Language
100% (1)
Lecture # 6: Computer Organization and Assembly Language
31 pages
User Manual V.1.1: English Version 02 2019
No ratings yet
User Manual V.1.1: English Version 02 2019
20 pages
CC Link IE
No ratings yet
CC Link IE
84 pages
PJM 3
No ratings yet
PJM 3
29 pages
FactSheet - QoS v1
No ratings yet
FactSheet - QoS v1
4 pages
Jb-Wd-Dse 6110 Mkii - 200 (1506) - 650
No ratings yet
Jb-Wd-Dse 6110 Mkii - 200 (1506) - 650
2 pages
The School Principal of PEGAFI, Dr. Francisca Uy
No ratings yet
The School Principal of PEGAFI, Dr. Francisca Uy
3 pages
SHR 1040
No ratings yet
SHR 1040
23 pages
8 Adagrad, RMSprop, Adam 04 Sep 2020material I 04 Sep 2020 Module4 Optimization
No ratings yet
8 Adagrad, RMSprop, Adam 04 Sep 2020material I 04 Sep 2020 Module4 Optimization
50 pages
Teacher Icard Programme
No ratings yet
Teacher Icard Programme
30 pages
Multiplexer and Demultiplexer
No ratings yet
Multiplexer and Demultiplexer
11 pages
Artificial Neural Network (ANN) : AI Predicts That There's A 5% Chance That It'll Take Over The World in Future
No ratings yet
Artificial Neural Network (ANN) : AI Predicts That There's A 5% Chance That It'll Take Over The World in Future
11 pages
Extract Greek Text PDF
No ratings yet
Extract Greek Text PDF
2 pages
Detailed Drawing of Footing and Column
No ratings yet
Detailed Drawing of Footing and Column
1 page
Deep Learning
No ratings yet
Deep Learning
18 pages
Carburetor Tuning Tips 101
No ratings yet
Carburetor Tuning Tips 101
3 pages
Momentum Update Rule
No ratings yet
Momentum Update Rule
4 pages
Dwyer BAS
No ratings yet
Dwyer BAS
6 pages
Soft Computing Assignment
No ratings yet
Soft Computing Assignment
9 pages
Deep Learning (All in One)
No ratings yet
Deep Learning (All in One)
23 pages
New Model Service Ratio - 15022025
No ratings yet
New Model Service Ratio - 15022025
36 pages
05 Ncecsc Dgarm
No ratings yet
05 Ncecsc Dgarm
9 pages
Op Tim Ization
No ratings yet
Op Tim Ization
22 pages
Soft Mod 2
No ratings yet
Soft Mod 2
11 pages
Activations, Loss Functions & Optimizers in ML
No ratings yet
Activations, Loss Functions & Optimizers in ML
29 pages
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization
No ratings yet
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization
1 page
Optimization
No ratings yet
Optimization
3 pages
Deepfakes: A New Threat To Face Recognition? Assessment and Detection
No ratings yet
Deepfakes: A New Threat To Face Recognition? Assessment and Detection
5 pages
Unit-2 Improving-Deep-Neural-Networks
No ratings yet
Unit-2 Improving-Deep-Neural-Networks
18 pages
L5 Training Neural Networks Part 2 en v2
No ratings yet
L5 Training Neural Networks Part 2 en v2
70 pages
Bio Optimization of Deep Learning Network Architectures 22fguqp5
No ratings yet
Bio Optimization of Deep Learning Network Architectures 22fguqp5
11 pages
Deep Learning (MODULE-2)
No ratings yet
Deep Learning (MODULE-2)
86 pages
LIS Serial Installation
No ratings yet
LIS Serial Installation
13 pages
Deep Learning - Summary - Deep - Learning
No ratings yet
Deep Learning - Summary - Deep - Learning
17 pages
Unit 2.4
No ratings yet
Unit 2.4
31 pages
ANN Analysis
No ratings yet
ANN Analysis
5 pages
DL Mod2
No ratings yet
DL Mod2
45 pages
CNN Basic Structure, Hyper-Parameter Tuning, Regularization-Dropouts
No ratings yet
CNN Basic Structure, Hyper-Parameter Tuning, Regularization-Dropouts
54 pages
305766
No ratings yet
305766
7 pages
Role of An Optimizer
No ratings yet
Role of An Optimizer
9 pages
ADL Unit-3
100% (2)
ADL Unit-3
21 pages
F2ENG OEF 2A Unit02 Vocabulary Dictation Ss
No ratings yet
F2ENG OEF 2A Unit02 Vocabulary Dictation Ss
2 pages
Visoin LED Series
No ratings yet
Visoin LED Series
8 pages
2023246032-Backward Propagation and Other Differential Algorithms
No ratings yet
2023246032-Backward Propagation and Other Differential Algorithms
48 pages
UCS - 401 - Unit-LV - Trends in Machine Learning - Model and Symbols - Bagging and Boosting, Multitask
No ratings yet
UCS - 401 - Unit-LV - Trends in Machine Learning - Model and Symbols - Bagging and Boosting, Multitask
44 pages
DL 4
No ratings yet
DL 4
15 pages
Optimization of Deep Networks
No ratings yet
Optimization of Deep Networks
84 pages
Unit 5
No ratings yet
Unit 5
36 pages
UNIT3
No ratings yet
UNIT3
17 pages
5th Unit DL Final Class Notes
No ratings yet
5th Unit DL Final Class Notes
77 pages
Pure Optimization
No ratings yet
Pure Optimization
23 pages
Module3 Notes
No ratings yet
Module3 Notes
18 pages
Ann TP
No ratings yet
Ann TP
40 pages
Lecture 2
No ratings yet
Lecture 2
31 pages
Module 3
No ratings yet
Module 3
7 pages
Deep Learning Module-03 Search Creators
No ratings yet
Deep Learning Module-03 Search Creators
20 pages
AIML105
No ratings yet
AIML105
5 pages
Deep Learning Concepts
No ratings yet
Deep Learning Concepts
13 pages
Deep Learning Exp 2.3 MU
No ratings yet
Deep Learning Exp 2.3 MU
4 pages
Deep Learning Module 3
No ratings yet
Deep Learning Module 3
15 pages
Unit IV
No ratings yet
Unit IV
89 pages
Introduction To Optimization-Lec1
No ratings yet
Introduction To Optimization-Lec1
36 pages
21BCP181 Ai 10
No ratings yet
21BCP181 Ai 10
8 pages
Unit V NNHDL
No ratings yet
Unit V NNHDL
33 pages
Important Optimization Algorithms Essentials
No ratings yet
Important Optimization Algorithms Essentials
12 pages
Adam 1
No ratings yet
Adam 1
11 pages
Module 3-DL
No ratings yet
Module 3-DL
12 pages
Cst414-Deep Learning Module 2
No ratings yet
Cst414-Deep Learning Module 2
13 pages
Gen Aiml Notes by Piyush
No ratings yet
Gen Aiml Notes by Piyush
39 pages
Module2 Question and Answer
No ratings yet
Module2 Question and Answer
25 pages
Ca 3 DL
No ratings yet
Ca 3 DL
6 pages
DL Test-2
No ratings yet
DL Test-2
28 pages
Optmizers 1729945752
No ratings yet
Optmizers 1729945752
11 pages
Deep Learning Module 2 Important Topics PYQs
No ratings yet
Deep Learning Module 2 Important Topics PYQs
30 pages
4-Tensors and Opeartions - Probability Basics-Gradient Descent-27!07!2024
No ratings yet
4-Tensors and Opeartions - Probability Basics-Gradient Descent-27!07!2024
18 pages
Deep Learning Unit 2
No ratings yet
Deep Learning Unit 2
25 pages
Home Assignment Submission Solutions
No ratings yet
Home Assignment Submission Solutions
82 pages
Deep Learning
No ratings yet
Deep Learning
20 pages
Day 2 - Loss & Activation Functions
No ratings yet
Day 2 - Loss & Activation Functions
8 pages
Neural Networks: Neural Networks Tools and Techniques for Beginners
From Everand
Neural Networks: Neural Networks Tools and Techniques for Beginners
John Slavio
5/5 (10)
Coding Interview Questions and Answers
From Everand
Coding Interview Questions and Answers
Chinmoy Mukherjee
No ratings yet

Optimizers lionVSAdam

Uploaded by

Optimizers lionVSAdam

Uploaded by

Op#mizers: Lion vs Adam

What are Op)mizers?

What does Adam do?

What does Lion do?

You might also like