Gradient Descent Optimization

Gradient descent is an optimization algorithm used to minimize a cost function by iteratively moving toward the minimum cost. Stochastic gradient descent is a type of gradient descent that uses a single training example per iteration instead of the full dataset, making it faster and more memory efficient for large datasets, though it has more variance in updates.

Uploaded by

Guna Seelan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views4 pages

Gradient Descent Optimization

Uploaded by

Guna Seelan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

GRADIENT DESCENT OPTIMIZATION

 Gradient descent is an optimization algorithm in gadget mastering used to limit a feature with
the aid of iteratively moving towards the minimal fee of characteristic.
 We essentially use this algorithm when we have to locate the least possible values which
could fulfill a given free function. In gadget getting to know, greater regularly that not we try
to limit loss features (like mean squared error). By minimizing the loss characteristic, we will
improve our model and gradient descent is one of the most popular algorithms used for this
cause.

 The graph above shows how exactly a gradient descent set of rules works.
 We first take a factor in the value function and begin shifting in steps in the direction of the
minimum factor. The size of the step, or how quickly we ought to converge to the minimum
factor is defined by learning rate.
 We can cowl more location with better learning fee but at the risk of overshooting the
minima. On the opposite hand, small steps/ smaller gaining knowledge of charges will eat a
number of times to attain the lowest point.

1|Page
 Now, the direction where in algorithm has to transport is also important. We calculate this by
way of using derivatives. You need to be familiar with derivatives from calculus. A spinoff is
largely calculated because the slope of the graph at any specific factor.
 We get that with the aid of finding the tangent line to the graph at that point. The extra sleep
the tangent, would suggest that more steps would be needed to reach minimum point;
muchless steep might suggest lesser steps are required to reach the minimum

Negate gradie
d nt

Direction in every weight vector, a parabola

w0-w1plane a single global
with
producing minimum
steepest
desce
nt

STOCHASTIC GRADIENT DESCENT

 The word stochastic means a system or a process that is linked with a random probability.
Hence, in stochastic gradient descent, a few samples are selected randomly instead of the
whole data set for each iteration.

 Stochastic gradient descent is a type of gradient descent that runs one training example per
iteration. It processes a training epoch for each example within a dataset and updates each
training example’s parameters one at a time.

2|Page
 As it requires only one training example at a time, hence it is easier to store in allocated
memory. However, it shows some computational efficiency losses in comparison to batch
gradient systems as it shows frequent updates that require more detail and speed.

 Further, due to frequent updates, it is also treated as a noisy gradient. However, sometimes it
can be helpful in finding the global minimum and also escaping the local minimum.
Advantages of stochastic gradient descent: It is easier to allocate in desired memory. It is
relatively fast to compute than batch gradient descent. It is more efficient for large dataset.
Disadvantages of stochastic gradient descent: SGD require a number of hyperparameters such
as the regularization parameter and the number of iterations.
 Stochastic Gradient Descent:
 In Stochastic Gradient Descent, a few samples are selected randomly instead of the whole
data set for each iteration.

3|Page
 In Gradient Descent, there is a term called “batch” which denotes the total number of samples
from a dataset that is used for calculating the gradient for each iteration.
 In typical Gradient Descent optimization, like Batch Gradient Descent, the batch is taken to
be the whole dataset. Although using the whole dataset is really useful for getting to the
minima in a less noisy and less random manner, the problem arises when our dataset gets big.

 Suppose, you have a million samples in your dataset, so if you use a typical Gradient Descent
optimization technique, you will have to use all of the one million samples for completing
one iteration while performing the Gradient Descent, and it has to be done for every iteration
until the minima are reached. Hence, it becomes computationally very expensive to perform.
 This problem is solved by Stochastic Gradient Descent. In SGD, it uses only a single sample,
i.e., a batch size of one, to perform each iteration.
 The sample is randomly shuffled and selected for performing the iteration.

4|Page

Gradient Descent
No ratings yet
Gradient Descent
17 pages
Stochastic Gradient Descent - Term Paper
No ratings yet
Stochastic Gradient Descent - Term Paper
8 pages
2,5 Stochastic Gradient Descent
No ratings yet
2,5 Stochastic Gradient Descent
11 pages
WINSEM2024-25 CSE4006 ETH AP2024254000693 2025-01-08 Reference-Material-I
No ratings yet
WINSEM2024-25 CSE4006 ETH AP2024254000693 2025-01-08 Reference-Material-I
40 pages
An Overview of Gradient Descent Optimization Algorithms PDF
No ratings yet
An Overview of Gradient Descent Optimization Algorithms PDF
12 pages
ML Lec 08 Gradient Descent
No ratings yet
ML Lec 08 Gradient Descent
37 pages
Technical Writing
No ratings yet
Technical Writing
8 pages
Technical Writing
No ratings yet
Technical Writing
9 pages
Gradient-Based Optimizers
No ratings yet
Gradient-Based Optimizers
54 pages
Technical Writing
No ratings yet
Technical Writing
9 pages
Gradient Descent 5 Part 2
No ratings yet
Gradient Descent 5 Part 2
15 pages
QB Unit 3
No ratings yet
QB Unit 3
14 pages
Gradient Descent - PR
No ratings yet
Gradient Descent - PR
31 pages
Gradient Descent Algorithm Is A First
No ratings yet
Gradient Descent Algorithm Is A First
5 pages
Gradient Descent
No ratings yet
Gradient Descent
8 pages
Gradient Descent
No ratings yet
Gradient Descent
13 pages
GD Types
No ratings yet
GD Types
98 pages
Gradient Descent Method
No ratings yet
Gradient Descent Method
12 pages
SGD 1
No ratings yet
SGD 1
86 pages
Lec 6
No ratings yet
Lec 6
11 pages
Gradient Descent
No ratings yet
Gradient Descent
4 pages
SGD 2
No ratings yet
SGD 2
18 pages
Paper 2
No ratings yet
Paper 2
27 pages
DL Unit - 2
No ratings yet
DL Unit - 2
20 pages
Gradient Descent and Cost Function
No ratings yet
Gradient Descent and Cost Function
14 pages
Stochastic Gradient Descent
No ratings yet
Stochastic Gradient Descent
4 pages
3 Types of Gradient Descent Algorithms For Small & Large Datasets
No ratings yet
3 Types of Gradient Descent Algorithms For Small & Large Datasets
9 pages
Stochastic Gradient Descent - Math and Python Code
No ratings yet
Stochastic Gradient Descent - Math and Python Code
28 pages
Linear Models-Gradient Descent, Regularization (Introduction)
No ratings yet
Linear Models-Gradient Descent, Regularization (Introduction)
26 pages
Gradient Decent
No ratings yet
Gradient Decent
15 pages
Stochastic Gradient Descent
No ratings yet
Stochastic Gradient Descent
5 pages
Stochastic Gradient Descent
No ratings yet
Stochastic Gradient Descent
23 pages
Gradient Descent
No ratings yet
Gradient Descent
15 pages
Lecture05 Descent
No ratings yet
Lecture05 Descent
31 pages
S09 DNN Gradients Wip
No ratings yet
S09 DNN Gradients Wip
28 pages
DL_Exp2
No ratings yet
DL_Exp2
6 pages
Gradient Descent DS Rohit Sharma Fench Knjs
No ratings yet
Gradient Descent DS Rohit Sharma Fench Knjs
15 pages
AI33
No ratings yet
AI33
6 pages
ANN Explanation Request Updated
No ratings yet
ANN Explanation Request Updated
44 pages
Stochastic Gradient Descent
No ratings yet
Stochastic Gradient Descent
23 pages
12-Mini-Batch Gradient Descent - Exponential Weighted Averages-07-08-2024
No ratings yet
12-Mini-Batch Gradient Descent - Exponential Weighted Averages-07-08-2024
2 pages
ML - Week 06
No ratings yet
ML - Week 06
31 pages
7 Stochastic Gradient
No ratings yet
7 Stochastic Gradient
4 pages
Gradient Descent
No ratings yet
Gradient Descent
2 pages
4 - Gradient Descent and Stochastic GD
No ratings yet
4 - Gradient Descent and Stochastic GD
37 pages
17 Large Scale Machine Learning PDF
No ratings yet
17 Large Scale Machine Learning PDF
10 pages
Lesson 4 Gradient Descent
No ratings yet
Lesson 4 Gradient Descent
13 pages
SGD
No ratings yet
SGD
3 pages
Gradient Descent & Stockastic Gradient Descent
No ratings yet
Gradient Descent & Stockastic Gradient Descent
6 pages
Comparison of Gradient Descent Algorithms On Training Neural Networks
No ratings yet
Comparison of Gradient Descent Algorithms On Training Neural Networks
20 pages
Gradient Descent
No ratings yet
Gradient Descent
2 pages
Stochastic Gradient Descent: Ryan Tibshirani Convex Optimization 10-725
No ratings yet
Stochastic Gradient Descent: Ryan Tibshirani Convex Optimization 10-725
22 pages
04 Batch SGD Mini Batch Gradient Descent Algorithms
No ratings yet
04 Batch SGD Mini Batch Gradient Descent Algorithms
3 pages
Gradient Descent Algorithm in Machine Learning
No ratings yet
Gradient Descent Algorithm in Machine Learning
21 pages
Gradient Descent and Optimization in Machine Learning.pptx
No ratings yet
Gradient Descent and Optimization in Machine Learning.pptx
9 pages
UNIT3
No ratings yet
UNIT3
37 pages
PCA and Convex Optimization and Bias, Variance-2
No ratings yet
PCA and Convex Optimization and Bias, Variance-2
29 pages
Mlfa Autumn 22 Lec 04
No ratings yet
Mlfa Autumn 22 Lec 04
24 pages
Gradient Descent and Its Types
No ratings yet
Gradient Descent and Its Types
5 pages
GRP IV Batch 1 Updated Schedule English - Online
No ratings yet
GRP IV Batch 1 Updated Schedule English - Online
5 pages
Unit 4
No ratings yet
Unit 4
60 pages
AIML Manual
No ratings yet
AIML Manual
123 pages
10+ Proven Technical Interview Questions (+answers)
No ratings yet
10+ Proven Technical Interview Questions (+answers)
6 pages
Wave Pipelining
No ratings yet
Wave Pipelining
25 pages
Daft Punk Homework Album Review
100% (1)
Daft Punk Homework Album Review
4 pages
Physical and Chemical Changes Assignment
No ratings yet
Physical and Chemical Changes Assignment
3 pages
MELSEC iQ-R WS Safety Controller Ethernet Communication Function Block Reference - 00A
No ratings yet
MELSEC iQ-R WS Safety Controller Ethernet Communication Function Block Reference - 00A
30 pages
Natural - and Man Made Disasters
No ratings yet
Natural - and Man Made Disasters
15 pages
Resume Brittany Harry Journalism Student
No ratings yet
Resume Brittany Harry Journalism Student
3 pages
24 Preschool Directory Flyer English 082324
No ratings yet
24 Preschool Directory Flyer English 082324
2 pages
Aryan Rathi Medical
No ratings yet
Aryan Rathi Medical
3 pages
61a568c21c3eb
No ratings yet
61a568c21c3eb
11 pages
Behavior Rating Inventory of Executive Function R
No ratings yet
Behavior Rating Inventory of Executive Function R
7 pages
Polytechnic University of The Philippines College of Science Department of Food Technology
No ratings yet
Polytechnic University of The Philippines College of Science Department of Food Technology
44 pages
Cost Minimization of Liquid Steel Production in Libyan Iron and Steel Company
No ratings yet
Cost Minimization of Liquid Steel Production in Libyan Iron and Steel Company
8 pages
MGMT30006 S1 190502 8950
No ratings yet
MGMT30006 S1 190502 8950
4 pages
1 LSPC Control Valve PDF
100% (1)
1 LSPC Control Valve PDF
21 pages
IoT in Pharmacetics
No ratings yet
IoT in Pharmacetics
11 pages
Retaining Wall Details 20.11.2024
No ratings yet
Retaining Wall Details 20.11.2024
1 page
Magale 2
No ratings yet
Magale 2
8 pages
Binary Numbers and Logic Gates
No ratings yet
Binary Numbers and Logic Gates
25 pages
Chickpea PPT Presenatation
100% (3)
Chickpea PPT Presenatation
17 pages
Compensation Policy
No ratings yet
Compensation Policy
20 pages
Diesel Engines 8V 2000 M70 (BM 531.923) 12V 2000 M70 (BM 535.923) 16V 2000 M70 (BM 536.923) Maintenance Schedule M050540/02E
No ratings yet
Diesel Engines 8V 2000 M70 (BM 531.923) 12V 2000 M70 (BM 535.923) 16V 2000 M70 (BM 536.923) Maintenance Schedule M050540/02E
4 pages
Published by The Worldwide Church of God: OF Herbert W. Armstrong'S Sabbath
No ratings yet
Published by The Worldwide Church of God: OF Herbert W. Armstrong'S Sabbath
14 pages
Proed Merged
No ratings yet
Proed Merged
94 pages
Microblading Consent Form
No ratings yet
Microblading Consent Form
1 page
Test 1Z0-1105-23
No ratings yet
Test 1Z0-1105-23
18 pages
Dictonaire and Teacher Role
No ratings yet
Dictonaire and Teacher Role
6 pages
RAN Minimum Installation Requirements and Guidelines - PQSM-W-001-2019 - Version No. 1 - 2020-01-24
No ratings yet
RAN Minimum Installation Requirements and Guidelines - PQSM-W-001-2019 - Version No. 1 - 2020-01-24
49 pages
EUH 2021 Syllabus Spring 10
No ratings yet
EUH 2021 Syllabus Spring 10
7 pages
Airbalancing Report. Final
No ratings yet
Airbalancing Report. Final
4 pages
Electronic Government Procurement Roadmap
No ratings yet
Electronic Government Procurement Roadmap
75 pages

Gradient Descent Optimization

Uploaded by

Gradient Descent Optimization

Uploaded by

GRADIENT DESCENT OPTIMIZATION

Direction in every weight vector, a parabola

STOCHASTIC GRADIENT DESCENT

You might also like