Stochastic Gradient Descent

Uploaded by

mt391446

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views4 pages

Stochastic Gradient Descent

Uploaded by

mt391446

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

What is Gradient?

A gradient is nothing but a derivative that defines the effects on outputs of the
function with a little bit of variation in inputs.

Stochastic Gradient Descent (SGD)

 

Gradient Descent is an iterative optimization process that searches for an

objective function’s optimum value (Minimum/Maximum). It is one of the most
used methods for changing a model’s parameters in order to reduce a cost
function in machine learning projects.
The primary goal of gradient descent is to identify the model parameters that
provide the maximum accuracy on both training and test datasets. In gradient
descent, the gradient is a vector pointing in the general direction of the
function’s steepest rise at a particular point. The algorithm might gradually drop
towards lower values of the function by moving in the opposite direction of the
gradient, until reaching the minimum of the function.
Types of Gradient Descent:
Typically, there are three types of Gradient Descent:
1. Batch Gradient Descent
2. Stochastic Gradient Descent
3. Mini-batch Gradient Descent
Here we are discussing Stochastic Gradient Descent (SGD).

Stochastic Gradient Descent (SGD):

Stochastic Gradient Descent (SGD) is a variant of the Gradient
Descent algorithm that is used for optimizing machine learning models. It
addresses the computational inefficiency of traditional Gradient Descent
methods when dealing with large datasets in machine learning projects.
In SGD, instead of using the entire dataset for each iteration, only a single
random training example (or a small batch) is selected to calculate the gradient
and update the model parameters. This random selection introduces randomness
into the optimization process, hence the term “stochastic” in stochastic Gradient
Descent
The advantage of using SGD is its computational efficiency, especially when
dealing with large datasets. By using a single example or a small batch, the
computational cost per iteration is significantly reduced compared to traditional
Gradient Descent methods that require processing the entire dataset.
Stochastic Gradient Descent Algorithm
 Initialization: Randomly initialize the parameters of the model.
 Set Parameters: Determine the number of iterations and the learning rate
(alpha) for updating the parameters.
 Stochastic Gradient Descent Loop: Repeat the following steps until the
model converges or reaches the maximum number of iterations:
a. Shuffle the training dataset to introduce randomness.
b. Iterate over each training example (or a small batch) in the shuffled
order.
c. Compute the gradient of the cost function with respect to the model
parameters using the current training example (or batch).
d. Update the model parameters by taking a step in the direction of the
negative gradient, scaled by the learning rate.
e. Evaluate the convergence criteria, such as the difference in the cost function
between iterations of the gradient.
 Return Optimized Parameters: Once the convergence criteria are met or
the maximum number of iterations is reached, return the optimized model
parameters.
In SGD, since only one sample from the dataset is chosen at random for each
iteration, the path taken by the algorithm to reach the minima is usually noisier
than your typical Gradient Descent algorithm. But that doesn’t matter all that
much because the path taken by the algorithm does not matter, as long as we
reach the minimum and with a significantly shorter training time.
The path taken by Batch Gradient Descent is shown below:
Batch gradient optimization path

A path taken by Stochastic Gradient Descent looks as follows –

stochastic gradient optimization path

One thing to be noted is that, as SGD is generally noisier than typical Gradient
Descent, it usually took a higher number of iterations to reach the minima,
because of the randomness in its descent. Even though it requires a higher
number of iterations to reach the minima than typical Gradient Descent, it is still
computationally much less expensive than typical Gradient Descent. Hence, in
most scenarios, SGD is preferred over Batch Gradient Descent for optimizing a
learning algorithm.

Advantages of Stochastic Gradient Descent

Speed: SGD is faster than other variants of Gradient Descent such as Batch
Gradient Descent and Mini-Batch Gradient Descent since it uses only one
example to update the parameters.
Memory Efficiency: Since SGD updates the parameters for each training
example one at a time, it is memory-efficient and can handle large datasets that
cannot fit into memory.
Avoidance of Local Minima: Due to the noisy updates in SGD, it has the
ability to escape from local minima and converges to a global minimum.

Disadvantages of Stochastic Gradient Descent

Noisy updates: The updates in SGD are noisy and have a high variance,
which can make the optimization process less stable and lead to oscillations
around the minimum.
Slow Convergence: SGD may require more iterations to converge to the
minimum since it updates the parameters for each training example one at a
time.
Sensitivity to Learning Rate: The choice of learning rate can be critical in
SGD since using a high learning rate can cause the algorithm to overshoot the
minimum, while a low learning rate can make the algorithm converge slowly.
Less Accurate: Due to the noisy updates, SGD may not converge to the exact
global minimum and can result in a suboptimal solution. This can be mitigated
by using techniques such as learning rate scheduling and momentum-based
updates

Principles of Communication Syllabus
No ratings yet
Principles of Communication Syllabus
1 page
1 Intro
No ratings yet
1 Intro
91 pages
Frmcs User Requirements Specification Version 4.0.0 PDF
No ratings yet
Frmcs User Requirements Specification Version 4.0.0 PDF
120 pages
Optimizers and Activation Functions in Deep Learning
No ratings yet
Optimizers and Activation Functions in Deep Learning
15 pages
SCSA3015 Deep Learning Unit 4 PDF
No ratings yet
SCSA3015 Deep Learning Unit 4 PDF
30 pages
DL Regularization
No ratings yet
DL Regularization
51 pages
Design Thinking in Food Delivery Apps 9921004850
No ratings yet
Design Thinking in Food Delivery Apps 9921004850
12 pages
Mobile Communication - Lab - Manual
No ratings yet
Mobile Communication - Lab - Manual
53 pages
GD Types
No ratings yet
GD Types
98 pages
Spectra T980
No ratings yet
Spectra T980
214 pages
Ge3171-Pspp Lab Manual
No ratings yet
Ge3171-Pspp Lab Manual
57 pages
ECE5179and6179 Course Project
No ratings yet
ECE5179and6179 Course Project
8 pages
Linear Models-Gradient Descent, Regularization (Introduction)
No ratings yet
Linear Models-Gradient Descent, Regularization (Introduction)
26 pages
Lecture05 Descent
No ratings yet
Lecture05 Descent
31 pages
Main SGD
No ratings yet
Main SGD
32 pages
UNIT III Part-2
No ratings yet
UNIT III Part-2
39 pages
Optim
No ratings yet
Optim
33 pages
ML Lec 08 Gradient Descent
No ratings yet
ML Lec 08 Gradient Descent
37 pages
Gradient Descent - PR
No ratings yet
Gradient Descent - PR
31 pages
ANN Explanation Request Updated
No ratings yet
ANN Explanation Request Updated
44 pages
ML - Stochastic Gradient Descent (SGD) - GeeksforGeeks
No ratings yet
ML - Stochastic Gradient Descent (SGD) - GeeksforGeeks
9 pages
Dla-Cat 1
No ratings yet
Dla-Cat 1
37 pages
Gradient Descent Optimization
No ratings yet
Gradient Descent Optimization
4 pages
Dell LCD Monitor 1503FP Service Manual PDF
No ratings yet
Dell LCD Monitor 1503FP Service Manual PDF
8 pages
WINSEM2024-25 CSE4006 ETH AP2024254000693 2025-01-08 Reference-Material-I
No ratings yet
WINSEM2024-25 CSE4006 ETH AP2024254000693 2025-01-08 Reference-Material-I
40 pages
6705-Article Text-13114-1-10-20210220
No ratings yet
6705-Article Text-13114-1-10-20210220
29 pages
UNIT3
No ratings yet
UNIT3
37 pages
Gradient-Based Optimizers
No ratings yet
Gradient-Based Optimizers
54 pages
Gradient Descent A Fundamental Optimization Algorithm
No ratings yet
Gradient Descent A Fundamental Optimization Algorithm
30 pages
ML Lecture2
No ratings yet
ML Lecture2
36 pages
Ai - Bad402 - M2
No ratings yet
Ai - Bad402 - M2
15 pages
Mlfa Autumn 22 Lec 04
No ratings yet
Mlfa Autumn 22 Lec 04
24 pages
Unit1 and Unit2
No ratings yet
Unit1 and Unit2
85 pages
Gradient Decent
No ratings yet
Gradient Decent
15 pages
2,5 Stochastic Gradient Descent
No ratings yet
2,5 Stochastic Gradient Descent
11 pages
Unit 4 Final
No ratings yet
Unit 4 Final
29 pages
ML Module 5 Full Notes
No ratings yet
ML Module 5 Full Notes
23 pages
Paper 2
No ratings yet
Paper 2
27 pages
Share Whitepaper 7
No ratings yet
Share Whitepaper 7
14 pages
DL Unit - 2
No ratings yet
DL Unit - 2
20 pages
Sameera CV-english 2024
No ratings yet
Sameera CV-english 2024
11 pages
1-19#-LS-909-SIZE ANALYSIS MACHINE-user Manual PDF
No ratings yet
1-19#-LS-909-SIZE ANALYSIS MACHINE-user Manual PDF
101 pages
Gradient Descent
No ratings yet
Gradient Descent
2 pages
Stochastic Gradient Descent
No ratings yet
Stochastic Gradient Descent
3 pages
Stochastic Gradient Descent
No ratings yet
Stochastic Gradient Descent
5 pages
Zamfira Ioana Ruxandra - Raport
No ratings yet
Zamfira Ioana Ruxandra - Raport
10 pages
Gradient Descent
No ratings yet
Gradient Descent
7 pages
Optimizer
No ratings yet
Optimizer
13 pages
Gradient Descent 5 Part 2
No ratings yet
Gradient Descent 5 Part 2
15 pages
Gradient Descent DS Rohit Sharma Fench Knjs
No ratings yet
Gradient Descent DS Rohit Sharma Fench Knjs
15 pages
Deep Learning
No ratings yet
Deep Learning
20 pages
QB Unit 3
No ratings yet
QB Unit 3
14 pages
Mlfa Autumn 23 Optimization
No ratings yet
Mlfa Autumn 23 Optimization
37 pages
Catalogo Sinamics
No ratings yet
Catalogo Sinamics
28 pages
DeepSkyCamera Manual en
No ratings yet
DeepSkyCamera Manual en
39 pages
Stochastic Gradient Descent
No ratings yet
Stochastic Gradient Descent
12 pages
Stochastic Search Methods
No ratings yet
Stochastic Search Methods
2 pages
04 Batch SGD Mini Batch Gradient Descent Algorithms
No ratings yet
04 Batch SGD Mini Batch Gradient Descent Algorithms
3 pages
Data Visualization2.pdf - Crdownload
No ratings yet
Data Visualization2.pdf - Crdownload
18 pages
chp2 Gradient Descent Algorithm
No ratings yet
chp2 Gradient Descent Algorithm
5 pages
Gradient Descent
No ratings yet
Gradient Descent
8 pages
Gradient Descent & Stockastic Gradient Descent
No ratings yet
Gradient Descent & Stockastic Gradient Descent
6 pages
Soft Defined Radio Report
No ratings yet
Soft Defined Radio Report
10 pages
User Manual For Amazfit Band 5
No ratings yet
User Manual For Amazfit Band 5
25 pages
Stochastic Gradient Descent: Ryan Tibshirani Convex Optimization 10-725
No ratings yet
Stochastic Gradient Descent: Ryan Tibshirani Convex Optimization 10-725
22 pages
AI33
No ratings yet
AI33
6 pages
Top 5 Open Source Email Security Tools On GitHub
No ratings yet
Top 5 Open Source Email Security Tools On GitHub
4 pages
Chapter 4 2023 2
No ratings yet
Chapter 4 2023 2
25 pages
XS Series E Appen 7 Installation PDF
No ratings yet
XS Series E Appen 7 Installation PDF
101 pages
SGD
No ratings yet
SGD
3 pages
12-Mini-Batch Gradient Descent - Exponential Weighted Averages-07-08-2024
No ratings yet
12-Mini-Batch Gradient Descent - Exponential Weighted Averages-07-08-2024
2 pages
Unit 4 - GRADIENT LEARNING
No ratings yet
Unit 4 - GRADIENT LEARNING
3 pages
The Killhouse Entry Point Wiki Fandom
No ratings yet
The Killhouse Entry Point Wiki Fandom
1 page
Gradient Descent
No ratings yet
Gradient Descent
4 pages
Mandatery Aix Command For Oracle Dba and Apps Dba
No ratings yet
Mandatery Aix Command For Oracle Dba and Apps Dba
34 pages
Stochastic Gradient Descent
No ratings yet
Stochastic Gradient Descent
23 pages
Comparison of Gradient Descent Algorithms On Training Neural Networks
No ratings yet
Comparison of Gradient Descent Algorithms On Training Neural Networks
20 pages
05.stochastic Gradient Descent
No ratings yet
05.stochastic Gradient Descent
2 pages
SAFE Tutorial v. 12 Ingles
No ratings yet
SAFE Tutorial v. 12 Ingles
112 pages
UNIT2
No ratings yet
UNIT2
25 pages
PWM Signal Generator ESR 1.2
No ratings yet
PWM Signal Generator ESR 1.2
4 pages
Kafd A1 CJ01 P504 Gas TRN 00402
No ratings yet
Kafd A1 CJ01 P504 Gas TRN 00402
2 pages
An Overview of Gradient Descent Optimization Algorithms PDF
No ratings yet
An Overview of Gradient Descent Optimization Algorithms PDF
12 pages
Daljit PDF
No ratings yet
Daljit PDF
2 pages
Gradient Descent
No ratings yet
Gradient Descent
17 pages
RhinoGold 4.0 - Level 1 - Tutorial 015P - Channel Pendant PDF
No ratings yet
RhinoGold 4.0 - Level 1 - Tutorial 015P - Channel Pendant PDF
2 pages
Gradient Descent
No ratings yet
Gradient Descent
15 pages
Stochastic Gradient Descent - Term Paper
No ratings yet
Stochastic Gradient Descent - Term Paper
8 pages
Product Description
No ratings yet
Product Description
15 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet