0% found this document useful (0 votes)
12 views5 pages

Stochastic Gradient Descent

Stochastic Gradient Descent (SGD) is an optimization algorithm that improves upon traditional gradient descent by using only one random data point or a small batch for each iteration, making it more efficient for large datasets. This method allows for faster computation and helps in escaping local minima, making it suitable for various machine learning applications like deep learning and natural language processing. The key advantage of SGD lies in its efficiency and memory usage, enabling effective online learning.

Uploaded by

gokulk200507
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views5 pages

Stochastic Gradient Descent

Stochastic Gradient Descent (SGD) is an optimization algorithm that improves upon traditional gradient descent by using only one random data point or a small batch for each iteration, making it more efficient for large datasets. This method allows for faster computation and helps in escaping local minima, making it suitable for various machine learning applications like deep learning and natural language processing. The key advantage of SGD lies in its efficiency and memory usage, enabling effective online learning.

Uploaded by

gokulk200507
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Stochastic Gradient Descent (SGD)

What is Gradient Descent


Gradient descent is an iterative optimization algorithm used to minimize a
loss function, which represents how far the model’s predictions are from
the actual values. The main goal is to adjust the parameters of a model
(weights, biases, etc.) so that the error is minimized.
• The update rule for the traditional gradient descent algorithm is:

Stochastic Gradient Descent (SGD)


• Stochastic Gradient Descent (SGD) is an optimization algorithm in machine
learning, particularly when dealing with large datasets. It is a variant of the
traditional gradient descent algorithm but offers several advantages in
terms of efficiency and scalability, making it the go-to method for many
deep-learning tasks.
• To understand SGD, it’s essential to first comprehend the concept
of gradient descent.
Need for Stochastic Gradient Descent

• For large datasets, computing the gradient using all data points can be
slow and memory-intensive. This is where SGD comes into play.
• Instead of using the full dataset to compute the gradient at each step, SGD
uses only one random data point (or a small batch of data points) at each
iteration. This makes the computation much faster.
Path followed by batch gradient descent vs. path followed by SGD:

Optimization path followed by Optimization path followed SGD Optimization


Gradient Descent
Working of Stochastic Gradient Descent

• In Stochastic Gradient Descent, the gradient is calculated for each training


example (or a small subset of training examples) rather than the entire
dataset.
The update rule becomes:

The key difference from traditional gradient descent is that, in SGD, the
parameter updates are made based on a single data point, not the entire
dataset.
Advantages of Stochastic Gradient Descent:
• Efficiency
• Memory Efficiency
• Escaping Local Minima
• Online Learning
Applications of Stochastic Gradient Descent

SGD and its variants are widely used across various domains of machine
learning:
• Deep Learning
• Natural Language Processing (NLP)
• Computer Vision
• Reinforcement Learning

You might also like