0% found this document useful (0 votes)
3 views

Types of Gradient Descent

The document discusses various types of gradient descent, including Batch, Stochastic, and Mini-Batch, outlining their advantages and disadvantages. It also highlights challenges such as local minima and vanishing gradients, along with advanced optimization algorithms like Momentum, RMSProp, and Adam. Additionally, it covers the applications, advantages, and disadvantages of gradient descent in machine learning and deep learning, concluding that it remains a fundamental optimization method despite its challenges.

Uploaded by

skandapmwork2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Types of Gradient Descent

The document discusses various types of gradient descent, including Batch, Stochastic, and Mini-Batch, outlining their advantages and disadvantages. It also highlights challenges such as local minima and vanishing gradients, along with advanced optimization algorithms like Momentum, RMSProp, and Adam. Additionally, it covers the applications, advantages, and disadvantages of gradient descent in machine learning and deep learning, concluding that it remains a fundamental optimization method despite its challenges.

Uploaded by

skandapmwork2003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Types of Gradient Descent

1. Batch Gradient Descent:

o Computes the gradient using the entire dataset.

o Advantages: Stable updates and accurate convergence.

o Disadvantages: Computationally expensive for large datasets.

2. Stochastic Gradient Descent (SGD):

o Uses a single randomly selected sample for each gradient update.

o Advantages: Faster updates, works well for large datasets.

o Disadvantages: Noisy updates can lead to fluctuations around the minimum.

3. Mini-Batch Gradient Descent:

o Combines the best of batch and stochastic methods by using a small random subset (mini-
batch) of the dataset.

o Advantages: Faster than batch and more stable than SGD.

Challenges in Gradient Descent

1. Local Minima:

o The loss function may have multiple local minima, especially in non-convex problems like
deep learning.

2. Saddle Points:
o Points where the gradient is zero but are neither minima nor maxima can slow down
convergence.

3. Vanishing/Exploding Gradients:

o In deep networks, gradients may become extremely small or large, causing issues during
backpropagation.

4. Choosing the Learning Rate:

o A poor choice of learning rate can result in:

 Too small: Slow convergence.

 Too large: Overshooting or divergence.

Variants of Gradient Descent

To address the above challenges, advanced optimization algorithms have been developed:

1. Momentum:

o Adds a momentum term to smooth updates and prevent oscillations.

2. RMSProp:

o Adjusts the learning rate for each parameter based on recent gradient magnitudes.

3. Adam (Adaptive Moment Estimation):

o Combines Momentum and RMSProp for adaptive learning rates.

Applications in Deep Learning

 Training Neural Networks: Gradient descent is combined with backpropagation to compute


gradients layer by layer and optimize weights.

 Reinforcement Learning: Used to optimize policies.

 Natural Language Processing: Optimizes embeddings and neural architectures.

 Image Processing: Trains deep convolutional networks.

Advantages of Gradient Descent

 Simple and easy to implement.

 Flexible and works for various machine learning and deep learning models.

 Scales well for large datasets when mini-batch or stochastic variants are used.

Disadvantages of Gradient Descent

 Sensitive to the choice of the learning rate.


 May get stuck in local minima or saddle points.

 Requires computational resources for large datasets and complex models.

Conclusion

Gradient Descent is a cornerstone optimization algorithm in machine learning and deep learning.
By iteratively adjusting model parameters to minimize the loss function, it enables models to learn
patterns in data efficiently. Despite its challenges, improvements like momentum and adaptive learning
rates have made Gradient Descent robust for large-scale applications in modern AI systems.

You might also like