Types of Gradient Descent
Types of Gradient Descent
o Combines the best of batch and stochastic methods by using a small random subset (mini-
batch) of the dataset.
1. Local Minima:
o The loss function may have multiple local minima, especially in non-convex problems like
deep learning.
2. Saddle Points:
o Points where the gradient is zero but are neither minima nor maxima can slow down
convergence.
3. Vanishing/Exploding Gradients:
o In deep networks, gradients may become extremely small or large, causing issues during
backpropagation.
To address the above challenges, advanced optimization algorithms have been developed:
1. Momentum:
2. RMSProp:
o Adjusts the learning rate for each parameter based on recent gradient magnitudes.
Flexible and works for various machine learning and deep learning models.
Scales well for large datasets when mini-batch or stochastic variants are used.
Conclusion
Gradient Descent is a cornerstone optimization algorithm in machine learning and deep learning.
By iteratively adjusting model parameters to minimize the loss function, it enables models to learn
patterns in data efficiently. Despite its challenges, improvements like momentum and adaptive learning
rates have made Gradient Descent robust for large-scale applications in modern AI systems.