Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 6
Deep Learning from
Scratch Theory + Practical FAHAD HUSSAIN MCS, MSCS, DAE(CIT)
Computer Science Instructor of well known international Center
Also, Machine Learning and Deep learning Practitioner
For further assistance, code and slide https://fahadhussaincs.blogspot.com/
YouTube Channel : https://www.youtube.com/fahadhussaintutorial Stochastic gradient descent The word ‘stochastic‘ means a system or a process that is linked with a random probability. Hence, in Stochastic Gradient Descent, a few samples are selected randomly instead of the whole data set for each iteration. In Gradient Descent, there is a term called “batch” which denotes the total number of samples from a dataset that is used for calculating the gradient for each iteration. In typical Gradient Descent optimization, like Batch Gradient Descent, the batch is taken to be the whole dataset. Although, using the whole dataset is really useful for getting to the minima in a less noisy or less random manner, but the problem arises when our datasets get really huge.
For further assistance, code and slide https://fahadhussaincs.blogspot.com/
YouTube Channel : https://www.youtube.com/fahadhussaintutorial Stochastic gradient descent Stochastic gradient descent (often abbreviated SGD) is an iterative method for optimizing an objective function with suitable smoothness properties (e.g. differentiable or subdifferentiable). ~Convex Loss function~
For further assistance, code and slide https://fahadhussaincs.blogspot.com/