0% found this document useful (0 votes)
28 views7 pages

Gradient Descent

Uploaded by

Sri Ram P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views7 pages

Gradient Descent

Uploaded by

Sri Ram P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Gradient Descent

Gradient Descent is known as one of the


most commonly used optimization
algorithms to train machine learning
models by means of minimizing errors
between actual and expected results.
Further, gradient descent is also used to
train Neural Networks.
In machine learning, optimization is the
task of minimizing the cost function
parameterized by the model's
parameters. The main objective of
gradient descent is to minimize the
convex function using iteration of
parameter updates. Once these machine
learning models are optimized, these
models can be used as powerful tools for
Artificial Intelligence and various
computer science applications.
What is Gradient Descent
or Steepest Descent?
Gradient Descent is defined as one of the
most commonly used iterative
optimization algorithms of machine
learning to train the machine learning and
deep learning models. It helps in finding
the local minimum of a function.

Main objective
- To minimize the cost function using
iteration.
What is Cost-function?

The cost function is defined as the


measurement of difference or error
between actual values and expected
values at the current position and present
in the form of a single real number.
It helps to increase and improve machine
learning efficiency by providing feedback
to this model so that it can minimize error
and find the local or global minimum.
The loss function refers to the error of one
training example, while a cost function
calculates the average error across an
entire training set.
How does Gradient
Descent work?
Y=mX+c
Where 'm' represents the slope of the
line, and 'c' represents the intercepts on
the y-axis.

The slope becomes steeper at the


starting point, but whenever new
parameters are generated, then
steepness gradually reduces, and at the
lowest point, it approaches the lowest
point, which is called a point of
convergence.
Stochastic Gradient
Descent
Stochastic Gradient Descent (SGD) is a
type of gradient descent where the model
is updated after processing one training
example at a time, rather than the whole
dataset at once. This makes it easier to
handle large datasets since it requires
less memory. However, because it
updates so frequently, it can be a bit
slower and noisy compared to batch
gradient descent. Despite this, SGD can
sometimes help find the global minimum
and avoid getting stuck in a local
minimum.

Advantages:
1. Uses less memory, making it easier to
handle.
2. More efficient for large datasets.

You might also like