0% found this document useful (0 votes)
8 views8 pages

Introduction To Gradient Descent

Gradient descent is an optimization algorithm used in machine learning to minimize cost functions by iteratively adjusting model parameters. It relies on the learning rate to determine step sizes and can converge to a minimum, although it may only find local minima for non-convex functions. The algorithm is applicable in various fields, including linear regression, stock predictions, and medical diagnostics, but has limitations such as sensitivity to learning rates and potential convergence issues.

Uploaded by

alirehman123001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views8 pages

Introduction To Gradient Descent

Gradient descent is an optimization algorithm used in machine learning to minimize cost functions by iteratively adjusting model parameters. It relies on the learning rate to determine step sizes and can converge to a minimum, although it may only find local minima for non-convex functions. The algorithm is applicable in various fields, including linear regression, stock predictions, and medical diagnostics, but has limitations such as sensitivity to learning rates and potential convergence issues.

Uploaded by

alirehman123001
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Introduction to

Gradient Descent

Gradient descent is a powerful optimization algorithm used in machine


learning to minimize a cost function. It iteratively adjusts the parameters
of a model to find the optimal values that best fit the data.

by Amir Ali
Understanding the Concept
Iteration Learning Rate Convergence
Gradient descent works by The learning rate determines The algorithm continues
taking small steps in the the size of each step, and it's iterating until it reaches a point
direction of the negative crucial to find the right balance where the changes in the
gradient of the cost function, to ensure convergence. parameters are negligible,
which points towards the indicating the minimum has
minimum. been found.
Cost Function and Optimization

1 Cost Function 2 Optimization


The cost function is a mathematical Gradient descent aims to find the values of
expression that quantifies the error between the model parameters that minimize the cost
the predicted and actual values. function.

3 Convexity 4 Non-convex Functions


For convex cost functions, gradient descent is For non-convex functions, gradient descent
guaranteed to find the global minimum. may only find a local minimum, which may not
be the global minimum.
Gradient Computation
1 Partial Derivatives
The gradient is the vector of partial derivatives of the cost function with
respect to each parameter.

2 Chain Rule
When the model is complex, the gradient is computed using the chain rule to
differentiate the composite functions.

3 Numerical Approximation
In some cases, the gradient may be difficult to compute analytically, so it can
be approximated numerically.
Updating the Parameters

Parameter Update Learning Rate Batch Size


In each iteration, the The learning rate controls the In some cases, the gradient is
parameters are updated by step size and can significantly computed using a subset of
subtracting the product of the impact the convergence of the the training data (batch), rather
gradient and the learning rate algorithm. than the entire dataset.
from the current values.
Convergence and Stopping Criteria

Convergence Criteria Maximum Iterations


Gradient descent stops when the changes in the A maximum number of iterations can be set as a
parameter values or the cost function become stopping criterion, in case the algorithm does not
negligible, indicating the minimum has been converge naturally.
reached.

Early Stopping Regularization


To avoid overfitting, the algorithm can be Regularization techniques can be used to
stopped when the performance on a validation prevent overfitting and ensure the algorithm
set starts to deteriorate. converges to a suitable minimum.
Example Application: Linear Regression

Housing Prices Stock Predictions Medical Diagnostics


Gradient descent can be used to Gradient descent can also be
find the optimal coefficients of a applied to predict stock prices Gradient descent is used in
linear regression model to predict based on historical data and machine learning models for
housing prices based on features other financial indicators. medical diagnosis, predicting the
like square footage and number likelihood of a patient having a
of bedrooms. certain condition based on their
symptoms and test results.
Advantages and Limitations of Gradient
Descent
Advantages Limitations

Efficient for large-scale optimization problems Can get stuck in local minima for non-convex
functions

Works well with high-dimensional data Sensitive to the choice of learning rate

Guaranteed to find the global minimum for May require many iterations to converge,
convex functions especially for complex models

Can be parallelized for faster computation May not perform well with sparse or noisy data

You might also like