0% found this document useful (0 votes)

16 views27 pages

Gradient Descent Final

Gradient Descent is an optimization algorithm used in machine learning to minimize cost functions by iteratively updating model parameters in the direction of the steepest descent. It is computationally efficient and scalable, making it suitable for high-dimensional datasets, but can struggle with local minima if the learning rate is not properly tuned. The algorithm has various applications in business analytics, including predictive modeling, dynamic pricing, and fraud detection.

Uploaded by

Deepanshu Thakur

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views27 pages

Gradient Descent Final

Uploaded by

Deepanshu Thakur

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 27

GRADIENT DESCENT

ALGORITHM IN
MACHINE
LEARNING
Deepanshu Thakur- RBA09
Pallav Jain- RBA 106
Ritesh Shinde- RBA 111
Pranav Khadke- RBA 108
Sankalp Agarwal- RBA 112
Karanveer- RBA 101
01. INTRODUCTION

02. IT’S WORKING

03. PROS AND CONS

TABLE OF
04. UTILITY IN BUSINESS
ANALYTICS CONTENT
05. UTILITY IN BUSINESS
SCENARIO

06. COMPARISON WITH

SIMILAR ALGORITHM

07. DATASET & CONCLUSION

WHAT IS GRADIENT

Gradient Descent is an optimization algorithm for finding a local

minimum of a differentiable function. Gradient descent in
machine learning is simply used to find the values of a function's
parameters (coefficients) that minimize a cost function as far as
possible.

Function requirements Gradient descent algorithm does not work

for all functions. There are two specific requirements.

A function must be:

• Differentiable
• Convex
• if a function is differentiable,
it has a derivative for each point in its
domain, not all functions meet these
criteria.

• function must be convex.

For a univariate function, this means that
the line segment connecting two
function’s points lays on or above its curve
(it does not cross it). If it does it means
that it has a local minimum which is not a
global one.
HOW DOES GRADIENT DESCENT
WORK?
In Gradient Descent, we start with a random guess and then
move slowly to the right answer or correct answer.

The common formula for optimization of parameters in

gradient descent is:
• New Value = Old Value – Step size (how much you
want to shift)

So, new value is the updates version of old value adjusted

against known as step size.

The Step Size in the world of Gradient Descent in terms of

mathematics is computed as:
• Step Size = Learning Rate x Slope
Let us consider a square
function:
So, the objective of
Gradient Descent is that
this function needs to be
minimized.

So, the minimum value

which this function can
take is 0. So, our objective
is to reach the minimum
value.
So, now the random guess
is which is plotted on the
graph, and we don’t know
whether to move up or
down. So, to find out in
which direction we need
to move to reach the
minimum point is with the
help of concept called
Derivative.

https://builtin.com/data-sc
ience/gradient-descent
MATHEMATICS OF GRADIENT
DESCENT:
So as X increases the value
of the Function from point A
decreases and value of the
Function from point B
increases.
Therefore, the Gradient or
Slope at point A is negative
that is -2x and at point B is
positive that is 2x.
This is how the slope helps
in deciding in which
direction to move to reach
the global minima.
GRADIENT DESCENT LEARNING
RATE:
• How big the steps gradient descent takes into
the direction of the local minimum are
determined by the learning rate, which figures
out how fast or slow we will move towards the
optimal weights.

• For the gradient descent algorithm to reach the

local minimum we must set the learning rate to
an appropriate value, which is neither too low
nor too high. This is important because if the
steps it takes are too big, it may not reach the
local minimum because it bounces back and
forth between the convex function of gradient
descent.

• If we set the learning rate to a very small value,

gradient descent will eventually reach the local
minimum but that may take a while. So, the
learning rate should never be too high or too low
PROS AND CONS
Aspect Pros Cons

Computationally efficient, especially in Full-batch gradient descent is

Efficiency
stochastic and mini-batch variations. computationally expensive for large datasets.

Easy to understand, implement, and apply to May require additional techniques for stability
Simplicity
various machine learning problems. in complex scenarios.

Handles high-dimensional data and large

May struggle with very large datasets without
Scalability datasets effectively with optimization
mini-batch or stochastic variants.
techniques.

Guaranteed to converge to a minimum

Can get stuck in local minima or saddle
Convergence (local/global) with proper learning rate
points in non-convex functions.
tuning.

Sensitive to learning rate—too high causes

Learning Flexibility to adjust learning rate for different
divergence, too low leads to slow
Rate use cases.
UTILITY IN BUSINESS ANALYTICS

• Optimization in Predictive Models: Gradient Descent helps train

machine learning models (e.g., linear regression, logistic regression,
neural networks) to predict customer behavior, demand, and risks.

• Cost Function Minimization: Businesses use it to minimize error

metrics (e.g., mean squared error) in predictive modeling, improving
the accuracy of forecasts and insights.

• Revenue and Profit Maximization: Can be applied in dynamic pricing

strategies or inventory optimization to maximize revenue and
minimize costs.
• Clustering and Segmentation: Facilitates unsupervised
learning models like k-means clustering, enabling
customer segmentation for targeted marketing
campaigns.

• Risk Modeling and Mitigation: Helps optimize models for

credit scoring, fraud detection, and insurance risk
assessment by fine-tuning parameters for minimal loss.

• Operational Efficiency: Optimizes resource allocation,

supply chain parameters, and workforce management
using machine learning models trained with gradient
descent.
UTILITY IN
BUSINESS
SCENARIO
Scenario Description

Customer Churn Gradient Descent optimizes models predicting which customers are
Prediction likely to leave, enabling retention strategies.

Trains time-series models to predict future sales trends, helping

Sales Forecasting
businesses manage inventory and operations.

Optimizes pricing models based on real-time demand, supply, and

Dynamic Pricing
competitor analysis.

Marketing Refines logistic regression models to target potential customers for

Campaigns higher conversion rates.

Inventory Optimizes stock levels by predicting demand and minimizing holding or

Management shortage costs.

Helps train anomaly detection models to identify fraudulent

Fraud Detection
transactions in real time.

Supply Chain Fine-tunes machine learning models for routing, demand forecasting,
Optimization and reducing operational costs.
COMPARISON
WITH SIMIALR
ALGORITHMS
Algorith
Description Advantages Disadvantages Best Use
m

General-
purpose
Optimizes by iteratively updating Simple, easy to Sensitive to learning rate, can
Gradient optimization,
parameters using the gradient of implement, efficient get stuck in local minima, slow
Descent regression,
the loss function. for large datasets. near convergence.
neural
networks.

Adam Requires tuning

Fast convergence,
(Adaptive Combines Momentum and RMSProp hyperparameters, Deep learning,
adaptive learning rate,
Moment for adaptive learning rates and computationally expensive NLP, and vision
works well with sparse
Estimatio faster convergence. compared to vanilla Gradient tasks.
data.
n) Descent.

Prevents exploding
Scales the learning rate by Can converge too fast to a Training RNNs
RMSPro gradients, effective in
averaging squared gradients to suboptimal solution, needs and deep neural
p non-stationary
stabilize updates. careful tuning of the decay rate. networks.
settings.

Deep learning
Adds a fraction of the previous update Reduces oscillations, Requires additional
Moment and models with
to the current gradient to accelerate faster convergence in hyperparameter tuning
um complex loss
convergence. ravine-like regions. (momentum term).
surfaces.

Works well with sparse Sparse data

WORKING ON
TESLA DATASET
CONCLUSION
• Gradient Descent is a fundamental optimization algorithm widely used in machine
learning and deep learning for minimizing cost functions. It iteratively updates
model parameters in the direction of the steepest descent, as determined by the
negative gradient.

• Unlike traditional optimization methods, Gradient Descent can handle high-

dimensional datasets efficiently.

• Works well for large datasets and deep learning models.

• It may get stuck in local minima, saddle points, or oscillate if the learning rate is not
properly tuned.

• Gradient Descent remains one of the most widely used optimization techniques in
machine learning due to its efficiency, scalability, and adaptability. Despite its
limitations, improvements like Momentum, Adam, and RMSprop have enhanced its
performance in deep learning applications.
REFERENCES:

https://towardsdatascience.com/gradient-descent-from-scratc
h-e8b75fa986cc
https://builtin.com/data-science/gradient-descent
https://youtu.be/xOB10eTjoQ
https://www.javatpoint.com/gradient-descent-in-m
achine-learning

THANK
YOU

Gradient Descent (GD) - GD With Momentum - Nesterov Accelerated GD - Stochastic GD - OrIGINAL
No ratings yet
Gradient Descent (GD) - GD With Momentum - Nesterov Accelerated GD - Stochastic GD - OrIGINAL
25 pages
Gradient Descent Algorithm in Machine Learning - Analytics Vidhya
No ratings yet
Gradient Descent Algorithm in Machine Learning - Analytics Vidhya
11 pages
Cbse - Class X - Maths Worksheet - Trigonometry
67% (6)
Cbse - Class X - Maths Worksheet - Trigonometry
2 pages
Gradient Descent Algorithm in Machine Learning
No ratings yet
Gradient Descent Algorithm in Machine Learning
21 pages
Module2 Optimizations
No ratings yet
Module2 Optimizations
65 pages
ML Lec 08 Gradient Descent
No ratings yet
ML Lec 08 Gradient Descent
37 pages
Unit3 Rev3
No ratings yet
Unit3 Rev3
201 pages
Gradient Descent
No ratings yet
Gradient Descent
27 pages
Introduction To Gradient Descent
No ratings yet
Introduction To Gradient Descent
8 pages
Gradient Descent A Fundamental Optimization Algorithm
No ratings yet
Gradient Descent A Fundamental Optimization Algorithm
30 pages
Lec05-1-Gradient Descent-Detailed
No ratings yet
Lec05-1-Gradient Descent-Detailed
62 pages
Gradient Descent in Linear Regression
No ratings yet
Gradient Descent in Linear Regression
30 pages
Gradient Descent: By-Vineet Ahuja BCA-V1-E 00221102021
No ratings yet
Gradient Descent: By-Vineet Ahuja BCA-V1-E 00221102021
10 pages
Gradient Descent
No ratings yet
Gradient Descent
14 pages
Gradient Descent
No ratings yet
Gradient Descent
15 pages
ML - Week 06
No ratings yet
ML - Week 06
31 pages
GD Types
No ratings yet
GD Types
98 pages
UNIT III Part-2
No ratings yet
UNIT III Part-2
39 pages
Gradient Descent
No ratings yet
Gradient Descent
5 pages
Gradient Descent DS Rohit Sharma Fench Knjs
No ratings yet
Gradient Descent DS Rohit Sharma Fench Knjs
15 pages
4 - Gradient Descent and Stochastic GD
No ratings yet
4 - Gradient Descent and Stochastic GD
37 pages
Gradient Descent - PR
No ratings yet
Gradient Descent - PR
31 pages
Gradient Descent
No ratings yet
Gradient Descent
9 pages
Optim
No ratings yet
Optim
33 pages
11 Gradient Descent
No ratings yet
11 Gradient Descent
58 pages
CCS355 Neural Networks and Deep Learning
No ratings yet
CCS355 Neural Networks and Deep Learning
142 pages
Gradient Descent Algorithm.Y...
No ratings yet
Gradient Descent Algorithm.Y...
10 pages
Paper 2
No ratings yet
Paper 2
27 pages
Gradient Descent Explained. A Comprehensive Guide To Gradient - by Daksh Trehan - Towards Data Science
No ratings yet
Gradient Descent Explained. A Comprehensive Guide To Gradient - by Daksh Trehan - Towards Data Science
9 pages
5.1loss Function, Optimization, GD
No ratings yet
5.1loss Function, Optimization, GD
39 pages
Gradient Descent Unit3
No ratings yet
Gradient Descent Unit3
9 pages
DL Regularization
No ratings yet
DL Regularization
51 pages
L5 - UCLxDeepMind DL2020
No ratings yet
L5 - UCLxDeepMind DL2020
52 pages
Gradient Descent
No ratings yet
Gradient Descent
8 pages
Gradient Descent - A Quick, Simple Introduction - Built in
No ratings yet
Gradient Descent - A Quick, Simple Introduction - Built in
15 pages
Linear Regression by IntuitiveAI v2.5
No ratings yet
Linear Regression by IntuitiveAI v2.5
5 pages
DL Unit - 2
No ratings yet
DL Unit - 2
20 pages
AI33
No ratings yet
AI33
6 pages
LInear
No ratings yet
LInear
14 pages
Adam Optimizer
No ratings yet
Adam Optimizer
22 pages
chp2 Gradient Descent Algorithm
No ratings yet
chp2 Gradient Descent Algorithm
5 pages
Gradient Descent
No ratings yet
Gradient Descent
4 pages
DL Test-2
No ratings yet
DL Test-2
28 pages
WINSEM2024-25 CSE4006 ETH AP2024254000693 2025-01-08 Reference-Material-I
No ratings yet
WINSEM2024-25 CSE4006 ETH AP2024254000693 2025-01-08 Reference-Material-I
40 pages
Gradient-Based Optimizers
No ratings yet
Gradient-Based Optimizers
54 pages
Gradient Descent
No ratings yet
Gradient Descent
6 pages
Mlfa Autumn 23 Optimization
No ratings yet
Mlfa Autumn 23 Optimization
37 pages
Gradient Descent
No ratings yet
Gradient Descent
9 pages
Gradient Descent Optimization
No ratings yet
Gradient Descent Optimization
27 pages
Gradient Descent
No ratings yet
Gradient Descent
2 pages
Gradient Descent
No ratings yet
Gradient Descent
13 pages
Gradient Descent Deep Learning: by T.K. Damodharan Vice President, RBS Reg - No: PC2013003013008
No ratings yet
Gradient Descent Deep Learning: by T.K. Damodharan Vice President, RBS Reg - No: PC2013003013008
37 pages
Gradient Descent Algorithm Is A First
No ratings yet
Gradient Descent Algorithm Is A First
5 pages
Gradient Descent
No ratings yet
Gradient Descent
5 pages
January 1995 PW
100% (1)
January 1995 PW
78 pages
Probability and Random Processes With Applications To Signal Processing and Communications 2nd Edition by Scott L.miller Donald Childers
No ratings yet
Probability and Random Processes With Applications To Signal Processing and Communications 2nd Edition by Scott L.miller Donald Childers
2 pages
Gradient Descent and Cost Function
No ratings yet
Gradient Descent and Cost Function
14 pages
Yash 21bsds12
No ratings yet
Yash 21bsds12
3 pages
ML Notes
No ratings yet
ML Notes
14 pages
14-RMSProp and Adam Optimization-12!08!2024
No ratings yet
14-RMSProp and Adam Optimization-12!08!2024
2 pages
Network How To
100% (1)
Network How To
139 pages
Gradient Descent
No ratings yet
Gradient Descent
17 pages
CFD Tutorial 1 - Elbow
100% (1)
CFD Tutorial 1 - Elbow
26 pages
Atm System FINAL
No ratings yet
Atm System FINAL
77 pages
BOP Control System BC0114001A
No ratings yet
BOP Control System BC0114001A
2 pages
Nextion Is A
100% (1)
Nextion Is A
26 pages
S70me-C8 5
No ratings yet
S70me-C8 5
406 pages
Emerson Digital Compressor Controller
No ratings yet
Emerson Digital Compressor Controller
17 pages
Aircraft Dynamics Longitudinal Mode Simulation
No ratings yet
Aircraft Dynamics Longitudinal Mode Simulation
40 pages
Oxymag English
No ratings yet
Oxymag English
40 pages
Unit III 1
No ratings yet
Unit III 1
11 pages
ASRJC H2 Chem 2021 P1 Solutions
No ratings yet
ASRJC H2 Chem 2021 P1 Solutions
29 pages
SAEP-348 - Chemical Cleaning, Disinfection, Post Treatment and Storage of Reverse Osmosis Membranes
No ratings yet
SAEP-348 - Chemical Cleaning, Disinfection, Post Treatment and Storage of Reverse Osmosis Membranes
34 pages
General Mathematics 11-Module 1
No ratings yet
General Mathematics 11-Module 1
6 pages
90 Integrals
No ratings yet
90 Integrals
2 pages
An Efficient Algorithm For 3D Rectangular Box Packing
No ratings yet
An Efficient Algorithm For 3D Rectangular Box Packing
4 pages
Power System Protection
No ratings yet
Power System Protection
17 pages
DSA Internal Exam Questions With Quiz
No ratings yet
DSA Internal Exam Questions With Quiz
4 pages
Gravitation Test Series
No ratings yet
Gravitation Test Series
3 pages
Color & Gloss Measurement
No ratings yet
Color & Gloss Measurement
4 pages
Geotechnical Earthquake Engineering: Prof. Deepankar Choudhury
No ratings yet
Geotechnical Earthquake Engineering: Prof. Deepankar Choudhury
38 pages
CHE 3800 - Mass Transfer and Separation Process (Winter 2017)
No ratings yet
CHE 3800 - Mass Transfer and Separation Process (Winter 2017)
3 pages
Enumerations in WinCC
No ratings yet
Enumerations in WinCC
16 pages
Unit TST 9
No ratings yet
Unit TST 9
3 pages
Cable Crimping (Punching)
No ratings yet
Cable Crimping (Punching)
3 pages
Liebert Apm 30 600 KW Brochure English
No ratings yet
Liebert Apm 30 600 KW Brochure English
8 pages
Tabulasi Data Hasil Uji Coba Kuesioner
No ratings yet
Tabulasi Data Hasil Uji Coba Kuesioner
11 pages
Python Machine Learning: Learn how to build powerful Python machine learning algorithms to generate useful data insights with this data analysis tutorial
From Everand
Python Machine Learning: Learn how to build powerful Python machine learning algorithms to generate useful data insights with this data analysis tutorial
Sebastian Raschka
4/5 (20)
The Comprehensive Guide to Machine Learning Algorithms and Techniques
From Everand
The Comprehensive Guide to Machine Learning Algorithms and Techniques
Mohammed Ahmed
5/5 (1)