Lect 7 - Gradient Descent

Gradient descent is an optimization algorithm used to minimize cost or loss functions in machine learning models. It works by iteratively moving the model's parameters, such as weights in a neural network, in the direction of steepest descent as defined by the negative gradient of the cost function. Each pass of the entire training dataset is called an epoch. The learning rate tunes the step size in each iteration to determine how quickly the parameters are adjusted toward a minimum. The goal is to reduce the error between predicted and expected values on each iteration until convergence is reached.

Uploaded by

Amrin Mulani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

63 views13 pages

Lect 7 - Gradient Descent

Uploaded by

Amrin Mulani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 13

Gradient Descent

Terminologies…
• Epoch : An epoch is a term used in machine learning and
indicates the number of passes of the entire training dataset
the machine learning algorithm has completed. Datasets are
usually grouped into batches (especially when the amount of
data is very large)
• Simple words ITERATION
Loss Functions
you are on the top of a hill and need to climb down. How do
you decide where to walk towards?

trekking
Steps:
• Look around to see all the possible paths
• Reject the ones going up. This is because these paths would actually
cost me more energy and make my task even more difficult
• Finally, take the path that I think has the most slope downhill

Deciding to go up the slope will cost us energy and time. Deciding to go down will
benefit us. Therefore, it has a negative cost.
Learning rate

• the learning rate is a tuning parameter in an optimization

algorithm that determines the step size at each iteration while
moving toward a minimum of a loss function
Cost function
• It is a function that measures the performance of a Machine
Learning model for given data. Cost Function quantifies the error
between predicted values and expected values and presents it in the
form of a single real number.
• Minimized - then returned value is usually called cost, loss or error.
• Maximized - then the value it yields is named a reward.
Example of cost function
•ŷ - predicted value,
•x - vector of data used for prediction or
training,
•w - weight.
Let’s Assume that x=0,1,2,3
W=2 ; then output values will be :

Let’s Assume that x=0,1,2,3

W=5 ; random parameter
Then y=0,5,10,15
• Now let’s take a small weight
w = 0.5

The cost will become: take

abs value

Y=[0,0.5,1,1.5] The model gives better results when

Now Lets compute error w=0.5 as cost value is smaller.
What are Optimizers?
• Optimizers are the ones that are used to reduce the loss in the model or
to reduce the error rate made by deep learning models.
Definition
• Gradient descent is an optimization algorithm used to minimize some
function by iteratively moving in the direction of steepest descent as
defined by the negative of the gradient.
• In machine learning, we use gradient descent to update
the parameters of our model.
• Parameters  Coefficients in Linear Regression
• Parameters  Weights in neural networks.
Diagrammatic Representation
import numpy as np for i in range(epochs):
import pandas as pd Y_pred = m*X + c # The current predicted value of
import matplotlib.pyplot as plt Y
data = pd.read_csv('C:/MLCourse/Linear Reg error = Y_pred - Y
dataset.csv') meanerror = np.mean(error)
X = data.iloc[:, 0] D_m = 2 * meanx * meanerror
Y = data.iloc[:, 1] D_c = 2 * meanerror
plt.scatter(X, Y) m = m - L * D_m # Update m
plt.show() c = c - L * D_c # Update c
m=0 cost[i] = (1/(2*n))*pow(meanerror,2)
c=0 print("Cost in Iteration : ",i+1 , " is ", cost[i] )
n = len(X) i=list(range(epochs))
epochs=30 plt.scatter(i, cost, color = "m", marker = "o", s = 25)
L=.0001 plt.xticks(range(0,epochs,5))
cost=[None] * epochs plt.yticks(range(0,340,10))
meanx=np.mean(X) plt.xlabel("Epochs")
plt.ylabel("Cost")
plt.show()
Cost in Iteration : 1 is 118.82464342205141
Cost in Iteration : 2 is 41.699553633502205
Cost in Iteration : 3 is 14.633772281201987
Cost in Iteration : 4 is 5.1354816183458105
Cost in Iteration : 5 is 1.8022127818843883
Cost in Iteration : 6 is 0.632456924699045
Cost in Iteration : 7 is 0.22195035215627046
Cost in Iteration : 8 is 0.07788982442675346
Cost in Iteration : 9 is 0.02733415239169711
Cost in Iteration : 10 is 0.009592471063728715
Cost in Iteration : 11 is 0.0033663198986343475
Cost in Iteration : 12 is 0.0011813545836785449
Cost in Iteration : 13 is 0.00041457695477623344
Cost in Iteration : 14 is 0.00014548896140593372
Cost in Iteration : 15 is 5.105695733235355e-05

MESA Model: Strategic Initiatives
100% (2)
MESA Model: Strategic Initiatives
3 pages
CSV Gamp 5
0% (1)
CSV Gamp 5
37 pages
Chapter 6 - Advanced Machine Learning PDF
No ratings yet
Chapter 6 - Advanced Machine Learning PDF
37 pages
Gradient Descent and SGD
No ratings yet
Gradient Descent and SGD
8 pages
C1 W1 Lab04 Gradient Descent Soln
No ratings yet
C1 W1 Lab04 Gradient Descent Soln
11 pages
Gradient Descent
No ratings yet
Gradient Descent
16 pages
ML Notes
No ratings yet
ML Notes
14 pages
Machine Learning Notes by Standard Andrew NG
No ratings yet
Machine Learning Notes by Standard Andrew NG
142 pages
Stanford ML CS229-Merged Notes
No ratings yet
Stanford ML CS229-Merged Notes
126 pages
Machine Learning Notes AndrewNg
No ratings yet
Machine Learning Notes AndrewNg
141 pages
Linearna Regresija - NG
No ratings yet
Linearna Regresija - NG
7 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
293 pages
Experiment 1
No ratings yet
Experiment 1
15 pages
C1 W2 Lab03 Feature Scaling and Learning Rate Soln
No ratings yet
C1 W2 Lab03 Feature Scaling and Learning Rate Soln
10 pages
3.linear Regression
No ratings yet
3.linear Regression
18 pages
Lab5 Linear Regression
No ratings yet
Lab5 Linear Regression
1 page
Linear and Logistic Regression Mathematical Intuition 1695069755
No ratings yet
Linear and Logistic Regression Mathematical Intuition 1695069755
3 pages
Linear Regression Notes
No ratings yet
Linear Regression Notes
15 pages
Linear Regression by IntuitiveAI v2.5
No ratings yet
Linear Regression by IntuitiveAI v2.5
5 pages
Regression
No ratings yet
Regression
30 pages
Linear - Regression - SGD
No ratings yet
Linear - Regression - SGD
71 pages
Linear Regression
No ratings yet
Linear Regression
63 pages
Notes 1
No ratings yet
Notes 1
30 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
cs229 2
No ratings yet
cs229 2
275 pages
CS229
No ratings yet
CS229
69 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
Gradient Descent
No ratings yet
Gradient Descent
15 pages
Unit VI Optimization Techniques Question Bank Solved Answer
No ratings yet
Unit VI Optimization Techniques Question Bank Solved Answer
20 pages
05 Gradient Descent
No ratings yet
05 Gradient Descent
23 pages
cs229 Notes1 PDF
No ratings yet
cs229 Notes1 PDF
28 pages
Linear Regression
No ratings yet
Linear Regression
29 pages
C1 W1 Lab03 Cost Function Soln
No ratings yet
C1 W1 Lab03 Cost Function Soln
5 pages
Backpropagation Optimization Tutorial
No ratings yet
Backpropagation Optimization Tutorial
14 pages
Sheet 3 Sol 3
No ratings yet
Sheet 3 Sol 3
3 pages
5.1loss Function, Optimization, GD
No ratings yet
5.1loss Function, Optimization, GD
39 pages
Gradient Descent From Scratch Complete Intuition
No ratings yet
Gradient Descent From Scratch Complete Intuition
8 pages
Gradient Descent Algorithm
No ratings yet
Gradient Descent Algorithm
6 pages
C1 W1 Lab03 Cost Function Soln
No ratings yet
C1 W1 Lab03 Cost Function Soln
4 pages
Gradient Descent
No ratings yet
Gradient Descent
13 pages
C1 W1 Lab03 Cost Function Soln
No ratings yet
C1 W1 Lab03 Cost Function Soln
4 pages
Basic Machine Learning: Case Study
No ratings yet
Basic Machine Learning: Case Study
11 pages
C1 W1 Lab03 Cost Function Soln
No ratings yet
C1 W1 Lab03 Cost Function Soln
4 pages
Machine Learning Notes Cs229 1
No ratings yet
Machine Learning Notes Cs229 1
217 pages
Tom Mitchell Provides A More Modern Definition
No ratings yet
Tom Mitchell Provides A More Modern Definition
10 pages
Experiment No
No ratings yet
Experiment No
29 pages
Eem520l3 2023
No ratings yet
Eem520l3 2023
25 pages
Training Method: Iterative Trial and Error Process That Machine Learning Algorithms May Use To Train A Model
No ratings yet
Training Method: Iterative Trial and Error Process That Machine Learning Algorithms May Use To Train A Model
8 pages
W2 Lab04 Gradient Descent Problem Assignment
No ratings yet
W2 Lab04 Gradient Descent Problem Assignment
10 pages
Lecture3 - Linear Regression and Logistic Regression
No ratings yet
Lecture3 - Linear Regression and Logistic Regression
60 pages
Detailed Guide 7 Loss Functions Machine Learning Python Code
No ratings yet
Detailed Guide 7 Loss Functions Machine Learning Python Code
16 pages
L3 Linear Regression and Gradient Descent
No ratings yet
L3 Linear Regression and Gradient Descent
46 pages
Linear Regression With One Variable
No ratings yet
Linear Regression With One Variable
48 pages
Larning Rate
No ratings yet
Larning Rate
9 pages
What Is Machine Learning by Coursera
No ratings yet
What Is Machine Learning by Coursera
47 pages
Unit-III Advanced Machine Learning
No ratings yet
Unit-III Advanced Machine Learning
8 pages
Take It Easy: Created Status Last Read
No ratings yet
Take It Easy: Created Status Last Read
55 pages
04 Optimization
No ratings yet
04 Optimization
62 pages
Unit - II: Assignment Problems Hungarian Method
No ratings yet
Unit - II: Assignment Problems Hungarian Method
41 pages
Operation Management Notes
No ratings yet
Operation Management Notes
16 pages
Process Controller For Temperature, Flow, Pressure Etc: Prolific
No ratings yet
Process Controller For Temperature, Flow, Pressure Etc: Prolific
23 pages
Calibration of Software Quality: Fuzzy Neural and Rough Neural Computing Approaches
No ratings yet
Calibration of Software Quality: Fuzzy Neural and Rough Neural Computing Approaches
4 pages
Important Questions of Artificial Intelligence (Kmc-101)
No ratings yet
Important Questions of Artificial Intelligence (Kmc-101)
2 pages
Experiment No 3 ML
No ratings yet
Experiment No 3 ML
6 pages
K-Nearest Neighbor: Carla P. Gomes CS4700
No ratings yet
K-Nearest Neighbor: Carla P. Gomes CS4700
9 pages
The Spiral Model
No ratings yet
The Spiral Model
5 pages
SUP 3033 - Assignment - 1
No ratings yet
SUP 3033 - Assignment - 1
5 pages
18IS62 Module5
0% (1)
18IS62 Module5
8 pages
MeasuringtheQualityofSRS Finalsubmission PDF
No ratings yet
MeasuringtheQualityofSRS Finalsubmission PDF
16 pages
Maintenance and Logistics Support Engineering
No ratings yet
Maintenance and Logistics Support Engineering
4 pages
Optimal Predictive Adaptive Control
100% (1)
Optimal Predictive Adaptive Control
425 pages
Software Development Best Practices
100% (1)
Software Development Best Practices
29 pages
Reading Process Theories: Bottom-Up Model Interactive Model Top-Down Model
No ratings yet
Reading Process Theories: Bottom-Up Model Interactive Model Top-Down Model
2 pages
BSCS CMSC311 SLM4
No ratings yet
BSCS CMSC311 SLM4
7 pages
Introduction To DBMS
No ratings yet
Introduction To DBMS
17 pages
Process Modeling, Identificaion and Control
No ratings yet
Process Modeling, Identificaion and Control
20 pages
Synapse - Test Automation Engineer
No ratings yet
Synapse - Test Automation Engineer
1 page
QMS Index
No ratings yet
QMS Index
9 pages
Obstacle Avoidance Based On Fuzzy Logic
No ratings yet
Obstacle Avoidance Based On Fuzzy Logic
5 pages
CS607 SOLVED MCQs FINAL TERM BY JUNAID
100% (1)
CS607 SOLVED MCQs FINAL TERM BY JUNAID
26 pages
All Units Ppts Walker Royce
50% (2)
All Units Ppts Walker Royce
110 pages
Generalization/Specialization: Database Systems
No ratings yet
Generalization/Specialization: Database Systems
15 pages
DBMS Material Module-1 & 2
No ratings yet
DBMS Material Module-1 & 2
67 pages
Critical Path Method (CPM)
No ratings yet
Critical Path Method (CPM)
11 pages
Requirements Engineering ESPAÑOL
No ratings yet
Requirements Engineering ESPAÑOL
211 pages
Control System Presentation
No ratings yet
Control System Presentation
15 pages

Lect 7 - Gradient Descent

Uploaded by

Lect 7 - Gradient Descent

Uploaded by

Gradient Descent

• the learning rate is a tuning parameter in an optimization

Let’s Assume that x=0,1,2,3

The cost will become: take

Y=[0,0.5,1,1.5] The model gives better results when

You might also like