0% found this document useful (0 votes)

30 views35 pages

Gradient Descent and Cost Function

The document discusses gradient descent and its relationship with cost functions, slope, and derivatives. It explains how gradient descent is used to find the optimal regression line by minimizing the mean square error through iterative adjustments of slope and intercept. Additionally, it highlights the importance of step size and learning rate in achieving convergence to the global minimum.

Uploaded by

Asma Ayub

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views35 pages

Gradient Descent and Cost Function

Uploaded by

Asma Ayub

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

GRADIENT DESCENT AND COST

FUNCTION
RECAP OF BASIC CONCEPTS
SLOPE
RECAP OF BASIC CONCEPTS

Case of a linear line

Delta y = 3 (9-6) {change in y}

Delta x = 1 (3-2) {change in x}
RECAP OF BASIC CONCEPTS
SLOPE
Case of a non-linear
Slope is not constant
Based on the point under observation,
the slope may vary
How to find the slope ?
RECAP OF BASIC CONCEPTS
SLOPE
Zoom on the point, and it will look like a straight
line
We can then use Delta y and Delta x, to calculate
the slope
RECAP OF BASIC CONCEPTS
SLOPE
RECAP OF BASIC CONCEPTS
SLOPE AND DERIVATIVE
RECAP OF BASIC CONCEPTS
DERIVATIVE
RECAP OF BASIC CONCEPTS
DERIVATIVE
RECAP OF BASIC CONCEPTS
SLOPE AND DERIVATIVE
RECAP OF BASIC CONCEPTS
DERIVATIVE
RECAP OF BASIC CONCEPTS
SLOPE AND DERIVATIVE
Geometrically, the derivative of a function
can be interpreted as the slope of the graph
of the function or, more precisely, as the
slope of the tangent line at a point.

Its calculation, in fact, derives from the slope

formula for a straight line, except that a limiting
process must be used for curves.
RECAP OF BASIC CONCEPTS
SLOPE AND DERIVATIVE
RECAP OF BASIC CONCEPTS
PARTIAL DERIVATIVE
RECAP OF BASIC CONCEPTS

PARTIAL DERIVATIVE AND GRADIENT DESCENT

Partial derivative tells us that how much will be the
effect of each weight change on the cost function.
E.g. how much price is changing, given number of
bedrooms
RECAP OF BASIC CONCEPTS
POSSIBLE LINEAR REGRESSION LINES

17
MEAN SQUARE ERROR

18
MEAN SQUARE ERROR

19
MEAN SQUARE ERROR

20
GRADIENT DESCENT

There can be multiple regression lines that we

can create with different values of slope and
intercept.

But we cannot try every permutation and

combination of slope and intercept.

The efficient approach will be to select the

optimal line in minimum number of iterations.

Gradient descent is the algorithm that helps to

find best fit line for given training dataset in
minimum iterations.
GRADIENT DESCENT
Slope and intercept are
plotted against MSE.

Different values of m and b

will create a plane.

22
We start from any value of
m and b, usually 0.

Then from that point cost is

calculated and it is reduced
with every mini-step.

We keep on taking small

steps until we reach the
minimum point.

At minima, the error is

minimum.
GRADIENT DESCENT
Two views from m and b.

23
GRADIENT DESCENT
If we take fixed size steps, we can miss global
minima.
The gradient descent will never converge.

24
GRADIENT DESCENT
Varying step sizes can help to achieve global minima.

25
GRADIENT DESCENT
Varying step sizes can be achieved by calculating slope at
each point.
Partial derivative/Slope will tell in which direction we
need to go.

26
GRADIENT DESCENT
Learning rate decides the step size.
GRADIENT DESCENT

28
GRADIENT DESCENT

Step size Derivative

Term
For simplicity, assume we have to
minimize a function of one variable
only J(θ1)

29
GRADIENT DESCENT

J(θ1) ; θ1 is a real number

Slope of the line

θ1 tangent to the
Initialize θ1 function at point θ1

Line has a positive slope,

derivative is positive, so θ1 moves
towards the left

30
GRADIENT DESCENT

J(θ1) ; θ1 is a real number

Slope of the line

θ1 tangent to the
function at point θ1

Line has a positive slope,

derivative is positive, so θ1 moves
towards the left

31
GRADIENT DESCENT

J(θ1) ; θ1 is a real number

θ1

Line has a negative slope,

derivative is negative, θ1 minus a
negative number adds something
to θ1 and moves it towards the
right
32
GRADIENT DESCENT

J(θ1) ; θ1 is a real number

θ1

IF alpha is too small, gradient descent

will take small steps, hence a long
time to converge

33
GRADIENT DESCENT

J(θ1) ; θ1 is a real number

θ1

IF alpha is too large, gradient descent

can overshoot the minimum and may
fail to converge

34
GRADIENT DESCENT

J(θ1) ; θ1 is a real number

θ1

If is θ1 already at minimum, the

derivative term will be zero and the
algorithm will converge in the first
iteration

Lecture-21-Exception Handling
No ratings yet
Lecture-21-Exception Handling
15 pages
Lecture 2.1 Linear Regression
No ratings yet
Lecture 2.1 Linear Regression
36 pages
Gradient Descent - Problem of Hiking Down A Mountain: Derivatives
No ratings yet
Gradient Descent - Problem of Hiking Down A Mountain: Derivatives
8 pages
K - Nearest Neighbors
No ratings yet
K - Nearest Neighbors
33 pages
Chap6 (Regression)
No ratings yet
Chap6 (Regression)
74 pages
Plastic Furniture
No ratings yet
Plastic Furniture
20 pages
Shaking Table Tests and Stability Analysis of Steep Nailed Slopes
No ratings yet
Shaking Table Tests and Stability Analysis of Steep Nailed Slopes
16 pages
Lecture 8
No ratings yet
Lecture 8
26 pages
04gradient Descent
No ratings yet
04gradient Descent
21 pages
(ML&PR 2025) Lec2 Regression II
No ratings yet
(ML&PR 2025) Lec2 Regression II
41 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
59 pages
ACS Landscape - Deployment - Recommendation - S4 - Q4 - 2020
No ratings yet
ACS Landscape - Deployment - Recommendation - S4 - Q4 - 2020
31 pages
Lecture 4
No ratings yet
Lecture 4
37 pages
Lect 5 - Gradient Descent
No ratings yet
Lect 5 - Gradient Descent
31 pages
Lecture 7
No ratings yet
Lecture 7
29 pages
Gradient Descent Vani
No ratings yet
Gradient Descent Vani
36 pages
Lec 5 - Gradient-Descent
No ratings yet
Lec 5 - Gradient-Descent
31 pages
Automata
No ratings yet
Automata
20 pages
Assignment 3
No ratings yet
Assignment 3
1 page
Mozart 60x120cm Lenza Punch E Catalogue
No ratings yet
Mozart 60x120cm Lenza Punch E Catalogue
18 pages
Lecture 22 Pipelining
No ratings yet
Lecture 22 Pipelining
13 pages
Linear Regression
No ratings yet
Linear Regression
130 pages
OASIS Native QuickHelp
No ratings yet
OASIS Native QuickHelp
21 pages
Lecture 2 A
No ratings yet
Lecture 2 A
10 pages
LinearRegression Annotated
No ratings yet
LinearRegression Annotated
116 pages
Gradient Descent
No ratings yet
Gradient Descent
108 pages
Tally and Accounting Course Notes
No ratings yet
Tally and Accounting Course Notes
35 pages
ML MU Unit 3RegressionTechniquespdf 2025 02-07-10!56!37
No ratings yet
ML MU Unit 3RegressionTechniquespdf 2025 02-07-10!56!37
115 pages
Chemistry.: Serway, R. A., & Jewett, J. W. (2018) - Physics For Scientists and Engineers With
No ratings yet
Chemistry.: Serway, R. A., & Jewett, J. W. (2018) - Physics For Scientists and Engineers With
6 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
116 pages
MLPPT
No ratings yet
MLPPT
36 pages
Lecture3 - Linear Regression and Logistic Regression
No ratings yet
Lecture3 - Linear Regression and Logistic Regression
60 pages
Greenhouse Monitoring System
No ratings yet
Greenhouse Monitoring System
7 pages
8 - Linear Regression - Gradient Descent Method
No ratings yet
8 - Linear Regression - Gradient Descent Method
21 pages
Math Lecture 4
No ratings yet
Math Lecture 4
27 pages
Linear Regression
No ratings yet
Linear Regression
63 pages
Linear Regression
No ratings yet
Linear Regression
20 pages
ML02
No ratings yet
ML02
25 pages
Linear Regression - Gradient Descent Method
No ratings yet
Linear Regression - Gradient Descent Method
15 pages
Excercise Process Analysis
No ratings yet
Excercise Process Analysis
8 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
117 pages
Gradient Descent
No ratings yet
Gradient Descent
52 pages
Ethernet Crossover Cable
No ratings yet
Ethernet Crossover Cable
5 pages
ACTIVITY 1.2 - Match Up Challenge
No ratings yet
ACTIVITY 1.2 - Match Up Challenge
2 pages
Module2 Optimizations
No ratings yet
Module2 Optimizations
65 pages
Salesforce1 User Guide
No ratings yet
Salesforce1 User Guide
8 pages
Naive Bayes
No ratings yet
Naive Bayes
24 pages
(PR 2024) Lec2 Regression II
No ratings yet
(PR 2024) Lec2 Regression II
41 pages
Latest Cables Interview Questions and Answers List
No ratings yet
Latest Cables Interview Questions and Answers List
96 pages
Gradient Descent - Xiaowei Huang
No ratings yet
Gradient Descent - Xiaowei Huang
53 pages
Automata
No ratings yet
Automata
27 pages
CCS355 Neural Networks and Deep Learning
No ratings yet
CCS355 Neural Networks and Deep Learning
142 pages
L3 Linear Regression and Gradient Descent
No ratings yet
L3 Linear Regression and Gradient Descent
46 pages
EBS Period Closing
No ratings yet
EBS Period Closing
18 pages
About Injection Molding
No ratings yet
About Injection Molding
13 pages
Emergency
No ratings yet
Emergency
5 pages
Ch2 - Lec3 - Linear Regression and Gradient Descent
No ratings yet
Ch2 - Lec3 - Linear Regression and Gradient Descent
60 pages
Lec05-1-Gradient Descent-Detailed
No ratings yet
Lec05-1-Gradient Descent-Detailed
62 pages
Computing For Data Sciences: Introduction To Regression Analysis
No ratings yet
Computing For Data Sciences: Introduction To Regression Analysis
9 pages
Notes Unit 1-3 Part-III
No ratings yet
Notes Unit 1-3 Part-III
25 pages
Simple Linear and Logistic Regression
No ratings yet
Simple Linear and Logistic Regression
81 pages
Redline Heavywall Data Sheet
No ratings yet
Redline Heavywall Data Sheet
16 pages
ML Lecture # 03 Gradient Descent
No ratings yet
ML Lecture # 03 Gradient Descent
23 pages
07 Gradient Descent For Linear Regression 10 Min
No ratings yet
07 Gradient Descent For Linear Regression 10 Min
5 pages
Gradient Decent
No ratings yet
Gradient Decent
40 pages
5.1loss Function, Optimization, GD
No ratings yet
5.1loss Function, Optimization, GD
39 pages
Computer Architecture Introduction
No ratings yet
Computer Architecture Introduction
27 pages
05 Gradient Descent
No ratings yet
05 Gradient Descent
23 pages
Linear Regression Notes
No ratings yet
Linear Regression Notes
25 pages
MC 2022-2305
No ratings yet
MC 2022-2305
3 pages
Slides-4 Optimization Extra Gradient Descent
No ratings yet
Slides-4 Optimization Extra Gradient Descent
67 pages
Linear Regression by IntuitiveAI v2.5
No ratings yet
Linear Regression by IntuitiveAI v2.5
5 pages
Intac Reviewer 2
No ratings yet
Intac Reviewer 2
10 pages
Regression Analysis
No ratings yet
Regression Analysis
54 pages
Major Training-AME BHEL Bhopal
No ratings yet
Major Training-AME BHEL Bhopal
35 pages
Linear Regression With One Variable
No ratings yet
Linear Regression With One Variable
12 pages
LInear
No ratings yet
LInear
14 pages
DL Unit - 2
No ratings yet
DL Unit - 2
20 pages
ETABS-Example-RC Building Seismic Load - Response
50% (2)
ETABS-Example-RC Building Seismic Load - Response
35 pages
Gradient Descent
No ratings yet
Gradient Descent
15 pages
Gradient Descent
No ratings yet
Gradient Descent
5 pages
MACHINE LEARNING ALGORITHM Unit-II
No ratings yet
MACHINE LEARNING ALGORITHM Unit-II
115 pages
11 Gradient Descent
No ratings yet
11 Gradient Descent
58 pages
Gradient Descent Deep Learning: by T.K. Damodharan Vice President, RBS Reg - No: PC2013003013008
No ratings yet
Gradient Descent Deep Learning: by T.K. Damodharan Vice President, RBS Reg - No: PC2013003013008
37 pages
Gradient Descent
No ratings yet
Gradient Descent
4 pages
Deep Learning (Part 8) - Coursesteach
No ratings yet
Deep Learning (Part 8) - Coursesteach
16 pages
Math YHPLinear Regression
No ratings yet
Math YHPLinear Regression
13 pages
Anhvan12chuongtrinhglobalsuccess Thionlineunit9 Careerpaths
No ratings yet
Anhvan12chuongtrinhglobalsuccess Thionlineunit9 Careerpaths
3 pages
Gradient Descent
No ratings yet
Gradient Descent
8 pages
AI33
No ratings yet
AI33
6 pages
Gradient Descent
No ratings yet
Gradient Descent
9 pages
The Village Noongan District Langowan West)
No ratings yet
The Village Noongan District Langowan West)
8 pages
1B Compiled Lab Reports
No ratings yet
1B Compiled Lab Reports
68 pages
Gradient Descent
No ratings yet
Gradient Descent
5 pages
Science Fair
No ratings yet
Science Fair
8 pages
Questionnaire For Financial Audit
No ratings yet
Questionnaire For Financial Audit
2 pages
An Introduction To Gradient Descent and Linear Regression
No ratings yet
An Introduction To Gradient Descent and Linear Regression
8 pages
Chap 8 AE
No ratings yet
Chap 8 AE
8 pages
Heeva Infra Projects
No ratings yet
Heeva Infra Projects
3 pages
ML Notes
No ratings yet
ML Notes
14 pages
Hse Working Requirements/Plan
No ratings yet
Hse Working Requirements/Plan
1 page
Blox Hunt Script
No ratings yet
Blox Hunt Script
8 pages
Calculus: Maths of the Gods
From Everand
Calculus: Maths of the Gods
Bill Todorovich
No ratings yet
A Short Course in Automorphic Functions
From Everand
A Short Course in Automorphic Functions
Joseph Lehner
No ratings yet
Mathematical Functions
From Everand
Mathematical Functions
Oliver Linton
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)

Gradient Descent and Cost Function

Uploaded by

Gradient Descent and Cost Function

Uploaded by

GRADIENT DESCENT AND COST

Case of a linear line

Delta y = 3 (9-6) {change in y}

Its calculation, in fact, derives from the slope

PARTIAL DERIVATIVE AND GRADIENT DESCENT

There can be multiple regression lines that we

But we cannot try every permutation and

The efficient approach will be to select the

Gradient descent is the algorithm that helps to

Different values of m and b

Then from that point cost is

We keep on taking small

At minima, the error is

Step size Derivative

J(θ1) ; θ1 is a real number

Slope of the line

Line has a positive slope,

J(θ1) ; θ1 is a real number

Slope of the line

Line has a positive slope,

J(θ1) ; θ1 is a real number

Line has a negative slope,

J(θ1) ; θ1 is a real number

IF alpha is too small, gradient descent

J(θ1) ; θ1 is a real number

IF alpha is too large, gradient descent

J(θ1) ; θ1 is a real number

If is θ1 already at minimum, the

You might also like