0% found this document useful (0 votes)

4 views

Lecture 4

Uploaded by

Uzma Mohammed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Lecture 4

Uploaded by

Uzma Mohammed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

Lecture 4: Linear Neural Network and Linear

Regression: Part 2

Md. Shahriar Hussain

ECE Department, NSU

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain
Linear Regression Single Variable

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 2
Important Equations

Hypothesis:

Parameters:

Cost Function:

Goal:

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 3
Cost Function for two parameters

(for fixed , this is a function of x) (function of the parameters )

500

400

300
Price ($)
200
in 1000’s
100

0
0 500 1000 1500 2000 2500 3000

Size in feet2 (x)

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 4
Cost Function for two parameters

• Previously we plotted our cost function by plotting

– θ1 vs J(θ1)
• Now we have two parameters
– Plot becomes a bit more complicated
– Generates a 3D surface plot where axis are
• X = θ1
• Z = θ0
• Y = J(θ0,θ1)

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 5
Cost Function for two parameters

• We can see that the height

(y) of the graph indicates the
value of the cost function,
• we need to find where y is at
a minimum

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 6
Gradient descent

• We want to get min J(θ0, θ1)

• Gradient descent
– Used all over machine learning for minimization

• Outline:

• Start with some

• Keep changing to reduce until we hopefully

end up at a minimum

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 7
Gradient descent

 Start with initial guesses

 Start at 0,0 (or any other value)
 Keeping changing θ0 and θ1 a little bit to try and reduce J(θ0,θ1)
 Each time you change the parameters, you select the gradient which
reduces J(θ0,θ1) the most possible
 Repeat
 Do so until you converge to a local minimum
 Has an interesting property
 Where you start can determine which minimum you end up
 Here we can see one initialization point led to one local minimum
 The other led to a different one

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 8
Gradient descent

• One initialization point led to one local minimum.

The other led to a different one
North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 9
Gradient Descent Algorithm
• Gradient descent is used to minimize the MSE by
calculating the gradient of the cost function

Correct: Simultaneous update Incorrect:

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 10
Gradient Descent Algorithm

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 11
Gradient Descent Algorithm

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 12
Gradient Descent Algorithm

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 13
Gradient Descent Algorithm

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 14
Gradient Descent Algorithm

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 15
Gradient Descent Algorithm

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 16
Learning Rate

• Here, α is the learning rate, a hyperparameter

• It controls how big steps we made
• If α is small, we will take tiny steps
• If α is big, we have an aggressive gradient descent

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 17
Learning Rate

If α is too small, gradient descent

can be slow.
Higher training time

If α is too large, gradient descent

can overshoot the minimum. It may
fail to converge, or even diverge.

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 18
Learning Rate

If α is too small, gradient descent

can be slow.
Higher training time

If α is too large, gradient descent

can overshoot the minimum. It may
fail to converge, or even diverge.

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 19
Learning Rate

If α is too small, gradient descent

can be slow.
Higher training time

If α is too large, gradient descent

can overshoot the minimum. It may
fail to converge, or even diverge.

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 20
Local Minima

• Local minimum: value of the loss function is minimum at that point in a local
region.
• Global minima: value of the loss function is minimum globally across the
entire domain the loss function

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 21
Local Minima

at local minima

Global minima

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 22
Gradient Descent Calculation

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 23
Gradient Descent Calculation

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 24
Gradient Descent Calculation

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 25
Linear Regression Single Variable

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 26
Linear Regression Multiple Variable

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 27
Linear Regression Multiple Variable

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 28
Linear Regression Multiple Variable

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 29
Linear Regression Multiple Variable

Now:

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 30
Linear Regression Multiple Variable

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 31
Linear Regression Multiple Variable

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 32
Gradient Descent for Multi Variables

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 33
Gradient Descent for Multi Variables

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 34
Gradient Descent for Multi Variables

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 35
Gradient Descent for Multi Variables

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 36
Gradient Descent for Multi Variables Vector Format

• Vector format:

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 37
Gradient Descent Calculation

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 38
Gradient Descent for Multi Variables Vector Format

• Compute them in the matrix format

• The gradient vector,

where, X is a matrix

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 39
Gradient Descent for Multi Variables Vector Format

• Suppose, we have d number of features and m number of sample examples.

1 1 1 1
𝑥0 𝑥1 𝑥2 … … . 𝑥𝑑
2 2 2 2
𝑥0 𝑥1 𝑥2 … … . 𝑥𝑑
3 3 3 3
𝑥0 𝑥1 𝑥2 … … . 𝑥𝑑
𝑋= .
.
.
.
𝑥0 𝑚 𝑥1 𝑚 𝑥2 𝑚 … … . 𝑥𝑑 𝑚

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 40
Gradient Descent for Multi Variables Vector Format
• Dimensionality Matching
– Suppose, we have d number of features and m number of sample examples

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 41
Gradient Descent for Multi Variables Vector Format

• Dimensionality Matching

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 42
Gradient Descent for Multi Variables Vector Format

• The gradient Descent update rule in matrix format:

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 43
Batch Gradient Descent

• This formula involves calculations over the full training set X, at each
Gradient Descent step
• This is why the algorithm is called Batch Gradient Descent.

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 44
• Reference:
– Andrew NG Lectures on Machine Learning, Standford University

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 45

Mensuration
From Everand
Mensuration
Wm. S. Hall
4.5/5 (4)
Depliant 4NSYS DVR 960H
No ratings yet
Depliant 4NSYS DVR 960H
4 pages
Lecture 3
No ratings yet
Lecture 3
56 pages
Lecture 6
No ratings yet
Lecture 6
51 pages
Lecture 5
No ratings yet
Lecture 5
34 pages
11 Gradient Descent
No ratings yet
11 Gradient Descent
58 pages
5.1Loss Function, Optimization,Gd
No ratings yet
5.1Loss Function, Optimization,Gd
39 pages
Gradient_Descent
No ratings yet
Gradient_Descent
52 pages
LInear
No ratings yet
LInear
14 pages
Notes Unit 1-3 Part-III
No ratings yet
Notes Unit 1-3 Part-III
25 pages
Regression Analysis
No ratings yet
Regression Analysis
54 pages
Chapter 4
No ratings yet
Chapter 4
65 pages
ML Lecture # 03 Gradient Descent
No ratings yet
ML Lecture # 03 Gradient Descent
23 pages
Enigma Submission
No ratings yet
Enigma Submission
3 pages
05 Gradient Descent
No ratings yet
05 Gradient Descent
23 pages
4_Gradient Descent and Stochastic GD
No ratings yet
4_Gradient Descent and Stochastic GD
37 pages
L4 More On Linear Regression and Polynomial Regression
No ratings yet
L4 More On Linear Regression and Polynomial Regression
37 pages
Problem_Set_Linear_Regression_and_Gradient_Descent
No ratings yet
Problem_Set_Linear_Regression_and_Gradient_Descent
3 pages
Mlfa Autumn 23 Optimization
No ratings yet
Mlfa Autumn 23 Optimization
37 pages
4. Gradient Descent
No ratings yet
4. Gradient Descent
15 pages
ML Notes
No ratings yet
ML Notes
14 pages
3 TrainingNetwork
No ratings yet
3 TrainingNetwork
65 pages
Gradient Descent: Disclaimer: This PPT Is Modified Based On Hung-Yi Lee
No ratings yet
Gradient Descent: Disclaimer: This PPT Is Modified Based On Hung-Yi Lee
38 pages
Gradient Descent
No ratings yet
Gradient Descent
5 pages
Lecture02a Optimization Annotated PDF
No ratings yet
Lecture02a Optimization Annotated PDF
23 pages
Lecture 7
No ratings yet
Lecture 7
33 pages
Lec02 - Linear Regression
No ratings yet
Lec02 - Linear Regression
35 pages
Gradient Descent
No ratings yet
Gradient Descent
9 pages
lec6_7_Linear_regression
No ratings yet
lec6_7_Linear_regression
38 pages
DL Unit -2
No ratings yet
DL Unit -2
20 pages
CS 304.A Training Models
No ratings yet
CS 304.A Training Models
149 pages
Gradient Descent
No ratings yet
Gradient Descent
13 pages
DNN M3 Optimization
No ratings yet
DNN M3 Optimization
81 pages
Gradient Descent_PR
No ratings yet
Gradient Descent_PR
31 pages
Chapter4 PDF
No ratings yet
Chapter4 PDF
9 pages
WINSEM2024-25_CSE4006_ETH_AP2024254000693_2025-01-08_Reference-Material-I
No ratings yet
WINSEM2024-25_CSE4006_ETH_AP2024254000693_2025-01-08_Reference-Material-I
40 pages
Lec05-1-Gradient Descent-Detailed
No ratings yet
Lec05-1-Gradient Descent-Detailed
62 pages
Gradient Descent
No ratings yet
Gradient Descent
6 pages
Gradient Descent a Fundamental Optimization Algorithm
No ratings yet
Gradient Descent a Fundamental Optimization Algorithm
30 pages
Gradient Descent DS Rohit Sharma Fench Knjs
No ratings yet
Gradient Descent DS Rohit Sharma Fench Knjs
15 pages
Gradient Descent (v2)
No ratings yet
Gradient Descent (v2)
38 pages
L3 Linear Regression and Gradient Descent
No ratings yet
L3 Linear Regression and Gradient Descent
46 pages
Module 4 Lab 3
No ratings yet
Module 4 Lab 3
6 pages
Lecture 7 - Optimization Part I
No ratings yet
Lecture 7 - Optimization Part I
38 pages
Gradient Descent Deep Learning: by T.K. Damodharan Vice President, RBS Reg - No: PC2013003013008
No ratings yet
Gradient Descent Deep Learning: by T.K. Damodharan Vice President, RBS Reg - No: PC2013003013008
37 pages
Chapter04_Training_Models
No ratings yet
Chapter04_Training_Models
33 pages
Gradient Descent
No ratings yet
Gradient Descent
5 pages
Module 4 Lab 2
No ratings yet
Module 4 Lab 2
5 pages
Sheet 3 Sol 3
No ratings yet
Sheet 3 Sol 3
3 pages
Gradient Descent Algorithm in Machine Learning
No ratings yet
Gradient Descent Algorithm in Machine Learning
21 pages
AI33
No ratings yet
AI33
6 pages
Linear Regression With One Variable: Gradient Descent
No ratings yet
Linear Regression With One Variable: Gradient Descent
30 pages
Gradient Descent - Xiaowei Huang
No ratings yet
Gradient Descent - Xiaowei Huang
53 pages
CCS355 Neural Networks and Deep Learning
No ratings yet
CCS355 Neural Networks and Deep Learning
142 pages
Presentation
No ratings yet
Presentation
12 pages
Gradient Descent Algorithm in Machine Learning: Dr. P. K. Chaurasia
No ratings yet
Gradient Descent Algorithm in Machine Learning: Dr. P. K. Chaurasia
24 pages
Mlfa Autumn 22 Lec 04
No ratings yet
Mlfa Autumn 22 Lec 04
24 pages
Adam Optimizer
No ratings yet
Adam Optimizer
22 pages
Lecture 2.1 Linear Regression
No ratings yet
Lecture 2.1 Linear Regression
36 pages
Applications of Derivatives Errors and Approximation (Calculus) Mathematics Question Bank
From Everand
Applications of Derivatives Errors and Approximation (Calculus) Mathematics Question Bank
Mohmmad Khaja Shareef
No ratings yet
HDTVI DVR 3MP Resolution: Features and Functions
No ratings yet
HDTVI DVR 3MP Resolution: Features and Functions
2 pages
3.1 The Internet The Basics - Vocabulary IT
No ratings yet
3.1 The Internet The Basics - Vocabulary IT
1 page
Transcript 2
No ratings yet
Transcript 2
3 pages
MBA in Python - 1
No ratings yet
MBA in Python - 1
32 pages
Solving Quadratic Equation by Factoring PDF
No ratings yet
Solving Quadratic Equation by Factoring PDF
13 pages
Chapter Six 3
No ratings yet
Chapter Six 3
60 pages
QB64 Introduction and QB64 Statements
No ratings yet
QB64 Introduction and QB64 Statements
4 pages
Federal Public Service Commission
No ratings yet
Federal Public Service Commission
3,523 pages
7 Best Practices For Ransomware Recovery
No ratings yet
7 Best Practices For Ransomware Recovery
19 pages
Chapter6 2020
No ratings yet
Chapter6 2020
27 pages
Symbolic Arguments
No ratings yet
Symbolic Arguments
9 pages
Lesson2 Huawei Ascend Platform Introduction EXTERNAL
No ratings yet
Lesson2 Huawei Ascend Platform Introduction EXTERNAL
40 pages
A Survey of Industry in Cambodia and Future Prospects Industry 4
No ratings yet
A Survey of Industry in Cambodia and Future Prospects Industry 4
4 pages
Class Work Prectice 1 - Evaluate The Scenarios
100% (1)
Class Work Prectice 1 - Evaluate The Scenarios
10 pages
Threat Analysis Report: Hash Values File Details Environment
No ratings yet
Threat Analysis Report: Hash Values File Details Environment
4 pages
Penyata Akaun: Tarikh Date Keterangan Description Terminal ID ID Terminal Amaun (RM) Amount (RM) Baki (RM) Balance (RM)
No ratings yet
Penyata Akaun: Tarikh Date Keterangan Description Terminal ID ID Terminal Amaun (RM) Amount (RM) Baki (RM) Balance (RM)
4 pages
p92 8332 Harness Main Cab, Cvntl Node
No ratings yet
p92 8332 Harness Main Cab, Cvntl Node
30 pages
Introduction To Business Systems Analysis
No ratings yet
Introduction To Business Systems Analysis
11 pages
2.4 GHZ Yagi Antennas
No ratings yet
2.4 GHZ Yagi Antennas
2 pages
Chapter4 - Filed Bus - Final
No ratings yet
Chapter4 - Filed Bus - Final
34 pages
Aeon - Co-Can Philosophy Help Us Get A Grip On The Consequences of AI
No ratings yet
Aeon - Co-Can Philosophy Help Us Get A Grip On The Consequences of AI
10 pages
HoldPeak Volt Meter HP Multimeter Instructions
No ratings yet
HoldPeak Volt Meter HP Multimeter Instructions
2 pages
E-Manual Curved Smart TV HU7200 Series 7
No ratings yet
E-Manual Curved Smart TV HU7200 Series 7
58 pages
Unit 4
No ratings yet
Unit 4
18 pages
Micro Inertial Reference System IRS: Product Description March 2012
No ratings yet
Micro Inertial Reference System IRS: Product Description March 2012
33 pages
01A-Assignment Brief CAD (APD APU1F2307ID)
No ratings yet
01A-Assignment Brief CAD (APD APU1F2307ID)
12 pages
bài tập
No ratings yet
bài tập
4 pages
Pizza Delivery
No ratings yet
Pizza Delivery
36 pages
Unit 4
No ratings yet
Unit 4
13 pages

Lecture 4

Uploaded by

Lecture 4

Uploaded by

Lecture 4: Linear Neural Network and Linear

Md. Shahriar Hussain

(for fixed , this is a function of x) (function of the parameters )

Size in feet2 (x)

• Previously we plotted our cost function by plotting

• We can see that the height

• We want to get min J(θ0, θ1)

• Start with some

• Keep changing to reduce until we hopefully

 Start with initial guesses

• One initialization point led to one local minimum.

Correct: Simultaneous update Incorrect:

• Here, α is the learning rate, a hyperparameter

If α is too small, gradient descent

If α is too large, gradient descent

If α is too small, gradient descent

If α is too large, gradient descent

If α is too small, gradient descent

If α is too large, gradient descent

• Compute them in the matrix format

• The gradient vector,

• Suppose, we have d number of features and m number of sample examples.

• The gradient Descent update rule in matrix format:

You might also like