0% found this document useful (0 votes)

10 views

Lecture 3

Uploaded by

Uzma Mohammed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

Lecture 3

Uploaded by

Uzma Mohammed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 56

Lecture 3: Linear Neural Network and Linear

Regression: Part 1

Md. Shahriar Hussain

ECE Department, NSU

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain
Linear Neural Networks (LNNs)

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 2
Linear Neural Networks (LNNs)

• The neuron aggregates the

weighted input data.

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 3
Linear Neural Networks (LNNs)

• There can be two different types of Linear Neural Networks

– Regression Problem
– Classification Problem

Classification
Regression Problem
Problem

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 4
Linear Neural Networks (LNNs)

• For Regression,
– There will be only aggregation
– No activation function is needed.

North South University Source: Andrew NG Lectures CSE445 5

Linear Neural Networks (LNNs)

• For Regression Problem, we need to

– Cast Linear Regression Technique as a LNN model
• For Classification Problem, we need to
– Cast Logistic and Softmax Regression Technique as a LNN model

Regression Problem Classification

Problem

North South University Source: Andrew NG Lectures CSE445 6

What is Linear Regression

• Linear regression is defined as an algorithm that provides a linear

relationship between an independent variable and a dependent variable to
predict the outcome of future events

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 7
Linear Regression Example

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 8
Linear Regression Example

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 9
Linear Regression Example

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 10
Linear Regression Example

A Line of best
fit/Regression Line is
a straight line that
represents the best
approximation of a
scatter plot of data
points

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 11
Linear Regression Example

Estimated/Predicted value (𝑦/𝑦) Actual/True value (𝑦)/Ground Truth

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 12
Data Set Description

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 13
Data Set Description

x (1) = 2104
x (2) = 1416
y (1) = 460
(x, y)= One Training Example
(x (i), y (i))= ith Training example y (2) = 232

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 14
Hypothesis

Training Set

Learning Algorithm

Size of house h Estimated

New/unseen price
data (x) hypothesis 𝑦 𝑥
North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 15
Hypothesis

• How do we represent h ?

𝑦 𝑥 =

𝜃0 and 𝜃1 : parameters/weights that will be

trained/determined by the ML model
Not hyperparameters Linear regression with one variable.
𝜃0 = intercept/bias/constant Univariate linear regression.
𝜃1 = slope/coefficient/gradient

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 16
Hypothesis

The goal is to choose Ө0 and Ө1 properly so that hӨ(x) is close to y.

• A cost function lets us figure out how to fit the best straight line to our data

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 17
Hypothesis

Size in feet2 (x) Price ($) in 1000's (y)

2104 460
1416 232
1534 315
852 178
… …
Hypothesis:
‘s: Parameters
How to choose ‘s ?
North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 18
Example

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 19
Cost Function

minimize
Ө0 Ө1

• We need to choose Ө0 and Ө1 in a way that the result of the function will be minimized for all
m training example. This equation is called cost function.
J(Ө0 , Ө1)=
minimize
Ө0 Ө1 J(Ө0 , Ө1)

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 20
Cost Function

Cost Function:

Goal:

• Here the cost function is called Squared Error cost function

• Minimize squared different between predicted house price and actual house
price
• 1/m - means we determine the average
• 1/2m the 2 makes the math a bit easier, and doesn't change the constants we
determine at all (i.e., half the smallest value is still the smallest value!)

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 21
Cost Function Calculation

• For simplifications, assumes θ0 = 0

Find best values of θ1 so that J(θ1) is minimum

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 22
Cost Function Calculation

3
3

2
2

1
1

0
0 -0.5 0 0.5 1 1.5 2 2.5
0 1 2 3
For, θ1 = 1
J(θ1) = 1/2*3 [0+0+0]=0
North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 23
Cost Function Calculation

For, θ1 = 0.5
J(θ1) = ?
North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 24
Cost Function Calculation

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 25
Cost Function Calculation

For, θ1 = 0
J(θ1) = ?
North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 26
Cost Function Calculation

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 27
Cost Function Calculation

• If we compute a range of values plot

 J(θ1) vs θ1 we get a polynomial
(looks like a quadratic)
• The optimization objective for the learning
algorithm is find the value of θ1 which
minimizes J(θ1)
 So, here θ1 = 1 is the best value
for θ1

 The line which has the least sum

of squares of errors is the best fit
line

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 28
Important Equations

Hypothesis:

Parameters:

Cost Function:

Goal:

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 29
Cost Function for two parameters

(for fixed , this is a function of x) (function of the parameters )

500

400

300
Price ($)
200
in 1000’s
100

0
0 500 1000 1500 2000 2500 3000

Size in feet2 (x)

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 30
Cost Function for two parameters

• Previously we plotted our cost function by plotting

– θ1 vs J(θ1)
• Now we have two parameters
– Plot becomes a bit more complicated
– Generates a 3D surface plot where axis are
• X = θ1
• Z = θ0
• Y = J(θ0,θ1)

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 31
Cost Function for two parameters

• We can see that the height

(y) of the graph indicates the
value of the cost function,
• we need to find where y is at
a minimum

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 32
Cost Function for two parameters

• A contour plot is a graphical technique for representing a 3-dimensional surface by

plotting constant z slices, called contours, on a 2-dimensional format

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 33
Cost Function for two parameters

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 34
Cost Function for two parameters

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 35
Cost Function for two parameters

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 36
Cost Function for two parameters

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 37
Gradient descent

• We want to get min J(θ0, θ1)

• Gradient descent
– Used all over machine learning for minimization

• Outline:

• Start with some

• Keep changing to reduce until we hopefully

end up at a minimum

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 38
Gradient descent

 Start with initial guesses

 Start at 0,0 (or any other value)
 Keeping changing θ0 and θ1 a little bit to try and reduce J(θ0,θ1)
 Each time you change the parameters, you select the gradient which
reduces J(θ0,θ1) the most possible
 Repeat
 Do so until you converge to a local minimum
 Has an interesting property
 Where you start can determine which minimum you end up
 Here we can see one initialization point led to one local minimum
 The other led to a different one

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 39
Gradient descent

• One initialization point led to one local minimum.

The other led to a different one
North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 40
Gradient Descent Algorithm
• Gradient descent is used to minimize the MSE by
calculating the gradient of the cost function

Correct: Simultaneous update Incorrect:

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 41
Gradient Descent Algorithm

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 42
Gradient Descent Algorithm

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 43
Gradient Descent Algorithm

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 44
Gradient Descent Algorithm

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 45
Gradient Descent Algorithm

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 46
Gradient Descent Algorithm

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 47
Learning Rate

• Here, α is the learning rate, a hyperparameter

• It controls how big steps we made
• If α is small, we will take tiny steps
• If α is big, we have an aggressive gradient descent

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 48
Learning Rate

If α is too small, gradient descent

can be slow.
Higher training time

If α is too large, gradient descent

can overshoot the minimum. It may
fail to converge, or even diverge.

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 49
Learning Rate

If α is too small, gradient descent

can be slow.
Higher training time

If α is too large, gradient descent

can overshoot the minimum. It may
fail to converge, or even diverge.

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 50
Learning Rate

If α is too small, gradient descent

can be slow.
Higher training time

If α is too large, gradient descent

can overshoot the minimum. It may
fail to converge, or even diverge.

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 51
Local Minima

• Local minimum: value of the loss function is minimum at that point in a local
region.
• Global minima: value of the loss function is minimum globally across the
entire domain the loss function

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 52
Local Minima

at local minima

Global minima

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 53
Gradient Descent Calculation

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 54
Gradient Descent Calculation

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 55
• Reference:
– Andrew NG Lectures on Machine Learning, Standford University

North South University Source: Andrew NG Lectures CSE465 Md. Shahriar Hussain 56

Whats New With M8
No ratings yet
Whats New With M8
24 pages
Machine Learning - Exploring The Model - Resp
No ratings yet
Machine Learning - Exploring The Model - Resp
18 pages
Lecture 6
No ratings yet
Lecture 6
51 pages
CSE445 T3 Linear Regression One Variable
No ratings yet
CSE445 T3 Linear Regression One Variable
57 pages
Linear Regression - Univariate
No ratings yet
Linear Regression - Univariate
62 pages
L3 Linear Regression and Gradient Descent
No ratings yet
L3 Linear Regression and Gradient Descent
46 pages
Linear Regression With Multiple Variables
No ratings yet
Linear Regression With Multiple Variables
56 pages
Slide 3 - Linear Regression One Variable
No ratings yet
Slide 3 - Linear Regression One Variable
60 pages
Lecture 4
No ratings yet
Lecture 4
45 pages
Linear-Regression
No ratings yet
Linear-Regression
55 pages
Linear+regression+with+one+variable
No ratings yet
Linear+regression+with+one+variable
48 pages
Linear Regression
No ratings yet
Linear Regression
63 pages
[MLP] MidtermNote
No ratings yet
[MLP] MidtermNote
31 pages
What Is Machine Learning by Coursera
No ratings yet
What Is Machine Learning by Coursera
47 pages
2. Linear_ Regression_SGD
No ratings yet
2. Linear_ Regression_SGD
71 pages
Lecture02a Optimization Annotated PDF
No ratings yet
Lecture02a Optimization Annotated PDF
23 pages
Andrew NG Week 1-2
No ratings yet
Andrew NG Week 1-2
120 pages
Regression
No ratings yet
Regression
30 pages
CS229
No ratings yet
CS229
69 pages
cs229 2
No ratings yet
cs229 2
275 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
What Is Machine Learning?
No ratings yet
What Is Machine Learning?
12 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
293 pages
Linearna Regresija - NG
No ratings yet
Linearna Regresija - NG
7 pages
Machine Learning Notes by Standard Andrew Ng
No ratings yet
Machine Learning Notes by Standard Andrew Ng
142 pages
Machine Learning Notes AndrewNg
No ratings yet
Machine Learning Notes AndrewNg
141 pages
Stanford ML CS229-Merged Notes
No ratings yet
Stanford ML CS229-Merged Notes
126 pages
Linear Regression Notes
No ratings yet
Linear Regression Notes
15 pages
Regression Analysis
No ratings yet
Regression Analysis
54 pages
Linear Regression For Absolute Beginners With Implementation in Python
No ratings yet
Linear Regression For Absolute Beginners With Implementation in Python
17 pages
Supervised Machine Learning
No ratings yet
Supervised Machine Learning
7 pages
cs229 Notes1 PDF
No ratings yet
cs229 Notes1 PDF
28 pages
Machine Learning Notes Cs229 1
No ratings yet
Machine Learning Notes Cs229 1
217 pages
04 LinearRegression
No ratings yet
04 LinearRegression
61 pages
Tom Mitchell Provides A More Modern Definition
No ratings yet
Tom Mitchell Provides A More Modern Definition
10 pages
5.1Loss Function, Optimization,Gd
No ratings yet
5.1Loss Function, Optimization,Gd
39 pages
Linear Regression: Jia-Bin Huang Virginia Tech
No ratings yet
Linear Regression: Jia-Bin Huang Virginia Tech
59 pages
01B-DL2023-LinearModels
No ratings yet
01B-DL2023-LinearModels
47 pages
Cost Function: y 2m 1 (Y ) 2m 1
No ratings yet
Cost Function: y 2m 1 (Y ) 2m 1
1 page
2 (1)
No ratings yet
2 (1)
18 pages
04 LinearRegression PDF
No ratings yet
04 LinearRegression PDF
61 pages
CSE445 T4a Logistic Regression
No ratings yet
CSE445 T4a Logistic Regression
38 pages
Gdesc LMS
No ratings yet
Gdesc LMS
7 pages
Linear Regression
No ratings yet
Linear Regression
6 pages
DSCTP 2022 1 ML Slides
No ratings yet
DSCTP 2022 1 ML Slides
110 pages
Lecture 2.1 Linear Regression
No ratings yet
Lecture 2.1 Linear Regression
36 pages
Machine Learning
No ratings yet
Machine Learning
60 pages
4. Gradient Descent
No ratings yet
4. Gradient Descent
15 pages
Lecture slides - Linear Regression (2025)
No ratings yet
Lecture slides - Linear Regression (2025)
45 pages
ML: Introduction 1. What Is Machine Learning?
No ratings yet
ML: Introduction 1. What Is Machine Learning?
38 pages
[ML&PR 2025] Lec2 Regression II
No ratings yet
[ML&PR 2025] Lec2 Regression II
41 pages
lec6_7_Linear_regression
No ratings yet
lec6_7_Linear_regression
38 pages
(MLP) Lecture Notes
No ratings yet
(MLP) Lecture Notes
22 pages
Foundations of Elementary Analysis
From Everand
Foundations of Elementary Analysis
Roshan Trivedi
No ratings yet
Axle box temperature Recorder LC76
No ratings yet
Axle box temperature Recorder LC76
8 pages
NguyenTriDan - AI Engineering Intern - CV
No ratings yet
NguyenTriDan - AI Engineering Intern - CV
1 page
Client CL - Data Analyst (HK)
No ratings yet
Client CL - Data Analyst (HK)
2 pages
Ut 303 PDF
No ratings yet
Ut 303 PDF
2 pages
Python Script
No ratings yet
Python Script
2 pages
Grassmann Manifold Based Spectrum Sensing For TV White Spaces
No ratings yet
Grassmann Manifold Based Spectrum Sensing For TV White Spaces
11 pages
Team Deployment With Money Requisition 25th August 2020
No ratings yet
Team Deployment With Money Requisition 25th August 2020
37 pages
PDF Industrial Security Managing Security in the 21st Century 1st Edition David L. Russell download
80% (5)
PDF Industrial Security Managing Security in the 21st Century 1st Edition David L. Russell download
61 pages
Azure Data Engineer Course Curriculum Nareshit
No ratings yet
Azure Data Engineer Course Curriculum Nareshit
10 pages
Long Division
No ratings yet
Long Division
12 pages
Krelease
No ratings yet
Krelease
57 pages
Chapter 2 - Property of REal Numbers
No ratings yet
Chapter 2 - Property of REal Numbers
18 pages
Survey of Systems Engineering Approaches For Design and Development of AESA Radar For Fighter Aircraft
No ratings yet
Survey of Systems Engineering Approaches For Design and Development of AESA Radar For Fighter Aircraft
7 pages
SE6104sp
No ratings yet
SE6104sp
30 pages
Introduction To Mass Communication
No ratings yet
Introduction To Mass Communication
9 pages
City of Marco Island Fiscal Year 2021 Capital Budget Workshop (Presentation) - June 8, 2020
No ratings yet
City of Marco Island Fiscal Year 2021 Capital Budget Workshop (Presentation) - June 8, 2020
48 pages
RACI Matrix
100% (3)
RACI Matrix
3 pages
Cryptocurrency and Blockchains in Emerging Economies: Social and Political Context
No ratings yet
Cryptocurrency and Blockchains in Emerging Economies: Social and Political Context
8 pages
PRACTICE EXERCISES Unit 5
No ratings yet
PRACTICE EXERCISES Unit 5
8 pages
Robotics 11
No ratings yet
Robotics 11
35 pages
Raystar Optronics, Inc.: RX12864A1 Graphic 128x64 Dots
No ratings yet
Raystar Optronics, Inc.: RX12864A1 Graphic 128x64 Dots
1 page
Science Practice Guide 2018
No ratings yet
Science Practice Guide 2018
23 pages
SWAP - VHBBB - MALL - Taberelor ING DM DROGERIE IMMOCHAN IMOBILIARE Se
No ratings yet
SWAP - VHBBB - MALL - Taberelor ING DM DROGERIE IMMOCHAN IMOBILIARE Se
112 pages
Kevin Cooper - Computer Programming For Beginners - 3 Books in 1 - Step by Step Guide To Learn Programming, Python For Beginners, Python Machine Learning
No ratings yet
Kevin Cooper - Computer Programming For Beginners - 3 Books in 1 - Step by Step Guide To Learn Programming, Python For Beginners, Python Machine Learning
501 pages
Resources Japanese
No ratings yet
Resources Japanese
4 pages
Course Description:: IIŞIK UNIVERSITY - Computer Engineering Department SOFT3501 Requirements Analysis - Fall 2021
No ratings yet
Course Description:: IIŞIK UNIVERSITY - Computer Engineering Department SOFT3501 Requirements Analysis - Fall 2021
2 pages
Library Noise Detector and Short Informa
No ratings yet
Library Noise Detector and Short Informa
94 pages
Deif Hybrid Controller Compatibility 4189341288 Uk
No ratings yet
Deif Hybrid Controller Compatibility 4189341288 Uk
11 pages
DS Question List Ioe
No ratings yet
DS Question List Ioe
4 pages

Lecture 3

Uploaded by

Lecture 3

Uploaded by

Lecture 3: Linear Neural Network and Linear

Md. Shahriar Hussain

• The neuron aggregates the

• There can be two different types of Linear Neural Networks

North South University Source: Andrew NG Lectures CSE445 5

• For Regression Problem, we need to

Regression Problem Classification

North South University Source: Andrew NG Lectures CSE445 6

• Linear regression is defined as an algorithm that provides a linear

Estimated/Predicted value (𝑦/𝑦) Actual/True value (𝑦)/Ground Truth

Size of house h Estimated

𝜃0 and 𝜃1 : parameters/weights that will be

The goal is to choose Ө0 and Ө1 properly so that hӨ(x) is close to y.

Size in feet2 (x) Price ($) in 1000's (y)

• Here the cost function is called Squared Error cost function

• For simplifications, assumes θ0 = 0

Find best values of θ1 so that J(θ1) is minimum

• If we compute a range of values plot

 The line which has the least sum

(for fixed , this is a function of x) (function of the parameters )

Size in feet2 (x)

• Previously we plotted our cost function by plotting

• We can see that the height

• A contour plot is a graphical technique for representing a 3-dimensional surface by

• We want to get min J(θ0, θ1)

• Start with some

• Keep changing to reduce until we hopefully

 Start with initial guesses

• One initialization point led to one local minimum.

Correct: Simultaneous update Incorrect:

• Here, α is the learning rate, a hyperparameter

If α is too small, gradient descent

If α is too large, gradient descent

If α is too small, gradient descent

If α is too large, gradient descent

If α is too small, gradient descent

If α is too large, gradient descent

You might also like