0% found this document useful (0 votes)

25 views54 pages

Lecture03 Linear Regression

Uploaded by

baygiolamaygio04

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views54 pages

Lecture03 Linear Regression

Uploaded by

baygiolamaygio04

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 54

UET

Since 2004

ĐẠI HỌC CÔNG NGHỆ, ĐHQGHN

VNU-University of Engineering and Technology

INT3405 - Machine Learning

Lecture 3: Linear Regression
Duc-Trong Le & Viet-Cuong Ta

Hanoi, 09/2023
Outline
● Supervised Learning
● Linear Regression with One Variable
○ Model Representation
○ Cost Functions
○ Gradient Descent
● Linear Regression with Multiple Variables
○ Learning rate
○ Normal Equation

FIT-CS INT3405 - Machine Learning 2

Recap: Random Variables

FIT-CS INT3405 - Machine Learning 3

Supervised Learning
●Supervised (Inductive) Learning
●Formalization
○ Input:

○ Output:

○ Target function: (unknown)

○ Training Data:

○ Hypothesis:

○ Hypothesis space:
FIT-CS INT3405 - Machine Learning 4
A Learning Problem

Unknown
Function
Input Output

FIT-CS INT3405 - Machine Learning 5

The Statistical Learning Framework

6
The Statistical Learning Framework

7
The Statistical Learning Framework

8
Hypothesis Spaces
●Linear models

○ Infinite possible hypotheses!

○ Any choices of coefficient a and b will result in a possible hypothesis
● Polynomial models

● Any nonlinear models

FIT-CS INT3405 - Machine Learning 9

Two Views of Learning
●Learning is the removal of our remaining uncertainty.
○ If we are know that x and y are linearly dependent, then we could
use the training data to infer the linear function
●Learning requires guessing a good, small hypothesis class.
○ We could start with a very small / simple class, and enlarge it until it
contains a hypothesis that fits the data
●But we could be wrong
○ Our prior knowledge might be wrong
○ Our guess of the hypothesis class could be wrong
■ The smaller the hypothesis class, the more likely we are wrong
FIT-CS INT3405 - Machine Learning 10
Two Strategies for Machine Learning
●Develop Languages for Expressing Prior Knowledge
○ Rule grammars and stochastic models

●Develop Flexible Hypothesis Spaces

○ Nested collections of hypotheses, rules, linear models, decision trees,
neural networks, etc.

●For either case, the key is to

○ Developing efficient algorithms for finding a Hypothesis that best
approximates the target function for fitting the data
FIT-CS INT3405 - Machine Learning 11
Key Issues in Machine Learning
● What are good hypothesis spaces?
○ Which spaces have been useful in practical applications and why?
● What algorithms can work with these spaces?
○ Are there general design principles for machine learning algorithms?
● How can we find the best hypothesis in an efficient way?
○ How to find the optimal solution efficiently (“optimization” question)
● How can we optimize accuracy on future data?
○ Known as the “overfitting” problem (i.e., “generalization” theory)
● How can we have confidence in the results?
○ How much training data is required to find accurate hypothesis? (“statistical” question)
● Are some learning problems computationally intractable? (“computational” question)
● How can we formulate application problems as machine learning problems? (“engineering”
question)
FIT-CS INT3405 - Machine Learning 12
Regression with One Variable (1)
Housing Prices
(Portland, OR)
Price
(in 1000s of dollars)

Size
(feet2)
Supervised Learning Regression Problem
Given the “right answer” for each Predict real-valued output
example in the data.

FIT-CS INT3405 - Machine Learning 13

Regression with One Variable (2)

Training set of Size in feet2 (x) Price ($) in 1000's (y)

housing prices 2104 460
(Portland, OR) 1416 232
1534 315
852 178
… …

Notation:
m = Number of training examples
x’s = “input” variable / features
y’s = “output” variable / “target” variable

FIT-CS INT3405 - Machine Learning 14

Model Representation
Training Set How do we represent h ?

Learning Algorithm y

Size of h Estimated x
house price
x Hypothesis y
Linear regression with one variable.
“Univariate Linear Regression”

How to choose parameters ?

FIT-CS INT3405 - Machine Learning 15
Formulation: Cost Function (1)
Hypothesis:

Parameters:

y
Cost Function: mean squared error (MSE)

x
Goal:

FIT-CS INT3405 - Machine Learning 16

Formulation: Cost Function (2)
Simplified
Hypothesis:

Parameters:

Cost Function:

Goal:

FIT-CS INT3405 - Machine Learning 17

Cost Function: Example (1)

For fix this is a function of x function of the parameter

FIT-CS INT3405 - Machine Learning 18

Cost Function: Example (2)

For fix this is a function of x function of the parameter

FIT-CS INT3405 - Machine Learning 19

Cost Function: Example (3)

For fix this is a function of x function of the parameter

FIT-CS INT3405 - Machine Learning 20

Cost Function (1)

Hypothesis:

Parameters:

Cost Function:

Goal:

FIT-CS INT3405 - Machine Learning 21

Cost Function (2)

(for fixed , this is a function of x) (function of the parameters )

Price ($)
in
1000’s

Size in feet2
(x)

FIT-CS INT3405 - Machine Learning 22

Cost Function (3)
●Contour plots

FIT-CS INT3405 - Machine Learning 23

Cost Function (4)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 24

Cost Function (5)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 25

Gradient Descent for Optimization (1)

Given some objective function

Want to optimize

Outline:
• Start with some
• Keep changing to reduce
until we hopefully end up at a minimum

FIT-CS INT3405 - Machine Learning 26

Gradient Descent for Optimization (2)

FIT-CS INT3405 - Machine Learning 27

Gradient Descent for Optimization (3)

FIT-CS INT3405 - Machine Learning 28

Gradient Descent Algorithm

Gradient descent algorithm

learning rate parameter

(rule of thumb: 0.1)

FIT-CS INT3405 - Machine Learning 29

Gradient Descent for Linear Regression (1)
Gradient descent algorithm Linear Regression Model

FIT-CS INT3405 - Machine Learning 30

Gradient Descent for Linear Regression (2)

Gradient descent algorithm

update
and
simultaneously

FIT-CS INT3405 - Machine Learning 31

Gradient Descent Example (1)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 32

Gradient Descent Example (2)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 33

Gradient Descent Example (3)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 34

Gradient Descent Example (4)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 35

Gradient Descent Example (5)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 36

Gradient Descent Example (6)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 37

Gradient Descent Example (7)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 38

Gradient Descent Example (8)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 39

Gradient Descent Example (9)

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 40

Batch Gradient Descent

“Batch”: Each step of gradient descent uses all the

training examples.

FIT-CS INT3405 - Machine Learning 41

Multivariate Linear Regression (1)
Multiple features (variables).

Size (feet2) Number of Number of Age of home Price ($1000)

bedrooms floors (years)

2104 5 1 45 460
1416 3 2 40 232
1534 3 2 30 315
852 2 1 36 178
… … … … …

Notation:
= number of features
= input (features) of training example.
= value of feature in training example.
FIT-CS INT3405 - Machine Learning 42
Multivariate Linear Regression (2)
Hypothesis:

Previously:

For convenience of notation, define .

FIT-CS INT3405 - Machine Learning 43

Gradient Descent for Multivariate LR
Hypothesis:

Parameters:

Cost function:

Gradient descent:
Repeat (simultaneously update for every )

FIT-CS INT3405 - Machine Learning 44

Univariate LR vs Multivariate LR

Gradient Descent
Previously (n=1): New algorithm :
Repeat Repeat

(simultaneously update )

FIT-CS INT3405 - Machine Learning 45

Convergence and Learning Rate

Example automatic convergence test:

Declare convergence if
decreases by less than
in one iteration.

No. of iterations
For sufficiently small , should decrease on every iteration.
But if is too small, gradient descent can be slow to converge.
If is too large: may not decrease on every iteration; may not converge.
SML– Term 1 2020-2021
FIT-CS INT3405 - Machine Learning 46
46
Learning Rate

divergenc
e

gradually
too small too decreased
constant large

FIT-CS INT3405 - Machine Learning 47

Normal Equation (1)
Gradient Descent
• Iterative approach
Normal Equation
• Analytical method to solve
Intuition Example: If 1D

Solve equation to find w

FIT-CS INT3405 - Machine Learning 48

Normal Equation (2)

FIT-CS INT3405 - Machine Learning 49

Normal Equation (3)
●Matrix-vector formulation

●Analytical solution

FIT-CS INT3405 - Machine Learning 50

The Pseudo-inverse

FIT-CS INT3405 - Machine Learning 51

Normal Equation: Example
Examples:
Size (feet2) Number of Number of Age of home Price ($1000)
bedrooms floors (years)

1 2104 5 1 45 460
1 1416 3 2 40 232
1 1534 3 2 30 315
1 852 2 1 36 178

is inverse of matrix .
FIT-CS INT3405 - Machine Learning 52
Gradient Descent vs Normal Equation
training examples, features.
Gradient Descent Normal Equation
• Need to choose . • No need to choose .
• Needs many iterations. • Don’t need to iterate.
• Works well even • Need to compute
when is large.
• Slow if is very large.

FIT-CS INT3405 - Machine Learning 53

Summary
● Supervised Learning
● Linear Regression with One Variable
○ Model Representation
○ Cost Functions
○ Gradient Descent
● Linear Regression with Multiple Variables
○ Learning rate
○ Normal Equation

Duc-Trong Le
FIT-CS INT3405 - Machine Learning 54

Lecture 4 Classification P1
No ratings yet
Lecture 4 Classification P1
51 pages
Lecture 4 Classification P1
No ratings yet
Lecture 4 Classification P1
50 pages
Linear Regression for ML ass
No ratings yet
Linear Regression for ML ass
99 pages
Machine Learning: Introduction and Linear Regression
No ratings yet
Machine Learning: Introduction and Linear Regression
29 pages
הרצאה-Classifiers and Decision Trees
No ratings yet
הרצאה-Classifiers and Decision Trees
119 pages
ML 02 Linear Regression
No ratings yet
ML 02 Linear Regression
51 pages
CS229 Lecture 2 PDF
100% (1)
CS229 Lecture 2 PDF
48 pages
Linear Regression
No ratings yet
Linear Regression
95 pages
CS229
No ratings yet
CS229
69 pages
Lecture3_Linear Regression and Logistic Regression
No ratings yet
Lecture3_Linear Regression and Logistic Regression
60 pages
Lecture Slides-Week9,10
No ratings yet
Lecture Slides-Week9,10
66 pages
Slide 2 ML Basics
No ratings yet
Slide 2 ML Basics
42 pages
DSCTP 2022 1 ML Slides
No ratings yet
DSCTP 2022 1 ML Slides
110 pages
03 Supervised Classification
No ratings yet
03 Supervised Classification
68 pages
Lecture 2 - General Concepts For ML
No ratings yet
Lecture 2 - General Concepts For ML
63 pages
cs229 2
No ratings yet
cs229 2
275 pages
MMDS 5th lesson
No ratings yet
MMDS 5th lesson
47 pages
Lecture Slides-Week9
No ratings yet
Lecture Slides-Week9
46 pages
Machine Learning - 5
No ratings yet
Machine Learning - 5
50 pages
Week 04
No ratings yet
Week 04
101 pages
Week 3
No ratings yet
Week 3
56 pages
Linear Regression
No ratings yet
Linear Regression
63 pages
AZ AI Lec 08 Machine Learing1
No ratings yet
AZ AI Lec 08 Machine Learing1
60 pages
Lecture5
No ratings yet
Lecture5
41 pages
Lecture 6 Classification SVM
No ratings yet
Lecture 6 Classification SVM
44 pages
Lecture 5 Classification SVM
No ratings yet
Lecture 5 Classification SVM
44 pages
Ds Module 4
No ratings yet
Ds Module 4
73 pages
Lecture 7 - Feature Selection & Model Optimization
No ratings yet
Lecture 7 - Feature Selection & Model Optimization
48 pages
Lecture 6 Classification P3 SVM
No ratings yet
Lecture 6 Classification P3 SVM
44 pages
DTreesAndOverfitting-1-11-2011_final
No ratings yet
DTreesAndOverfitting-1-11-2011_final
20 pages
Lecture Slides-Week11
No ratings yet
Lecture Slides-Week11
32 pages
Linear Regression: Jia-Bin Huang Virginia Tech
No ratings yet
Linear Regression: Jia-Bin Huang Virginia Tech
59 pages
Lecture 4 Classification P1
No ratings yet
Lecture 4 Classification P1
49 pages
Supervised_Learning (2)
No ratings yet
Supervised_Learning (2)
41 pages
Machine Learning Notes Cs229 1
No ratings yet
Machine Learning Notes Cs229 1
217 pages
Linear Regression
100% (1)
Linear Regression
51 pages
MLA TAB Lecture3
No ratings yet
MLA TAB Lecture3
70 pages
Lecture Slides Week11
No ratings yet
Lecture Slides Week11
33 pages
1 ML Introduction
No ratings yet
1 ML Introduction
36 pages
Lecture 1 - Introduction To ML
No ratings yet
Lecture 1 - Introduction To ML
41 pages
Lecture 3 - Linear Regression
No ratings yet
Lecture 3 - Linear Regression
55 pages
Linear Regression: Courtesy:Richard Zemel, Raquel Urtasun and Sanja Fidler
No ratings yet
Linear Regression: Courtesy:Richard Zemel, Raquel Urtasun and Sanja Fidler
48 pages
Lecture2 - General Concepts For ML
No ratings yet
Lecture2 - General Concepts For ML
69 pages
Gradient Descent - Linear Regression
100% (1)
Gradient Descent - Linear Regression
47 pages
Lecture 3 - Linear Regression
No ratings yet
Lecture 3 - Linear Regression
31 pages
Lec1 PDF
No ratings yet
Lec1 PDF
56 pages
Lect 1
No ratings yet
Lect 1
24 pages
Linear Regression
No ratings yet
Linear Regression
4 pages
Machine Learning Summary
No ratings yet
Machine Learning Summary
38 pages
Linear Regression Notes
No ratings yet
Linear Regression Notes
15 pages
Immersive Audio Signal Processing
100% (4)
Immersive Audio Signal Processing
222 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
CH 12
No ratings yet
CH 12
37 pages
probability
No ratings yet
probability
11 pages
Z.H. Sikder University of Science and Technology: Mid-Term Examination, Fall-2020
No ratings yet
Z.H. Sikder University of Science and Technology: Mid-Term Examination, Fall-2020
6 pages
417-AI-X
No ratings yet
417-AI-X
7 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
An Introduction To Numerical Analysis - K. E. Atkinson PDF
No ratings yet
An Introduction To Numerical Analysis - K. E. Atkinson PDF
99 pages
Guide To Evaluating LLM and RAG Systems
No ratings yet
Guide To Evaluating LLM and RAG Systems
41 pages
Er Model Cab PPT GRP 15
No ratings yet
Er Model Cab PPT GRP 15
17 pages
Machine Learning Shortnote
No ratings yet
Machine Learning Shortnote
14 pages
RIT-39
No ratings yet
RIT-39
19 pages
Power of Knockoff: The Impact of Ranking Algorithm, Augmented Design, and Symmetric Statistic
No ratings yet
Power of Knockoff: The Impact of Ranking Algorithm, Augmented Design, and Symmetric Statistic
67 pages
Chapter 5 - Probability Distributions and Data Modeling
No ratings yet
Chapter 5 - Probability Distributions and Data Modeling
100 pages
Logit Disagreement: OoD Detection with Bayesian Neural Networks
No ratings yet
Logit Disagreement: OoD Detection with Bayesian Neural Networks
14 pages
Homework 1: EE 737 Spring 2019-20 Assigned: 30 Jan Due: Beginning of Class, 06 Feb
100% (1)
Homework 1: EE 737 Spring 2019-20 Assigned: 30 Jan Due: Beginning of Class, 06 Feb
2 pages
ICI Forecast
No ratings yet
ICI Forecast
9 pages
Paap 2019-08-15 Peter Glynn
No ratings yet
Paap 2019-08-15 Peter Glynn
46 pages
Machine Learning Guided Project
No ratings yet
Machine Learning Guided Project
23 pages
Send Unit Routing
No ratings yet
Send Unit Routing
65 pages
Final Exam: Hoang Tran
No ratings yet
Final Exam: Hoang Tran
4 pages
SecIoT HAMMI
No ratings yet
SecIoT HAMMI
9 pages
Network Information Security - Ite4001
No ratings yet
Network Information Security - Ite4001
33 pages
A Look at Contrast Stretching
0% (1)
A Look at Contrast Stretching
11 pages
A Routhray
No ratings yet
A Routhray
5 pages
Project Report-Stock-Price-Prediction
No ratings yet
Project Report-Stock-Price-Prediction
2 pages
06 - Normal Distribution Template
No ratings yet
06 - Normal Distribution Template
16 pages
Authenticated Image Encryption Scheme Based On Chaotic Maps ND Memory Cellular Automataa
No ratings yet
Authenticated Image Encryption Scheme Based On Chaotic Maps ND Memory Cellular Automataa
9 pages
Applied Mathematical Modelling: Junli Liu, Tailei Zhang
No ratings yet
Applied Mathematical Modelling: Junli Liu, Tailei Zhang
12 pages
Automatic Acronym Recognition. Dana Dannélls
No ratings yet
Automatic Acronym Recognition. Dana Dannélls
4 pages
Deep Learning Based Smart Garbage Classifier For Effective Waste Management
No ratings yet
Deep Learning Based Smart Garbage Classifier For Effective Waste Management
4 pages
Robustness Analysis of A Disturbance-Observer Based PI Control
No ratings yet
Robustness Analysis of A Disturbance-Observer Based PI Control
6 pages
Unit - Iii Signal Transmission Through Linear Systems
No ratings yet
Unit - Iii Signal Transmission Through Linear Systems
3 pages
Ece45 HW2
No ratings yet
Ece45 HW2
5 pages
Tugas 4 (Inverse Response Chapter 17)
No ratings yet
Tugas 4 (Inverse Response Chapter 17)
5 pages
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
From Everand
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
e3
No ratings yet
Python AI Programming: Navigating fundamentals of ML, deep learning, NLP, and reinforcement learning in practice
From Everand
Python AI Programming: Navigating fundamentals of ML, deep learning, NLP, and reinforcement learning in practice
Patrick J
No ratings yet
Python AI Programming
From Everand
Python AI Programming
Patrick J
No ratings yet
Python for Finance
From Everand
Python for Finance
Yuxing Yan
2.5/5 (4)
Mastering Python Scientific Computing: A complete guide for Python programmers to master scientific computing using Python APIs and tools
From Everand
Mastering Python Scientific Computing: A complete guide for Python programmers to master scientific computing using Python APIs and tools
Hemant Kumar Mehta
4/5 (1)

Lecture03 Linear Regression

Uploaded by

Lecture03 Linear Regression

Uploaded by

UET

ĐẠI HỌC CÔNG NGHỆ, ĐHQGHN

INT3405 - Machine Learning

FIT-CS INT3405 - Machine Learning 2

FIT-CS INT3405 - Machine Learning 3

○ Target function: (unknown)

FIT-CS INT3405 - Machine Learning 5

○ Infinite possible hypotheses!

● Any nonlinear models

FIT-CS INT3405 - Machine Learning 9

●Develop Flexible Hypothesis Spaces

●For either case, the key is to

FIT-CS INT3405 - Machine Learning 13

Training set of Size in feet2 (x) Price ($) in 1000's (y)

FIT-CS INT3405 - Machine Learning 14

How to choose parameters ?

FIT-CS INT3405 - Machine Learning 16

FIT-CS INT3405 - Machine Learning 17

For fix this is a function of x function of the parameter

FIT-CS INT3405 - Machine Learning 18

For fix this is a function of x function of the parameter

FIT-CS INT3405 - Machine Learning 19

For fix this is a function of x function of the parameter

FIT-CS INT3405 - Machine Learning 20

FIT-CS INT3405 - Machine Learning 21

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 22

FIT-CS INT3405 - Machine Learning 23

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 24

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 25

Given some objective function

FIT-CS INT3405 - Machine Learning 26

FIT-CS INT3405 - Machine Learning 27

FIT-CS INT3405 - Machine Learning 28

Gradient descent algorithm

learning rate parameter

FIT-CS INT3405 - Machine Learning 29

FIT-CS INT3405 - Machine Learning 30

Gradient descent algorithm

FIT-CS INT3405 - Machine Learning 31

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 32

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 33

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 34

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 35

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 36

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 37

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 38

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 39

(for fixed , this is a function of x) (function of the parameters )

FIT-CS INT3405 - Machine Learning 40

“Batch”: Each step of gradient descent uses all the

FIT-CS INT3405 - Machine Learning 41

Size (feet2) Number of Number of Age of home Price ($1000)

For convenience of notation, define .

FIT-CS INT3405 - Machine Learning 43

FIT-CS INT3405 - Machine Learning 44

FIT-CS INT3405 - Machine Learning 45

Example automatic convergence test:

FIT-CS INT3405 - Machine Learning 47

Solve equation to find w

FIT-CS INT3405 - Machine Learning 48

FIT-CS INT3405 - Machine Learning 49

FIT-CS INT3405 - Machine Learning 50

FIT-CS INT3405 - Machine Learning 51

FIT-CS INT3405 - Machine Learning 53

You might also like