0% found this document useful (0 votes)

9 views15 pages

Lecture15 Regression

Uploaded by

yitongwu766

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views15 pages

Lecture15 Regression

Uploaded by

yitongwu766

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Regression

George Lan

A. Russell Chandler III Chair Professor

H. Milton Stewart School of Industrial & Systems
Engineering
Machine learning for apartment hunting
Suppose you are to move to Atlanta
And you want to find the most
reasonably priced apartment satisfying
your needs:
square-ft., # of bedroom, distance to campus …

Living area (ft2) # bedroom Rent ($)

230 1 600
506 2 1000
433 2 1100
109 1 500
…
150 1 ?
270 1.5 ?
The learning problem
Features:
Living area, distance to campus, # bedroom …
Denote as 𝑥 = 𝑥! , 𝑥" , … , 𝑥# $
rent

Target:
Rent
Living area Denoted as y

Training set:
𝑋 = 𝑥! , 𝑥 " , … 𝑥 %
rent

𝑦 = 𝑦! , 𝑦 " , … , 𝑦 % $

Location

Living area
Linear Regression Model
Assume 𝑦 is a linear function of 𝑥 (features) plus noise 𝜖

𝑦 = 𝜃! + 𝜃" 𝑥" + ⋯ + 𝜃# 𝑥# + 𝜖

where 𝜖 is an error term of unmodeled effects or random noise

Let 𝜃 = 𝜃! , 𝜃" , … , 𝜃# $, and augment data by one dimension

𝑥 ← 1, 𝑥 $

Then 𝑦 = 𝜃 $ 𝑥 + 𝜖

4
Least mean square method
Given m data points, find 𝜃 that minimizes the mean square
error
(
1 )
-
𝜃 = 𝑎𝑟𝑔𝑚𝑖𝑛% 𝐿 𝜃 = 5 𝑦 − 𝜃 𝑥 & $ &
𝑚
&'"

Our usual trick: set gradient to 0 and find parameter

(
𝜕𝐿 𝜃 2
= − 5 𝑦& − 𝜃$𝑥 & 𝑥 & = 0
𝜕𝜃 𝑚
&'"
( (
2 & &
2 & & !
⇔ − 5𝑦 𝑥 + 5𝑥 𝑥 𝜃 = 0
𝑚 𝑚
&'" &'"

5
Matrix version of the gradient
*+ % ) ( ) ( !
*%
= − ( ∑&'" 𝑦 & 𝑥 & & &
+ ( ∑&'" 𝑥 𝑥 𝜃 =0

Equivalent to

𝜕𝐿 𝜃 2 ! 2 !
=− 𝑥 … , 𝑥% !
𝑦 …,𝑦 % $
+ 𝑥 , … 𝑥 % 𝑥! , … 𝑥 % $
𝜃=0
𝜕𝜃 𝑚 𝑚

Define 𝑋 = 𝑥 ! , 𝑥 " , … 𝑥 % , 𝑦 = 𝑦 ! , 𝑦 " , … , 𝑦 % $ , gradient becomes

𝜕𝐿 𝜃 2 2
= − 𝑋𝑦 + 𝑋𝑋 ! 𝜃 = 0
𝜕𝜃 𝑚 𝑚

⇒ 𝜃+ = 𝑋𝑋 ! "#
𝑋𝑦
6
Alternative way of obtaining 𝜃!
The matrix inversion in 𝜃+ = 𝑋𝑋 ! "#
𝑋𝑦 can be very expensive to
compute

*+ % ) (
*%
= − ( ∑&'" 𝑦& − 𝜃$𝑥 & 𝑥 &

Gradient descent
(
𝛼 !
𝜃+ $%# ← 𝜃+ $ + . 𝑦 & − 𝜃+ $ 𝑥 & 𝑥 &
𝑚
&'#

Stochastic gradient descent (use one data point at a time)

+ $%# + + $! &
𝜃 ← 𝜃 + 𝛽$ 𝑦 − 𝜃 𝑥 𝑥 &
$ &

7
A recap:
Stochastic gradient update rule
%
𝜃, !"# ← 𝜃, ! + 𝛽 𝑦 $ − 𝜃, ! 𝑥 $ 𝑥 $

Pros: on-line, low per-step cost

Cons: coordinate, maybe slow-converging
Gradient descent
'
𝛼 %
𝜃, !"# ← 𝜃, ! + 6 𝑦 $ − 𝜃, ! 𝑥 $ 𝑥 $
𝑚
$&#

Pros: fast-converging, easy to implement

Cons: need to read all data
Solve normal equations
(𝑋𝑋 % )𝜃, = 𝑋𝑦
Pros: a single-shot algorithm! Easiest to implement.
Cons: need to compute inverse 𝑋𝑇𝑋 "# , expensive, numerical
issues (e.g., matrix is singular ..)
Geometric Interpretation of LMS
The predictions on the training data are:
𝑦< = 𝑋 $ 𝜃 = 𝑋 $ 𝑋𝑋 $ ;" 𝑋𝑦
Look at residue 𝑦< − 𝑦

𝑦< − 𝑦 = 𝑋 $ 𝑋𝑋 $ ;" 𝑋 −𝐼 𝑦

𝑋 𝑦< − 𝑦 = 𝑋 𝑋 $ 𝑋𝑋 $ ;" 𝑋 −𝐼 𝑦 =0

𝑦< is the orthogonal projection of 𝑦 into the

space spanned by the columns of 𝑋
Probabilistic Interpretation of LMS
Assume 𝑦 is a linear in 𝑥 plus noise 𝜖
𝑦 = 𝜃$𝑥 + 𝜖

Assume 𝜖 follows a Gaussian N(0,σ)

)
1 𝑦& $
−𝜃 𝑥 &
𝑝 𝑦& 𝑥& ; 𝜃 = exp −
2𝜋𝜎 2𝜎 )

By independence assumption, likelihood is

𝐿 𝜃
( ( ( & − 𝜃$𝑥 & )
1 ∑ &'" 𝑦
= F 𝑝 𝑦& 𝑥& ; 𝜃 = exp −
2𝜋𝜎 2𝜎 )
&'"
Probabilistic Interpretation of LMS, cont.
Hence the log-likelihood is:

(
1
1 )
log 𝐿 𝜃 = 𝑚 log − ) 5 𝑦& − 𝜃$𝑥 &
2𝜋𝜎 2𝜎 &'"

Do you recognize the last term?

(
1 )
𝐿𝑀𝑆: 5 𝑦& − 𝜃$𝑥 &
𝑚
&

Thus under independence assumption and Gaussian noise

assumption, LMS is equivalent to MLE of 𝜃 !
Nonlinear regression

Want to fit a polynomial regression model

𝑦 = 𝜃! + 𝜃" 𝑥 + 𝜃) 𝑥 ) + ⋯ + 𝜃# 𝑥 # + 𝜖

Let 𝑥M = 1, 𝑥, 𝑥 ) , … , 𝑥 # $ and 𝜃 = 𝜃! , 𝜃" , 𝜃) , … , 𝜃# $

y = 𝜃 $ 𝑥M
12
Least mean square method
Given 𝑚 data points, find 𝜃 that minimizes the mean square
error
(
1 )
-
𝜃 = 𝑎𝑟𝑔𝑚𝑖𝑛% 𝐿 𝜃 = 5 𝑦 − 𝜃 𝑥M & $ &
𝑚
&'"

Our usual trick: set gradient to 0 and find parameter

(
𝜕𝐿 𝜃 2
= − 5 𝑦 & − 𝜃 $ 𝑥M & 𝑥M & = 0
𝜕𝜃 𝑚
&'"
( (
2 & &
2 & & $
⇔ − 5 𝑦 𝑥M + 5 𝑥M 𝑥M 𝜃 = 0
𝑚 𝑚
&'" &'"

13
Matrix version of the gradient
$
Define 𝑋0 = 𝑥1 (!) , 𝑥1 (") , … 𝑥1 (%) , 𝑦 = 𝑦 (!) , 𝑦 (") , … , 𝑦 (%) , gradient
becomes

𝜕𝐿 𝜃 2 2
= − 𝑋𝑦O + 𝑋O 𝑋O $ 𝜃 = 0
𝜕𝜃 𝑚 𝑚
;"
- O
⇒ 𝜃 = 𝑋𝑋 O $ O
𝑋𝑦

Note that 𝑥M = 1, 𝑥, 𝑥 ) , … , 𝑥 # $

If we choose a different maximal degree 𝑛 for the polynomial,

the solution will be different.

14
Example: head acceleration in accident

Warrior 1800 Illustrated Parts Catalog Revision 13
100% (2)
Warrior 1800 Illustrated Parts Catalog Revision 13
293 pages
2a Linear Regression 18may
No ratings yet
2a Linear Regression 18may
28 pages
Linear Regression 18may
No ratings yet
Linear Regression 18may
28 pages
Notes 3
No ratings yet
Notes 3
59 pages
BITS F464 ML Lecture Notes
No ratings yet
BITS F464 ML Lecture Notes
86 pages
Lecture slides - Linear Regression (2025)
No ratings yet
Lecture slides - Linear Regression (2025)
45 pages
Regression
No ratings yet
Regression
30 pages
Lec9 - Linear Models
No ratings yet
Lec9 - Linear Models
44 pages
CS229
No ratings yet
CS229
69 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
cs229 2
No ratings yet
cs229 2
275 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
01B-DL2023-LinearModels
No ratings yet
01B-DL2023-LinearModels
47 pages
Lecture Notes 5 Linear Regression
No ratings yet
Lecture Notes 5 Linear Regression
11 pages
M6 RegressionLinearModels v2
No ratings yet
M6 RegressionLinearModels v2
97 pages
Understanding The Geometry of Predictive Models: Workshop at S P Jain School Institute of Management and Research
No ratings yet
Understanding The Geometry of Predictive Models: Workshop at S P Jain School Institute of Management and Research
78 pages
Machine Learning Notes Cs229 1
No ratings yet
Machine Learning Notes Cs229 1
217 pages
2EL1730 ML Lecture02 Linear and Logistic Regression
No ratings yet
2EL1730 ML Lecture02 Linear and Logistic Regression
65 pages
Lecture3_upload
No ratings yet
Lecture3_upload
28 pages
Lecture 5 - Linear Regression
No ratings yet
Lecture 5 - Linear Regression
51 pages
2. Linear_ Regression_SGD
No ratings yet
2. Linear_ Regression_SGD
71 pages
02 - Linear Models - A
No ratings yet
02 - Linear Models - A
23 pages
Wk05 machine learning
No ratings yet
Wk05 machine learning
6 pages
cs229 Notes1 PDF
No ratings yet
cs229 Notes1 PDF
28 pages
Lecture16 Crossvalidation
No ratings yet
Lecture16 Crossvalidation
32 pages
AC-ED L04 - Logistic Regression, Regularization
No ratings yet
AC-ED L04 - Logistic Regression, Regularization
80 pages
Machine Learning - SoS 2017
No ratings yet
Machine Learning - SoS 2017
15 pages
Machine Learning Notes by Standard Andrew Ng
No ratings yet
Machine Learning Notes by Standard Andrew Ng
142 pages
Linearna Regresija - NG
No ratings yet
Linearna Regresija - NG
7 pages
MLA TAB Lecture3
No ratings yet
MLA TAB Lecture3
70 pages
Machine Learning Notes AndrewNg
No ratings yet
Machine Learning Notes AndrewNg
141 pages
Stanford ML CS229-Merged Notes
No ratings yet
Stanford ML CS229-Merged Notes
126 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
293 pages
Linear Regression Notes
No ratings yet
Linear Regression Notes
15 pages
Essentials of Linear Regression in Python
No ratings yet
Essentials of Linear Regression in Python
23 pages
Lecture 3_Regression (1)
No ratings yet
Lecture 3_Regression (1)
47 pages
session1
No ratings yet
session1
39 pages
4 Linear Regression Additional Notes
No ratings yet
4 Linear Regression Additional Notes
8 pages
Lecture 6 - Ridge Regression, Polynomial Regression (DONE!!) PDF
No ratings yet
Lecture 6 - Ridge Regression, Polynomial Regression (DONE!!) PDF
26 pages
GradientDescent-Regression_slides
No ratings yet
GradientDescent-Regression_slides
26 pages
Machine Learning: Linear Models For Regression
No ratings yet
Machine Learning: Linear Models For Regression
54 pages
Regression Analysis
No ratings yet
Regression Analysis
54 pages
Linear Regression
No ratings yet
Linear Regression
26 pages
Lec1 PDF
No ratings yet
Lec1 PDF
56 pages
training-models
No ratings yet
training-models
13 pages
Linear and Logistic Regression: Marta Arias Marias@lsi - Upc.edu
No ratings yet
Linear and Logistic Regression: Marta Arias Marias@lsi - Upc.edu
25 pages
Module3_Ch1
No ratings yet
Module3_Ch1
83 pages
Classification and Regression
No ratings yet
Classification and Regression
34 pages
Berkeley Machine Learning
No ratings yet
Berkeley Machine Learning
185 pages
Lec4 Oct12 2022 PracticalNotes LinearRegression
No ratings yet
Lec4 Oct12 2022 PracticalNotes LinearRegression
34 pages
CIS 4526: Foundations of Machine Learning Linear Regression: (Modified From Sanja Fidler)
No ratings yet
CIS 4526: Foundations of Machine Learning Linear Regression: (Modified From Sanja Fidler)
20 pages
CS 256: LMS Algorithms
No ratings yet
CS 256: LMS Algorithms
23 pages
Linear Regression
No ratings yet
Linear Regression
62 pages
Lecture 2
No ratings yet
Lecture 2
66 pages
ML Summary PDF
No ratings yet
ML Summary PDF
5 pages
Regression Analysis
No ratings yet
Regression Analysis
11 pages
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)
Generalized Fermat Equation
From Everand
Generalized Fermat Equation
Ran Van Vo
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Learn Python through Nursery Rhymes and Fairy Tales: Classic Stories Translated into Python Programs (Coding for Kids and Beginners)
From Everand
Learn Python through Nursery Rhymes and Fairy Tales: Classic Stories Translated into Python Programs (Coding for Kids and Beginners)
Shari Eskenas
5/5 (1)
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet
FLEXIDOME IP Micro 3 Operation Manual enUS 82103891083
No ratings yet
FLEXIDOME IP Micro 3 Operation Manual enUS 82103891083
58 pages
PT Station Equal To PC + LC and Not PI + Tangent
No ratings yet
PT Station Equal To PC + LC and Not PI + Tangent
14 pages
Kalman Filter Implementation: First Part of Implementation
No ratings yet
Kalman Filter Implementation: First Part of Implementation
10 pages
Flexwatch 3170
No ratings yet
Flexwatch 3170
10 pages
BIT University of Colombo - Middleware Architecture Lesson 5
No ratings yet
BIT University of Colombo - Middleware Architecture Lesson 5
18 pages
Business Case On Retail Management System
No ratings yet
Business Case On Retail Management System
11 pages
Developers Guide
No ratings yet
Developers Guide
29 pages
FND123 Assignment Week 2
No ratings yet
FND123 Assignment Week 2
10 pages
Neha Solutions at Pollachi: MOB: 9655340005 / 9655340006 Old Bustand Backside, Watertank Opposite
No ratings yet
Neha Solutions at Pollachi: MOB: 9655340005 / 9655340006 Old Bustand Backside, Watertank Opposite
18 pages
Server Virtualization Architecture and Implementation: by Jeff Daniels
No ratings yet
Server Virtualization Architecture and Implementation: by Jeff Daniels
5 pages
Project Report ON Library Management System
No ratings yet
Project Report ON Library Management System
5 pages
Vehicle Management System Project Scope
No ratings yet
Vehicle Management System Project Scope
4 pages
Me3153a.1 Question and Answers For Midterm Exam Repeaters 3a
No ratings yet
Me3153a.1 Question and Answers For Midterm Exam Repeaters 3a
6 pages
POM Unit 4
No ratings yet
POM Unit 4
16 pages
237-2023 BlackScreen
No ratings yet
237-2023 BlackScreen
2 pages
CSNX Training Release Notes V4.X
No ratings yet
CSNX Training Release Notes V4.X
7 pages
HEF4044B: 1. General Description
No ratings yet
HEF4044B: 1. General Description
14 pages
USSDKLog
No ratings yet
USSDKLog
2,104 pages
Hadits Arbain
No ratings yet
Hadits Arbain
67 pages
Vemavavarapu - Koti Bhagyasree: Career Objective
No ratings yet
Vemavavarapu - Koti Bhagyasree: Career Objective
2 pages
ELEC 3509 Lab 3 - New
No ratings yet
ELEC 3509 Lab 3 - New
17 pages
Eco. Gen. Maintenance Manual
No ratings yet
Eco. Gen. Maintenance Manual
23 pages
01 First
No ratings yet
01 First
11 pages
WomenSafetyApp Presentation
No ratings yet
WomenSafetyApp Presentation
42 pages
Statistics for Managers Using Microsoft Excel 8th Edition Levine eBook and TestBank Bundle Full Download
100% (1)
Statistics for Managers Using Microsoft Excel 8th Edition Levine eBook and TestBank Bundle Full Download
403 pages
CSS11 SSLM Week 1-8
No ratings yet
CSS11 SSLM Week 1-8
54 pages
WISE - c04647851
No ratings yet
WISE - c04647851
7 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
4 pages
Verena Paceli 1708010068
No ratings yet
Verena Paceli 1708010068
14 pages

Lecture15 Regression

Uploaded by

Lecture15 Regression

Uploaded by

Regression

A. Russell Chandler III Chair Professor

Living area (ft2) # bedroom Rent ($)

where 𝜖 is an error term of unmodeled effects or random noise

Let 𝜃 = 𝜃! , 𝜃" , … , 𝜃# $, and augment data by one dimension

Our usual trick: set gradient to 0 and find parameter

Define 𝑋 = 𝑥 ! , 𝑥 " , … 𝑥 % , 𝑦 = 𝑦 ! , 𝑦 " , … , 𝑦 % $ , gradient becomes

Stochastic gradient descent (use one data point at a time)

Pros: on-line, low per-step cost

Pros: fast-converging, easy to implement

𝑦< is the orthogonal projection of 𝑦 into the

Assume 𝜖 follows a Gaussian N(0,σ)

By independence assumption, likelihood is

Do you recognize the last term?

Thus under independence assumption and Gaussian noise

Want to fit a polynomial regression model

Let 𝑥M = 1, 𝑥, 𝑥 ) , … , 𝑥 # $ and 𝜃 = 𝜃! , 𝜃" , 𝜃) , … , 𝜃# $

Our usual trick: set gradient to 0 and find parameter

If we choose a different maximal degree 𝑛 for the polynomial,

You might also like