0% found this document useful (0 votes)

6 views30 pages

383 Fall11 Lec19

Uploaded by

mokatoglc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views30 pages

383 Fall11 Lec19

Uploaded by

mokatoglc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Regression and Classification"

with Linear Models"

CMPSCI 383
Nov 15, 2011!

1
Todayʼs topics"
• Learning from Examples: brief review!
• Univariate Linear Regression!
• Batch gradient descent!
• Stochastic gradient descent!
• Multivariate Linear Regression!
• Regularization!
• Linear Classifiers!
• Perceptron learning rule!
• Logistic Regression!

2
Learning from Examples (supervised learning)"

3
Learning from Examples (supervised learning)"

4
Learning from Examples (supervised learning)"

5
Learning from Examples (supervised learning)"

6
Learning from Examples (supervised learning)"

7
Learning from Examples (supervised learning)"

8
Important issues"

• Generalization !
• Overfitting!
• Cross-validation!
• Holdout cross validation!
• K-fold cross validation!
• Leave-one-out cross-validation!
• Model selection!

9
Recall Notation"

(x1, y1 ), (x 2 , y 2 ),K (x N , y N ) training set!

Where each y j was generated by !

an unknown function! y = f (x)
€
Discover a function h that best
€
€ the true function! f
approximates

hypothesis!
€
€
10
Loss Functions"
Suppose the true prediction for input x is f (x) = y
but the hypothesis gives h(x) = yˆ

L(x, y, yˆ ) = Utility(result of using y given input x)

€ − Utility(result of using yˆ given input x)

Simplified version : L(y, yˆ )

€ Absolute value loss : L1 (y, yˆ ) = y − yˆ

2
€ Squared error loss : L2 (y, yˆ ) = ( y − yˆ )
0/1 loss : L0 /1 (y, yˆ ) = 0 if y = yˆ , else 1

Generalization loss: expected loss over all possible examples!

€ Empirical loss: average loss over available examples!

11
Univariate Linear Regression"

12
Univariate Linear Regression contd."

w = [ w 0 ,w1 ] weight vector!

hw (x) = w1 x + w 0

Find weight vector that minimizes empirical loss,

€ e.g., L2:!
N N N

Loss(hw ) = ∑ L2 (y j , hw (x j )) =∑ (y j − hw (x j )) 2 =∑ (y j − (w1 x j + w 0 )) 2
j =1 j =1 j =1

i.e., find w * such that!

€ w* = argmin w Loss(hw )

€ 13

€
Weight Space"

14
Finding w*"

Find weights such that:!

∂ N ∂ N
∑
∂w 0 j =1
2
(y j − (w1 x j + w 0 )) = 0 and ∑
∂w1 j =1
(y j − (w1 x j + w 0 )) 2 = 0

15
Gradient Descent"

∂
wi ← wi − α Loss(w)
∂w i
step size or!
learning rate!

16
Gradient Descent contd."

For one training example (x, y) :!

w 0 ← w 0 + α (y − hw (x)) and w1 ← w1 + α (y − hw (x))x

€
For N training examples:!
€
w 0 ← w 0 + α ∑ (y j − hw (x j )) and w1 ← w1 + α ∑ (y j − hw (x j ))x j
j j

batch gradient descent!

€
stochastic gradient descent: take a step for
one training example at a time!
17
The Multivariate case"

hsw (x j ) = w 0 + w1 x j,1 +L + w n x j,n = w 0 + ∑ w i x j,i

Augmented vectors: add a feature to each x by tacking on a 1:! x j,0 = 1

€ Then:!

hsw (x j ) = w⋅ x j = wT x j = ∑ w i x j,i
€ i
€
And batch gradient descent update becomes:!

€ w i ← w i + α ∑ (y j − hw (x j ))x j,i
j

18
€
The Multivariate case contd."

Or, solving analytically:!

Let y be the vector of outputs for the training examples!
X data matrix: each row is an input vector!

Solving this for w *:! y = Xw

€
−1
€ w* = ( XT X) XT y
€
€ pseudo inverse!

19
Regularization"

Cost(h) = EmpLoss(h) + λComplexity(h)

q
Complexity(hw ) = Lq (w) = ∑ w i
i

20
L1 vs. L2 Regularization"

21
Linear Classification: hard thresholds"

22
Linear Classification: hard thresholds contd."

• Decision Boundary:!
• In linear case: linear separator, a hyperplane!
• Linearly separable: !
• data is linearly separable if the classes can be
separated by a linear separator!
• Classification hypothesis:!
hw (x) = Threshold(w⋅ x) where Threshold(z) = 1 if z ≥ 0 and 0 otherwise

23
Perceptron Learning Rule"

For a single sample (x, y) :

w i ← w i + α ( y − hw (x)) x i

• If the output is correct, i.e., y =h w (x), then the weights don't change
• If y = 1 but hw (x) = 0, then w i is increased when x i is positive and decreased when x i is negative.
€ • If y = 0 but hw (x) = 1, then w i is decreased when x i is positive and increased when x i is negative.

€ Perceptron Convergence Theorem: For any data set

thatʼs linearly separable and any training procedure
that continues to present each training example, the
learning rule is guaranteed to find a solution in a finite
number of steps.!

24
Perceptron Performance"

25
Linear Classification with Logistic Regression"

An important function!!
26
Logistic Regression"

1
hw (x) = Logistic(w⋅ x) =
1+ e −w⋅x

For a single sample (x, y) and L2 loss function :

w€i ← w i + α ( y − hw (x)) hw (x)(1 − hw (x)) x i
€

€ derivative of logistic function!

27
Logistic Regression Performance"

separable case!

28
Summary"
• Learning from Examples: brief review!
• Loss functions!
• Generalization!
• Overfitting!
• Cross-validation!
• Regularization!
• Univariate Linear Regression!
• Batch gradient descent!
• Stochastic gradient descent!
• Multivariate Linear Regression!
• Regularization!
• Linear Classifiers!
• Perceptron learning rule!
• Logistic Regression!

29
Next Class"

• Artificial Neural Networks, Nonparametric

Models, & Support Vector Machines!
• Secs. 18.7 – 18.9!

Department of Mining Engineering: Indian Institute of Technology (Indian School of Mines) Dhanbad
No ratings yet
Department of Mining Engineering: Indian Institute of Technology (Indian School of Mines) Dhanbad
25 pages
Time-series-Forecasting Time Series Forecasting Jupyter Code - Ipynb at Main Chetandudhane Time-series-Forecasting GitHub
100% (2)
Time-series-Forecasting Time Series Forecasting Jupyter Code - Ipynb at Main Chetandudhane Time-series-Forecasting GitHub
162 pages
DSCTP 2022 1 ML Slides
No ratings yet
DSCTP 2022 1 ML Slides
351 pages
Bootstrap Student Presentation
100% (1)
Bootstrap Student Presentation
36 pages
ML Opt
No ratings yet
ML Opt
89 pages
CS60010: Deep Learning: Spring 2021
No ratings yet
CS60010: Deep Learning: Spring 2021
32 pages
What Is Multiple Linear Regression (MLR) ?
No ratings yet
What Is Multiple Linear Regression (MLR) ?
4 pages
Sample Question Econometrics
No ratings yet
Sample Question Econometrics
11 pages
CS480 6 Linear Models
No ratings yet
CS480 6 Linear Models
68 pages
Everything You've Ever Wanted To Know About Linear Classifiers
No ratings yet
Everything You've Ever Wanted To Know About Linear Classifiers
16 pages
Introml 02 Regression Annotated PDF
No ratings yet
Introml 02 Regression Annotated PDF
26 pages
MAS Cost Concepts and Analysis
No ratings yet
MAS Cost Concepts and Analysis
6 pages
Linear Classifier: by Dr. Sanjeev Kumar Associate Professor Department of Mathematics IIT Roorkee, Roorkee-247 667, India
No ratings yet
Linear Classifier: by Dr. Sanjeev Kumar Associate Professor Department of Mathematics IIT Roorkee, Roorkee-247 667, India
86 pages
NN Theory
No ratings yet
NN Theory
138 pages
The Environmental Quality Index Approach
No ratings yet
The Environmental Quality Index Approach
39 pages
Chapter 1 Part 7
No ratings yet
Chapter 1 Part 7
7 pages
Lec9 - Linear Models
No ratings yet
Lec9 - Linear Models
44 pages
Chapter 12 Homework Answer: T T+J T
No ratings yet
Chapter 12 Homework Answer: T T+J T
3 pages
Numerical Optimization: Lecture Notes #24 Nonlinear Least Squares - Orthogonal Distance Regression
No ratings yet
Numerical Optimization: Lecture Notes #24 Nonlinear Least Squares - Orthogonal Distance Regression
21 pages
Linear Models
No ratings yet
Linear Models
30 pages
Lecture - 4 - Logistic Regression
No ratings yet
Lecture - 4 - Logistic Regression
62 pages
The Relationship Between Communication Competence and Organizational Conflict
No ratings yet
The Relationship Between Communication Competence and Organizational Conflict
22 pages
CS221 - Artificial Intelligence - Machine Learning - 2 Linear Regression
No ratings yet
CS221 - Artificial Intelligence - Machine Learning - 2 Linear Regression
24 pages
OECD 220 Enchytraeid Reproduction Test
No ratings yet
OECD 220 Enchytraeid Reproduction Test
24 pages
Productivity Trends and Determinants of Indian Textile Industry: A Disaggregated Analysis
No ratings yet
Productivity Trends and Determinants of Indian Textile Industry: A Disaggregated Analysis
13 pages
2nd Group - Sales Budget
No ratings yet
2nd Group - Sales Budget
12 pages
Introduction To Machine Learning: 2 Linear Classifiers
No ratings yet
Introduction To Machine Learning: 2 Linear Classifiers
4 pages
Linear Regression
No ratings yet
Linear Regression
29 pages
Using Minitab For Teaching Statistics in Higher Education
No ratings yet
Using Minitab For Teaching Statistics in Higher Education
3 pages
Today: - Calculus
No ratings yet
Today: - Calculus
61 pages
Statistics For CS
No ratings yet
Statistics For CS
16 pages
03 Linear Models
No ratings yet
03 Linear Models
46 pages
EA Linear Regression
No ratings yet
EA Linear Regression
3 pages
02 - Linear Models - A
No ratings yet
02 - Linear Models - A
23 pages
Gradient Descent Example PDF
No ratings yet
Gradient Descent Example PDF
3 pages
Backpropagation LectureNotesPublic
No ratings yet
Backpropagation LectureNotesPublic
13 pages
Generalized Linear Model
No ratings yet
Generalized Linear Model
67 pages
Lecture 3 - Linear Regression
No ratings yet
Lecture 3 - Linear Regression
31 pages
Management Science Notes
No ratings yet
Management Science Notes
13 pages
Lec1 PerceptronPocket Recap
No ratings yet
Lec1 PerceptronPocket Recap
61 pages
Homework2 v1.0
No ratings yet
Homework2 v1.0
5 pages
CH 1
No ratings yet
CH 1
24 pages
Question Bank - ML - Unit1,2,3
No ratings yet
Question Bank - ML - Unit1,2,3
3 pages
A Novel Method For Computationally Efficacious Linear and Polynomial Regression Analytics of Big Data in Medicine
No ratings yet
A Novel Method For Computationally Efficacious Linear and Polynomial Regression Analytics of Big Data in Medicine
10 pages
Gradient Descent Based Learners
No ratings yet
Gradient Descent Based Learners
11 pages
Lec10 Intro ML
No ratings yet
Lec10 Intro ML
93 pages
Lecture 6
No ratings yet
Lecture 6
19 pages
L02 Linear Regression
No ratings yet
L02 Linear Regression
9 pages
L3 Cse256 Fa24 FFN
No ratings yet
L3 Cse256 Fa24 FFN
64 pages
Material For Student CAIEC™ (V062021A) EN
No ratings yet
Material For Student CAIEC™ (V062021A) EN
189 pages
Results of Statistical Analysis of Pressure Relief Valve Proof Test Data
No ratings yet
Results of Statistical Analysis of Pressure Relief Valve Proof Test Data
20 pages
Lecture 1, Part 1: Linear Regression: Roger Grosse
No ratings yet
Lecture 1, Part 1: Linear Regression: Roger Grosse
9 pages
Handout - 7 - Threats To Validity
No ratings yet
Handout - 7 - Threats To Validity
1 page
Lecture13 - ML Linear & Log-Linear Models
No ratings yet
Lecture13 - ML Linear & Log-Linear Models
34 pages
Lecture3 Supervised Learning I
No ratings yet
Lecture3 Supervised Learning I
84 pages
Lec 3-5 (Function Approximation)
No ratings yet
Lec 3-5 (Function Approximation)
34 pages
2EL1730 ML Lecture02 Linear and Logistic Regression
No ratings yet
2EL1730 ML Lecture02 Linear and Logistic Regression
65 pages
PDF Introduction To Statistics and Data Analysis 3rd Edition Roxy Peck Download
100% (8)
PDF Introduction To Statistics and Data Analysis 3rd Edition Roxy Peck Download
84 pages
3 LogisticRegression
No ratings yet
3 LogisticRegression
30 pages
cs188 Fa23 Note21
No ratings yet
cs188 Fa23 Note21
8 pages
Week 2
No ratings yet
Week 2
43 pages
Machine Learning Notes Cs229 1
No ratings yet
Machine Learning Notes Cs229 1
217 pages
COL774 Practice Problems
No ratings yet
COL774 Practice Problems
22 pages
ML 2
No ratings yet
ML 2
155 pages
AI Lec2.1 MLsupervised
No ratings yet
AI Lec2.1 MLsupervised
21 pages
Week3 LearningI
No ratings yet
Week3 LearningI
48 pages
05 Optimization Basics
No ratings yet
05 Optimization Basics
94 pages
04 LogisticRegression
No ratings yet
04 LogisticRegression
29 pages
Regression
No ratings yet
Regression
39 pages
Week11 - Regularization and Optimization
No ratings yet
Week11 - Regularization and Optimization
75 pages
Machine Learning - I
No ratings yet
Machine Learning - I
126 pages
Lecture 5 - Logistic Regression
No ratings yet
Lecture 5 - Logistic Regression
28 pages
Week 1 Lecture Notes
No ratings yet
Week 1 Lecture Notes
7 pages
IML Summary
No ratings yet
IML Summary
12 pages
RDOrganizational - Research - Methods 2015 Turner 1094428115610808
No ratings yet
RDOrganizational - Research - Methods 2015 Turner 1094428115610808
25 pages
Supervised Learning
No ratings yet
Supervised Learning
41 pages
02-Linear Regression
No ratings yet
02-Linear Regression
17 pages
01B DL2023 LinearModels
No ratings yet
01B DL2023 LinearModels
47 pages
Lecture Notes 6 Logistic Regression
No ratings yet
Lecture Notes 6 Logistic Regression
8 pages
Lecture 3 - Regression
No ratings yet
Lecture 3 - Regression
47 pages
MATH 1200 CRN 40150 Spring25
No ratings yet
MATH 1200 CRN 40150 Spring25
15 pages
VL2024250502474 Ast02
No ratings yet
VL2024250502474 Ast02
10 pages
Linear - Regression - SGD
No ratings yet
Linear - Regression - SGD
71 pages
QSRI Lecture1
No ratings yet
QSRI Lecture1
45 pages
3-LG Eval
No ratings yet
3-LG Eval
52 pages
Wk05 Machine Learning
No ratings yet
Wk05 Machine Learning
6 pages
04 - Linear-Classification-2024
No ratings yet
04 - Linear-Classification-2024
65 pages
Business Statistics A Decision Making Approach 10th Edition David Groebner
No ratings yet
Business Statistics A Decision Making Approach 10th Edition David Groebner
323 pages
Business Statistics Communicating With Numbers 5th Edition Jaggia Unlocked Test Bank
No ratings yet
Business Statistics Communicating With Numbers 5th Edition Jaggia Unlocked Test Bank
312 pages
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
From Everand
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
Andrew Igla
No ratings yet

383 Fall11 Lec19

Uploaded by

383 Fall11 Lec19

Uploaded by

Regression and Classification"

with Linear Models"

(x1, y1 ), (x 2 , y 2 ),K (x N , y N ) training set!

Where each y j was generated by !

L(x, y, yˆ ) = Utility(result of using y given input x)

Simplified version : L(y, yˆ )

€ Absolute value loss : L1 (y, yˆ ) = y − yˆ

Generalization loss: expected loss over all possible examples!

w = [ w 0 ,w1 ] weight vector!

Find weight vector that minimizes empirical loss,

i.e., find w * such that!

Find weights such that:!

For one training example (x, y) :!

w 0 ← w 0 + α (y − hw (x)) and w1 ← w1 + α (y − hw (x))x

batch gradient descent!

hsw (x j ) = w 0 + w1 x j,1 +L + w n x j,n = w 0 + ∑ w i x j,i

Augmented vectors: add a feature to each x by tacking on a 1:! x j,0 = 1

Or, solving analytically:!

Solving this for w *:! y = Xw

Cost(h) = EmpLoss(h) + λComplexity(h)

For a single sample (x, y) :

€ Perceptron Convergence Theorem: For any data set

For a single sample (x, y) and L2 loss function :

€ derivative of logistic function!

• Artificial Neural Networks, Nonparametric

You might also like