0% found this document useful (0 votes)

39 views57 pages

ML 04 Validation Regularization

Uploaded by

Mrs.SANTHOSHI A

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views57 pages

ML 04 Validation Regularization

Uploaded by

Mrs.SANTHOSHI A

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 57

CS 60050

Machine Learning

Evaluation and Error analysis

Validation and Regularization

Some slides taken from course materials of Andrew Ng

How to evaluate a model?
• Regression
– Some measure of how close are predicted values
(by a model) to the actual values

• Classification
– Whether predicted classes match the actual
classes
Evaluation metrics for Regression
• Mean Squared Error (MSE)
– For every data point, compute error (distance between
predicted value and actual value)
– Sum squares of these errors, and take average
– More popular variant: RMSE (square root of MSE)
• R2 or R-squared
– A naïve Simple Average Model (SAM): for every point,
predict the average of all points
– R2: 1 – (error of model / error of SAM)
– Best possible R2 is 1; can be negative for a really bad model
R2 or R-squared
• Dataset has n instances <xi , yi>, i=1..N
• Predicted values: fi, i=1..N
• Mean of actual values:

Residual sum of squares

Total sum of squares

(proportional to variance)
Evaluation metrics for classification
• Let y = actual class, h = predicted class for an
example

• Accuracy: Out of all examples, for what

fraction is h = y?

• But accuracy is often not sufficient to indicate

performance in practice
Skewed classes
• Often the class of interest is a rare class (y=1)
– Spam emails / social network accounts
– Cancerous cells
– Fraud credit card transactions
Skewed classes
• Often the class of interest is a rare class (y=1)
– Spam emails / social network accounts
– Cancerous cells
– Fraud credit card transactions
• Precision: Out of all examples for which model
predicted h=1, for what fraction is y=1?
• Recall: Of all examples for which y=1, for what
fraction did model correctly predict h=1?
Precision vs. Recall: tradeoff
• Predict y=1 if h > some threshold
• Predict y=1 only if highly confident: high
precision, lower recall
• Avoid missing too many cases with y=1: high
recall, lower precision

• F-score: harmonic mean of Precision and Recall

Confusion Matrix

h=1 h = -1

Precision: (True positive) / (True positive + False positive)

Recall: (True positive) / (True positive + False negative)

Another format of confusion matrix

• Two types of errors:

– False positive/accept: hypothesis +1, true label -1
– False negative/reject: hypothesis -1, true label +1
Two types of errors

• How do we penalize the two types of errors?

• Which is more important – higher Precision or

higher Recall?

• Depends on the specific application

Example: Fingerprint verification

• Input fingerprint, classify as

known identity or intruder

• Application 1: Supermarket
verifies customers for giving a
discount

• Application 2: For entering

into RAW, GoI
Example: Fingerprint verification

• Input fingerprint, classify as

known identity or intruder

y
• Application 1: Supermarket
verifies customers for giving a
discount

• Application 2: For entering

into RAW, GoI
Example: Fingerprint verification

• Input fingerprint, classify as

known identity or intruder

y
• Application 1: Supermarket
verifies customers for giving a
discount
y
• Application 2: For entering
into RAW, GoI
On what data to measure
precision, recall, error rate, ..?
• Option 1: training set
• Option 2: some other set of examples that was
unknown at the time of training (test set)

• Motivation for ML: learn a model that performs

well (generalizes well) to unknown examples
• Option 2 gives better guarantees for
generalization of a learnt model
Error Analysis

Bias and Variance

Example: Linear regression (housing prices)
Price

Size
Fitting a linear function

Fitting a quadratic function

Fitting a higher order function

Bias vs. variance in linear regression
Price

Size
Bias vs. variance in linear regression
Price

Size

High bias “Just right”

(underfitting)
Bias vs. variance in linear regression
Price

Size

High bias “Just right” High variance

(underfitting) (overfitting)
Overfitting

If we have too many features, the learned hypothesis

may fit the training set very well

but fail to generalize to new examples.

Bias vs. variance in logistic regression
Example: Logistic regression
Sources of noise and error
• While learning a target function using a training set
• Two sources of noise
– Some training points may not come exactly from
the target function: stochastic noise
– The target function may be too complex to capture
using the chosen hypothesis set: deterministic noise

• Generalization error: Model tries to fit the noise in the

training data, which gets extrapolated to the test set
Ways to handle noise
• Validation
– Check performance on data other than training
data, and tune model accordingly

• Regularization
– Constraint the model so that the noise cannot be
learnt too well
Validation
Validation

• Divide given data into train set and test set

– E.g., 80% train and 20% test
– Better to select randomly
• Learn parameters using training set
• Check performance (validate the model) on
test set, using measures such as accuracy,
misclassification rate, etc.
• Trade-off: more data for training vs. validation
An example: model selection
• Which order polynomial will best fit a given data?
Polynomials available: h1, h2, …, h10
• As if an extra parameter - degree of the polynomial -
is to be learned
• Approach
– Divide into train and test set
– Train each hypothesis on train set, measure error
on test set
– Select the hypothesis with minimum test set error
An example: model selection
• Problem with the previous approach
– The test set error we computed is not a true
estimate of generalization error
– Since our extra parameter (order of polynomial) is
fit to the test set
An example: model selection
• Approach 2
– Divide data into train set (60%), validation set
(20%) and test set (20%)
– Select that hypothesis which gives lowest error on
validation set
– Use test set to estimate generalization error

• Note: Test set not at all seen during training

Popular methods of evaluating a
classifier

• Holdout method
– Split data into train and test set (usually 2/3 for
train and 1/3 for test). Learn model using train set
and measure performance over test set

– Usually used when there is sufficiently large data,

since both train and test data will be a part
Popular methods of evaluating a
classifier

• Repeated Holdout method

– Repeat the Holdout method multiple times with
different subsets used for train/test
– In each iteration, a certain portion of data is
randomly selected for training, rest for testing
– The error rates on the different iterations are
averaged to yield an overall error rate
– More reliable than simple Holdout
Popular methods of evaluating a
classifier
• k-fold cross-validation
– First step: data is split into k subsets of equal size;
– Second step: each subset in turn is used for testing
and the remainder for training
– Performance measures averaged over all folds

• Popular choice for k: 10 or 5

• Advantage: all available data points being used
to train as well test model
k-fold cross validation (shown for k=3)

Classifier

train train test

Data train test train

test train train

Regularization
Addressing overfitting: Two ways

1. Reduce number of features

― Manually select which features to keep
― Problem: loss of some information (discarded features)

2. Regularization
― Keep all the features, but reduce magnitude/values of
parameters
― Works well when we have a lot of features, each of which
contributes a bit to predicting
Intuition of regularization

Price
Price

Size of house Size of house

Suppose we penalize and make , really small.

+ K Θ32 + K Θ42
Regularization for linear regression
By convention,
regularization is not
applied on θ0 (makes
little difference to the
solution)

λ: Regularization parameter

Smaller values of parameters lead to more

generalizable models, less overfitting
Regularization for linear regression
In regularized linear regression, we choose to minimize

Regularization parameter
- Controls trade-off between our two goals
- (1) fitting the training data well
- (2) keeping values of parameters small

- What if λ is too large? Underfitting

L1 and L2 regularization

• What we are discussing is called L2

regularization or “ridge” regularization
– adds squared magnitude of parameters as penalty
term

• Look up L1 or “Lasso” regularization

– adds absolute value of magnitude of parameters
as penalty term
Regularized linear regression
Gradient Descent for ordinary linear regression
Repeat
Regularized linear regression
Gradient Descent for Regularized Linear Regression
Repeat
Regularized logistic regression
Example: Logistic regression
Gradient descent for ordinary logistic regression

Repeat
Gradient Descent for Regularized Logistic Regression
Gradient Descent for Regularized Logistic Regression

Repeat
Bias vs. Variance
A closer look
Example: Linear regression
Price

Size

High bias “Just right” High variance

(underfit) (overfit)
Example: Logistic regression
Analysing bias vs. variance

• Suppose your model is not performing as well

as expected. Is it a bias problem or a variance
problem?
Bias (underfit):
Validation Both training error and
error

error or test error validation / test error are

high
Training error
Variance (overfit):
Low training error
degree of polynomial d
High validation / test error
Bias vs. Variance

• Bias and variance both contribute to the error

of classifier
• Variance is error due to randomness in how
the training data was selected (variance of an
estimate refers to how much the estimate will
vary from sample to sample)
• Bias is error due to something systematic, not
random
Will more training data help?
• A learnt model is not performing as well as expected.
Will having more training data help?

• Note that there can be substantial cost for getting

more training data.
Will more training data help?
• A learnt model is not performing as well as expected.
Will having more training data help?

• Note that there can be substantial cost for getting

more training data.

• If model is suffering from high bias, getting more

training data will not (by itself) help much.
• If model is suffering from high variance, getting more
training data is likely to help
Practical approach
• Divide data into training set and validation set
• Start with simple algorithm, train on different
amounts of training data, test performance on
validation set
• Plot learning curves to decide if more training data,
more features likely to help
• Error analysis: Manually examine the examples (in
validation set) where algorithm made errors. Any
systematic trend in what type of examples it is
making errors on?
Learning curves
• How do training error (in-sample error) and test or
validation error (out-of-sample error) generally vary
with number of training points?

CCNA 200 301 June 2023 v1 2
No ratings yet
CCNA 200 301 June 2023 v1 2
320 pages
Microsoft Azure Fundamentals: Microsoft AZ-900 Dumps Available Here at
No ratings yet
Microsoft Azure Fundamentals: Microsoft AZ-900 Dumps Available Here at
11 pages
DL Unit1
100% (1)
DL Unit1
79 pages
Group 3 Proposal 1
No ratings yet
Group 3 Proposal 1
39 pages
CDB Interfaces: C#, C++, Python, Fortran and Cadinp
No ratings yet
CDB Interfaces: C#, C++, Python, Fortran and Cadinp
3 pages
Excavating AI
No ratings yet
Excavating AI
3 pages
BUS 6140 Module 2 Worksheet - Edited
100% (2)
BUS 6140 Module 2 Worksheet - Edited
13 pages
Machine Learning Models: by Mayuri Bhandari
No ratings yet
Machine Learning Models: by Mayuri Bhandari
48 pages
FINAL MODULE Mia PDF
No ratings yet
FINAL MODULE Mia PDF
31 pages
Automatic Control Systems in Mechanical Engineering
No ratings yet
Automatic Control Systems in Mechanical Engineering
9 pages
Chapter 3 Finalversion
No ratings yet
Chapter 3 Finalversion
107 pages
14EIT72 Instrumentation System Design UNIT-IV - Computer Based Controller
No ratings yet
14EIT72 Instrumentation System Design UNIT-IV - Computer Based Controller
48 pages
Regression and Generalization
No ratings yet
Regression and Generalization
67 pages
Nama Item Harga Keterangan
No ratings yet
Nama Item Harga Keterangan
14 pages
Maths
No ratings yet
Maths
21 pages
HCL Case Study
No ratings yet
HCL Case Study
37 pages
ML Tips and Tricks
No ratings yet
ML Tips and Tricks
32 pages
Instamojo Overview
No ratings yet
Instamojo Overview
32 pages
6 - 2D Viewing Transformation
No ratings yet
6 - 2D Viewing Transformation
31 pages
Python CheatSheet - Sahil
No ratings yet
Python CheatSheet - Sahil
8 pages
SYBAF Mcqs Question Bank
No ratings yet
SYBAF Mcqs Question Bank
15 pages
AXNav - Replaying Accessibility Tests From Natural Language
No ratings yet
AXNav - Replaying Accessibility Tests From Natural Language
16 pages
4 Bit Braun Multiplier With Kogge Stone Adder
No ratings yet
4 Bit Braun Multiplier With Kogge Stone Adder
15 pages
Daftar Pustaka - (New)
No ratings yet
Daftar Pustaka - (New)
13 pages
Decentralized Computer
No ratings yet
Decentralized Computer
13 pages
A Case Study of Software Security Red Teams at Microsoft
No ratings yet
A Case Study of Software Security Red Teams at Microsoft
10 pages
ÔN TẬP 7BY
No ratings yet
ÔN TẬP 7BY
10 pages
(Rahman) Assignment#1
No ratings yet
(Rahman) Assignment#1
9 pages
All DL
No ratings yet
All DL
72 pages
Lecture Slide 02 - Supervised Learning - Summer 2023
No ratings yet
Lecture Slide 02 - Supervised Learning - Summer 2023
43 pages
Kontur PDF
No ratings yet
Kontur PDF
1 page
Bias Variance
No ratings yet
Bias Variance
14 pages
Connecting Fog and Cloud Computing
No ratings yet
Connecting Fog and Cloud Computing
3 pages
ML 19.03 Sidenotes
No ratings yet
ML 19.03 Sidenotes
30 pages
IEEE Conference Template
No ratings yet
IEEE Conference Template
5 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
116 pages
Quiz 1 Materials
No ratings yet
Quiz 1 Materials
159 pages
QP Iot System Development
No ratings yet
QP Iot System Development
3 pages
MLT Notes
No ratings yet
MLT Notes
28 pages
Sushant's Resume
No ratings yet
Sushant's Resume
2 pages
Lec 5
No ratings yet
Lec 5
28 pages
Supervised Regression Notes
No ratings yet
Supervised Regression Notes
11 pages
Machine Learning
No ratings yet
Machine Learning
63 pages
Lec06 PracticalML
No ratings yet
Lec06 PracticalML
40 pages
DSOST3
No ratings yet
DSOST3
31 pages
Lecture16 Crossvalidation
No ratings yet
Lecture16 Crossvalidation
32 pages
Lecture 5
No ratings yet
Lecture 5
26 pages
Linear Regression Summary
No ratings yet
Linear Regression Summary
57 pages
19 ML Intro
No ratings yet
19 ML Intro
31 pages
Machine Learning Using Matlab: Lecture 8 Advice On ML Application
No ratings yet
Machine Learning Using Matlab: Lecture 8 Advice On ML Application
30 pages
Unit-I Machine Learning Basics
No ratings yet
Unit-I Machine Learning Basics
85 pages
Week 4 Lecture Slides BUS265 2023
No ratings yet
Week 4 Lecture Slides BUS265 2023
45 pages
10: Advice For Applying Machine Learning: Deciding What To Try Next
No ratings yet
10: Advice For Applying Machine Learning: Deciding What To Try Next
8 pages
Gansp Awareness Quiz PDF
No ratings yet
Gansp Awareness Quiz PDF
13 pages
CSO504 Machine Learning: Evaluation and Error Analysis Validation and Regularization Koustav Rudra 22/08/2022
No ratings yet
CSO504 Machine Learning: Evaluation and Error Analysis Validation and Regularization Koustav Rudra 22/08/2022
28 pages
Model Generalization
No ratings yet
Model Generalization
117 pages
Machine Learning Interview Questions.
50% (2)
Machine Learning Interview Questions.
43 pages
ML - Module 5
No ratings yet
ML - Module 5
80 pages
Overfitting & Feature Engineering
No ratings yet
Overfitting & Feature Engineering
37 pages
Theory in Machine Learning
No ratings yet
Theory in Machine Learning
60 pages
L2 - Problems in ML & Performance Evaluation
No ratings yet
L2 - Problems in ML & Performance Evaluation
30 pages
Lec4 Oct12 2022 PracticalNotes LinearRegression
No ratings yet
Lec4 Oct12 2022 PracticalNotes LinearRegression
34 pages
3 LogisticRegression
No ratings yet
3 LogisticRegression
30 pages
Regression Analysis
No ratings yet
Regression Analysis
11 pages
Training Evaluation
No ratings yet
Training Evaluation
42 pages
Week11 - Regularization and Optimization
No ratings yet
Week11 - Regularization and Optimization
75 pages
Machine Learning-2
No ratings yet
Machine Learning-2
87 pages
Jkkklphftbbhuii
No ratings yet
Jkkklphftbbhuii
17 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
Lecturenotes PDF
No ratings yet
Lecturenotes PDF
80 pages
Lecture 2 Ai
No ratings yet
Lecture 2 Ai
24 pages
Model Evaluation in ML
No ratings yet
Model Evaluation in ML
12 pages
05-1 Supervised Learning
No ratings yet
05-1 Supervised Learning
65 pages
Lecturenotes Cse176
No ratings yet
Lecturenotes Cse176
80 pages
MLA TAB Lecture3
No ratings yet
MLA TAB Lecture3
70 pages
Chapter 1 Capstone Project Ai Class 12
No ratings yet
Chapter 1 Capstone Project Ai Class 12
5 pages
ML 5
No ratings yet
ML 5
14 pages
Edge AI
88% (8)
Edge AI
156 pages
AI & ML Notes
No ratings yet
AI & ML Notes
22 pages
Forecasting and Learning Theory
No ratings yet
Forecasting and Learning Theory
46 pages
Module 3 - ML
No ratings yet
Module 3 - ML
101 pages
Choosing Model and Tuning
No ratings yet
Choosing Model and Tuning
20 pages
Lect 1
No ratings yet
Lect 1
24 pages
ML 01
No ratings yet
ML 01
24 pages
Machine Learning Cheatsheet
No ratings yet
Machine Learning Cheatsheet
12 pages
TR Rain Error
No ratings yet
TR Rain Error
6 pages
Certified Lean Six Sigma Green Belt (ICGB) Practice Questions And Exam Tests ICGB Exam Guidebook And Updated Questions
From Everand
Certified Lean Six Sigma Green Belt (ICGB) Practice Questions And Exam Tests ICGB Exam Guidebook And Updated Questions
Idea Link
No ratings yet
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Practical Statistical Process Control
From Everand
Practical Statistical Process Control
Colin Hardwick
5/5 (9)
Ways to Achieve Quality
From Everand
Ways to Achieve Quality
chakrapani srinivasa
5/5 (1)

ML 04 Validation Regularization

Uploaded by

ML 04 Validation Regularization

Uploaded by

CS 60050

Evaluation and Error analysis

Some slides taken from course materials of Andrew Ng

Residual sum of squares

Total sum of squares

• Accuracy: Out of all examples, for what

• But accuracy is often not sufficient to indicate

• F-score: harmonic mean of Precision and Recall

Precision: (True positive) / (True positive + False positive)

Recall: (True positive) / (True positive + False negative)

• Two types of errors:

• How do we penalize the two types of errors?

• Which is more important – higher Precision or

• Depends on the specific application

• Input fingerprint, classify as

• Application 2: For entering

• Input fingerprint, classify as

• Application 2: For entering

• Input fingerprint, classify as

• Motivation for ML: learn a model that performs

Bias and Variance

Fitting a quadratic function

Fitting a higher order function

High bias “Just right”

High bias “Just right” High variance

If we have too many features, the learned hypothesis

but fail to generalize to new examples.

• Generalization error: Model tries to fit the noise in the

• Divide given data into train set and test set

• Note: Test set not at all seen during training

– Usually used when there is sufficiently large data,

• Repeated Holdout method

• Popular choice for k: 10 or 5

train train test

Data train test train

test train train

1. Reduce number of features

Size of house Size of house

Suppose we penalize and make , really small.

Smaller values of parameters lead to more

- What if λ is too large? Underfitting

• What we are discussing is called L2

• Look up L1 or “Lasso” regularization

High bias “Just right” High variance

• Suppose your model is not performing as well

error or test error validation / test error are

• Bias and variance both contribute to the error

• Note that there can be substantial cost for getting

• Note that there can be substantial cost for getting

• If model is suffering from high bias, getting more

You might also like