0% found this document useful (0 votes)

37 views12 pages

Structuring Machine learning projects

The document outlines key strategies for structuring machine learning projects, emphasizing the importance of orthogonalization in tuning model parameters and hyperparameters. It discusses evaluation metrics such as precision, recall, and F1 score, as well as the significance of training, dev, and test set distributions. Additionally, it covers concepts like avoidable bias, variance trade-offs, error analysis, and advanced techniques such as transfer learning and multi-task learning.

Uploaded by

saksham2700

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views12 pages

Structuring Machine learning projects

Uploaded by

saksham2700

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Structuring Machine learning projects

30 July 2020 12:25

Orthogonalization:
Knowing what parameter/ hyperparameter to tune in the model, in o
some change in the model, we know what parameter to tune to bring

Example: For Supervised learning, for every step we have the followin
1. Fit the training set well on the cost function -> (Bigger network,
2. Fit dev set well on the cost function --> (Regularization, bigger tr
3. Fit test set well on the cost function --> (Bigger dev set, etc.)
4. Performs well in the real world. --> (Change the dev set, or the c

SINGLE NUMBER EVALUATION METRIC:

Set up the goal of your project. How will you measure the success of t

𝑻𝒓𝒖𝒆 𝒑𝒐𝒔𝒊𝒕𝒊𝒗𝒆𝒔
𝑷𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏 =
𝑻𝒓𝒖𝒆 𝒑𝒐𝒔𝒊𝒕𝒊𝒗𝒆𝒔 + 𝑭𝒂𝒍𝒔𝒆 𝒑𝒐𝒔𝒊𝒕𝒊𝒗𝒆𝒔

Amongst all the examples classified as positive, how many did it class
Example - A classifier has 95% precision. If it classifies something as a c
actually a cat.

𝑻𝒓𝒖𝒆 𝒑𝒐𝒔𝒊𝒕𝒊𝒗𝒆𝒔
𝑹𝒆𝒄𝒂𝒍𝒍 =
𝑻𝒓𝒖𝒆 𝒑𝒐𝒔𝒊𝒕𝒊𝒗𝒆𝒔 + 𝑭𝒂𝒍𝒔𝒆 𝒏𝒆𝒈𝒂𝒕𝒊𝒗𝒆𝒔

Amongst all positives it got in the data, how much did it identify corre
Example - A classifier has 98% recall. If we give the classifier 100 examp
correctly.

But, both of them are required to evaluate a classifier. We want both o

order to achieve what effect. If we want
g that change.

ng knobs to tune the model:

different optimization algo, etc.)
raining set, etc.)

cost function, etc.)

the project ?

sify correctly ?
cat, then there is 95% chance that it is

ectly?
ples, it should identify about 98 of them

of them. The standard way to combine

Example - A classifier has 98% recall. If we give the classifier 100 examp
correctly.

But, both of them are required to evaluate a classifier. We want both o

Precision and recall is using F1 score.

2
𝐹1 𝑠𝑐𝑜𝑟𝑒 = (𝐻𝑎𝑟𝑚𝑜𝑛𝑖𝑐 𝑀𝑒𝑎𝑛)
1 1
𝑃+𝑅
The advantage is that even if one of them is low, the F1 score will be l
F1 score.

Another example.
If we have 2 classifiers that work differently for different geographies
better overall ? Just take the average of each classifier over all the geo
the 2 classifiers.

Satisficing and Optimizing metric:

Suppose we have the following example:

Here, we can't choose 1 of them (both important), and also, we can't c

So, we can form a metric as the following:
MAXIMIZE the accuracy
Subject to Running time ≤ 100 ms
ples, it should identify about 98 of them

of them. The standard way to combine

low. We want a classifier with the highest

s. How can we decide which one works

ographies and compare this average for

combine both of them (different units).

So, we can form a metric as the following:
MAXIMIZE the accuracy
Subject to Running time ≤ 100 ms

Using this metric, we can get that B is the best choice.

Here, Accuracy is the optimizing metric (Which we want to achieve),
(minimum condition which we want to satisfy)

Dev and test set should come from the same distribution.
• For small Datasets, say ≤ 500,000 examples
○ 70% training, 30% dev
○ 60% training, 20% dev, 20% test
• But if very large datasets, say 1,000,000 examples
○ 98% train, 1%dev, 1% test
Set you test set to be big enough to give high confidence on the overal

Bayes' optimal error:

Maximum possible accuracy, minimum possible error. No system can
than humans.

𝑨𝒗𝒐𝒊𝒅𝒂𝒃𝒍𝒆 𝑩𝒊𝒂𝒔 = 𝑬𝒓𝒓𝒄𝒍𝒇 𝒐𝒏 𝒕𝒓𝒂𝒊𝒏𝒊𝒏𝒈 𝒅𝒂𝒕𝒂 − 𝑬𝒓𝒓𝑩𝒂𝒚𝒆𝒔! 𝒄𝒍𝒇

For example, the Bayes' error / Human Error = 7.5%

And the Training error of our classifier is 8%.
The Avoidable bias is only (8-7.5) = 0.5%, because we can't anyway g

Improving your model performance:

Avoidable bias / Variance
Look at the difference between:
1. Human Level Error
2. Training Error
3. Dev Error
This will give you the Avoidable bias / Variance trade-off ! You can im
and Running time Is the Satisficing metric

ll performance of your system.

n surpass this performance. Even better

get below 7.5%.

mprove accordingly.
Look at the difference between:
1. Human Level Error
2. Training Error
3. Dev Error
This will give you the Avoidable bias / Variance trade-off ! You can im
For Avoiding Avoidable bias: Train bigger model, Train longer, Use be
architecture, perform a hyperparameter search.

For avoiding Variance, Get more data, perform regularization, change

search.

Error Analysis
Sometimes, through manual Error analysis, we get a lot of insights on

v Whenever starting a project on which a lot of literature is not ava

implementation to get some idea about bias/variance and errors
v If you are building upon some project on which a lot of literature
more complex system from the start only.

Training and Testing on different distributions

If we have 200k examples of high quality cat images
And we have 10k examples from low quality images
& we want to build a classifier for low quality images,
What can we do ?
1. 1st option : Take all the available images, randomly shuffle the im
sets.
2. 2nd Option: Take all 200k high quality images, and 5k images low
low quality images into Dev set & 2.5k low quality images into te
2nd Option works better.

How to find whether your error is because of difference between

of High Variance ?
mprove accordingly.
etter optimization algorithms. Change NN

e NN architecture, do hyperparameters

n what to do next.

ailable, just do a quick and dirty

s. Then start building upon that system.
e is already available, then you can build a

mages & divide it into Train, Dev & test

w quality into the Training set. Take 2.5k

est set.

n Training data & dev Data or because

How to find whether your error is because of difference between
of High Variance ?
Tr err = 9%
Dev err = 15%
Is it a high Variance? Or is it because of high difference between Tr da

To find this out, pull out some data from the training set and some fro
(before training). We will call this data the train-dev set. It's the mixtu
Train the ML model on the training data alone.
Find out the train-dev set error, dev-set error
Consider the following cases:
Human Level err ~0% ~0% ~0% ~0%
Tr err 1% 1% 10% 10%
Tr-dev err 1.5% 9% 11% 11%
Dev Err 10% 10% 12% 20%
Comments Data mismatch High variance High Bias High Bi
Data m

What to do in case of high difference between training & dev set data?
1. Collect more training data similar to the dev set data
2. Artificial Data Synthesis (example data augmentation, appending
the rear view mirror problem)--- Just keep in mind that you shou
for a large possible examples.

Transfer Learning:
Learning to recognize cats, and then using that network to read x-ray
Using a model trained on some data (pre-training), and using that m
tuning).
Particularly useful when you have a relatively small dataset.

Multi-task learning:
Training multiple models and using all of them to train the final mode
n Training data & dev Data or because

ata and Dev data ?

om the dev set (Remove this from both)

ure of both the distributions.

ias,
mismatch

g car background noise in pure speech for

uld not synthesize the data as a tiny subset

y scans.
model to train some other data (Fine-

el
Multi-task learning:
Training multiple models and using all of them to train the final mode
Training models to detect cars, stop signs, pedestrians, etc. And using

End-to-end deep learning

For speech recognition we require,
Audio --> features -->Phonemes -->words --> transcript
What E2E DL does is:
Audio --------------------------------------------------> transcript
It requires a larger and deeper NN, and a lot more data.
el
g all of them to train your final model.

Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
How To Improve Model
No ratings yet
How To Improve Model
27 pages
2 1 TXT Bias Variance
No ratings yet
2 1 TXT Bias Variance
4 pages
ppt5dl
No ratings yet
ppt5dl
33 pages
Train: Dev: Test Sets
No ratings yet
Train: Dev: Test Sets
5 pages
Machine_Learning_Yearning
No ratings yet
Machine_Learning_Yearning
40 pages
Training Evaluation
No ratings yet
Training Evaluation
42 pages
ML 02 Dataset-Feature Selection PDF
No ratings yet
ML 02 Dataset-Feature Selection PDF
44 pages
Model Evaluation in ML
No ratings yet
Model Evaluation in ML
12 pages
Overfitting & Feature Engineering.pptx
No ratings yet
Overfitting & Feature Engineering.pptx
37 pages
Fixing Neural Network Course 2 1659759284
No ratings yet
Fixing Neural Network Course 2 1659759284
30 pages
Evaluating A Machine Learning Model
No ratings yet
Evaluating A Machine Learning Model
14 pages
Practical Aspects of Deep Learning PI
No ratings yet
Practical Aspects of Deep Learning PI
46 pages
Lecture 9 - Evaluations
No ratings yet
Lecture 9 - Evaluations
68 pages
ML MU Unit 2
100% (3)
ML MU Unit 2
84 pages
Subtitle (5)
No ratings yet
Subtitle (5)
3 pages
Lecture 12 - Machine Learning
No ratings yet
Lecture 12 - Machine Learning
18 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
cs229 Notes1
No ratings yet
cs229 Notes1
5 pages
Lecture # 09
No ratings yet
Lecture # 09
3 pages
Unit 2
No ratings yet
Unit 2
97 pages
ML MU Unit 2
100% (2)
ML MU Unit 2
42 pages
Chapter 01 Introduction To Machine Learning
No ratings yet
Chapter 01 Introduction To Machine Learning
59 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
32 pages
Question1 Answers Complete
No ratings yet
Question1 Answers Complete
4 pages
ML Checklist PDF
No ratings yet
ML Checklist PDF
4 pages
5.2
No ratings yet
5.2
62 pages
19_ML_intro
No ratings yet
19_ML_intro
33 pages
Xchapter 1
No ratings yet
Xchapter 1
31 pages
Fall 2022 Midterm Notes PDF
No ratings yet
Fall 2022 Midterm Notes PDF
15 pages
L2_Problems in ML & Performance Evaluation - Copy
No ratings yet
L2_Problems in ML & Performance Evaluation - Copy
30 pages
TR Rain Error
No ratings yet
TR Rain Error
6 pages
Chapter 19
No ratings yet
Chapter 19
30 pages
ML.1Lecture.2 (Old)
No ratings yet
ML.1Lecture.2 (Old)
23 pages
DEEP LEARNING UNIT 3
No ratings yet
DEEP LEARNING UNIT 3
19 pages
Machine Learning General: Definiton
No ratings yet
Machine Learning General: Definiton
14 pages
Machine Learning Math Essentials _12.02.2025
No ratings yet
Machine Learning Math Essentials _12.02.2025
88 pages
module3_DS_ppt
No ratings yet
module3_DS_ppt
68 pages
ML 5
No ratings yet
ML 5
14 pages
AIMl TA2
No ratings yet
AIMl TA2
4 pages
Model Generalization
No ratings yet
Model Generalization
117 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
11 Machine Learning System Design PDF
No ratings yet
11 Machine Learning System Design PDF
7 pages
CS3244 (2120) - Project Discussion 1 - Overview
No ratings yet
CS3244 (2120) - Project Discussion 1 - Overview
25 pages
Evaluation
No ratings yet
Evaluation
18 pages
Unit 3
No ratings yet
Unit 3
55 pages
Data Science Interview Questions (#Day11) PDF
100% (1)
Data Science Interview Questions (#Day11) PDF
11 pages
ML viva questions
No ratings yet
ML viva questions
25 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
4 pages
C1 W2
No ratings yet
C1 W2
60 pages
CSO504 Machine Learning: Evaluation and Error Analysis Validation and Regularization Koustav Rudra 22/08/2022
No ratings yet
CSO504 Machine Learning: Evaluation and Error Analysis Validation and Regularization Koustav Rudra 22/08/2022
28 pages
Ensemble Learning
No ratings yet
Ensemble Learning
46 pages
CSC4316 9
No ratings yet
CSC4316 9
40 pages
2020 Evaluation PDF
No ratings yet
2020 Evaluation PDF
25 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
32 pages
Machine Learning Cheatsheet Compiled and Curated by Robins Yadav
No ratings yet
Machine Learning Cheatsheet Compiled and Curated by Robins Yadav
14 pages
19 ML Intro
No ratings yet
19 ML Intro
31 pages
40 Interview Questions asked at Startups in Machine Learning _ Data Science
No ratings yet
40 Interview Questions asked at Startups in Machine Learning _ Data Science
13 pages
CH-5_ML
No ratings yet
CH-5_ML
36 pages
uf,of, bias-variance tradeoff
No ratings yet
uf,of, bias-variance tradeoff
3 pages
Download Complete Beginning MLOps with MLFlow : Deploy Models in AWS SageMaker, Google Cloud, and Microsoft Azure 1st Edition Sridhar Alla PDF for All Chapters
No ratings yet
Download Complete Beginning MLOps with MLFlow : Deploy Models in AWS SageMaker, Google Cloud, and Microsoft Azure 1st Edition Sridhar Alla PDF for All Chapters
52 pages
Unlocking_Profit_Potential_Maximizing_Returns_with
No ratings yet
Unlocking_Profit_Potential_Maximizing_Returns_with
13 pages
Chemistry Thesis Proposal Example
100% (3)
Chemistry Thesis Proposal Example
8 pages
Learning The Domain Specific Inverse Nufft
No ratings yet
Learning The Domain Specific Inverse Nufft
5 pages
Final Pattern Recognition Laboratery
No ratings yet
Final Pattern Recognition Laboratery
39 pages
Features Extraction and Reduction Techniques With Optimized SVM For Persian/Arabic Handwritten Digits Recognition
No ratings yet
Features Extraction and Reduction Techniques With Optimized SVM For Persian/Arabic Handwritten Digits Recognition
19 pages
Machine Learning Overview
No ratings yet
Machine Learning Overview
103 pages
Empirical Asset Pricing Via Machine Learning
No ratings yet
Empirical Asset Pricing Via Machine Learning
78 pages
Iris Classification
No ratings yet
Iris Classification
6 pages
Phase 1 Review 0
No ratings yet
Phase 1 Review 0
8 pages
S-5
No ratings yet
S-5
10 pages
Landfriend and Mocskos - TrueSkill Through Time: reliable initial skill estimates and historical comparability with Julia, Python, and R
No ratings yet
Landfriend and Mocskos - TrueSkill Through Time: reliable initial skill estimates and historical comparability with Julia, Python, and R
43 pages
FLAML Tutorial 2022-KDD
No ratings yet
FLAML Tutorial 2022-KDD
222 pages
Stock Market Prediction Using Machine Learning Report 1
No ratings yet
Stock Market Prediction Using Machine Learning Report 1
36 pages
1-s2.0-S0167404824002670-main
No ratings yet
1-s2.0-S0167404824002670-main
10 pages
Nesterov Momentum
No ratings yet
Nesterov Momentum
3 pages
Energy retrofitting of hospital buildings considering climate change An
No ratings yet
Energy retrofitting of hospital buildings considering climate change An
26 pages
EN3150 Pattern Recognition - L02
No ratings yet
EN3150 Pattern Recognition - L02
51 pages
How To Start Kaggle
No ratings yet
How To Start Kaggle
40 pages
ML Lab 08 Manual - Logisitic Regression (Ver7)
No ratings yet
ML Lab 08 Manual - Logisitic Regression (Ver7)
9 pages
Guimarães, Lucas C. B. Rebello, Gabriel Antonio F. Camilo, Gustavo F. de Souza, Lucas Airam C. Duarte, Otto Carlos M. B. (2022)
No ratings yet
Guimarães, Lucas C. B. Rebello, Gabriel Antonio F. Camilo, Gustavo F. de Souza, Lucas Airam C. Duarte, Otto Carlos M. B. (2022)
16 pages
Interim report
No ratings yet
Interim report
17 pages
Prediction and Reliability Analysis of Shear Stren
No ratings yet
Prediction and Reliability Analysis of Shear Stren
16 pages
12. Bone Fracture Classification using
No ratings yet
12. Bone Fracture Classification using
6 pages
Sample Thesis Proposal Computer Science
100% (2)
Sample Thesis Proposal Computer Science
6 pages
Customer Churn Prediction in The Telecommunication Industries Using RNN
No ratings yet
Customer Churn Prediction in The Telecommunication Industries Using RNN
7 pages
DL 4
No ratings yet
DL 4
15 pages
Deep Learning Training Best Practices
No ratings yet
Deep Learning Training Best Practices
40 pages
Ijeqi
No ratings yet
Ijeqi
10 pages
Exam DP 100 Data Science Solution On Azure Skills Measured
No ratings yet
Exam DP 100 Data Science Solution On Azure Skills Measured
9 pages

Structuring Machine learning projects

Uploaded by

Structuring Machine learning projects

Uploaded by

Structuring Machine learning projects

30 July 2020 12:25

SINGLE NUMBER EVALUATION METRIC:

But, both of them are required to evaluate a classifier. We want both o

ng knobs to tune the model:

cost function, etc.)

of them. The standard way to combine

But, both of them are required to evaluate a classifier. We want both o

Satisficing and Optimizing metric:

Here, we can't choose 1 of them (both important), and also, we can't c

of them. The standard way to combine

low. We want a classifier with the highest

s. How can we decide which one works

combine both of them (different units).

Using this metric, we can get that B is the best choice.

Bayes' optimal error:

𝑨𝒗𝒐𝒊𝒅𝒂𝒃𝒍𝒆 𝑩𝒊𝒂𝒔 = 𝑬𝒓𝒓𝒄𝒍𝒇 𝒐𝒏 𝒕𝒓𝒂𝒊𝒏𝒊𝒏𝒈 𝒅𝒂𝒕𝒂 − 𝑬𝒓𝒓𝑩𝒂𝒚𝒆𝒔! 𝒄𝒍𝒇

For example, the Bayes' error / Human Error = 7.5%

Improving your model performance:

ll performance of your system.

n surpass this performance. Even better

get below 7.5%.

For avoiding Variance, Get more data, perform regularization, change

v Whenever starting a project on which a lot of literature is not ava

Training and Testing on different distributions

How to find whether your error is because of difference between

ailable, just do a quick and dirty

mages & divide it into Train, Dev & test

w quality into the Training set. Take 2.5k

n Training data & dev Data or because

ata and Dev data ?

om the dev set (Remove this from both)

g car background noise in pure speech for

End-to-end deep learning

You might also like