Lecture 3 1611410001002
Lecture 3 1611410001002
• Multi-class classification
• Multi-output classification
Model Selection and Training
• Train a classification model for detecting ‘5’. Target output for training data
instance corresponding an image of ‘5’ is +1, else target output is ‘0’
• Perform cross validation like the regression problem and try out multiple
classification model for achieving acceptable performance.
Multiclass Classification
PREDICTED CLASS
a: TP (true positive)
Class=Yes Class=No b: FN (false negative)
c: FP (false positive)
ACTUAL Class=Yes a b d: TN (true negative)
CLASS
Class=No c d
Metrics for Performance Evaluation…
PREDICTED CLASS
Class=Yes Class=No
ACTUAL Class=Yes a b
(TP) (FN)
CLASS
Class=No c d
(FP) (TN)
ad TP TN
Accuracy
a b c d TP TN FP FN
Limitation of Accuracy
a
Precision (p)
ac
a
Recall (r)
ab
2rp 2a
F - measure (F)
r p 2a b c
wa w d
Weighted Accuracy 1 4
wa wb wc w d
1 2 3 4
Methods for Performance Evaluation
How to obtain a reliable estimate of performance?
• Performance of a model may depend on other factors besides the learning
algorithm:
• Class distribution
• Cost of misclassification
• Size of training and test sets
Learning Curve
• Holdout
• Reserve 2/3 for training and 1/3 for testing
• Random subsampling
• Repeated holdout
• Cross validation
• Partition data into k disjoint subsets
• k-fold: train on k-1 partitions, test on the remaining one
• Leave-one-out: k=n
• Stratified sampling
• oversampling vs undersampling
• Bootstrap
• Sampling with replacement
ROC (Receiver Operating Characteristic)
At threshold t:
TP=0.5, FN=0.5, FP=0.12, TN=0.88
ROC Curve
(TP,FP):
• (0,0): declare everything
to be negative class
• (1,1): declare everything
to be positive class
• (1,0): ideal
• Diagonal line:
• Random guessing
• Below diagonal line:
• prediction is opposite of the
true class
Using ROC for Model Comparison
No model consistently
outperform the other
M1 is better for small FPR
M2 is better for large FPR
ROC Curve:
Class + - + - - - + - + +
P
Threshol 0.25 0.43 0.53 0.76 0.85 0.85 0.85 0.87 0.93 0.95 1.00
d >= TP 5 4 4 3 3 3 3 2 2 1 0
FP 5 5 4 4 3 2 1 1 0 0 0
TN 0 0 1 1 2 3 4 4 5 5 5
FN 0 1 1 2 2 2 2 3 3 4 5
TPR 1 0.8 0.8 0.6 0.6 0.6 0.6 0.4 0.4 0.2 0
• Gradient Descent
• e.g., Learning rate, how long to run
• Mini-batch
• Batch size
• Regularization constant
• Many Others
• will be discussed in upcoming sessions
Hyperparameter Optimization
• Just fiddle with the parameters until you get the results you want
• This is just grid search, but with randomly chosen points instead of points
on a grid.
• RandomSearchCV
• Problem: with random search, not necessarily going to get anywhere near the
optimal parameters in a finite sample.
An Alternative: Bayesian Optimization
• Main benefit: choose the hyperparameters to test not at random, but in a way that
gives the most information about the model
• This lets it learn faster than grid search
Effect of Bayesian Optimization
• Partition part of the available data to create an validation dataset that we don’t
use for training.
• What is MLOps
• DevOps vs MLOps
• Level 0 MLOps
• Continuous Training
• Level 1 MLOps
• Continuous Integration, Delivery
• Frameworks
What is MLOps?
Apply DevOps principles to ML systems
• An engineering culture and practice that aims at unifying ML system
development (Dev) and ML system operation (Ops).
• Perform continuous
training (CT) by
automating the ML
pipeline
• Achieves continuous
delivery of model
prediction service.
•
• Automated data and
model validation
steps to the pipeline
• Needs pipeline
triggers and metadata
management.
Data and Model Validation
• Model validation: Required after retraining the model with the new data.
Evaluate and validate the model before promoting to production. This offline
model validation step consists of
• Producing evaluation metric using the trained model on test data to assess the
model quality.
• Comparing the evaluation metrics of production model, baseline model, or other
business-requirement models.
• Ensuring the consistency of model performance on various data segments
• Test model for deployment, including infrastructure compatibility and API
Level 2: CI/CD and automated pipeline automation
Stages of CI/CD Automation Pipeline
• Pipeline and its components are built, tested, and packaged when
• new code is committed or
• pushed to the source code repository.