0% found this document useful (0 votes)

23 views51 pages

Lecture 3 1611410001002

The document discusses model selection, training, and evaluation for regression and classification problems. It provides examples of selecting linear regression and random forest regression models for a house price prediction problem. For classification, it discusses training a model on the MNIST dataset to detect the digit '5'. It also covers multi-class classification using one-vs-all and one-vs-one strategies. The document concludes by discussing performance metrics like accuracy and ROC curves, and methods for estimating model performance including cross-validation, bootstrap, and learning curves.

Uploaded by

amrasirah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views51 pages

Lecture 3 1611410001002

Uploaded by

amrasirah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 51

Model Selection and Training

Dr. Sugata Ghosal

CSIS Off-Campus Faculty
BITS Pilani
In This Segment

• Model Selection and Training

• For regression problem
• For classification problem

• Multi-class classification
• Multi-output classification
Model Selection and Training

• Select based on training data

• If prediction label/output is available, use regression or classification model
• Regression if real valued output
• Classification if output is discrete (binary/integer)
• else, unsupervised model is used.
• For the house price prediction problem, use regression model since median
house prices are available along with training data (predictors)
• Example: Linear Regression

• Better model is necessary for improving the prediction accuracy

Regression Model Selection and Training
• Decision Tree Based Regression produces low error on training data

• Cross-validation error is not satisfactory

• Better accuracy can be obtained using Random Forest based regressor
Classification Model
MNIST Dataset Classification
• A set of 70,000 small images of handwritten
digits
• Each image 28x28 pixel with intensity 0 (black) –
255 (white)
• Input data represented as 70000 x 784 matrix
• Each image is labeled with the digit it represents.
Classification Model Training
Detect a ‘5’
• Segment the dataset into 60,000 training images and 10,000 test images

• Shuffle the training dataset

• Train a classification model for detecting ‘5’. Target output for training data
instance corresponding an image of ‘5’ is +1, else target output is ‘0’

• Perform cross validation like the regression problem and try out multiple
classification model for achieving acceptable performance.
Multiclass Classification

• Multiclass classifiers (aka multinomial classifiers) can distinguish between more

than two classes.
• Some algorithms (such as Random Forest classifiers or naive Bayes classifiers)
are capable of handling multiple classes directly.
• Many (such as Support Vector Machine classifiers or Linear classifiers) are strictly
binary
• One-versus-all (OvA) or One-versus-rest strategy using multiple binary classifiers.
• e.g., for MNIST classification, train 10 binary classifiers, one for each digit (a 0-detector,
a 1-detector, a 2-detector, and so on).
• get the decision score from each classifier for that image and select the class whose
classifier outputs the highest score.
• One-versus-one (OvO) strategy
• train a binary classifier for every pair of digits: one to distinguish 0s and 1s, another to
distinguish 0s and 2s, another for 1s and 2s, and so on.
• If there are N classes, you need to train N × (N – 1) / 2 classifiers.
• Run an image through all 45 classifiers and see which class wins the most duels.
• Main advantage of OvO is each classifier only needs to be trained on the part of
the training set for the two classes that it must distinguish
Multi Label / Output Classification

• In multilabel classification, multiple classes for each instance should be

output.
• e.g., the classifier has been trained to recognize three faces, A, B, and C; then
when it is shown a picture of A and C, it should output [1, 0, 1]

• In multioutput-multiclass or simply multioutput classification each label can

be multiclass, i.e., it can have more than two possible values.
• e.g., in a system to remove noise from images, input is a noisy digit image, output a
clean digit image, represented as an array of pixel intensities
• classifier’s output is multilabel - one label per pixel
• each label can have multiple values (pixel intensity ranges from 0 to 255
Thank You!
In our next segment: Model Evaluation
Model Evaluation

• Metrics for Performance Evaluation

• How to evaluate the performance of a model?

• Methods for Performance Evaluation

Metrics for Performance Evaluation
Focus on the predictive capability of a model
• Confusion Matrix

PREDICTED CLASS
a: TP (true positive)
Class=Yes Class=No b: FN (false negative)
c: FP (false positive)
ACTUAL Class=Yes a b d: TN (true negative)
CLASS
Class=No c d
Metrics for Performance Evaluation…

PREDICTED CLASS

Class=Yes Class=No

ACTUAL Class=Yes a b
(TP) (FN)
CLASS
Class=No c d
(FP) (TN)

• Most widely-used metric:

ad TP  TN
Accuracy  
a  b  c  d TP  TN  FP  FN
Limitation of Accuracy

• Consider a 2-class problem

• Number of Class 0 examples = 9990
• Number of Class 1 examples = 10

• If model predicts everything to be class 0, accuracy is 9990/10000 = 99.9 %

• Accuracy is misleading because model does not detect any class 1 example
Measures for Imbalanced Classes

a
Precision (p) 
ac
a
Recall (r) 
ab
2rp 2a
F - measure (F)  
r  p 2a  b  c

wa  w d
Weighted Accuracy  1 4

wa  wb wc w d
1 2 3 4
Methods for Performance Evaluation
How to obtain a reliable estimate of performance?
• Performance of a model may depend on other factors besides the learning
algorithm:
• Class distribution
• Cost of misclassification
• Size of training and test sets
Learning Curve

 Learning curve shows

how accuracy changes
with varying sample
size
 Requires a sampling
schedule for creating
learning curve:
 Arithmetic
sampling
(Langley, et al)
 Geometric
sampling
(Provost et al)

Effect of small sample

size:
- Bias in the
estimate
Methods of Estimation

• Holdout
• Reserve 2/3 for training and 1/3 for testing
• Random subsampling
• Repeated holdout
• Cross validation
• Partition data into k disjoint subsets
• k-fold: train on k-1 partitions, test on the remaining one
• Leave-one-out: k=n
• Stratified sampling
• oversampling vs undersampling
• Bootstrap
• Sampling with replacement
ROC (Receiver Operating Characteristic)

• Developed in 1950s for signal detection theory to analyze noisy signals

• Characterize the trade-off between positive hits and false alarms
• ROC curve plots TP (on the y-axis) against FP (on the x-axis)
• Performance of each classifier represented as a point on the ROC curve
• changing the threshold of algorithm, sample distribution or cost matrix changes
the location of the point
ROC Curve

• 1-dimensional data set containing 2 classes (positive and

negative)
• any points located at x > t is classified as positive

At threshold t:
TP=0.5, FN=0.5, FP=0.12, TN=0.88
ROC Curve

(TP,FP):
• (0,0): declare everything
to be negative class
• (1,1): declare everything
to be positive class
• (1,0): ideal

• Diagonal line:
• Random guessing
• Below diagonal line:
• prediction is opposite of the
true class
Using ROC for Model Comparison

 No model consistently
outperform the other
 M1 is better for small FPR
 M2 is better for large FPR

 Area Under the ROC curve

 Ideal:
 Area = 1
 Random guess:
 Area = 0.5
How to Construct an ROC curve

Instance P(+|A) True Class

• Use classifier that produces
1 0.95 +
posterior probability for each
2 0.93 + test instance P(+|A)
3 0.87 - • Sort the instances according to
4 0.85 - P(+|A) in decreasing order
5 0.85 - • Apply threshold at each unique
6 0.85 + value of P(+|A)

7 0.76 - • Count the number of TP, FP,

TN, FN at each threshold
8 0.53 +
• TP rate, TPR = TP/(TP+FN)
9 0.43 -
• FP rate, FPR = FP/(FP + TN)
10 0.25 +
How to construct an ROC curve

ROC Curve:

Class + - + - - - + - + +
P
Threshol 0.25 0.43 0.53 0.76 0.85 0.85 0.85 0.87 0.93 0.95 1.00

d >= TP 5 4 4 3 3 3 3 2 2 1 0

FP 5 5 4 4 3 2 1 1 0 0 0

TN 0 0 1 1 2 3 4 4 5 5 5

FN 0 1 1 2 2 2 2 3 3 4 5

TPR 1 0.8 0.8 0.6 0.6 0.6 0.6 0.4 0.4 0.2 0

FPR 1 1 0.8 0.8 0.6 0.4 0.2 0.2 0 0 0

Thank You!
In our next segment: Hyperparameter Optimization
Hyperparameters
Machine Learning systems use many parameters internally

• Gradient Descent
• e.g., Learning rate, how long to run
• Mini-batch
• Batch size
• Regularization constant
• Many Others
• will be discussed in upcoming sessions
Hyperparameter Optimization

• Also called metaparameter optimization

• Also called tuning

• How to find best values of hyperparameters?

Tuning By Hand

• Just fiddle with the parameters until you get the results you want

• Probably the most common type of hyperparameter optimization

• Upsides: the results are generally pretty good…

• Downsides: lots of effort, and no theoretical guarantees

Grid Search

• Define some grid of parameters you want to try

• Try all the parameter values in the grid
• By running the whole system for each setting of parameters
• Then choose the setting with the best result
• Essentially a brute force method
Downsides of Grid Search

• As the number of parameters increases, the cost of grid search increases

exponentially!
• Need some way to choose the grid properly
• Something this can be as hard as the original hyperparameter
optimization
• Can’t take advantage of any insight you have about the system!
Making Grid Search Fast

• Early stopping to the rescue

• Can run all the grid points for one epoch, then discard the half that performed worse,
then run for another epoch, discard half, and continue.

• Can take advantage of parallelism

• Run all the different parameter settings independently on different servers in a
cluster.
• An embarrassingly parallel task.
• Downside: doesn’t reduce the energy cost.
One Variant: Random Search

• This is just grid search, but with randomly chosen points instead of points
on a grid.
• RandomSearchCV

• This solves the curse of dimensionality

• Don’t need to increase the number of grid points exponentially as the number
of dimensions increases.

• Problem: with random search, not necessarily going to get anywhere near the
optimal parameters in a finite sample.
An Alternative: Bayesian Optimization

• Statistical approach for minimizing noisy black-box functions.

• Idea: learn a statistical model of the function from hyperparameter values to

the loss function
• Then choose parameters to minimize the loss

• Main benefit: choose the hyperparameters to test not at random, but in a way that
gives the most information about the model
• This lets it learn faster than grid search
Effect of Bayesian Optimization

• Downside: it’s a pretty heavyweight method

• The updates are not as simple-to-implement as grid search

• Upside: empirically it has been demonstrated to get better results in fewer

experiments
• Compared with grid search and random search

• Pretty widely used method

• Lots of research opportunities here.
Cross-Validation

• Partition part of the available data to create an validation dataset that we don’t
use for training.

• Then use that set to evaluate the hyperparameters.

• Typically, multiple rounds of cross-validation are performed using different

partitions
• Can get a very good sense of how good the hyperparameters are
• But at a significant computational cost!
Thank You!
In our next segment: Machine Learning Pipeline
Machine Learning Pipeline

Dr. Sugata Ghosal

[email protected]
In This Segment

• What is MLOps
• DevOps vs MLOps

• Level 0 MLOps
• Continuous Training

• Level 1 MLOps
• Continuous Integration, Delivery

• Frameworks
What is MLOps?
Apply DevOps principles to ML systems
• An engineering culture and practice that aims at unifying ML system
development (Dev) and ML system operation (Ops).

• Automation and monitoring at all steps of ML system construction, including

integration, testing, releasing, deployment and infrastructure management.

• Data scientists can implement and train an ML model with predictive

performance on an offline validation (holdout) dataset, given relevant training
data for their use case.

• However, the real challenge is building an integrated ML system and to

continuously operate it in production.
Ecosystem of ML System Components
A small fraction of a real-world ML system is composed of the ML code
DevOps Vs. MLOps

• DevOps for developing and operating large-scale software systems provides

benefits such as
• shortening the development cycles
• increasing deployment velocity, and
• dependable releases.
• Two key concepts
• Continuous Integration (CI)
• Continuous Delivery (CD)
• An ML system is a software system, so similar practices apply to reliably build
and operate at scale.
• However, ML systems differ from other software systems
• Team skills: focus on exploratory data analysis, model development, and
experimentation.
• Development: ML is experimental in nature.
• The challenge is tracking what worked and what did not, maintaining reproducibility, and
maximizing code reusability.
• Testing: Additional testing needed for data validation, trained model quality
evaluation, and model validation.
DevOps Vs. MLOps

• Deployment: a multi-step pipeline to automatically retrain and deploy model.

• adds complexity
• Automation needed before deployment by data scientists to train and validate
new models.
• Production: ML models can have reduced performance due to constantly
evolving data profiles.
• Need to track summary statistics of data and
• monitor the online performance of model to send notifications or roll back for
suboptimal values
• ML and other software systems are similar in CI of source control, unit /
integration testing, and CD of the software module / package.
• However, in ML,
• CI is also about testing and validating data, data schemas, and models.
• CD is a system (an ML training pipeline) that automatically deploys another
service (model prediction service).
• Continuous training (CT) is a new property, unique to ML systems, that is
concerned with automatically retraining the model in production and
serving the models.
Manual ML Steps
• Manual, script-driven, and interactive process.
• Disconnection between ML and operations, possibly leading to training-
serving skew
• Infrequent release iterations. No CI, CD, active performance monitoring
• Deploy trained Model as a prediction service
• Deployment process is concerned only with deploying the trained model as a
prediction service, e.g., a microservice with a REST API
MLOps Level 1

• Perform continuous
training (CT) by
automating the ML
pipeline

• Achieves continuous
delivery of model
prediction service.
•
• Automated data and
model validation
steps to the pipeline

• Needs pipeline
triggers and metadata
management.
Data and Model Validation

• Data validation: Required prior to model training to decide whether to retrain

the model or stop the execution of the pipeline based on following
• Data values skews: significant changes in the statistical properties of data,
triggering retraining
• Data schema skews: downstream pipeline steps, including data processing and
model training, receives data that doesn't comply with the expected schema.
• stop the pipeline to release a fix or an update to the pipeline to handle these changes in
the schema.
• Schema skews include receiving unexpected features or with unexpected values, not
receiving all the expected features

• Model validation: Required after retraining the model with the new data.
Evaluate and validate the model before promoting to production. This offline
model validation step consists of
• Producing evaluation metric using the trained model on test data to assess the
model quality.
• Comparing the evaluation metrics of production model, baseline model, or other
business-requirement models.
• Ensuring the consistency of model performance on various data segments
• Test model for deployment, including infrastructure compatibility and API
Level 2: CI/CD and automated pipeline automation
Stages of CI/CD Automation Pipeline

1) Development and experimentation: iteratively try new ML algorithms and

modeling. The output is the source code of the ML pipeline steps that are
then pushed to a source repository.
2) Pipeline continuous integration: build source code and run various tests.
The outputs of this stage are pipeline components (packages,
executables, and artifacts).
3) Pipeline continuous delivery: deploy artifacts produced by the CI stage to
the target environment.
4) Automated training: automatically executed in production based on a
schedule or trigger. The output is a trained model pushed to the model
registry.
5) Model continuous delivery: serve the trained model as a prediction
service for the predictions.
6) Monitoring: collect statistics on the model performance based on live
data. The output is a trigger to execute the pipeline or to execute a new
experiment cycle.
Stages of the CI/CD automated ML
pipeline
Continuous Integration

• Pipeline and its components are built, tested, and packaged when
• new code is committed or
• pushed to the source code repository.

• Besides building packages, container images, and executables, CI process

can include
• Unit testing feature engineering logic.
• Unit testing the different methods implemented in your model.
• For example, you have a function that accepts a categorical data column and you encode
the function as a one-hot feature.
• Testing for training convergence
• Testing for NaN values due to dividing by zero or manipulating small or large
values.
• Testing that each component in the pipeline produces the expected artifacts.
• Testing integration between pipeline components.
Continuous Delivery

• Continuously delivers new pipeline implementations to the target

environment
• prediction services of the newly trained model.
• For rapid and reliable continuous delivery of pipelines and models, consider
• Verifying the compatibility of the model with the target infrastructure
• e.g., required packages are installed in the serving environment
• Availability of memory, compute, and accelerator resources.
• Testing the prediction service by calling the service API for the updated model
• Testing prediction service performance, such as throughput, latency.
• Validating the data either for retraining or batch prediction.
• Verifying that models meet the predictive performance targets prior to deployment.
• Automated deployment to a test environment, triggered by new code to the
development branch.
• Semi-automated deployment to a pre-production environment, triggered by code
merging
• Manual deployment to a production from pre-production.
Frameworks
Cloud Vendors are providing MLOps framework
• https://cloud.google.com/solutions/machine-learning/mlops-continuous-deliver
y-and-automation-pipelines-in-machine-learning
• Kubeflow and Cloud Build

• Amazon AWS MLOps

• Microsoft Azure MLOps

Thank You!
In our next segment: Linear Regression

Int3209 - Data Mining: Week 5: Classification Model Improvements
No ratings yet
Int3209 - Data Mining: Week 5: Classification Model Improvements
56 pages
Model Selection On ML
No ratings yet
Model Selection On ML
49 pages
CSC4316 9
No ratings yet
CSC4316 9
40 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
"Classifiers": R & D Project by Under The Guidance of
No ratings yet
"Classifiers": R & D Project by Under The Guidance of
59 pages
AI351 Lecture 2 - Common Evaluation Metrics
No ratings yet
AI351 Lecture 2 - Common Evaluation Metrics
50 pages
06 EnsembleLearning
No ratings yet
06 EnsembleLearning
65 pages
ML Unit 1
No ratings yet
ML Unit 1
73 pages
Module 5 Advanced Classification Techniques
No ratings yet
Module 5 Advanced Classification Techniques
40 pages
Machine Learning - Lecture 5
No ratings yet
Machine Learning - Lecture 5
19 pages
Hands On Machine Learning 3 Edition
No ratings yet
Hands On Machine Learning 3 Edition
31 pages
Unit 4 ML
No ratings yet
Unit 4 ML
28 pages
Clase10 11
No ratings yet
Clase10 11
18 pages
Tuning Decision Trees Python
No ratings yet
Tuning Decision Trees Python
50 pages
Lec - 4
No ratings yet
Lec - 4
43 pages
Introduction To Classification - PPT Slides 1
No ratings yet
Introduction To Classification - PPT Slides 1
62 pages
L 13 Choose Your Own Algorithm D 07062024 111828am
No ratings yet
L 13 Choose Your Own Algorithm D 07062024 111828am
36 pages
Data Mining-Unit-3
No ratings yet
Data Mining-Unit-3
16 pages
Xchapter 1
No ratings yet
Xchapter 1
31 pages
Practical Issues
No ratings yet
Practical Issues
30 pages
AIML-HC Mod 03
No ratings yet
AIML-HC Mod 03
46 pages
Lecture 5 Evaluation - Classifer
No ratings yet
Lecture 5 Evaluation - Classifer
61 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
04 - Model Selection
No ratings yet
04 - Model Selection
62 pages
9b. Evaluation of Classifiers
No ratings yet
9b. Evaluation of Classifiers
4 pages
ML 2 PPT Unit 2
No ratings yet
ML 2 PPT Unit 2
214 pages
Question1 Answers Complete
No ratings yet
Question1 Answers Complete
4 pages
Unit3 7 Issues
No ratings yet
Unit3 7 Issues
24 pages
Machine Learning II
No ratings yet
Machine Learning II
61 pages
DL IT324a 4
No ratings yet
DL IT324a 4
52 pages
Chp8 Classification Basic Concepts - Lecture#8
No ratings yet
Chp8 Classification Basic Concepts - Lecture#8
40 pages
Lec5 Classification
No ratings yet
Lec5 Classification
27 pages
4 Classification
No ratings yet
4 Classification
20 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
Module 6
No ratings yet
Module 6
24 pages
A10 Model Performance v2 2up
No ratings yet
A10 Model Performance v2 2up
11 pages
Lecture 3b - Evaluation
No ratings yet
Lecture 3b - Evaluation
37 pages
Model Evaluation and Selection
No ratings yet
Model Evaluation and Selection
49 pages
Lecture 10
No ratings yet
Lecture 10
16 pages
08 Classification
No ratings yet
08 Classification
46 pages
Lecture 5 - Feature Extraction, Model Building & Evaluation
No ratings yet
Lecture 5 - Feature Extraction, Model Building & Evaluation
35 pages
Evaluating Machine Learning Algorithms and Model Selection
No ratings yet
Evaluating Machine Learning Algorithms and Model Selection
10 pages
Machine Learning
No ratings yet
Machine Learning
9 pages
CS 620 / DASC 600 Introduction To Data Science & Analytics: Lecture 8-Performance Evaluation
No ratings yet
CS 620 / DASC 600 Introduction To Data Science & Analytics: Lecture 8-Performance Evaluation
62 pages
MACHINELEARNING
No ratings yet
MACHINELEARNING
20 pages
CH-5 ML
No ratings yet
CH-5 ML
36 pages
Binary, Multi-Class & Multi-Label Classification
No ratings yet
Binary, Multi-Class & Multi-Label Classification
6 pages
6 Evaluarea Performantei
No ratings yet
6 Evaluarea Performantei
43 pages
Multiclass Classification
No ratings yet
Multiclass Classification
45 pages
Machine Learning
No ratings yet
Machine Learning
11 pages
Unit6 - 7 Issues
No ratings yet
Unit6 - 7 Issues
53 pages
3ML.02.MainConcepts Evaluation
No ratings yet
3ML.02.MainConcepts Evaluation
35 pages
Classification Metrics
No ratings yet
Classification Metrics
39 pages
Module 4 - Classification
No ratings yet
Module 4 - Classification
10 pages
Bi Unit 5
No ratings yet
Bi Unit 5
20 pages
ML - Mod2 Classification
No ratings yet
ML - Mod2 Classification
74 pages
Numerical Analysis II Essentials
From Everand
Numerical Analysis II Essentials
The Editors of REA
No ratings yet
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
From Everand
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
SUJAUL CHOWDHURY
No ratings yet
Master Fundamental Concepts of Math Olympiad: Maths, #1
From Everand
Master Fundamental Concepts of Math Olympiad: Maths, #1
Subbalakshmi Devaki
No ratings yet
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet

Lecture 3 1611410001002

Uploaded by

Lecture 3 1611410001002

Uploaded by

Model Selection and Training

Dr. Sugata Ghosal

• Model Selection and Training

• Select based on training data

• Better model is necessary for improving the prediction accuracy

• Cross-validation error is not satisfactory

• Shuffle the training dataset

• Multiclass classifiers (aka multinomial classifiers) can distinguish between more

• In multilabel classification, multiple classes for each instance should be

• In multioutput-multiclass or simply multioutput classification each label can

• Metrics for Performance Evaluation

• Methods for Performance Evaluation

• Most widely-used metric:

• Consider a 2-class problem

• If model predicts everything to be class 0, accuracy is 9990/10000 = 99.9 %

 Learning curve shows

Effect of small sample

• Developed in 1950s for signal detection theory to analyze noisy signals

• 1-dimensional data set containing 2 classes (positive and

 Area Under the ROC curve

Instance P(+|A) True Class

7 0.76 - • Count the number of TP, FP,

FPR 1 1 0.8 0.8 0.6 0.4 0.2 0.2 0 0 0

• Also called metaparameter optimization

• How to find best values of hyperparameters?

• Probably the most common type of hyperparameter optimization

• Upsides: the results are generally pretty good…

• Downsides: lots of effort, and no theoretical guarantees

• Define some grid of parameters you want to try

• As the number of parameters increases, the cost of grid search increases

• Early stopping to the rescue

• Can take advantage of parallelism

• This solves the curse of dimensionality

• Statistical approach for minimizing noisy black-box functions.

• Idea: learn a statistical model of the function from hyperparameter values to

• Downside: it’s a pretty heavyweight method

• Upside: empirically it has been demonstrated to get better results in fewer

• Pretty widely used method

• Then use that set to evaluate the hyperparameters.

• Typically, multiple rounds of cross-validation are performed using different

Dr. Sugata Ghosal

• Automation and monitoring at all steps of ML system construction, including

• Data scientists can implement and train an ML model with predictive

• However, the real challenge is building an integrated ML system and to

• DevOps for developing and operating large-scale software systems provides

• Deployment: a multi-step pipeline to automatically retrain and deploy model.

• Data validation: Required prior to model training to decide whether to retrain

1) Development and experimentation: iteratively try new ML algorithms and

• Besides building packages, container images, and executables, CI process

• Continuously delivers new pipeline implementations to the target

• Amazon AWS MLOps

• Microsoft Azure MLOps

You might also like