0% found this document useful (0 votes)

89 views19 pages

MLOps

Outliers can significantly impact logistic regression because they can skew the decision boundary due to the linear nature of the model. For instance, if an outlier is far from the main cluster of data, it may push the boundary closer to the remaining points, causing misclassification.

Uploaded by

ahmed77fouad23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

89 views19 pages

MLOps

Uploaded by

ahmed77fouad23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

MLOps

MLOps Principles
As machine learning and AI propagate in software products and services, we
need to establish best practices and tools to test, deploy, manage, and monitor
ML models in real-world production. In short, with MLOps we strive to
avoid “technical debt” in machine learning applications.
SIG MLOps defines “an optimal MLOps experience [as] one where Machine
Learning assets are treated consistently with all other software assets within a
CI/CD environment. Machine Learning models can be deployed alongside the
services that wrap them and the services that consume them as part of a
unified release process.” By codifying these practices, we hope to accelerate
the adoption of ML/AI in software systems and fast delivery of intelligent
software. In the following, we describe a set of important concepts in MLOps
such as Iterative-Incremental Development, Automation, Continuous
Deployment, Versioning, Testing, Reproducibility, and Monitoring.

Iterative-Incremental Process in MLOps

MLOps 1
The complete MLOps process includes three broad phases of “Designing the
ML-powered application”, “ML Experimentation and Development”, and “ML
Operations”.
The first phase is devoted to business understanding, data
understanding and designing the ML-powered software. In this stage, we
identify our potential user, design the machine learning solution to solve its
problem, and assess the further development of the project. Mostly, we would
act within two categories of problems - either increasing the productivity of the
user or increasing the interactivity of our application.

Initially, we define ML use-cases and prioritize them. The best practice for ML
projects is to work on one ML use case at a time. Furthermore,
the design phase aims to inspect the available data that will be needed to train
our model and to specify the functional and non-functional requirements of our
ML model. We should use these requirements to design the architecture of the
ML-application, establish the serving strategy, and create a test suite for the
future ML model.

MLOps 2
The follow-up phase “ML Experimentation and Development” is devoted to
verifying the applicability of ML for our problem by implementing Proof-of-
Concept for ML Model. Here, we run iteratively different steps, such
as identifying or polishing the suitable ML algorithm for our problem, data
engineering, and model engineering. The primary goal in this phase is to deliver
a stable quality ML model that we will run in production.
The main focus of the “ML Operations” phase is to deliver the previously
developed ML model in production by using established DevOps practices such
as testing, versioning, continuous delivery, and monitoring.
All three phases are interconnected and influence each other. For example, the
design decision during the design stage will propagate into the experimentation
phase and finally influence the deployment options during the final operations
phase.

Automation
The level of automation of the Data, ML Model, and Code pipelines determines
the maturity of the ML process. With increased maturity, the velocity for the
training of new models is also increased. The objective of an MLOps team is to
automate the deployment of ML models into the core software system or as a
service component. This means, to automate the complete ML-workflow steps
without any manual intervention. Triggers for automated model training and
deployment can be calendar events, messaging, monitoring events, as well as
changes on data, model training code, and application code.

Automated testing helps discovering problems quickly and in early stages. This
enables fast fixing of errors and learning from mistakes.

To adopt MLOps, we see three levels of automation, starting from the initial
level with manual model training and deployment, up to running both ML and
CI/CD pipelines automatically.

1. Manual process. This is a typical data science process, which is performed

at the beginning of implementing ML. This level has an experimental and
iterative nature. Every step in each pipeline, such as data preparation and
validation, model training and testing, are executed manually. The common
way to process is to use Rapid Application Development (RAD) tools, such
as Jupyter Notebooks.

MLOps 3
2. ML pipeline automation. The next level includes the execution of model
training automatically. We introduce here the continuous training of the
model. Whenever new data is available, the process of model retraining is
triggered. This level of automation also includes data and model validation
steps.

3. CI/CD pipeline automation. In the final stage, we introduce a CI/CD system

to perform fast and reliable ML model deployments in production. The core
difference from the previous step is that we now automatically build, test,
and deploy the Data, ML Model, and the ML training pipeline components.

The following picture shows the automated ML pipeline with CI/CD routines:

Figure adopted from “MLOps: Continuous delivery and automation pipelines in

machine learning”
The MLOps stages that reflect the process of ML pipeline automation are
explained in the following table:

MLOps Stage Output of the Stage Execution

MLOps 4
Source code for pipelines: Data
Development & Experimentation (ML extraction, validation, preparation,
algorithms, new ML models) model training, model evaluation, model
testing

Pipeline Continuous Integration (Build source Pipeline components to be deployed:

code and run tests) packages and executables.

Pipeline Continuous Delivery (Deploy Deployed pipeline with new

pipelines to the target environment) implementation of the model.

Automated Triggering (Pipeline is

Trained model that is stored in the model
automatically executed in production.
registry.
Schedule or trigger are used)

Model Continuous Delivery (Model serving Deployed model prediction service (e.g.
for prediction) model exposed as REST API)

Monitoring (Collecting data about the model Trigger to execute the pipeline or to
performance on live data) start a new experiment cycle.

After analyzing the MLOps Stages, we might notice that the MLOps setup
requires several components to be installed or prepared. The following table
lists those components:

MLOps Setup Components Description

Versioning the Code, Data, and ML

Source Control
Model artifacts.

Using CI tools for (1) Quality

assurance for all ML artifacts, and (2)
Test & Build Services
Building packages and executables for
pipelines.

Using CD tools for deploying pipelines

Deployment Services
to the target environment.

A registry for storing already trained ML

Model Registry
models.

Preprocessing input data as features to

Feature Store be consumed in the model training
pipeline and during the model serving.

Tracking metadata of model training,

for example model name, parameters,
ML Metadata Store
training data, test data, and metric
results.

MLOps 5
Automating the steps of the ML
ML Pipeline Orchestrator
experiments.

Continuous Delivery (CD) concerns with delivery of an ML training pipeline

that automatically deploys another the ML model prediction service.

Continuous Training (CT) is unique to ML systems property, which

automatically retrains ML models for re-deployment.

Continuous Monitoring (CM) concerns with monitoring production data

and model performance metrics, which are bound to business metrics.

Versioning
The goal of the versioning is to treat ML training scrips, ML models and data
sets for model training as first-class citizens in DevOps processes by tracking
ML models and data sets with version control systems. The common reasons
when ML model and data changes (according to SIG MLOps) are the following:

ML models can be retrained based upon new training data.

Models may be retrained based upon new training approaches.

Models may be self-learning.

Models may degrade over time.

Models may be deployed in new applications.

MLOps 6
Models may be subject to attack and require revision.

Models can be quickly rolled back to a previous serving version.

Corporate or government compliance may require audit or investigation on

both ML model or data, hence we need access to all versions of the
productionized ML model.

Data may reside across multiple systems.

Data may only be able to reside in restricted jurisdictions.

Data storage may not be immutable.

Data ownership may be a factor.

Analogously to the best practices for developing reliable software systems,

every ML model specification (ML training code that creates an ML model)
should go through a code review phase. Furthermore, every ML model
specification should be versioned in a VCS to make the training of ML models
auditable and reproducible.

The complete development pipeline includes three essential components, data

pipeline, ML model pipeline, and application pipeline. In accordance with this
separation we distinguish three scopes for testing in ML systems: tests for
features and data, tests for model development, and tests for ML
infrastructure.

Features and Data Tests

Data validation: Automatic check for data and features schema/domain.

Action: In order to build a schema (domain values), calculate statistics

from the training data. This schema can be used as expectation
definition or semantic role for input data during training and serving
stages.

Features importance test to understand whether new features add a

predictive power.

Action: Compute correlation coefficient on features columns.

Action: Train model with one or two features.

Action: Use the subset of features “One of k left out and train a set of
different models.

Measure data dependencies, inference latency, and RAM usage for

each new feature. Compare it with the predictive power of the newly
added features.

Drop out unused/deprecated features from your infrastructure and

document it.

MLOps 8
Features and data pipelines should be policy-compliant (e.g. GDPR). These
requirements should be programmatically checked in both development
and production environments.

Feature creation code should be tested by unit tests (to capture bugs in
features).

Tests for Reliable Model Development

We need to provide specific testing support for detecting ML-specific errors.

Testing ML training should include routines, which verify that algorithms

make decisions aligned to business objective. This means that ML
algorithm loss metrics (MSE, log-loss, etc.) should correlate with business
impact metrics (revenue, user engagement, etc.)

Action: The loss metrics - impact metrics relationship, can be measured

in small scale A/B testing using an intentionally degraded model.

Assessing the cost of more sophisticated ML models.

Action: ML model performance should be compared to the simple

baseline ML model (e.g. linear model vs neural network).

Validating performance of a model.

It is recommended to separate the teams and procedures collecting the

training and test data to remove the dependencies and avoid false
methodology propagating from the training set to the test set (source).

Action: Use an additional test set, which is disjoint from the training and
validation sets. Use this test set only for a final evaluation.

Fairness/Bias/Inclusion testing for the ML model performance.

MLOps 9
Action: Collect more data that includes potentially under-represented
categories.

Action: Examine input features if they correlate with protected user

categories.

Conventional unit testing for any feature creation, ML model specification

code (training) and testing.

Model governance testing (coming soon)

ML infrastructure test
Training the ML models should be reproducible, which means that training
the ML model on the same data should produce identical ML models.

Diff-testing of ML models relies on deterministic training, which is hard

to achieve due to non-convexity of the ML algorithms, random seed
generation, or distributed ML model training.

Action: determine the non-deterministic parts in the model training code

base and try to minimize non-determinism.

Test ML API usage. Stress testing.

Action: Unit tests to randomly generate input data and training the
model for a single optimization step (e.g gradient descent).

Action: Crash tests for model training. The ML model should restore
from a checkpoint after a mid-training crash.

Test the algorithmic correctness.

Action: Unit test that it is not intended to completing the ML model

training but to train for a few iterations and ensure that loss decreases
while training.

Avoid: Diff-testing with previously build ML models because such tests

are hard to maintain.

Integration testing: The full ML pipeline should be integration tested.

Action: Create a fully automated test that regularly triggers the entire
ML pipeline. The test should validate that the data and code

MLOps 10
successfully finish each stage of training and the resulting ML model
performs as expected.

All integration tests should be run before the ML model reaches the
production environment.

Validating the ML model before serving it.

Action: Setting a threshold and testing for slow degradation in model

quality over many versions on a validation set.

Action: Setting a threshold and testing for sudden performance drops in

a new version of the ML model.

ML models are canaried before serving.

Action: Testing that an ML model successfully loads into production

serving and the prediction on real-life data is generated as expected.

Testing that the model in the training environment gives the same score as
the model in the serving environment.

Action: The difference between the performance on the holdout data

and the “nextday” data. Some difference will always exist. Pay attention
to large differences in performance between holdout and “nextday”
data because it may indicate that some time-sensitive features cause
ML model degradation.

Action: Avoid result differences between training and serving

environments. Applying a model to an example in the training data and
the same example at serving should result in the same prediction. A
difference here indicates an engineering error.

Monitoring
Once the ML model has been deployed, it need to be monitored to assure that
the ML model performs as expected. The following check list for model
monitoring activities in production is adopted from “The ML Test Score: A
Rubric for ML Production Readiness and Technical Debt Reduction” by E.Breck
et al. 2017:

Monitor dependency changes throughout the complete pipeline result in

notification.

Data version change.

MLOps 11
Changes in source system.

Dependencies upgrade.

Monitor data invariants in training and serving inputs: Alert if data does not
match the schema, which has been specified in the training step.

Action: tuning of alerting threshold to ensure that alerts remain useful

and not misleading.

Monitor whether training and serving features compute the same value.

Since the generation of training and serving features might take place
on physically separated locations, we must carefully test that these
different code paths are logically identical.

Action: (1) Log a sample of the serving traffic. (2) Compute distribution
statistics (min, max, avg, values, % of missing values, etc.) on the
training features and the sampled serving features and ensure that they
match.

Monitor the numerical stability of the ML model.

Action: trigger alerts for the occurrence of any NaNs or infinities.

Monitor computational performance of an ML system. Both dramatic and

slow-leak regression in computational performance should be notified.

Action: measure the performance of versions and components of code,

data, and model by pre-setting the alerting threshold.

Action: collect system usage metrics like GPU memory allocation,

network traffic, and disk usage. These metrics are useful for cloud
costs estimations.

Monitor how stale the system in production is.

Measure the age of the model. Older ML models tend to decay in

performance.

Action: Model monitoring is a continuous process, therefore it is

important to identify the elements for monitoring and create a strategy
for the model monitoring before reaching production.

Monitor the processes of feature generation as they have impact on the

model.

Action: re-run feature generation on a frequent basis.

MLOps 12
Monitor degradation of the predictive quality of the ML model on served
data. Both dramatic and slow-leak regression in prediction quality should be
notified.

Degradation might happened due to changes in data or differing code

paths, etc.

Action: Measure statistical bias in predictions (avg in predictions in a

slice of data). Models should have nearly zero bias.

Action: If a label is available immediately after the prediction is made,

we can measure the quality of prediction in real-time and identify
problems.

The picture below shows that the model monitoring can be implemented by
tracking the precision, recall, and F1-score of the model prediction along with
the time. The decrease of the precision, recall, and F1-score triggers the model
retraining, which leads to model recovery.

“ML Test Score” System

MLOps 13
The “ML Test Score” measures the overall readiness of the ML system for
production. The final ML Test Score is computed as follows:

For each test, half a point is awarded for executing the test manually, with
the results documented and distributed.

A full point is awarded if the there is a system in place to run that test
automatically on a repeated basis.

Sum the score of each of the four sections individually: Data Tests, Model
Tests, ML Infrastructure Tests, and Monitoring.

The final ML Test Score is computed by taking the minimum of the scores
aggregated for each of the sections: Data Tests, Model Tests, ML
Infrastructure Tests, and Monitoring.

After computing the ML Test Score, we can reason about the readiness of the
ML system for production. The following table provides the interpretation
ranges:

Points Description

0 More of the research project than a productionized system.

Not totally untested, but it is worth considering the possibility of

(0,1]
serious holes in reliability.

There has been first pass at basic productionization, but additional

(1,2]
investment may be needed.

Reasonably tested, but it is possible that more of those tests and

(2,3]
procedures may be automated.

(3,5] Strong level of automated testing and monitoring.

>5 Exceptional level of automated testing and monitoring.

Source: “The ML Test Score: A Rubric for ML Production Readiness and

Technical Debt Reduction” by E.Breck et al. 2017

Reproducibility
Reproducibility in a machine learning workflow means that every phase of
either data processing, ML model training, and ML model deployment should
produce identical results given the same input.

Phase Challenges How to Ensure Reproducibility

MLOps 14
1) Always backup your data.2)
Saving a snapshot of the data set
Generation of the training data
(e.g. on the cloud storage).3) Data
can't be reproduced (e.g due
Collecting Data sources should be designed with
to constant database changes
timestamps so that a view of the
or data loading is random)
data at any point can be
retrieved.4) Data versioning.

Scenarios:1) Missing values are

imputed with random or mean
1) Feature generation code should
values.2) Removing labels
Feature be taken under version control.2)
based on the percentage of
Engineering Require reproducibility of the
observation.3) Non-
previous step "Collecting Data"
deterministic feature extraction
methods.

1) Ensure the order of features is

always the same.2) Document and
automate feature transformation,
Model Training / such as normalization.3) Document
Non-determinism
Model Build and automate hyperparameter
selection.4) For ensemble learning:
document and automate the
combination of ML models.

1) Training the ML model has 1) Software versions and

been performed with a dependencies should match the
software version that is production environment.2) Use a
Model different to the production container (Docker) and document
Deployment environment.2) The input data, its specification, such as image
which is required by the ML version.3) Ideally, the same
model is missing in the programming language is used for
production environment. training and deployment.

Loosely Coupled Architecture (Modularity)

According to Gene Kim et al., in their book “Accelerate”, “high performance [in
software delivery] is possible with all kinds of systems, provided that systems
—and the teams that build and maintain them — are loosely coupled. This key
architectural property enables teams to easily test and deploy individual
components or services even as the organization and the number of systems it
operates grow—that is, it allows organizations to increase their productivity as
they scale.”

MLOps 15
Additionally, Gene Kim et al., recommend to “use a loosely coupled
architecture. This affects the extent to which a team can test and deploy their
applications on demand, without requiring orchestration with other services.
Having a loosely coupled architecture allows your teams to work
independently, without relying on other teams for support and services, which
in turn enables them to work quickly and deliver value to the organization.”

Regarding ML-based software systems, it can be more difficult to achieve loose

coupling between machine learning components than for traditional software
components. ML systems have weak component boundaries in several ways.
For example, the outputs of ML models can be used as the inputs to another
ML model and such interleaved dependencies might affect one another during
training and testing.
Basic modularity can be achieved by structuring the machine learning project.
To set up a standard project structure, we recommend using dedicated
templates such as

Cookiecutter Data Science Project Template

The Data Science Lifecycle Process Template

PyScaffold

ML-based Software Delivery Metrics (4 metrics from

“Accelerate”)
In the most resent study on the state of DevOps, the authors emphasized four
key metrics that capture the effectivenes of the software development and
delivery of elite/high performing organisations: Deployment Frequency, Lead
Time for Changes, Mean Time To Restore, and Change Fail Percentage. These
metrics have been found useful to measure and improve ones ML-based
software delivery. In the following table, we give the definition of each of the
metricts and make the connection to MLOps.

Metric DevOps MLOps

Deployment How often does your ML Model Deployment Frequency

Frequency organization deploy code depends on1) Model retraining
to production or release it requirements (ranging from less
to end-users? frequent to online training). Two aspects
are crucial for model retraining1.1)
Model decay metric.1.2) New data
availability.2) The level of automation of

MLOps 16
the deployment process, which might
range between *manual deployment*
and *fully automated CI/CD pipeline*.

ML Model Lead Time for Changes

depends on1) Duration of the
How long does it take to explorative phase in Data Science in
Lead Time for go from code committed to order to finalize the ML model for
Changes code successfully running deployment/serving.2) Duration of the
in production? ML model training.3) The number and
duration of manual steps during the
deployment process.

ML Model MTTR depends on the

How long does it generally number and duration of manually
take to restore service performed model debugging, and model
when a service incident or deployment steps. In case, when the ML
Mean Time To
a defect that impacts users model should be retrained, then MTTR
Restore (MTTR)
occurs (e.g., unplanned also depends on the duration of the ML
outage or service model training. Alternatively, MTTR
impairment)? refers to the duration of the rollback of
the ML model to the previous version.

What percentage of
changes to production or ML Model Change Failure Rate can be
released to users result in expressed in the difference of the
degraded service (e.g., currently deployed ML model
Change Failure lead to service impairment performance metrics to the previous
Rate or service outage) and model's metrics, such as Precision,
subsequently require Recall, F-1, accuracy, AUC, ROC, false
remediation (e.g., require a positives, etc. ML Model Change Failure
hotfix, rollback, fix Rate is also related to A/B testing.
forward, patch)?

To improve the effectiveness of the ML development and delivery process one

should measure the above four key metrics. A practical way to achieve such
effectiveness is to implement the CI/CD pipeline first and adopt test-driven
development for Data, ML Model, and Software Code pipelines.

Summary of MLOps Principles and Best Practices

The complete ML development pipeline includes three levels where changes
can occur: Data, ML Model, and Code. This means that in machine learning-
based systems, the trigger for a build might be the combination of a code

MLOps 17
change, data change or model change. The following table summarizes the
MLOps principles for building ML-based software:

MLOps
Data ML Model Code
Principles

1) Data
preparation 1) ML model training
1) Application
pipelines2) pipeline2) ML model
Versioning code2)
Features store3) (object)3) Hyperparameters4)
Configurations
Datasets4) Experiment tracking
Metadata

1) Model specification is unit

tested2) ML model training
pipeline is integration tested3)
ML model is validated before
1) Data Validation 1) Unit testing2)
being operationalized4) ML
(error Integration
model staleness test (in
Testing detection)2) testing for the
production)5) Testing ML
Feature creation end-to-end
model relevance and
unit testing pipeline
correctness6) Testing non-
functional requirements
(security, fairness,
interpretability)

1) Data 1) Data engineering pipeline2) 1) ML model

transformation2) ML model training pipeline3) deployment with
Automation
Feature creation Hyperparameter/Parameter CI/CD2)
and manipulation selection Application build

1) Versions of all
dependencies in
dev and prod are
1) Backup data2) 1) Hyperparameter tuning is identical2) Same
Data identical between dev and technical stack
versioning3) prod2) The order of features for dev and
Extract is the same3) Ensemble production
Reproducibility
metadata4) learning: the combination of environments3)
Versioning of ML models is same4)The Reproducing
feature model pseudo-code is results by
engineering documented providing
container images
or virtual
machines

MLOps 18
1) Feature store
1) Containerization of the ML
is used in dev 1) On-premise,
Deployment stack2) REST API3) On-
and prod cloud, or edge
premise, cloud, or edge
environments

1) Data
distribution
1) ML model decay2) 1) Predictive
changes (training
Numerical stability3) quality of the
Monitoring vs. serving
Computational performance application on
data)2) Training
of the ML model serving data
vs serving
features

Along with the MLOps principles, following the set of best practices should help
reducing the “technical debt” of the ML project:

MLOps Best
Data ML Model Code
Practices

1) Data sources2)
1) Model selection
Decisions, 1) Deployment
criteria2) Design of
Documentation how/where to get process2) How to
experiments3) Model
data3) Labelling run locally
pseudo-code
methods

1) A folder that
1) Data folder for raw
contains the trained 1) A folder for
and processed data2)
model2) A folder for bash/shell scripts2)
A folder for data
Project notebooks3) A folder A folder for tests3)
engineering
Structure for feature A folder for
pipeline3) Test folder
engineering4)A folder deployment files
for data engineering
for ML model (e.g Docker files)
methods
engineering

MLOps 19

AWS MLOps Slides
No ratings yet
AWS MLOps Slides
185 pages
MLOps Notes
100% (1)
MLOps Notes
48 pages
MLOps Specialization Course January 2024!5!15
No ratings yet
MLOps Specialization Course January 2024!5!15
11 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
91 pages
MLOPS Unit 1
No ratings yet
MLOPS Unit 1
10 pages
Session 29 - MLOps Tools Overview-New
100% (1)
Session 29 - MLOps Tools Overview-New
40 pages
A Seminar Report On Devops: Bachelor of Technology in Electronics & Communication Engineering
100% (11)
A Seminar Report On Devops: Bachelor of Technology in Electronics & Communication Engineering
38 pages
Devopssec
100% (1)
Devopssec
86 pages
Agile Transformation Operating Model
100% (7)
Agile Transformation Operating Model
12 pages
The Big Book of Mlops
No ratings yet
The Big Book of Mlops
49 pages
MLOps Continuous Delivery For ML On AWS
No ratings yet
MLOps Continuous Delivery For ML On AWS
69 pages
Kubernetes Notes
No ratings yet
Kubernetes Notes
17 pages
Mlops 101
No ratings yet
Mlops 101
33 pages
ML Ops
100% (1)
ML Ops
19 pages
MLOps Interview QnA
No ratings yet
MLOps Interview QnA
19 pages
Lecture+Notes Intro To MLOps Session3
No ratings yet
Lecture+Notes Intro To MLOps Session3
8 pages
CI - CD Pipeline With Jenkins On AWS - by Edmar Barros - FAUN - Medium
100% (1)
CI - CD Pipeline With Jenkins On AWS - by Edmar Barros - FAUN - Medium
15 pages
Mlops: Continuous Delivery and Automation Pipelines in Machine Learning
100% (1)
Mlops: Continuous Delivery and Automation Pipelines in Machine Learning
14 pages
MLOps Google Cloud
No ratings yet
MLOps Google Cloud
37 pages
MLops Concept
No ratings yet
MLops Concept
20 pages
MLOps - Definitions, Tools and Challenges
100% (1)
MLOps - Definitions, Tools and Challenges
8 pages
Mlops Productionalization Brochure
No ratings yet
Mlops Productionalization Brochure
7 pages
MLOps Buyers Guide by Seldon
No ratings yet
MLOps Buyers Guide by Seldon
11 pages
Machine Learning Operations MLOps Overview Definition and Architecture
No ratings yet
Machine Learning Operations MLOps Overview Definition and Architecture
14 pages
Atlassian Axelos ITIL4 Guide
No ratings yet
Atlassian Axelos ITIL4 Guide
38 pages
DASA DevOps Fundamentals Mock Exam 1
100% (1)
DASA DevOps Fundamentals Mock Exam 1
22 pages
MLOps Specialization Course
No ratings yet
MLOps Specialization Course
29 pages
Implementation of MLOps 1710672760
No ratings yet
Implementation of MLOps 1710672760
23 pages
Pega DevOps Release Pipeline Overview
100% (3)
Pega DevOps Release Pipeline Overview
200 pages
Dzone Continuous Delivery Volume 3
No ratings yet
Dzone Continuous Delivery Volume 3
38 pages
ATARC AIDA Guidebook - FINAL 40
No ratings yet
ATARC AIDA Guidebook - FINAL 40
8 pages
ATARC AIDA Guidebook - FINAL 38
No ratings yet
ATARC AIDA Guidebook - FINAL 38
6 pages
MLops
No ratings yet
MLops
43 pages
ATARC AIDA Guidebook - FINAL 37
No ratings yet
ATARC AIDA Guidebook - FINAL 37
5 pages
Introduction To Mlops
No ratings yet
Introduction To Mlops
34 pages
DT166g FinalReport 2
No ratings yet
DT166g FinalReport 2
39 pages
Mlops - Definitions, Tools and Challenges: Elated Ork
No ratings yet
Mlops - Definitions, Tools and Challenges: Elated Ork
8 pages
7 - From ML To Production
No ratings yet
7 - From ML To Production
23 pages
Operationalizing Mchine Learning - An Interview Study
No ratings yet
Operationalizing Mchine Learning - An Interview Study
20 pages
Mlops: A Reality!!!!
No ratings yet
Mlops: A Reality!!!!
5 pages
Kreuzberger, Kühl, & Hirschl (202 - ) Machine Learning Operations (MLOps) Overview, Definition, and Architecture
No ratings yet
Kreuzberger, Kühl, & Hirschl (202 - ) Machine Learning Operations (MLOps) Overview, Definition, and Architecture
13 pages
MLOps Asilla 20221124
No ratings yet
MLOps Asilla 20221124
16 pages
MLOps Interview Study CSCW24
No ratings yet
MLOps Interview Study CSCW24
34 pages
MLOps
No ratings yet
MLOps
16 pages
Nasscom Mlops Playbook 2022
No ratings yet
Nasscom Mlops Playbook 2022
55 pages
Devops Notes
No ratings yet
Devops Notes
3 pages
Step-By-Step Guide To Gain MLOps Skills
No ratings yet
Step-By-Step Guide To Gain MLOps Skills
6 pages
MLOps Components Tools Process and Metrics - A Systematic Literature Review
No ratings yet
MLOps Components Tools Process and Metrics - A Systematic Literature Review
10 pages
MLOPs Original
No ratings yet
MLOPs Original
27 pages
MLOps Specialization Course January 2024
No ratings yet
MLOps Specialization Course January 2024
24 pages
MLOPs? ?
No ratings yet
MLOPs? ?
73 pages
A Guide To MLOps Canonical PDF 1680071776
No ratings yet
A Guide To MLOps Canonical PDF 1680071776
19 pages
Operationalizing Machine Learning - An Interview StudyOperationalizing Machine Learning - An Interview Study - 2209.09125
No ratings yet
Operationalizing Machine Learning - An Interview StudyOperationalizing Machine Learning - An Interview Study - 2209.09125
40 pages
Base Paper 3 - Master Theises
No ratings yet
Base Paper 3 - Master Theises
75 pages
Automating The Training and Deployment of Models in MLOps by Integrating Systems With Machine Learning
No ratings yet
Automating The Training and Deployment of Models in MLOps by Integrating Systems With Machine Learning
11 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
36 pages
1 KJH
No ratings yet
1 KJH
4 pages
MLOPs Course
No ratings yet
MLOPs Course
24 pages
Chapter 5
No ratings yet
Chapter 5
20 pages
The Ultimate Guide To MLOps Ebook
No ratings yet
The Ultimate Guide To MLOps Ebook
10 pages
SRE Practitioner v1.0 Exam Study Guide - July2021
No ratings yet
SRE Practitioner v1.0 Exam Study Guide - July2021
94 pages
Diffusion Models Part3
No ratings yet
Diffusion Models Part3
24 pages
VIANOPS - Whitepaper 3 16
No ratings yet
VIANOPS - Whitepaper 3 16
18 pages
05 T-GCPPCA-A-m4-l6-file-en-27.en
No ratings yet
05 T-GCPPCA-A-m4-l6-file-en-27.en
29 pages
Unit 1
No ratings yet
Unit 1
21 pages
Machine Learning Operations A Mapping Study
No ratings yet
Machine Learning Operations A Mapping Study
9 pages
Devops 101
No ratings yet
Devops 101
77 pages
Presentation 1
No ratings yet
Presentation 1
5 pages
Step 2 - APN Navigate For AWS Competency Playbook
No ratings yet
Step 2 - APN Navigate For AWS Competency Playbook
43 pages
Tantithamthavorn Et Al - 2025
No ratings yet
Tantithamthavorn Et Al - 2025
7 pages
54 Towards Regulatory-Compliant MLOps Oravizio's Journey From A Machine Learning Experiment To A Deployed Certified Medical Product
No ratings yet
54 Towards Regulatory-Compliant MLOps Oravizio's Journey From A Machine Learning Experiment To A Deployed Certified Medical Product
14 pages
Fabric Get Started
No ratings yet
Fabric Get Started
151 pages
SAfe QUestions
100% (1)
SAfe QUestions
12 pages
A Guide To Getting Started in Devops
100% (1)
A Guide To Getting Started in Devops
18 pages
Exin - Premium.DEVOPSF - by .VCEplus.40q-DEMO
No ratings yet
Exin - Premium.DEVOPSF - by .VCEplus.40q-DEMO
13 pages
Jenkins Interview Questions
No ratings yet
Jenkins Interview Questions
11 pages
Introduction To Jenkins
No ratings yet
Introduction To Jenkins
13 pages
Oose Unit 5
No ratings yet
Oose Unit 5
25 pages
3957 - Module 1 - Foundations of DevOps I
No ratings yet
3957 - Module 1 - Foundations of DevOps I
34 pages
Cambly
No ratings yet
Cambly
26 pages
s6 PPT 1
No ratings yet
s6 PPT 1
18 pages
Oose Unit 5
No ratings yet
Oose Unit 5
10 pages
QA in DevOps Presentation
No ratings yet
QA in DevOps Presentation
19 pages
Devops
No ratings yet
Devops
23 pages
SAFe Glossary - Scaled Agile Framework - New
No ratings yet
SAFe Glossary - Scaled Agile Framework - New
12 pages
Ramp-Up Guide DevOps
No ratings yet
Ramp-Up Guide DevOps
5 pages
Esha Ankita Devops
No ratings yet
Esha Ankita Devops
7 pages
CI/CD Enablement On Azure: Service Focus
No ratings yet
CI/CD Enablement On Azure: Service Focus
4 pages
Hadoop Vs Spark
No ratings yet
Hadoop Vs Spark
2 pages
Team and Technical Agility
No ratings yet
Team and Technical Agility
1 page
Resume Ahmed Essam
No ratings yet
Resume Ahmed Essam
1 page
MLflow in Practice: Definitive Reference for Developers and Engineers
From Everand
MLflow in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
The MLflow Handbook: End-to-End Machine Learning Lifecycle Management
From Everand
The MLflow Handbook: End-to-End Machine Learning Lifecycle Management
Robert Johnson
No ratings yet
ML Ops on Azure: From Models to Production
From Everand
ML Ops on Azure: From Models to Production
Kameron Hussain
No ratings yet

MLOps

Uploaded by

MLOps

Uploaded by

MLOps

Iterative-Incremental Process in MLOps

1. Manual process. This is a typical data science process, which is performed

3. CI/CD pipeline automation. In the final stage, we introduce a CI/CD system

Figure adopted from “MLOps: Continuous delivery and automation pipelines in

MLOps Stage Output of the Stage Execution

Pipeline Continuous Integration (Build source Pipeline components to be deployed:

Pipeline Continuous Delivery (Deploy Deployed pipeline with new

Automated Triggering (Pipeline is

MLOps Setup Components Description

Versioning the Code, Data, and ML

Using CI tools for (1) Quality

Using CD tools for deploying pipelines

A registry for storing already trained ML

Preprocessing input data as features to

Tracking metadata of model training,

Further reading: “MLOps: Continuous delivery and automation pipelines in

Continuous Delivery (CD) concerns with delivery of an ML training pipeline

Continuous Training (CT) is unique to ML systems property, which

Continuous Monitoring (CM) concerns with monitoring production data

ML models can be retrained based upon new training data.

Models may be retrained based upon new training approaches.

Models may be self-learning.

Models may degrade over time.

Models may be deployed in new applications.

Models can be quickly rolled back to a previous serving version.

Corporate or government compliance may require audit or investigation on

Data may reside across multiple systems.

Data may only be able to reside in restricted jurisdictions.

Data storage may not be immutable.

Data ownership may be a factor.

Analogously to the best practices for developing reliable software systems,

Further reading: How do we manage ML models? Model Management

The complete development pipeline includes three essential components, data

Features and Data Tests

Action: In order to build a schema (domain values), calculate statistics

Features importance test to understand whether new features add a

Action: Compute correlation coefficient on features columns.

Action: Train model with one or two features.

Measure data dependencies, inference latency, and RAM usage for

Drop out unused/deprecated features from your infrastructure and

Tests for Reliable Model Development

Testing ML training should include routines, which verify that algorithms

Action: The loss metrics - impact metrics relationship, can be measured

Further reading: Selecting the Right Metric for evaluating Machine

Assessing the cost of more sophisticated ML models.

Action: ML model performance should be compared to the simple

Validating performance of a model.

It is recommended to separate the teams and procedures collecting the

Fairness/Bias/Inclusion testing for the ML model performance.

Action: Examine input features if they correlate with protected user

Further reading: “Tour of Data Sampling Methods for Imbalanced

Conventional unit testing for any feature creation, ML model specification

Model governance testing (coming soon)

Diff-testing of ML models relies on deterministic training, which is hard

Action: determine the non-deterministic parts in the model training code

Test ML API usage. Stress testing.

Test the algorithmic correctness.

Action: Unit test that it is not intended to completing the ML model

Avoid: Diff-testing with previously build ML models because such tests

Integration testing: The full ML pipeline should be integration tested.

Validating the ML model before serving it.

Action: Setting a threshold and testing for slow degradation in model

Action: Setting a threshold and testing for sudden performance drops in

ML models are canaried before serving.

Action: Testing that an ML model successfully loads into production

Action: The difference between the performance on the holdout data

Action: Avoid result differences between training and serving

Monitor dependency changes throughout the complete pipeline result in

Data version change.

Action: tuning of alerting threshold to ensure that alerts remain useful

Monitor the numerical stability of the ML model.

Action: trigger alerts for the occurrence of any NaNs or infinities.

Monitor computational performance of an ML system. Both dramatic and

Action: measure the performance of versions and components of code,

Action: collect system usage metrics like GPU memory allocation,