Segmentation Dataset
Segmentation Dataset
Bob: Y = f(X)
Let’s share our model with users aka let’s put it into production! 2
What Has to Go Right?
Concept and data drifts are one of the main challenges of production ML systems!
4
MLOps is about maintaining the trained model performance* in production.
The performance may degrade due to factors outside of our control
so we ought to monitor the performance and if needed, roll out a new model to users.
6
MLOps = ML Model + Software
Machine
Feature Extraction Resource
Management
ML Model Serving
Configuration Analysis Tools Monitoring
Infrastructure
Data
Collection
Data Process
Verification Management
Tools
Your
ML Framework
code
D. Sculley et. al. Hidden Technical Debt in Machine Learning Systems, NIPS 2015 7
MLOps Pipeline
Various
Notebooks
scripts
Curated dataset
https://sites.google.com/princeton.edu/rep-workshop/ 13
Keeping Track of Data Processing
Model-driven ML Data-driven ML
Fixed component Dataset Model Architecture
Variable component Model Architecture Dataset
Objective High accuracy Fairness, low bias
Explainability Limited Possible
https://datacentricai.org
16
https://spectrum.ieee.org/andrew-ng-data-centric-ai
Modelling
Training challenges
Rare events
Analyzing results
Hyperparameter
train validate
tuning
With this approach, the model eventually sees the entire dataset. 18
Selecting Data for Training
Training Validation Test
Dataset
75% 15% 10%
Final
Hyperparameter check
train tuning
validate test
Splitting dataset in three allows to perform a final check with unseen data. 19
Balancing Datasets
Consider a binary classification problem with a dataset composed of 200 entries.
There are 160 negative examples (no failure) and 40 positive ones (failure).
There were 3130 healthy signals (Y=False) and 112 faulty ones (Y=True)
C. Obermair, Extension of Signal Monitoring Applications with Machine Learning, Master Thesis, TU Graz
M. Brice, LHC tunnel Pictures during LS2, https://cds.cern.ch/images/CERN-PHOTO-201904-108-15 21
Rare Events
JH. Kim et al. Hybrid Integration of Solid-State Quantum Emitters on a Silicon Photonic Chip, Nano Letters 2017 24
What else can we do?
When one of the values of Y is rare in the population, considerable
resources in data collection can be saved by randomly selecting within
categories of Y. […]
The strategy is to select on Y by collecting observations (randomly or all
those available) for which Y = 1 (the "cases") and a random selection of
observations for which Y = 0 (the "controls").
We can also collect more data of particular class (if even possible).
G. King and L. Zeng, “Logistic Regression in Rare Events Data,” Political Analysis, p. 28, 2001.
https://en.wikipedia.org/wiki/Cross-validation_(statistics) 25
Training Tracking
1. Pen & Paper
2. Spreadsheet
3. Dedicated framework
- Weights and Biases
- Neptune.ai
- Tensorflow
- …
26
Error Analysis
Such analysis may reveal issues with labelling or rare classes in data.
For unstructured data, a cockpit could help in analysis.
Useful in monitoring of certain classes of inputs. 27
28
Deployment
Degrees of automation
Modes of deployment
Reproducible environments
Starting from Shadow mode we can collect more training data in production!
C. Obermair, Extension of Signal Monitoring Applications with Machine Learning, Master Thesis, TU Graz 30
Modes of Deployment
Router
X%
New version
https://hbr.org/2017/09/the-surprising-power-of-online-experiments
https://en.wikipedia.org/wiki/Blue-winged_parrot 31
Reproducible Environments
HTTP Server
KServe
REST API
Data Pipeline Config Pool of
file models
ML Model Computing
infrastructure
Computing environment
(OS, Python, packages)
• Infrastructure metrics
• Logging errors
• Memory, CPU resources utilization
• Latency and jitter
For each of the relevant metrics one should define warning/error thresholds. 35
Monitoring Matters
C. Obermair, Extension of Signal Monitoring Applications with Machine Learning, Master Thesis, TU Graz 36
Data Engineering Modelling Deployment Monitoring
37
MLOps Pipeline with Tensorflow
Pipeline represented as DAG
directed acyclic graph
Data Engineering
Modelling
Deployment
https://www.tensorflow.org/tfx/guide 38
MLOps Pipeline with Kubeflow
Data Engineering
Modelling
https://ml.cern.ch
Deployment https://www.kubeflow.org/docs/started/ 39
Conclusion
Development ML Production ML
Objective High-accuracy model Efficiency of the overall system
Dataset Fixed Evolving
Code quality Secondary importance Critical
Model training Optimal tuning Fast turn-arounds
Reproducibility Secondary importance Critical
Traceability Secondary importance Critical
I do hope the presented MLOps concepts will allow your models to transition
40
From Good to Great
Resources
41