0% found this document useful (0 votes)

29 views68 pages

CT1-MLOPs S1 2

Uploaded by

Divam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views68 pages

CT1-MLOPs S1 2

Uploaded by

Divam

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 68

CT1 - MLOps

Manaranjan Pradhan
Key Objectives
• Understand key challenges at every step in building ML Systems
• Design, Development, Evaluation, Deployment and Monitoring stages
• Equip you with tools, techniques and best practices to deal with these
challenges
• Design, develop and deploy end-to-end ML systems
• Learn some of the Industry Standard tools and platforms
• Focus is on practical lessons.
Exams and Grading

Component Weightage Coding Scheme

Group Assignment 50% 3N-a

Final Exam 50% 4N

Projects
• Project to be completed by a team of 5 (max) people.
• The project will be hosted on Github. You can showcase this
as an accomplishment.
• Plan to write a blog also.
• This will be good learning experience!
Prepare for the class!
• Install and setup Conda environment
• Install Google Colab
• Create a folder on your Google Drive, where you can store all
the code and materials, I send you before each class.
• Sign up for Azure
• You should get 100 USD Student credit if you sign up using your
student (ISB) email id
Why MLOps and ML Systems Design
Netflix 1 Million Dollar Challenge!
Netflix 1M USD Prize

• https://www.wired.com/2012/04/netflix-prize-costs/
• https://netflixtechblog.com/netflix-recommendations-beyond-the-5-stars-part-1-55838468f429
Netflix Blog

MLOps Training Material - [email protected] 9

https://papers.nips.cc/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf
ML Lifecycle

Use Case Data Exploratory Building

Identification Model
Collection Analysis & Machine Model Model
and Problem Validation and
and Feature Learning Deployment Monitoring
Formulation Evaluation
Preparation Engineering Models

First Model Development Iterative Steps

Continuous Model Update Iterative Steps

Lifecycle – Engineering Skills are Leveraged

Use Case Data Exploratory Building

Identification Model
Collection Analysis & Machine Model Model
and Problem and Feature Validation and
Learning Deployment Monitoring
Formulation Preparation Engineering Evaluation
Models

Is being automated.
AutoML is an initiative in
that direction. Can
Can take develop frameworks for
around 60% of selecting models or
total effort Needs some amount of reuse or repurpose an MLOps
statistics and business existing model (transfer
knowledge learning)
Machine Learning Algorithms
Machine Learning
Ability to learn without being
explicitly programmed

Supervised Unsupervised
Where features or factors and the Only features are known. No
outcome is known in the historical data ground truth.

Recommender Dimensionality
Classification Regression Forecasting Clustering
Systems Reduction
Predicting labels which are Predicting labels which are Grouping related items for Projecting higher dimensional data
categorical for example continuous in nature for example Time series example finding customer onto lower dimensions. For example
sales volume or stock price forecasting. Recommending groups images
customer churn or fraud products or services
detection prediction.
What exactly is a ML System?

MLOps Training Material - [email protected] 14

Part of Development
Continuous Integration (C) Continuous Delivery (CD)

When you say ML systems Only need the model here and the required parameters
like y = B0 + B1*X1 + ....
Incoming requests

Do Transformation + Agorithm

Train How?
Data M1 Log Reg
Test
M2 Decision Tree
Code
- Exploration
Prediction
- Preparation M3 KNN Transfer the model M3
- Modelling
- Evaluation
M4 Rendom Forest

Learning System Prediction /

What Inference System
Serialization/Deserialization Technique: else?
We take a model and write it to a file in either pickel format of ONNX format(Open Neural Network Exchange) and this file need to be
version control. Then we load the file into memory in inference system (deserialization) and when we get input we run the model and
get prediction values.
Different Types of ML systems

Batch prediction Online prediction

Frequency Periodical (e.g. every 4 hours) As soon as requests come

Useful for Processing accumulated data Need prediction result immediately

when you don't need
immediate results (e.g.
recommendation systems)

Optimized High throughput Low latency

quickly execute tasks delay is minimized

Examples ● TripAdvisor hotel ranking ● Google Assistant

● Netflix recommendations speech recognition
● Customer Churn Analysis ● Fraud Detection

https://stanford-cs329s.github.io/
MLOps Training Material - [email protected] 16
MLOps Process Frameworks
Business Understanding
Problem Formulation

Accuracy or
Cost or risk of System
business metrics
model failure Constraints
to measure

Interpretability or Bias and Regulatory

explainability Fairness Requirements

19
All machine learning projects
should be single metric driven.
Customer going to loan default or not

Cost of Model Failure

False TP
Negatives Recall =
TP FN (TP+FN)

Minimize Total Cost =

FP TN
(C1 x FP + C2 x FN)
Where,
C1 = Cost of each False Positives
C2 = Cost of each false negatives

False
Positives
21
Cost of Model Failure
Spam mails
that
appear in
the inbox

TP
Precision =
(TP+FP)

Minimizing FP is more important here

Good mails that appear in the If finding employee attrition: Leave or not leave: FN is more important: Use Recall
spam box
22
The risk or cost of model failure
should be captured at the time of
problem formulation.
Business Metrics
Performance of the recommender system is measured by

Number of quality plays

take-rate =
Number of recommendations user sees

https://dl.acm.org/doi/pdf/10.1145/2843948

24
Deployment Constraints
• Latency
• Throughput
Latency requirement can
impact the choice of models
Input Data or Features

Model
Makes
Decisions
Prediction

Inference / Prediction System

25
Interpretability or explainability
• Model Objective Attrition
Model
• Prediction (Black Box) What is the likelihood
of this employee
• Inference leaving?

• Inference is key to create strategy

• Black box vs glass box models

What are important

factors influencing
employee’s decision
to leave? 26
Bias and Fairness

https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G

https://fortune.com/2018/10/10/amazon-ai-recruitment-bias-women-sexist/
Problem Formulation
• Inference or Prediction
• Local or Global Inference
• Model Risk Assessment
• Evaluation Metrics
• Model Interpretability or Explainability
• Bias and Fairness Requirements
• Compliance Requirements
• System Constraints
Data Understanding
What Practitioners Say?
1. Data + Schema
2. Storage Efficiency
3. Read Latency

Data format

31
Limitations of Traditional DWs
• Plain-text CSV - a good old friend of a data scientist
• Pickle - a Python’s way to serialize things
• HDF5 - a file format designed to store and organize large amounts of data
• https://en.wikipedia.org/wiki/Hierarchical_Data_Format
• Feather - a fast, lightweight, and easy-to-use binary file format for storing
data frames
• https://github.com/wesm/feather
• Parquet - an Apache Hadoop’s columnar storage format
• https://parquet.apache.org/
• Good for reading a subset of the columns. Mostly used as data lake or data
warehouse storage format.

32
CSV and JSON are semi-structured data

Pros:
• Widely used
• Plain text file – Can open it in any computer, readable by humans
• Can be read from and written to by most data software

Cons:
• Not the most efficient way to store or access
• No formal standard, so there is room for user interpretation on how to handle edge cases
• No Schema attached

Note:
• A great default option for most use cases
• < 100 MB data
Parquet or ORC or Pickle: Data + Schema columnar storage format

Pros:
• Very fast
• Naturally understands all dtypes used by pandas, including multi-index DataFrames
• Very common in “big data” systems like Hadoop or Spark
• Supports various compression algorithms

Cons:
• Binary storage format that is not human-readable

Note:
• > 100 MB data
• When many systems are accessing it (BI, ML Platforms, any other analysis tool)
• When tool many columns are present, but you need to access only a subset of it
Row Oriented Vs. Columnar Format
Pickle
Pros:
• Python native serializable format. Highly optimized for python read and write

Cons:
• No other language or system can understand this.

Note:
• Only when you know only python systems will read this file
• Can be used as a staging file during pipelines
Missing: 1. How much is missing?: >20% missing data, don't use that column
2. Imputation: by default value, mean/median, models

Data Profiling Outliers: 1. extreme outliers

• Missing Values
• Outliers
• Bad Data Quality
• Data Sampling Errors

https://careersatdoordash.com/blog/five-common-data-quality-gotchas-in-machine-learning-and-how-
to-detect-them-quickly/
Train-Test Split
Train Test

• Is not appropriate for model search

• Experimenting with multiple models (different algorithms)
• Searching for best model (same algorithm) with optimal
hyperparameters
• Test data is used multiple times for model validation
• Information gets leaked into modelling process
• Test accuracy of final model is optimistically biased
Train-Validation-Test Split
Train Val Test

• Is appropriate for model search

• Experimenting with multiple models (different algorithms)
• Searching for best model (same algorithm) with optimal
hyperparameters
• Validation set remains static
• Information gets leaked into modelling process
• Validation accuracy may be optimistically biased
To find the best model we do cv

K-Fold Cross Validation Split

5-Fold here

Train Holdout

train with K1 or K4 and test it against k5 and see

accuracy. Do this for all iterations and take average
of the accuracies.

Similar can be done with different mode (DT) and

accuracies of the two models will be compared.

Depending on this model will be selected.

Divide train data into multifolds and test with a unique fold and iterate
Class Imbalance Type text here

● Not enough signal to learn about rare ● Fraud detection

classes ● Spam detection
● Statistically speaking, predicting ● Disease screening
majority label has higher chance of ● Churn prediction
being right
● Imbalance often comes with
differences in cost of wrong predictions
Class imbalance solution: Resampling
Under sampling Oversampling
Remove samples from the majority Add more examples to the minority
class class
Can cause overfitting Can cause loss of information
Over sampling: SMOTE
Synthetic Minority Oversampling Technique
Types of data leakage
● Data Leakage
○ Premature featurization: creating feature on the entire data instead of just
training data
■ E.g. create n-gram counts/vocabulary from train + test sets
○ Oversampling before splits
■ Train splits might contain test samples
○ Time leakage
■ Randomly splitting data instead of temporal split can cause training data to be able to see
the future
○ Group leakage
■ A patient has 3 CT scans: 2 in train, 1 in test.

44
How to avoid leakage?
● Check for duplication between train and valid/test splits
● Temporal split data (if possible)
● Use only train splits for feature engineering
● Train model on subset of features
○ If performance very high on a subset, either very good set of features or
leakage!
● Monitor model performance as more features are added
○ Sudden increase: either a very good feature or leakage!
● Involve subject matter experts in the process

45
Keeping track of all things!
Software Development, only
code is version controlled

Pipeline Model
Data + /Code +

Any change in Any change in Any change in

samples will effect imputation, scaling hyperparameters
the performance and encoding will effect the
of the model. techniques will performance of
change the the mode.
performance of
the model.

MLOps Training Material - [email protected]

Model Developement

MLOps Training Material - [email protected] 47

Do we always need to build a ML Model?

MLOps Training Material - [email protected] 48

Model Baselining

https://eugeneyan.com/writing/first-rule-of-ml/

MLOps Training Material - [email protected]

Google’s Rule for Machine Learning

https://developers.google.com/machine-learning/guides/rules-of-ml

MLOps Training Material - [email protected]

Before building a ML model start with a baseline model

Baselining
• Create a system with if/else rules from heuristics
• Build a simple ML (linear regression) first
• Build a system with regex (hand crafted regular expressions) for
classifying text data
• Benefits of creating a heuristics system

MLOps Training Material - [email protected]

When you are forced to build ML systems?

MLOps Training Material - [email protected]

Which model need to be built?

MLOps Training Material - [email protected] 53

Which model to select?

Called Black box models

Accuracy Vs. Explainability

https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G

https://www.bbc.com/news/technology-45809919

• White box Vs. Black box models

• Explainable AI (XAI)
https://fortune.com/2018/10/10/amazon-ai-recruitment-bias-women-sexist/
• Bias and Fairness

MLOps Training Material - [email protected]

Model Development is messy!
• Run number of experiments to refine your model
• Experiments may involve
• Different Transformations
• Different Models
• Different Hyperparameters

• Easy to lose track of code, hyperparameters, and artifacts

• Fail to reproduce experiments (reproducibility)

MLOps Training Material - [email protected]

Building and deploying a Pipeline
Preprocessors

OHE Encode the

Scale the
Train categorical Model
numerical features
variables

MLFlow: For local use

Neptune.io
Test

pipeline

Deploy
Weights and Biases : Cloud based: Mostly used

the
Prediction
System
Defining preprocessor
Defining the model
Generating the pipeline
Weights and biases tool: https://wandb.ai/site/

Experiment Tracking
• Manually track the results of all model runs in a spreadsheet
• Use experiment tracking tools
• Weights and Biases
• MLFlow
• Neptune.ai

MLOps Training Material - [email protected]

AutoML
• Finding right model can be time consuming
• Time to market can be critical
• Unavailability of expertise in enterprises
• Benefits of using AutoML:
• Improve efficiency by automatically running repetitive tasks. This allows data
scientists to focus more on problems (like data) instead of models.
• Automated ML pipelines also help avoid potential errors caused by manual
work.
• AutoML is a big step towards the democratization of machine learning and
allows everyone to use ML features.
AutoML Frameworks
AutoML Frameworks
• Two types of frameworks
• Searches possible models from traditional ML algorithms
• Linear, SVM, KNN, Bagging, Boosting, Naïve Bayes etc.
• Does hyperparameter tuning
• Works with mostly structured data
• Neural Network Search
• Searches for neural network architectures
• Number of neurons and layers
• Works with structured and unstructured data

Extra:

Ensembling Techniques:
1. Bagging
2. Boosting
3. Stacking: Build all the different models, take outcome from each model and then again pass it through a meta model and get prediction. This is popular
now a days
AutoML outputs leaderboard and lists all the models it has tried and ranks them

Leaderboards
AutoML Frameworks

Popular

• https://isg.beel.org/blog/2020/04/09/list-of-automl-tools-and-software-libraries/
• https://medium.com/swlh/8-automl-libraries-to-automate-machine-learning-pipeline-3da0af08f636
Searching parameters
• Max run time – Limits the time to experiment
• Max models – Limits the number of experiments
• Stopping Metrics and tolerance: MSE, AUC R_square<=0.85
• Sorting Metrics – For leaderboard creation
• Exclude algos e.g. ["GLM", "DeepLearning"]
• Include algos e.g. ["GBM", " XGBoost", "DRF"]
• Preprocessing
• for example scaling, various encoding (OHE, Target etc.) – Not many frameworks support
this.
How to use AutoML?
• Should be used as a guidance tool
• May not take the suggested model directly to production
• Gives guidance on
• What models can be used
• What feature engineering can be used (though this is not a replacement of
actual feature engineering based on domain knowledge)
• Can be an indicator of what accuracy can be expected

AWS MLOps Slides
No ratings yet
AWS MLOps Slides
185 pages
Building Machine Learning Systems With A Feature Store Batch, Real-Time, and LLM Systems Early Release Jim
No ratings yet
Building Machine Learning Systems With A Feature Store Batch, Real-Time, and LLM Systems Early Release Jim
84 pages
ML in Production en
No ratings yet
ML in Production en
106 pages
Session 29 - MLOps Tools Overview-New
100% (1)
Session 29 - MLOps Tools Overview-New
40 pages
Batch Vs Online ML: Wednesday, March 17, 2021 5:30 PM
No ratings yet
Batch Vs Online ML: Wednesday, March 17, 2021 5:30 PM
436 pages
100 Days of ML
No ratings yet
100 Days of ML
383 pages
MLOps Notes
100% (1)
MLOps Notes
48 pages
Production ML Pipelines With TensorFlow Extended - TFX - Presentation
No ratings yet
Production ML Pipelines With TensorFlow Extended - TFX - Presentation
234 pages
Segmentation Dataset
No ratings yet
Segmentation Dataset
41 pages
CT1 MLOPs S3 - 4
No ratings yet
CT1 MLOPs S3 - 4
37 pages
Lec1 24th Nov
No ratings yet
Lec1 24th Nov
29 pages
ML Lectures 2022 Part 1
No ratings yet
ML Lectures 2022 Part 1
231 pages
MLOps Getting From Good To Great
No ratings yet
MLOps Getting From Good To Great
41 pages
Weibull-Analysis-In-Excel Standard IEC 61649
No ratings yet
Weibull-Analysis-In-Excel Standard IEC 61649
113 pages
MLOps Continuous Delivery For ML On AWS
No ratings yet
MLOps Continuous Delivery For ML On AWS
69 pages
Webinar Slides Mlops
100% (1)
Webinar Slides Mlops
35 pages
MLOps Interview Study CSCW24
No ratings yet
MLOps Interview Study CSCW24
34 pages
Designing Machine Learning Systems by Chip Huygen by Rick
No ratings yet
Designing Machine Learning Systems by Chip Huygen by Rick
15 pages
Base Paper 3 - Master Theises
No ratings yet
Base Paper 3 - Master Theises
75 pages
3-Data Considerations
No ratings yet
3-Data Considerations
46 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
91 pages
Module 5.pptx - 20250608 - 201231 - 0000
No ratings yet
Module 5.pptx - 20250608 - 201231 - 0000
43 pages
(Big Data For Industry 4.0) K. Suganthi, R. Karthik, G. Rajesh, Peter Ho Chiung Ching - Machine Learning and Deep Learning Techniques in Wireless and Mobile Networking Systems-CRC Press (2021)
No ratings yet
(Big Data For Industry 4.0) K. Suganthi, R. Karthik, G. Rajesh, Peter Ho Chiung Ching - Machine Learning and Deep Learning Techniques in Wireless and Mobile Networking Systems-CRC Press (2021)
285 pages
Lecture+Notes Intro To MLOps Session3
No ratings yet
Lecture+Notes Intro To MLOps Session3
8 pages
Basis Functions (3 Lectures) : Concept of Basis Functions. Fourier Transform and Its Properties
No ratings yet
Basis Functions (3 Lectures) : Concept of Basis Functions. Fourier Transform and Its Properties
16 pages
cs329s 2022 02 Slides MLSD
No ratings yet
cs329s 2022 02 Slides MLSD
99 pages
Lecture 3 - 1-ML and Data Systems Fundamentals
No ratings yet
Lecture 3 - 1-ML and Data Systems Fundamentals
48 pages
2-ML Principles
No ratings yet
2-ML Principles
34 pages
7 - From ML To Production
No ratings yet
7 - From ML To Production
23 pages
AI-Lecture 8 (Machine Learning Overview)
No ratings yet
AI-Lecture 8 (Machine Learning Overview)
42 pages
BDA Lec11
No ratings yet
BDA Lec11
32 pages
Regression and Analysis
No ratings yet
Regression and Analysis
132 pages
REPEAT 1 Starting The Enterprise ML Journey, Featuring ProSiebenSat.1 Media SE AIM205-R1
No ratings yet
REPEAT 1 Starting The Enterprise ML Journey, Featuring ProSiebenSat.1 Media SE AIM205-R1
62 pages
ID3 MedhaPradhan
No ratings yet
ID3 MedhaPradhan
22 pages
DA Python Env Intro
No ratings yet
DA Python Env Intro
47 pages
Lecture 3.1.1 - Hashing Functions
No ratings yet
Lecture 3.1.1 - Hashing Functions
15 pages
Lecture 4-5
No ratings yet
Lecture 4-5
48 pages
Mlops: Continuous Delivery and Automation Pipelines in Machine Learning
100% (1)
Mlops: Continuous Delivery and Automation Pipelines in Machine Learning
14 pages
Tantithamthavorn Et Al - 2025
No ratings yet
Tantithamthavorn Et Al - 2025
7 pages
Topic Cheatsheet For GCP's Professional Machine Learning Engineer Beta Exam
No ratings yet
Topic Cheatsheet For GCP's Professional Machine Learning Engineer Beta Exam
2 pages
Unit 1
No ratings yet
Unit 1
21 pages
MLOps Research Work by Arka Roy
No ratings yet
MLOps Research Work by Arka Roy
21 pages
Presentation 1
No ratings yet
Presentation 1
5 pages
C2 - W1 Mlopssadsa
No ratings yet
C2 - W1 Mlopssadsa
111 pages
MLops Concept
No ratings yet
MLops Concept
20 pages
IE 303 Discrete-Event Simulation: Lecture 3: Event-Scheduling Algorithm
No ratings yet
IE 303 Discrete-Event Simulation: Lecture 3: Event-Scheduling Algorithm
26 pages
Chapt2-Transport Level Security-SSL, TLS
No ratings yet
Chapt2-Transport Level Security-SSL, TLS
28 pages
Unit 2
No ratings yet
Unit 2
12 pages
MLOPS Unit 1
No ratings yet
MLOPS Unit 1
10 pages
Optimization of An Appointment Scheduling Problem
No ratings yet
Optimization of An Appointment Scheduling Problem
20 pages
ML Notion 1
No ratings yet
ML Notion 1
18 pages
Unit-1 Introduction To Machine Learning (5hrs)
No ratings yet
Unit-1 Introduction To Machine Learning (5hrs)
8 pages
Red-Black Tree - Wikipedia
No ratings yet
Red-Black Tree - Wikipedia
20 pages
CS 2 3 4 Aml
No ratings yet
CS 2 3 4 Aml
70 pages
Lecture 8 - Lifecycle of A Data Science Project - Part 2
No ratings yet
Lecture 8 - Lifecycle of A Data Science Project - Part 2
43 pages
Work and Heat
No ratings yet
Work and Heat
23 pages
Bivariate
No ratings yet
Bivariate
8 pages
MLOps
No ratings yet
MLOps
16 pages
Getting Started With MLOPs 21 Page Tutorial
No ratings yet
Getting Started With MLOPs 21 Page Tutorial
21 pages
Mlops Productionalization Brochure
No ratings yet
Mlops Productionalization Brochure
7 pages
Data Structures and Algorithms
No ratings yet
Data Structures and Algorithms
6 pages
Mlops 101
No ratings yet
Mlops 101
33 pages
2020-09-17 - Lak - GDG - Machine Learning Design Patterns For MLOps PDF
No ratings yet
2020-09-17 - Lak - GDG - Machine Learning Design Patterns For MLOps PDF
43 pages
Presented To: Ricardo Javier Pineda
No ratings yet
Presented To: Ricardo Javier Pineda
48 pages
A Secure Deep Probabilistic Dynamic Thermal Line Rating Prediction
No ratings yet
A Secure Deep Probabilistic Dynamic Thermal Line Rating Prediction
10 pages
Numerical Methods HW1
No ratings yet
Numerical Methods HW1
2 pages
MLOps Asilla 20221124
No ratings yet
MLOps Asilla 20221124
16 pages
BA Orientation
No ratings yet
BA Orientation
11 pages
Inspiron 15 7572 Laptop Service Manual en Us
No ratings yet
Inspiron 15 7572 Laptop Service Manual en Us
64 pages
Lead Compensators - Emma Benjaminson - Data Scientist
No ratings yet
Lead Compensators - Emma Benjaminson - Data Scientist
5 pages
MLOPs Artem Koval
No ratings yet
MLOPs Artem Koval
38 pages
Identifing Software Bugs or Not Using SMLT Model
No ratings yet
Identifing Software Bugs or Not Using SMLT Model
34 pages
Reliance Fresh Cluster Analysis PDF
No ratings yet
Reliance Fresh Cluster Analysis PDF
26 pages
BSSE
No ratings yet
BSSE
8 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
ML (AutoRecovered)
No ratings yet
ML (AutoRecovered)
5 pages
Maths October Month Assignment Linear Equations in One Variable
No ratings yet
Maths October Month Assignment Linear Equations in One Variable
2 pages
Symbolic Dynamics For A Kinds of Piecewise Smooth Maps
No ratings yet
Symbolic Dynamics For A Kinds of Piecewise Smooth Maps
10 pages
Brain CT and MRI Medical Image Fusion Using Convolutional Neural Networks and A Dual-Channel Spiking Cortical Model
No ratings yet
Brain CT and MRI Medical Image Fusion Using Convolutional Neural Networks and A Dual-Channel Spiking Cortical Model
14 pages
ML Midterm Cheatsheet
No ratings yet
ML Midterm Cheatsheet
2 pages
Tableau SOP
No ratings yet
Tableau SOP
4 pages
The ML Test Score: A Rubric For ML Production Readiness and Technical Debt Reduction
No ratings yet
The ML Test Score: A Rubric For ML Production Readiness and Technical Debt Reduction
10 pages
Digital Controls & Digital Filters: M.R. Azimi, Professor
No ratings yet
Digital Controls & Digital Filters: M.R. Azimi, Professor
14 pages
Ba Mpbam
No ratings yet
Ba Mpbam
21 pages
Feature Labs - ML 2.0
No ratings yet
Feature Labs - ML 2.0
13 pages
Mlops: A Reality!!!!
No ratings yet
Mlops: A Reality!!!!
5 pages
ML Projects For Final Year
No ratings yet
ML Projects For Final Year
7 pages
Matplotlib CS
No ratings yet
Matplotlib CS
10 pages
Model Perf Cheat Sheet
No ratings yet
Model Perf Cheat Sheet
2 pages
Total Station GTS 102 N User Manual 112-114
No ratings yet
Total Station GTS 102 N User Manual 112-114
3 pages
Bilal Dsa Lab
No ratings yet
Bilal Dsa Lab
5 pages
Ba MPBDS
No ratings yet
Ba MPBDS
5 pages
Intro To Algorithms
No ratings yet
Intro To Algorithms
24 pages
The P vs. NP Problem: Madhu Sudan May 17, 2010
No ratings yet
The P vs. NP Problem: Madhu Sudan May 17, 2010
16 pages
MidTerm K64 1
No ratings yet
MidTerm K64 1
1 page
Boarding Pass (CCU-BLR)
No ratings yet
Boarding Pass (CCU-BLR)
1 page

CT1-MLOPs S1 2

Uploaded by

CT1-MLOPs S1 2

Uploaded by

CT1 - MLOps

Component Weightage Coding Scheme

Group Assignment 50% 3N-a

Final Exam 50% 4N

MLOps Training Material - [email protected] 9

Use Case Data Exploratory Building

First Model Development Iterative Steps

Continuous Model Update Iterative Steps

Use Case Data Exploratory Building

MLOps Training Material - [email protected] 14

Learning System Prediction /

Batch prediction Online prediction

Useful for Processing accumulated data Need prediction result immediately

Optimized High throughput Low latency

Examples ● TripAdvisor hotel ranking ● Google Assistant

Interpretability or Bias and Regulatory

Cost of Model Failure

Minimize Total Cost =

Minimizing FP is more important here

Number of quality plays

Inference / Prediction System

• Inference is key to create strategy

What are important

Data Profiling Outliers: 1. extreme outliers

• Is not appropriate for model search

• Is appropriate for model search

K-Fold Cross Validation Split

train with K1 or K4 and test it against k5 and see

Similar can be done with different mode (DT) and

Depending on this model will be selected.

● Not enough signal to learn about rare ● Fraud detection

Any change in Any change in Any change in

MLOps Training Material - [email protected]

MLOps Training Material - [email protected] 47

MLOps Training Material - [email protected] 48

MLOps Training Material - [email protected]

MLOps Training Material - [email protected]

MLOps Training Material - [email protected]

MLOps Training Material - [email protected]

MLOps Training Material - [email protected] 53

Called Black box models

• White box Vs. Black box models

MLOps Training Material - [email protected]

• Easy to lose track of code, hyperparameters, and artifacts

MLOps Training Material - [email protected]

OHE Encode the

MLFlow: For local use

MLOps Training Material - [email protected]

You might also like