0% found this document useful (0 votes)
7 views29 pages

Mlflow Workshop Part 2

This document provides an overview of MLflow, focusing on its components such as Tracking, Projects, and Models, which facilitate the machine learning lifecycle. It emphasizes the importance of reproducibility in machine learning and outlines how MLflow helps package and manage data science code and models. Additionally, it includes examples and resources for further learning about MLflow functionalities and usage.

Uploaded by

Tuan Minh Pham
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views29 pages

Mlflow Workshop Part 2

This document provides an overview of MLflow, focusing on its components such as Tracking, Projects, and Models, which facilitate the machine learning lifecycle. It emphasizes the importance of reproducibility in machine learning and outlines how MLflow helps package and manage data science code and models. Additionally, it includes examples and resources for further learning about MLflow functionalities and usage.

Uploaded by

Tuan Minh Pham
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Platform for Complete Machine

Learning Lifecycle
Jules S. Damji
@2twitme

San Francisco| May 13, 2020: Part 2 of 3 Series


Outline – Introduction to MLflow: Understanding MLflow
Projects and Models- Part 2
§ Review & Recap Part 1: MLflow Tracking
▪ https://youtu.be/x3cxvsUFVZA
§ MLFlow Component
▪ MLflow Projects & Models
▪ Concepts and Motivations
▪ MLflow on Databricks Community Edition (DCE)
▪ Explore MLflow UI
▪ Tutorials
§ Q&A

https://dbricks.co/mlflow-part-2
https://github.com/dmatrix/mlflow-workshop-project-expamle-1
Machine Learning
Development is Complex
Traditional Software vs. Machine Learning
Traditional Software Machine Learning

§ Goal: Meet a functional specification § Goal: Optimize metric(e.g., accuracy.


§ Quality depends only on code Constantly experiment to improve it
§ Typically pick one software stack w/ § Quality depends on input data and
fewer libraries and tools tuning parameters
§ Compare + combine many libraries,
model
Machine Learning Lifecycle
μ
λθ Tuning

Scale

Data Prep

μ
Model λθ Tuning
Delta Raw Data Exchange Training
Scale
Scale

Deploy
Governance

Scale
MLflow Components
w
ne

Tracking Projects Models Model


Record and query Package data Deploy machine Registry
experiments: code, science code in a learning models in
Store, annotate
data, config, and results format that enables diverse serving
and manage
reproducible runs environments
models in a
on any platform environments
central repository

databricks.com
mlflow.org github.com/mlflow twitter.com/MLflow
/mlflow
Model Development with MLflow is Simple!
data = load_text(file) $ mlflow ui
ngrams = extract_ngrams(data, N=n)
model = train_model(ngrams,
learning_rate=lr)
score = compute_accuracy(model)
with mlflow.start_run() as run:
mlflow.log_param(“data_file”, file)
mlflow.log_param(“n”, n)
mlflow.log_param(“learn_rate”, lr)
mlflow.log_metric(“score”, score) Track parameters, metrics,
mlflow.sklearn.log_model(model) output files & code version
Search using UI or API
MLflow Tracking
Python,
Java, R or
REST API
Notebooks Tracking Server UI

Local Apps Parameters Metrics Artifacts


API

Spark
Cloud Jobs Metadata Models
Data Source
$ export MLFLOW_TRACKING_URI <URI>
mlflow.set_tracking_uri(URI)
MLflow Components
w
ne

Tracking Projects Models Model


Record and query Package data Deploy machine Registry
experiments: code, science code in a learning models in
Store, annotate
data, config, and results format that enables diverse serving
and manage
reproducible runs environments
models in a
on any platform environments
central repository

databricks.com
mlflow.org github.com/mlflow twitter.com/MLflow
/mlflow
MLflow Projects Motivation
Diverse set of tools

Projects
Package data science
Diverse set of environments code in a format that
enables reproducible runs
on any platform

Challenge: ML results difficult to reproduce


MLflow Projects

Local Execution
Project Spec

Code Config
Remote Execution
Dependencies Data
1. Example MLflow Project File
my_projectject/
├── MLproject conda_env: conda.yaml

│ entry_points:
│ main:
parameters:
│ training_data: path
│ lambda: {type: float, default: 0.1}
command: python main.py {training_data} {lambda}

├── conda.yaml
├── main.py $ mlflow run git://<my_project>.git -P lambda=0.2
└── model.py
mlflow.run(“git://<my_project>”, parameters={..})
...
mlflow run . –e main –P lambda=0.2
2. Example Conda.yaml
my_project/
├── MLproject
channels:
│ - defaults
│ dependencies:
│ - python=3.7.3
- scikit-learn=0.20.3
│ - pip:
│ - mlflow
├── conda.yaml - cloudpickle==0.8.0
├── main.py name: mlflow-env

└── model.py
….
MLflow Projects
Packaging format for reproducible ML runs
• Any code folder or GitHub repository
• MLproject file with project configuration
Defines dependencies for reproducibility
• Conda (+ R, Docker, …) dependencies can be specified in MLproject
• Reproducible in (almost) any environment
Execution API for running projects
§ CLI / Python / R / Java mml

directory paths to
§ Supports local and remote execution MLproject file
▪ mlflow run –help (CLI)
▪ mlflow run https://github.com/dmatrix/jsd-mlflow-examples.git#keras/imdbclassifier (CLI)
▪ mlflow.run (<project_uri>, parameters={}) or mlflow.projects.run((<project_uri>, parameters={}) (API)
Anatomy of MLflow Project Execution
1 2 3
$ mlflow run Fetch the GitHub project into
https://github.com/mlflow- /var/folders/xxx directory Create conda env & activate
d
project-example-1 d mlflow-runidd

4 5

Install packages & dependencies from In the activated conda environment


conda.yaml d mlflow-runid d
Execute your entry point:
python train.py args, …,args
How to build an MLflow Project
1 2
• Create an MLproject file • Create a conda.yaml file
• Populate with entry points • Populate with dependencies
d and
and default type • Copy from yourd mlflow ui
parameters artifacts ->Model->conda.yaml

3 • Test it
4
• Create a GitHub repository
• Populate or upload • mlflow run git://URI –P arg.. –P args
d
MLProject, conda.yaml, • d params-{})
mlflow.run(URI,
data, src files… etc. • Share it …
MLflow Project: Create Multi-Step Workflow

https://github.com/mlflow/mlflow/tree/master/examples/multistep_workflow
MLflow Components
w
ne

Tracking Projects Models Model


Record and query Package data Deploy machine Registry
experiments: code, science code in a learning models in
Store, annotate
data, config, and results format that enables diverse serving
and manage
reproducible runs environments
models in a
on any platform environments
central repository

databricks.com
mlflow.org github.com/mlflow twitter.com/MLflow
/mlflow
MLflow Model Motivations

Inference Code
NxM
Combination of
Model support for
all Serving tools

Batch & Stream Scoring

ML Frameworks Serving Tools


MLflow Model Motivation
MLflow Models
Inference Code

Model Format

Flavor 1 Flavor 2
Batch & Stream
Scoring

Standard for ML models


ML Frameworks Serving Tools
Example MLflow Model
Example MLflow Model
mlflow.tensorflow.log_model(...)
my_model/
├── MLmodel run_id: 769915006efd4c4bbd662461
time_created: 2018-06-28T12:34
│ flavors:
│ tensorflow:
Usable by tools that understand
saved_model_dir: estimator
│ signature_def_key: predict TensorFlow model format
│ python_function: Usable by any tool that can run
loader_module: mlflow.tensorflow
│ Python (Docker, Spark, etc!)
└── estimator/
├── saved_model.pb
└── variables/
...
Model Keras Flavor Example
mlflow.keras.log_model(…)

Train a model

predict = mlflow.pyfunc.load_model(…)
Flavor 1:
Pyfunc predict(pandas.input_dataframe)

Model Flavor 2:
Format Keras
model = mlflow.keras.load_model(…)

model.predict(keras.Input(…))
Model Flavors Example

predict = mlflow.pyfunc.load_model(model_uri)

predict(pandas.input_dataframe)
MLflow Models
Packaging format for ML Models
• Any directory with MLmodel file
Defines dependencies for reproducibility
• Conda environment can be specified in MLmodel configuration
Model creation and loading utilities
• mlflow.<model_flavor>.save_model(…) or log_model(…)
• mlflow.<model_flavor>.load_model(…)
Deployment APIs
• CLI / Python / R / Java
• mlflow models [OPTIONS] COMMAND [ARGS]...
• mlflow models serve [OPTIONS [ARGS] ….
• mlflow models predict [OPTIONS [ARGS] ...
MLflow Project & Models
Tutorials
Tutorials: https://github.com/dmatrix/mlflow-workshop-part-2

MLflow Project Keras Example:


https://github.com/dmatrix/mlflow-workshop-project-expamle-1
Learning More About MLflow

§ pip install mlflow to get started


§ Find docs & examples at mlflow.org
§ Peruse code at MLflow Github
§ Join the Slack channel
§ More MLflow tutorials
Thank you! J
Q&A
[email protected]
@2twitme
https://www.linkedin.com/in/dmatrix/

You might also like