0% found this document useful (0 votes)
47 views

TensorFlow Extended Part 2 - Model Build - Analysis - and - Serving

The document provides an overview of TensorFlow Extended (TFX) for building machine learning pipelines. It discusses the goals and components of TFX including data validation, transformation, training models with TensorFlow Estimator, model analysis and serving models. It also covers TensorFlow and building machine learning models with Estimator.

Uploaded by

ku.madan05
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views

TensorFlow Extended Part 2 - Model Build - Analysis - and - Serving

The document provides an overview of TensorFlow Extended (TFX) for building machine learning pipelines. It discusses the goals and components of TFX including data validation, transformation, training models with TensorFlow Estimator, model analysis and serving models. It also covers TensorFlow and building machine learning models with Estimator.

Uploaded by

ku.madan05
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

TensorFlow Extended Part 2

Model Build, Analysis &


Serving
Armen Donigian
Who am I?
● Computer Science Undergrad degree @UCLA
● Computer Science Grad degree @USC
● 15+ years experience as Software & Data Engineer
● Computer Science Instructor
● Mentor @Udacity Deep Learning Nanodegree
● Real-time wagering algorithms @GamePlayerNetwork
● Differential GPS corrections @Jet Propulsion Laboratory, landing sequence for
Mars Curiosity
● Years of experience in design, implementation & productionalization of
machine learning models for several FinTech underwriting businesses
● Currently, head of Personalization & Discovery
● Available for Consulting ([email protected])
Goals, Breadth vs Depth...
Goal: Provide context of the requirements, tools & methodologies involved with
developing a production grade machine learning pipeline.

Slides will provide you with breadth.

Notebooks will provide you with depth (i.e. implementation details).

Generalist Generalist
Generalist

Specialist
Specialist

Specialist
Lesson Roadmap

● Overview of TFX: What problems it can help you solve ● TensorFlow Model Analysis Overview (40 mins)
(30 mins) a. What is it & why should you care?
a. What is TFX & Why Should You Care? b. TFMA API Overview
b. What can you leverage? TFX Ecosystem c. TFMA Usage
d. TFMA notebook demo
c. Which problems can TFX help you solve?
d. TFX Components

10 min Break
10 min Break
● TensorFlow Estimator Overview (35 mins)
a. What is TensorFlow & Why Should You Care?
● Tensorflow Serving (45 mins)
b. What is TF Estimator?
a. What is it & why should you care?
c. How to train a model using TF Estimator? b. TF Serving Intro
d. Dataset Overview c. TF Serving w/ Docker notebook demo
e. TF Estimator notebook demo i. CPU / GPU / TPU
d. TF Model Server REST API
10 min Break
TensorFlow Extended
Overview
TensorFlow Extended (TFX)
TFX is…

● A general purpose machine learning platform implemented @Google


● A set of gluable components into one platform simplifying the development of
end to end ML pipelines.
● An open source solution to reduce the time to production from months to
weeks while minimizing custom, fragile solutions filled with tech debt.
● Used by Google to create & deploy their machine learning models.
Why Should You Care?
What you first think? VS... Real World ML Use Cases

Takeaway: Doing machine learning in real


world is HARD!

Building custom solutions is expensive,


duplicative, fragile & leads to tech debt.

Hidden Technical Debt in Machine Learning Systems


What Can I Leverage: TFX Ecosystem
Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization

Shared Configuration Framework & Job Orchestration

Tuner

Data Data Trainer / Model Evaluation


Data Analysis Data Validation Serving Logging
Ingestion Transformation Estimator & Validation

Shared Utilities for Garbage Collection, Data Access Controls

Pipeline Storage

Machine Learning Platform Overview


Open Source

Link to TFX paper Not Public Yet


Train / Serving Data Flows A Data Science Workflow, Glossary of ML terms, Diagram Reference

Training Data Build Model Save Model Load Model Serve Model Serving Data

3 4 6 7 8 11

Evaluate Model
All sessions in a day Current session
Large size Small size
High throughput Low latency

2 5 10
Input Data for
Prepare Data 1 Training
Serving Data 9 Prepare Data
TFX Pipeline
TF Data Validation TF Transform TF Estimator TF Model Analysis TF Serving

Build Model Have a Model

source
Architecture Overview
TFX pipelines can be orchestrated using Apache Airflow and Kubeflow Pipelines.
For this workshop, we will be running in interactive mode.

source
What is a TFX Component?
● TFX pipelines are a series of components
● Components are organized into DAGs
● Executor is where insert your work will be
● Driver feeds data to Executor
● Publisher writes to ml.metadata

source
What is a TFX Component?
As data flows through pipelines…

● Components read data coming from


Metadata store and ...
● Write data to components further in the
pipeline...
● Except when at start and end
● Orchestrators like (Kubeflow or Airflow)
help you manage triggering of tasks and
monitor components
● ML Metadata store is a RDBMs
containing...
○ Trained & re-trained Models
○ Data we trained with
○ Evaluation results
○ Location of data objects (not data)
○ Execution history records for every
component
○ Data provenance of intermediate
outputs

source
TFX Pipeline
TF Data Validation TF Transform TF Estimator TF Model Analysis TF Serving

Build Model Have a Model

Components API (docs)

● ExampleGen ingests and splits the input dataset.


● StatisticsGen calculates statistics for the dataset.
● SchemaGen SchemaGen examines the statistics and creates a data schema.
● ExampleValidator looks for anomalies and missing values in the dataset.
● Transform performs feature engineering on the dataset.
● Trainer trains the model using TensorFlow Estimators
● Evaluator performs deep analysis of the training results.
● ModelValidator ensures that the model is "good enough" to be pushed to production.
● Pusher deploys the model to a serving infrastructure.
● TensorFlow Serving for serving.
Installation Notes
Python 3.x support now available for
Apache Beam

source
TensorFlow
Model Build
Dataset
Bucket Features Dense Float Features Vocab Features Categorical Features

pickup_hour trip_distance

pickup_month passenger_count

pickup_day_of_week tip_amount

dropoff_month

dropoff_hour

dropoff_day_of_week

bucketize scale_to_z_score Target to predict: fare_amount

New York Yellow Cab dataset available via


BigQuery public datasets
Transformations
What is TF & Why Should You Care?
TensorFlow is an open source high performance library which uses directed graphs (see
overview)

● Dataflow Model (link to paper)


○ Nodes represent math operations, Edges represent arrays of data
○ tf.math.add represented as..
■ single node w/ 2 input edges (matrices to be added)
■ 1 output edge (result of addition)
● Flexible
○ Works w/ image, audio, text and numerical data
● Parallelism
○ dataflow graph represents dependencies between operations, figure out which
ops can execute in parallel)
● Distributed Execution
○ TF partitions your program across multiple devices (CPUs, GPUs, TPUs
attached to different machines)
○ TF takes care of networking between machines
● Compilation
○ Benefit from compiler optimizations for dataflow graph using XLA
● Portability source
○ Train model in Python, export SavedModel, serve in C++)
Tensorflow FeatureColumns Part 1

source
Tensorflow FeatureColumns Part 2

source
Tensorflow FeatureColumns Part 3

source
What is TF Estimator & Why Should You Care?
TF Estimator is a high level OOP API which makes it easier to train models
(see overview)

● TF Estimator is compatible with the scikit-learn API


● Train models using CPU / GPU / TPUs
● Quicker model (graph) development
● Load large amounts of data
● Model checkpointing & recover from failures
● Train / Evaluation / Monitor
● Distributed Training
● Save summaries for TensorBoard
● Hyper-parameter tuning using ML Engine
● Serving predictions from a trained model
● Easily create Estimators from Keras models source

● How to create custom estimators


● Need to implement preprocessing_fn & _build_estimator methods!
Trainer TFX Pipeline Component
Notes:

● To ingest data into your ML pipeline


○ Input: Transform graph, schema from SchemaGen, Code (model training code)
○ Output: Two different SavedModel (one for production inference, other for evaluation)

source
preprocessing_fn(...)

source
_build_estimator(...)

source
TF Model Build Knowledge Check

Q: Which of the following is a supported feature column method in TF Estimator?

a) tf.feature_column.numeric_column()
b) tf.feature_column.categorical_column_with_vocabulary_list()
c) tf.feature_column.categorical_column_with_identity()
d) tf.feature_column.indicator_column()
e) All of the above
TensorFlow
Model Analysis
What is TF Model Analysis & Why Should You Care?
TF Model Analysis is a library for evaluating TF models.

Benefits include…

● Allows you to evaluate models on large amounts of data


● You can choose which metric & what slice/segment of your data to evaluate
model predictions on
○ This helps you find slices of data for a given feature where the model performs poorly
○ Great model debugging tool
● Track performance over time Trained Model Input Data
(TF Estimator) (EvalSet & Slice)
○ Trends of different models over time
○ As you get new data
● User friendly visualization tool Evaluator
(TFMA)

Results
Evaluator TFX Pipeline Component
Notes:

● To evaluate overall and individual data slices


○ Input: ExampleGen, Trainer
○ Output: Evaluation Metrics
○ Helps identify individual points where model performs poorly

source
ModelValidator TFX Pipeline Component
Notes:

● Is the new model better or worse than what we have in production?


○ Input: ExampleGen, Trainer
○ Output: Validation Outcome

source
Pusher TFX Pipeline Component
Notes:

● If model validation passes, push to production


○ Input: Model Validator
○ Output: Deployment options
■ TF Lite
■ TF JS
■ TF Serving

source
TensorFlow Model Analysis API
Feature Engineering @ Scale Transformations

Evaluate & persist results. tfma.ExtractEvaluateAndWriteResults()

Creates an EvalResult object for use with the tfma.load_eval_result()


visualization functions.

Run model analysis for a single model on tfma.multiple_data_analysis()


multiple data sets.

Run model analysis for multiple models on the tfma.multiple_model_analysis()


same data set.

Runs TensorFlow model analysis. tfma.run_model_analysis()


Define Feature Slices for TFMA
Number of rows
per hour of day!
Note
distribution of
mean
prediction
Note average
loss anomaly!
Feature Cross
Analysis
TF Model Analysis Knowledge Check

Q: TFMA is only useful if you’re building a model using TensorFlow?

a) True
b) False
TensorFlow
Serving
What is TF Serving & Why Should You Care?
Requirements of a Model Serving System...

1. Low latency
a. Isolation of load & serve threads
2. Efficient
a. Dynamic request batching
3. Scale Horizontally
4. Reliable & Robust
5. Support loading/hosting multiple model versions dynamically
a. Serve one model, while sending canary requests to new model
b. Built in A/B testing
6. Deployment roll forward / backward
7. Serves over 1,500 models @Google, 100 predictions/sec

Dockerfile(s) maintained by Google


● Dockerfile, VM w/ TensorFlow Serving
● Dockerfile.gpu, VM w/ TensorFlow Serving (GPU support to be used with nvidia-docker)
● Dockerfile.devel, VM w/ all dependencies needed to build TensorFlow Serving
● Dockerfile.devel-gpu, VM w/ all dependencies needed to build TensorFlow Serving w/ GPU
support.
Test Drive it Yourself...
TF Serving Out of the Box (w/ Docker)
SavedModel Artifacts

After training, we have a trained saved


model (universal format)

● Learned variable weights


● Graph
● Embeddings & Vocabs
● Inferred Schema
● Transformed features
TF Serving
(w/ Docker)
TF Serving Inference
TF Serving ModelServer REST API
First, you’ll need to install TF Model Server...
apt-get remove tensorflow-model-server

POST http://host:port/<URI>:<VERB>
URI - /v1/models/${MODEL_NAME}[/versions/${MODEL_VERSION}]
VERBS - classify | regress | predict

Classify Format:
POST http://host:port/v1/models/${MODEL_NAME}[/versions/${MODEL_VERSION}]:classify

Classify Example:
POST http://host:port/v1/models/iris/versions/1:classify

Predict Format:
POST http://host:port/v1/models/${MODEL_NAME}[/versions/${MODEL_VERSION}]:predict

Predict Example:
POST http://host:port/v1/models/mnist/versions/1:predict

End-2-End example
Next Steps:
● Work through TFMA & TFServing Notebooks

You might also like