Dokumen - Tips - Mmds 2014 Talk Distributing ML Algorithms From Gpus To The Cloud

This document summarizes distributing large-scale machine learning algorithms from GPUs to the cloud. It discusses Netflix's interest in high quality recommendations and improving rating prediction accuracy. It outlines the scale of Netflix's data and types of models used. It then discusses three levels of distributing and parallelizing algorithms - by population subset, hyperparameters, and training data. Finally, it provides an example of distributing artificial neural network training over GPUs and AWS, addressing questions of data and computation distribution and latency requirements.

Uploaded by

juan aguirre

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views34 pages

Dokumen - Tips - Mmds 2014 Talk Distributing ML Algorithms From Gpus To The Cloud

Uploaded by

juan aguirre

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Distributing Large-scale ML

Algorithms: from GPUs to the

Cloud

MMDS 2014
June, 2014

Xavier Amatriain
Director - Algorithms Engineering @xamat
Outline
■ Introduction
■ Emmy-winning Algorithms
■ Distributing ML Algorithms in Practice
■ An example: ANN over GPUs & AWS Cloud
What we were interested in:
■ High quality recommendations
Proxy question:
■ Accuracy in predicted rating
■ Improve by 10% = $1million!
Data size:
■ 100M ratings (back then “almost massive”)
2006 2014
Netflix Scale
▪ > 44M members
▪ > 40 countries
▪ > 1000 device types
▪ > 5B hours in Q3 2013
▪ Plays: > 50M/day
▪ Searches: > 3M/day
▪ Ratings: > 5M/day
▪ Log 100B events/day
▪ 31.62% of peak US downstream
traffic
Smart Models ■ Regression models (Logistic,
Linear, Elastic nets)
■ GBDT/RF
■ SVD & other MF models
■ Factorization Machines
■ Restricted Boltzmann Machines
■ Markov Chains & other graphical
models
■ Clustering (from k-means to
modern non-parametric models)
■ Deep ANN
■ LDA
■ Association Rules
■ …
“Emmy Winning”

Netflix Algorithms
Rating Prediction
2007 Progress Prize
▪ Top 2 algorithms
▪ MF/SVD - Prize RMSE: 0.8914
▪ RBM - Prize RMSE: 0.8990

▪ Linear blend Prize RMSE: 0.88

▪ Currently in use as part of Netflix’ rating prediction component
▪ Limitations
▪ Designed for 100M ratings, we have 5B ratings
▪ Not adaptable as users add ratings
▪ Performance issues
Ranking
Ranking
Page composition
Similarity
Search Recommendations
Postplay
Gamification
Distributing ML algorithms in practice
1. Do I need all that data?

2. At what level should I distribute/parallelize?

3. What latency can I afford?

Do I need all that data?
Really?

Anand Rajaraman: Former Stanford Prof. &

Senior VP at Walmart
Sometimes, it’s not
about more data
[Banko and Brill, 2001]

Norvig: “Google does not

have better Algorithms,
only more Data”

Many features/
low-bias models
Sometimes, it’s not
about more data
At what level should I parallelize?
The three levels of Distribution/Parallelization

1. For each subset of the population (e.g.

region)
2. For each combination of the
hyperparameters
3. For each subset of the training data

Each level has different requirements

Level 1 Distribution
■ We may have subsets of the
population for which we need to
train an independently optimized
model.
■ Training can be fully distributed
requiring no coordination or data
communication
Level 2 Distribution
■ For a given subset of the population we
need to find the “optimal” model
■ Train several models with different
hyperparameter values
■ Worst-case: grid search
■ Can do much better than this (E.g. Bayesian
Optimization with Gaussian Process Priors)
■ This process *does* require coordination
■ Need to decide on next “step”
■ Need to gather final optimal result
■ Requires data distribution, not sharing
Level 3 Distribution
■ For each combination of
hyperparameters, model training may
still be expensive
■ Process requires coordination and
data sharing/communication
■ Can distribute computation over
machines splitting examples or
parameters (e.g. ADMM)
■ Or parallelize on a single multicore
machine (e.g. Hogwild)
■ Or… use GPUs
ANN Training over GPUS and AWS
ANN Training over GPUS and AWS
■ Level 1 distribution: machines over different AWS
regions
■ Level 2 distribution: machines in AWS and same AWS
region
■ Use coordination tools
■ Spearmint or similar for parameter optimization
■ Condor, StarCluster, Mesos… for distributed cluster coordination
■ Level 3 parallelization: highly optimized parallel CUDA
code on GPUs
What latency can I afford?
3 shades of latency
▪ Blueprint for multiple
algorithm services
▪ Ranking
▪ Row selection
▪ Ratings
▪ Search
▪ …

▪ Multi-layered Machine
Learning
Matrix Factorization Example
Xavier Amatriain (@xamat)
[email protected]

Thanks!
(and yes, we are hiring)

Im 7000 SM
No ratings yet
Im 7000 SM
2,533 pages
Sap Workflow For Beginners Step by Step
86% (7)
Sap Workflow For Beginners Step by Step
5 pages
Advanced Java MCQ
No ratings yet
Advanced Java MCQ
7 pages
Power Systems Resilience: Naser Mahdavi Tabatabaei Sajad Najafi Ravadanegh Nicu Bizon
No ratings yet
Power Systems Resilience: Naser Mahdavi Tabatabaei Sajad Najafi Ravadanegh Nicu Bizon
366 pages
1 - 1 Computers in Our Everyday Lives PDF
No ratings yet
1 - 1 Computers in Our Everyday Lives PDF
26 pages
Mock Paper Computer Architecture Answers
No ratings yet
Mock Paper Computer Architecture Answers
26 pages
Machine Learning Approaches For Fake Reviews Detection A Systematic Literature Review
No ratings yet
Machine Learning Approaches For Fake Reviews Detection A Systematic Literature Review
27 pages
Comptia A 220 1201 Exam Objectives (2 0)
No ratings yet
Comptia A 220 1201 Exam Objectives (2 0)
18 pages
Designing Machine Learning Systems by Chip Huygen by Rick
No ratings yet
Designing Machine Learning Systems by Chip Huygen by Rick
15 pages
CIS VMware ESXi 7.0 Benchmark v1.3.0 DRAFT - PDF
No ratings yet
CIS VMware ESXi 7.0 Benchmark v1.3.0 DRAFT - PDF
252 pages
Assignment 2 Front Sheet: Qualification TEC Level 5 HND Diploma in Computing
No ratings yet
Assignment 2 Front Sheet: Qualification TEC Level 5 HND Diploma in Computing
47 pages
Motorola Solutions: Mototrbo
No ratings yet
Motorola Solutions: Mototrbo
26 pages
Synchronization PCB: This Module Is One of A Series of Modules That Describe The Components of The M System
No ratings yet
Synchronization PCB: This Module Is One of A Series of Modules That Describe The Components of The M System
12 pages
Lecture+Notes Intro To MLOps Session3
No ratings yet
Lecture+Notes Intro To MLOps Session3
8 pages
MLOps Continuous Delivery For ML On AWS
No ratings yet
MLOps Continuous Delivery For ML On AWS
69 pages
LDM1 Module 3 Decision Tree
No ratings yet
LDM1 Module 3 Decision Tree
5 pages
Topic 5 Calculus Review HL
No ratings yet
Topic 5 Calculus Review HL
120 pages
GEN - Instructions For Remotely Proctored Testing - EN
No ratings yet
GEN - Instructions For Remotely Proctored Testing - EN
15 pages
5.1 Theinternet andtheWorldWideWeb
No ratings yet
5.1 Theinternet andtheWorldWideWeb
10 pages
Introduction To Data Science - Unit-1
No ratings yet
Introduction To Data Science - Unit-1
9 pages
Java Hello World Example - Simple Program of Java - Javatpoint
No ratings yet
Java Hello World Example - Simple Program of Java - Javatpoint
1 page
Large Scale Deep Learning
No ratings yet
Large Scale Deep Learning
170 pages
Nsdi21 SwitchML
No ratings yet
Nsdi21 SwitchML
25 pages
Tensorflow Enterprise
100% (3)
Tensorflow Enterprise
544 pages
Mlops Productionalization Brochure
No ratings yet
Mlops Productionalization Brochure
7 pages
Prince Mishra Resume
No ratings yet
Prince Mishra Resume
2 pages
Lecture20-21-Scaling and Distributed Training
No ratings yet
Lecture20-21-Scaling and Distributed Training
58 pages
D5.1. OpenETCS - Functional Specification of Demonstrator
No ratings yet
D5.1. OpenETCS - Functional Specification of Demonstrator
41 pages
Mastering Algorithms for Competitive Programming: Unlock the Secrets of Expert-Level Skills
From Everand
Mastering Algorithms for Competitive Programming: Unlock the Secrets of Expert-Level Skills
Larry Jones
No ratings yet
AIM301 Deep Learning With TensorFlow PyTorch and MXNet On AWS
No ratings yet
AIM301 Deep Learning With TensorFlow PyTorch and MXNet On AWS
29 pages
Webinar Slides Mlops
100% (1)
Webinar Slides Mlops
35 pages
MTM18 Final Report
No ratings yet
MTM18 Final Report
24 pages
Parallelism Strategies in Machine Learning, Get The Free Cheat Sheet - 2
No ratings yet
Parallelism Strategies in Machine Learning, Get The Free Cheat Sheet - 2
32 pages
Auto Parallel
No ratings yet
Auto Parallel
21 pages
BD 10 Tensorflow
No ratings yet
BD 10 Tensorflow
43 pages
ICT Chapter 2 Exam-Style Questions Some Answers
100% (3)
ICT Chapter 2 Exam-Style Questions Some Answers
8 pages
Deeplearning Ai
No ratings yet
Deeplearning Ai
57 pages
Lec1 24th Nov
No ratings yet
Lec1 24th Nov
29 pages
Compare Laptop
No ratings yet
Compare Laptop
7 pages
1710903164003
No ratings yet
1710903164003
59 pages
The Landscape of Machine,...
No ratings yet
The Landscape of Machine,...
31 pages
Viva Computer Test
No ratings yet
Viva Computer Test
19 pages
CNCF - Ai 2
No ratings yet
CNCF - Ai 2
21 pages
REPEAT 1 Starting The Enterprise ML Journey, Featuring ProSiebenSat.1 Media SE AIM205-R1
No ratings yet
REPEAT 1 Starting The Enterprise ML Journey, Featuring ProSiebenSat.1 Media SE AIM205-R1
62 pages
Pre-Intermediate Business Writing: Worksheet 9: An Internal Memo
No ratings yet
Pre-Intermediate Business Writing: Worksheet 9: An Internal Memo
2 pages
Alpa Automating Inter - and Intra-Operator Parallelism - 2201.12023
No ratings yet
Alpa Automating Inter - and Intra-Operator Parallelism - 2201.12023
20 pages
Monitor Your Industrial Plant From Anywhere: The World'S #1-Selling Industrial Alarm Notification Software
No ratings yet
Monitor Your Industrial Plant From Anywhere: The World'S #1-Selling Industrial Alarm Notification Software
2 pages
2020 McKinsey Executives Guide To Developing AI at Scale
No ratings yet
2020 McKinsey Executives Guide To Developing AI at Scale
12 pages
Webtech Akshay 16137mailvalidatoin
No ratings yet
Webtech Akshay 16137mailvalidatoin
15 pages
ML Projects For Final Year
No ratings yet
ML Projects For Final Year
7 pages
Introduction To Parallel and Distributed Computing: by Smiles Vargas
No ratings yet
Introduction To Parallel and Distributed Computing: by Smiles Vargas
10 pages
Build Your Performant ML Stack With NVIDIA DGX and Kubeflow
No ratings yet
Build Your Performant ML Stack With NVIDIA DGX and Kubeflow
14 pages
W11 Ecs7020p
No ratings yet
W11 Ecs7020p
35 pages
4251 Assignment 6
No ratings yet
4251 Assignment 6
11 pages
Unit 2
No ratings yet
Unit 2
9 pages
ML System Optimization - Lecture 10 - Model Optimization Techniques
No ratings yet
ML System Optimization - Lecture 10 - Model Optimization Techniques
33 pages
cs329s 02 Note Intro ML Sys Design
No ratings yet
cs329s 02 Note Intro ML Sys Design
27 pages
Whitepaper Machine Learning and The Intelligent Edge
No ratings yet
Whitepaper Machine Learning and The Intelligent Edge
7 pages
Rationals Review 8 - Practice Test
No ratings yet
Rationals Review 8 - Practice Test
2 pages
Week 13 GCP Lec Notes
No ratings yet
Week 13 GCP Lec Notes
28 pages
A New Platform For Distributed
No ratings yet
A New Platform For Distributed
19 pages
Pytorch FSDP: Experiences On Scaling Fully Sharded Data Parallel
No ratings yet
Pytorch FSDP: Experiences On Scaling Fully Sharded Data Parallel
13 pages
Machine Learning Model Deployment
No ratings yet
Machine Learning Model Deployment
3 pages
Slides Rethink Machine Learning For Regulated Industries
No ratings yet
Slides Rethink Machine Learning For Regulated Industries
30 pages
Thesis Proposal: Scaling Distributed Machine Learning With System and Algorithm Co-Design
No ratings yet
Thesis Proposal: Scaling Distributed Machine Learning With System and Algorithm Co-Design
12 pages
7 - From ML To Production
No ratings yet
7 - From ML To Production
23 pages
Operationalizing The Model
No ratings yet
Operationalizing The Model
46 pages
MODULE 2 Deep Learning
No ratings yet
MODULE 2 Deep Learning
26 pages
AWS ML Notes - Domain Misc
No ratings yet
AWS ML Notes - Domain Misc
15 pages
Train: Dev: Test Sets
No ratings yet
Train: Dev: Test Sets
5 pages
Machine Learning Roadmap PDF
No ratings yet
Machine Learning Roadmap PDF
4 pages
AWS SageMaker Custom Algorithms and Frameworks
No ratings yet
AWS SageMaker Custom Algorithms and Frameworks
19 pages
Chang Si Ju
No ratings yet
Chang Si Ju
2 pages
MLOps
No ratings yet
MLOps
16 pages
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
ML Model Deployment Strategies. An Illustrated Guide To Deployment - by Yashaswi Nayak - Jul, 2022 - Towards Data Science
No ratings yet
ML Model Deployment Strategies. An Illustrated Guide To Deployment - by Yashaswi Nayak - Jul, 2022 - Towards Data Science
11 pages
C2 - W1 Mlopssadsa
No ratings yet
C2 - W1 Mlopssadsa
111 pages
ML at Scale Ebook
No ratings yet
ML at Scale Ebook
14 pages
Advanced Data Science On Spark: Reza Zadeh
No ratings yet
Advanced Data Science On Spark: Reza Zadeh
47 pages
BigML Machine Learning Platform
No ratings yet
BigML Machine Learning Platform
4 pages
ONTAP 9.10.1 Performance Tech Spec
No ratings yet
ONTAP 9.10.1 Performance Tech Spec
1 page
Power Machine Learning at Scale: Mapping Parallelized Modeling-to-HPC Infrastructure On AWS
No ratings yet
Power Machine Learning at Scale: Mapping Parallelized Modeling-to-HPC Infrastructure On AWS
20 pages
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Mapa Tipo de Datos
No ratings yet
Mapa Tipo de Datos
1 page
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet
The Beginner’s Guide to Local AI – Free AI Run Locally on Your PC
From Everand
The Beginner’s Guide to Local AI – Free AI Run Locally on Your PC
Steven Mcananey
No ratings yet
LightGBM in Practice: Definitive Reference for Developers and Engineers
From Everand
LightGBM in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet

Dokumen - Tips - Mmds 2014 Talk Distributing ML Algorithms From Gpus To The Cloud

Uploaded by

Dokumen - Tips - Mmds 2014 Talk Distributing ML Algorithms From Gpus To The Cloud

Uploaded by

Distributing Large-scale ML

Algorithms: from GPUs to the

▪ Linear blend Prize RMSE: 0.88

2. At what level should I distribute/parallelize?

3. What latency can I afford?

Anand Rajaraman: Former Stanford Prof. &

Norvig: “Google does not

1. For each subset of the population (e.g.

Each level has different requirements

You might also like