Large-Scale Deep Learning With Tensorflow: Jeff Dean Google Brain Team
Large-Scale Deep Learning With Tensorflow: Jeff Dean Google Brain Team
Large-Scale Deep Learning With Tensorflow: Jeff Dean Google Brain Team
With TensorFlow
Jeff Dean
Google Brain team
g.co/brain
In collaboration with many other people at Google
What is the Google Brain Team?
g.co/brain
We Disseminate Our Work in Many Ways
● By publishing our work
○ See papers at research.google.com/pubs/BrainTeam.html
Document 1
… car parking available for a small fee.
… parts of our floor model inventory for sale.
Document 2
Selling all kinds of automobile and pickup truck parts,
engines, and transmissions.
Example Needs of the Future
● Which of these eye images shows symptoms of diabetic
retinopathy?
● Find me all rooftops in North America
● Describe this video in Spanish
● Find me all documents relevant to reinforcement learning for
robotics and summarize them in German
● Find a free time for everyone in the Smart Calendar project
to meet and set up a videoconference
● Robot, please fetch me a cup of tea from the snack kitchen
Growing Use of Deep Learning at Google
# of directories containing model description files Across many
products/areas:
Android
Apps
drug discovery
Gmail
Image understanding
Maps
Natural language
understanding
Photos
Robotics research
Speech
Translation
YouTube
… many others ...
Important Property of Neural Networks
Results get better with
more data +
bigger models +
more computation
● Core in C++
○ Very low overhead
● Core in C++
○ Very low overhead
● Different front ends for specifying/driving the computation
○ Python and C++ today, easy to add more
● Core in C++
○ Very low overhead
● Different front ends for specifying/driving the computation
○ Python and C++ today, easy to add more
MatMul Xent
examples
labels
s o rs
Computation is a dataflow graph ten
with
MatMul Xent
examples
labels
Example TensorFlow fragment
biases
learning rate
Symbolic Differentiation
GPU 0 CPU
biases
learning rate
Assign Devices to Ops
● TensorFlow inserts Send/Recv Ops to transport tensors across devices
● Recv ops pull data from Send ops
GPU 0 CPU
biases
Send Recv
Add ... Mul Assign
... Sub
learning rate
Assign Devices to Ops
● TensorFlow inserts Send/Recv Ops to transport tensors across devices
● Recv ops pull data from Send ops
GPU 0 CPU
biases
Send Recv
Add ... Mul Assign
... Send Recv Sub
Recv Send
Recv
Model
Replicas ...
Data
...
Data Parallelism
Parameter Servers
Model
Replicas ...
Data
...
Data Parallelism
Parameter Servers
∆p p
Model
Replicas ...
Data
...
Data Parallelism
Parameter Servers p’ = p + ∆p
∆p p
Model
Replicas ...
Data
...
Data Parallelism
Parameter Servers p’ = p + ∆p
p’
Model
Replicas ...
Data
...
Data Parallelism
Parameter Servers
∆p’ p’
Model
Replicas ...
Data
...
Data Parallelism
Parameter Servers p’’ = p’ + ∆p
∆p’ p’
Model
Replicas ...
Data
...
Data Parallelism
Parameter Servers p’’ = p’ + ∆p
∆p’ p’
Model
Replicas ...
Data
...
Distributed training mechanisms
Graph structure and low-level graph primitives (queues) allow us to play with
synchronous vs. asynchronous update algorithms.
Cross process communication is the same!
● Communication across machines over the network abstracted identically to
cross device communication.
/job:worker/cpu:0 /job:ps/gpu:0
biases
Send Recv
Add ... Mul Assign
... Send Recv Sub
Recv Send
Recv
50 GPUs
10 GPUs
1 GPU
Hours
Image Model Training Time
50 GPUs
10 GPUs
2.6 hours vs. 79.3 hours (30.5X) 1 GPU
Hours
Sync converges faster (time to accuracy)
See Google Cloud Platform blog: Google supercharges machine learning tasks with TPU custom chip,
by Norm Jouppi, May, 2016
Long Short-Term Memory (LSTMs):
Make Your Memory Cells Differentiable
[Hochreiter & Schmidhuber, 1997] Sigmoids
W R
WRITE? READ?
X M Y X M Y
FORGET?
F
Example: LSTM [Hochreiter et al, 1997][Gers et al, 1999]
Enables
long term
dependencies
to flow
Example: LSTM
for i in range(20):
m, c = LSTMCell(x[i], mprev, cprev)
mprev = m
cprev = c
Example: Deep LSTM
for i in range(20):
for d in range(4): # d is depth
input = x[i] if d is 0 else m[d-1]
m[d], c[d] = LSTMCell(input, mprev[d], cprev[d])
mprev[d] = m[d]
cprev[d] = c[d]
Example: Deep LSTM
for i in range(20):
for d in range(4): # d is depth
input = x[i] if d is 0 else m[d-1]
m[d], c[d] = LSTMCell(input, mprev[d], cprev[d])
mprev[d] = m[d]
cprev[d] = c[d]
Example: Deep LSTM
for i in range(20):
for d in range(4): # d is depth
with tf.device("/gpu:%d" % d):
input = x[i] if d is 0 else m[d-1]
m[d], c[d] = LSTMCell(input, mprev[d], cprev[d])
mprev[d] = m[d]
cprev[d] = c[d]
A B C D
GPU6
GPU3
GPU1
2000 x 4 =
_ 8k dims per
A B C D A B C sentence
_
A B C D
GPU6
GPU3
GPU1
2000 x 4 =
_ 8k dims per
A B C D A B C sentence
_
A B C D
GPU6
GPU3
GPU1
2000 x 4 =
_ 8k dims per
A B C D A B C sentence
_
A B C D
GPU6
GPU3
GPU1
2000 x 4 =
_ 8k dims per
A B C D A B C sentence
_
A B C D
GPU6
GPU3
GPU1
2000 x 4 =
_ 8k dims per
A B C D A B C sentence
_
A B C D
GPU6
GPU3
GPU1
2000 x 4 =
_ 8k dims per
A B C D A B C sentence
_
A B C D
GPU6
GPU3
GPU1
2000 x 4 =
_ 8k dims per
A B C D A B C sentence
_
A B C D
GPU6
GPU3
GPU1
2000 x 4 =
_ 8k dims per
A B C D A B C sentence
_
A B C D
GPU6
GPU3
GPU1
2000 x 4 =
_ 8k dims per
A B C D A B C sentence
_
A B C D
GPU6
GPU3
GPU1
2000 x 4 =
_ 8k dims per
A B C D A B C sentence
_
A B C D
GPU6
GPU3
GPU1
2000 x 4 =
_ 8k dims per
A B C D A B C sentence
_
A B C D
GPU6
GPU3
GPU1
2000 x 4 =
_ 8k dims per
A B C D A B C sentence
_
What are some ways that
deep learning is having
a significant impact at Google?
Deep
“How cold is
Recurrent
it outside?”
Neural Network
Acoustic Input Text Output
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov,
Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich
Deep
Convolutional “ocean”
Neural Network
Automatic Tag
Your Photo
Launched in 2015
Third most important search ranking signal (of 100s)
Bloomberg, Oct 2015: “Google Turning Its Lucrative Web Search Over to AI Machines”
Sequence-to-Sequence Model
Target sequence
[Sutskever & Vinyals & Le NIPS 2014] X Y Z Q
Deep LSTM
A B C D __ X Y Z
Input sequence
Sequence-to-Sequence Model: Machine Translation
Target sentence
[Sutskever & Vinyals & Le NIPS 2014] How
Input sentence
Sequence-to-Sequence Model: Machine Translation
Target sentence
[Sutskever & Vinyals & Le NIPS 2014] How tall
Input sentence
Sequence-to-Sequence Model: Machine Translation
Target sentence
[Sutskever & Vinyals & Le NIPS 2014] How tall are
Input sentence
Sequence-to-Sequence Model: Machine Translation
Target sentence
[Sutskever & Vinyals & Le NIPS 2014] How tall are you?
Input sentence
Sequence-to-Sequence Model: Machine Translation
At inference time:
Beam search to choose most
[Sutskever & Vinyals & Le NIPS 2014] probable
over possible output sequences
Input sentence
Smart Reply
April 1, 2009: April Fool’s Day joke
Incoming Email
Smart Reply - Nov 2015
Activate
Smart Reply?
Small
Feed-Forward yes/no
Neural Network
Google Research Blog
Incoming Email
Smart Reply - Nov 2015
Activate
Smart Reply?
Small
Feed-Forward yes/no
Neural Network
Generated Replies
Deep
Recurrent
Neural Network
Image Captioning
[Vinyals et al., CVPR 2015] A young girl asleep
W __ A young girl
Image Captions Research
Human: A young girl asleep on
the sofa cuddling a stuffed
bear.
cloud.google.com/translate
cloud.google.com/speech
cloud.google.com/vision
cloud.google.com/text
Use Cloud-based APIs
cloud.google.com/translate
cloud.google.com/speech
cloud.google.com/vision
cloud.google.com/text
Google Cloud Vision API
https://cloud.google.com/vision/
Google Cloud ML
Scaled service for training and inference w/TensorFlow
A Few TensorFlow Community Examples
(From more than 2000 results for ‘tensorflow’ on GitHub)
● DQN: github.com/nivwusquorum/tensorflow-deepq
● NeuralArt: github.com/woodrush/neural-art-tf
● Char RNN: github.com/sherjilozair/char-rnn-tensorflow
● Keras ported to TensorFlow: github.com/fchollet/keras
● Show and Tell: github.com/jazzsaxmafia/show_and_tell.tensorflow
● Mandarin translation: github.com/jikexueyuanwiki/tensorflow-zh
...
A Few TensorFlow Community Examples
(From more than 2000 2100 results for ‘tensorflow’ on GitHub)
● DQN: github.com/nivwusquorum/tensorflow-deepq
● NeuralArt: github.com/woodrush/neural-art-tf
● Char RNN: github.com/sherjilozair/char-rnn-tensorflow
● Keras ported to TensorFlow: github.com/fchollet/keras
● Show and Tell: github.com/jazzsaxmafia/show_and_tell.tensorflow
● Mandarin translation: github.com/jikexueyuanwiki/tensorflow-zh
...
github.com/nivwusquorum/tensorflow-deepq
github.com/woodrush/neural-art-tf
github.com/sherjilozair/char-rnn-tensorflow
github.com/fchollet/keras
github.com/jazzsaxmafia/show_and_tell.tensorflow
github.com/jikexueyuanwiki/tensorflow-zh
What Does the Future Hold?
Deep learning usage will continue to grow and accelerate:
g.co/brain (We’re hiring! Also check out Brain Residency program at g.co/brainresidency)
www.tensorflow.org
research.google.com/people/jeff
research.google.com/pubs/BrainTeam.html
Questions?