Lec2 - Intro To Tensorflow
Lec2 - Intro To Tensorflow
TensorFlow!
1
Agenda
Why TensorFlow
Linear Regression
Managing experiments
3
Why TensorFlow?
● Flexibility + Scalability
● Popularity
4
import tensorflow as tf
5
Graphs and Sessions
6
Data Flow Graphs
10
What’s a tensor?
An n-dimensional array
and so on
11
Data Flow Graphs
12
Data Flow Graphs
Why x, y?
13
Data Flow Graphs
a = tf.add(3, 5)
3
Nodes: operators, variables, and constants
Edges: tensors 5 a
14
Data Flow Graphs
import tensorflow as tf
a = tf.add(3, 5)
print(a)
3
5 a
15
How to get the value of a?
16
How to get the value of a?
The session will look at the graph, trying to think: hmm, how can I get the value of a,
then it computes all the nodes that leads to a. 17
How to get the value of a?
The session will look at the graph, trying to think: hmm, how can I get the value of a,
then it computes all the nodes that leads to a. 18
How to get the value of a?
19
tf.Session()
20
tf.Session()
Session will also allocate memory to store the current values of variables.
21
More graph
Visualized by TensorBoard
x = 2
y = 3
op1 = tf.add(x, y)
op2 = tf.multiply(x, y)
op3 = tf.pow(op2, op1)
with tf.Session() as sess:
op3 = sess.run(op3)
22
Subgraphs
x = 2
y = 3 useless pow_op
add_op = tf.add(x, y)
mul_op = tf.multiply(x, y)
useless = tf.multiply(x, add_op)
pow_op = tf.pow(add_op, mul_op)
with tf.Session() as sess:
z = sess.run(pow_op)
add_op mul_op
23
Subgraphs
Possible to break graphs into several
chunks and run them parallelly
across multiple CPUs, GPUs, TPUs,
or other devices
Example: AlexNet
# Creates a graph.
with tf.device('/gpu:2'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], name='b')
c = tf.multiply(a, b)
25
Why graphs
27
Why graphs
28
Why graphs
29
Why graphs
31
Your first TensorFlow program
import tensorflow as tf
a = tf.constant(2, name='a')
b = tf.constant(3, name='b')
x = tf.add(a, b, name='add')
32
Visualize it with TensorBoard
import tensorflow as tf
a = tf.constant(2, name='a')
b = tf.constant(3, name='b') Create the summary writer after graph
definition and before running your session
x = tf.add(a, b, name='add')
33
Run it
Go to terminal, run:
$ python [yourprogram].py
$ tensorboard --logdir="./graphs" --port 6006 6006 or any port you want
34
35
Constants, Sequences,
Variables, Ops
36
Constants
import tensorflow as tf
# >> [[0 2]
# [4 6]]
37
Tensors filled with a specific value
tf.zeros([2, 3], tf.int32) ==> [[0, 0, 0], [0, 0, 0]]
38
Constants as sequences
tf.lin_space(start, stop, num, name=None)
tf.lin_space(10.0, 13.0, 4) ==> [10. 11. 12. 13.]
39
Randomly Generated Constants
tf.random_normal
tf.truncated_normal
tf.random_uniform
tf.random_shuffle
tf.random_crop
tf.multinomial
tf.random_gamma
40
What’s wrong with constants?
Not trainable
45
Constants are stored in graph definition
my_const = tf.constant([1.0, 2.0], name="my_const")
46
Constants are stored in graph definition
47
Constants are stored in graph definition
Use variables or readers for more data that requires more memory
48
Variables
# create variables with tf.Variable
s = tf.Variable(2, name="scalar")
m = tf.Variable([[0, 1], [2, 3]], name="matrix")
W = tf.Variable(tf.zeros([784,10]))
49
You have to initialize your variables
The easiest way is initializing all variables at once:
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
50
You have to initialize your variables
The easiest way is initializing all variables at once:
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
51
You have to initialize your variables
The easiest way is initializing all variables at once:
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
52
Eval() a variable
# W is a random 700 x 100 variable object
W = tf.Variable(tf.truncated_normal([700, 10]))
with tf.Session() as sess:
sess.run(W.initializer)
print(W)
53
tf.Variable.assign()
W = tf.Variable(10)
W.assign(100)
with tf.Session() as sess:
sess.run(W.initializer)
print(W.eval()) # >> ????
54
tf.Variable.assign()
W = tf.Variable(10)
W.assign(100)
with tf.Session() as sess:
sess.run(W.initializer)
print(W.eval()) # >> 10
Ugh, why?
55
tf.Variable.assign()
W = tf.Variable(10)
W.assign(100)
with tf.Session() as sess:
sess.run(W.initializer)
print(W.eval()) # >> 10
56
tf.Variable.assign()
W = tf.Variable(10)
W.assign(100)
with tf.Session() as sess:
sess.run(W.initializer)
print(W.eval()) # >> 10
--------
W = tf.Variable(10)
assign_op = W.assign(100)
with tf.Session() as sess:
sess.run(W.initializer)
sess.run(assign_op)
print(W.eval()) # >> 100
57
Placeholder
58
A quick reminder
59
Placeholders
⇒ Assemble the graph first without knowing the values needed for computation
60
Placeholders
⇒ Assemble the graph first without knowing the values needed for computation
Analogy:
Define the function f(x, y) = 2 * x + y without knowing value of x or y.
x, y are placeholders for the actual values.
61
Why placeholders?
We, or our clients, can later supply their own data when they
need to execute the computation.
62
Placeholders
63
Placeholders
64
Supplement the values to placeholders using
a dictionary
65
Placeholders
# >> [6, 7, 8]
66
Placeholders
67
Placeholders
# >> [6, 7, 8]
68
Placeholders are valid ops
# >> [6, 7, 8]
69
What if want to feed multiple data points in?
You have to do it one at a time
with tf.Session() as sess:
for a_value in list_of_values_for_a:
print(sess.run(c, {a: a_value}))
70
Linear Regression
in TensorFlow
72
Model the linear relationship between:
● dependent variable Y
● explanatory variables X
73
Want
76
Model
Inference: Y_predicted = w * X + b
77
Phase 1: Assemble our graph
80
Step 2: Create placeholders for
inputs and labels
82
Step 3: Create weight and bias
tf.get_variable(
No need to specify shape if
using constant initializer
name,
shape=None,
dtype=None,
initializer=None,
) 83
Step 4: Inference
Y_predicted = w * X + b
84
Step 5: Specify loss function
85
Step 6: Create optimizer
opt = tf.train.GradientDescentOptimizer(learning_rate=0.001)
optimizer = opt.minimize(loss)
86
Phase 2: Train our model
87
Write log files using a FileWriter
88
See it on TensorBoard
89
90
91
Optimizers
111
Optimizer
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(loss)
112
Optimizer
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001).minimize(loss)
Session looks at all trainable variables that loss depends on and update them
113
Optimizer
Session looks at all trainable variables that optimizer depends on and update them
114
Trainable variables
tf.Variable(initial_value=None, trainable=True,...)
115
List of optimizers in TF
tf.train.GradientDescentOptimizer
tf.train.AdagradOptimizer
Usually Adam works out-of-the-box better than
tf.train.MomentumOptimizer SGD
tf.train.AdamOptimizer
tf.train.FtrlOptimizer
tf.train.RMSPropOptimizer
...
116
Name scope
127
Name scope
# declare op_1
# declare op_2
# ...
128
Name scope
with tf.name_scope('data'):
iterator = dataset.make_initializable_iterator()
center_words, target_words = iterator.get_next()
with tf.name_scope('embed'):
embed_matrix = tf.get_variable('embed_matrix',
shape=[VOCAB_SIZE, EMBED_SIZE], ...)
embed = tf.nn.embedding_lookup(embed_matrix, center_words)
with tf.name_scope('loss'):
nce_weight = tf.get_variable('nce_weight', shape=[VOCAB_SIZE, EMBED_SIZE], ...)
nce_bias = tf.get_variable('nce_bias', initializer=tf.zeros([VOCAB_SIZE]))
loss = tf.reduce_mean(tf.nn.nce_loss(weights=nce_weight, biases=nce_bias, …)
with tf.name_scope('optimizer'):
optimizer = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(loss)
129
TensorBoard
130
Variable scope
tf.name_scope() vs tf.variable_scope()
131
Variable scope
132
Variable sharing: The problem
def two_hidden_layers(x):
w1 = tf.Variable(tf.random_normal([100, 50]), name='h1_weights')
b1 = tf.Variable(tf.zeros([50]), name='h1_biases')
h1 = tf.matmul(x, w1) + b1
133
Variable sharing: The problem
def two_hidden_layers(x):
w1 = tf.Variable(tf.random_normal([100, 50]), name='h1_weights')
b1 = tf.Variable(tf.zeros([50]), name='h1_biases')
h1 = tf.matmul(x, w1) + b1
134
Sharing Variable: The problem
Two sets of
variables are
created.
135
tf.get_variable()
tf.get_variable(<name>, <shape>, <initializer>)
136
tf.get_variable()
def two_hidden_layers(x):
assert x.shape.as_list() == [200, 100]
w1 = tf.get_variable("h1_weights", [100, 50], initializer=tf.random_normal_initializer())
b1 = tf.get_variable("h1_biases", [50], initializer=tf.constant_initializer(0.0))
h1 = tf.matmul(x, w1) + b1
assert h1.shape.as_list() == [200, 50]
w2 = tf.get_variable("h2_weights", [50, 10], initializer=tf.random_normal_initializer())
b2 = tf.get_variable("h2_biases", [10], initializer=tf.constant_initializer(0.0))
logits = tf.matmul(h1, w2) + b2
return logits
logits1 = two_hidden_layers(x1)
logits2 = two_hidden_layers(x2)
137
tf.get_variable()
def two_hidden_layers(x):
assert x.shape.as_list() == [200, 100]
w1 = tf.get_variable("h1_weights", [100, 50], initializer=tf.random_normal_initializer())
b1 = tf.get_variable("h1_biases", [50], initializer=tf.constant_initializer(0.0))
h1 = tf.matmul(x, w1) + b1
assert h1.shape.as_list() == [200, 50]
w2 = tf.get_variable("h2_weights", [50, 10], initializer=tf.random_normal_initializer())
b2 = tf.get_variable("h2_biases", [10], initializer=tf.constant_initializer(0.0))
logits = tf.matmul(h1, w2) + b2
return logits
ValueError: Variable h1_weights already exists,
logits1 = two_hidden_layers(x1)
disallowed. Did you mean to set reuse=True in
logits2 = two_hidden_layers(x2) VarScope?
138
tf.variable_scope()
def two_hidden_layers(x):
assert x.shape.as_list() == [200, 100]
w1 = tf.get_variable("h1_weights", [100, 50], initializer=tf.random_normal_initializer())
b1 = tf.get_variable("h1_biases", [50], initializer=tf.constant_initializer(0.0))
h1 = tf.matmul(x, w1) + b1
assert h1.shape.as_list() == [200, 50]
w2 = tf.get_variable("h2_weights", [50, 10], initializer=tf.random_normal_initializer())
b2 = tf.get_variable("h2_biases", [10], initializer=tf.constant_initializer(0.0))
logits = tf.matmul(h1, w2) + b2
return logits
Put your variables within a scope and reuse all
with tf.variable_scope('two_layers') as scope: variables within that scope
logits1 = two_hidden_layers(x1)
scope.reuse_variables()
139
logits2 = two_hidden_layers(x2)
tf.variable_scope()
Only one set of
variables, all within
the variable scope
“two_layers”
140
tf.variable_scope()
tf.variable_scope
implicitly creates a
name scope
141
Reusable code?
def two_hidden_layers(x):
assert x.shape.as_list() == [200, 100]
w1 = tf.get_variable("h1_weights", [100, 50], initializer=tf.random_normal_initializer())
b1 = tf.get_variable("h1_biases", [50], initializer=tf.constant_initializer(0.0))
h1 = tf.matmul(x, w1) + b1
assert h1.shape.as_list() == [200, 50]
w2 = tf.get_variable("h2_weights", [50, 10], initializer=tf.random_normal_initializer())
b2 = tf.get_variable("h2_biases", [10], initializer=tf.constant_initializer(0.0))
logits = tf.matmul(h1, w2) + b2
return logits
with tf.variable_scope('two_layers') as scope:
logits1 = two_hidden_layers(x1)
scope.reuse_variables()
142
logits2 = two_hidden_layers(x2)
Layer ‘em up
def fully_connected(x, output_dim, scope):
with tf.variable_scope(scope, reuse=tf.AUTO_REUSE) as scope:
w = tf.get_variable("weights", [x.shape[1], output_dim], initializer=tf.random_normal_initializer())
b = tf.get_variable("biases", [output_dim], initializer=tf.constant_initializer(0.0))
return tf.matmul(x, w) + b
def two_hidden_layers(x):
Fetch variables if they
already exist
h1 = fully_connected(x, 50, 'h1')
h2 = fully_connected(h1, 10, 'h2') Else, create them
143
Manage Experiments
tf.train.Saver
saves graph’s variables in binary files
146
Saves sessions, not graphs!
147
Save parameters after 1000 steps
# define model
model = SkipGramModel(params)
148
Specify the step at which the model is saved
# define model
model = SkipGramModel(params)
149
Global step
Very common in
TensorFlow program
150
Global step
global_step = tf.Variable(0,
dtype=tf.int32,
trainable=False,
name='global_step')
151
Restore variables
saver.restore(sess, 'checkpoints/name_of_the_checkpoint')
155
Restore the latest checkpoint
# check if there is checkpoint
ckpt = tf.train.get_checkpoint_state(os.path.dirname('checkpoints/checkpoint'))
156
tf.summary
Why matplotlib when you can summarize?
157
tf.summary
158
Step 1: create summaries
with tf.name_scope("summaries"):
tf.summary.scalar("loss", self.loss)
tf.summary.scalar("accuracy", self.accuracy)
tf.summary.histogram("histogram loss", self.loss)
summary_op = tf.summary.merge_all()
159
Step 2: run them
160
Step 3: write summaries to file
writer.add_summary(summary, global_step=step)
161
Putting it together
tf.summary.scalar("loss", self.loss)
tf.summary.histogram("histogram loss", self.loss)
summary_op = tf.summary.merge_all()
if (index + 1) % 1000 == 0:
saver.save(sess, 'checkpoints/skip-gram', index)
162
See summaries on TensorBoard
163
Scalar loss
164
Histogram loss
165
Toggle run to compare experiments
166