0% found this document useful (0 votes)
3 views

BD 10 Tensorflow

Uploaded by

downloadgame1510
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

BD 10 Tensorflow

Uploaded by

downloadgame1510
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Big Data Analytics

Big Data Analytics

Lars Schmidt-Thieme

Information Systems and Machine Learning Lab (ISMLL)


Institute of Computer Science
University of Hildesheim, Germany

C. Distributed Computing Environments /


3. Computational Graphs (TensorFlow)

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 1 / 31
Big Data Analytics

Syllabus
Tue. 9.4. (1) 0. Introduction
A. Parallel Computing
Tue. 16.4. (2) A.1 Threads
Tue. 23.4. (3) A.2 Message Passing Interface (MPI)
Tue. 30.4. (4) A.3 Graphical Processing Units (GPUs)
B. Distributed Storage
Tue. 7.5. (5) B.1 Distributed File Systems
Tue. 14.5. (6) B.2 Partioning of Relational Databases
Tue. 21.5. (7) B.3 NoSQL Databases
C. Distributed Computing Environments
Tue. 28.5. (8) C.1 Map-Reduce
Tue. 4.6. (9) C.2 Resilient Distributed Datasets (Spark)
Tue. 11.6. — — Pentecoste Break —
Tue. 18.6. (10) C.3 Computational Graphs (TensorFlow)
D. Distributed Machine Learning Algorithms
Tue. 25.6. (11) D.1 Distributed Stochastic Gradient Descent
Tue. 2.7. (12) D.2 Distributed Matrix Factorization
Tue. 9.7. (13) Questions and Answers
Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 1 / 31
Big Data Analytics

Outline

1. The Computational Graph

2. Variables

3. Example: Linear Regression

4. Automatic Gradients

5. Large Data I: Feeding

6. Large Data II: Reader Nodes

7. Debugging

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 1 / 31
Big Data Analytics 1. The Computational Graph

Outline

1. The Computational Graph

2. Variables

3. Example: Linear Regression

4. Automatic Gradients

5. Large Data I: Feeding

6. Large Data II: Reader Nodes

7. Debugging

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 1 / 31
Big Data Analytics 1. The Computational Graph

TensorFlow
I Computational framework

I multi-device, distributed
I Core in C/C++, standard interface in Python
I several further language bindings, e.g., R, Java

I open source
I developed by Google
I initially released Nov. 2015
I 2nd generation framework
I 1st generation framework was called DistBelief

I alternative: pytorch
I developed by facebook, since 2016, open source

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 1 / 31
Big Data Analytics 1. The Computational Graph

Tensors
I tensor = multidimensional array
I rank = number of dimensions
I shape = vector of sizes,
one size for each dimension.

rank common name shape


0 scalar ()
1 vector (size)
2 matrix (numrows, numcols)
≥3 tensor of higher order (numdim1 , numdim2 , . . ., numdimr )
I examples:
 
1.0 −3.0 2.3 1.7
A =  5.6 0.0 −1.3 3.4  , shape(A) = (3, 4), rank(A) = 2
−7.7 −3.3 −2.1 5.2
Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 2 / 31
Big Data Analytics 1. The Computational Graph

Computational Graphs
I TensorFlow organizes a computation as a directed graph.
I Nodes represent a tensor.
I or a list of tensors.

I Tensors can be:


I stored tensors
I immutable, value provided at creation time: tf.constant
I immutable, value provided when running the graph: tf.placeholder
I mutable: tf.Variable
I computed tensors (operations):
I having one or more input tensors
I having one or more output tensors
– output index: port

I Edges represent dependencies.


I Edge x → y if y is computed and x one of its inputs.
Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 3 / 31
Big Data Analytics 1. The Computational Graph

Sessions

I A session represents the state of an ongoing computation on a


computational graph.

I create with default constructor tf.Session.

I compute the value of a tensor node with run.

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 4 / 31
Big Data Analytics 1. The Computational Graph

Two Phases

1. Construct the Computational Graph


I create tensor nodes
I possibly referencing other tensor nodes as inputs

2. Compute values of a node of the Computation Graph (running)


I usually specify target tensor(s)
I computes all intermediate tensors required for this tensor
I yield the value of the target tensor(s)

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 5 / 31
Big Data Analytics 1. The Computational Graph

Hello TensorFlow: Add two Constants

1 import tensorflow as tf
2
3 a = tf.constant(3.0)
4 b = tf.constant(4.0) Output:
5 x = tf.add(a, b)
6 1 Tensor("Const:0", shape=(), dtype=float32)
7 print(a) 2 Tensor("Const_1:0", shape=(), dtype=float32)
8 print(b) 3 Tensor("Add:0", shape=(), dtype=float32)
9 print(x) 4 7
10
11 sess = tf.Session()
12 x_val = sess.run(x)
13 print(x_val)

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 6 / 31
Big Data Analytics 1. The Computational Graph

Hello TensorFlow: Add two Constants

1 import tensorflow as tf
2
3 a = tf.constant([3.0, -2.7, 1.2])
4 b = tf.constant([4.0, 5.1, -1.7]) Output:
5 x = tf.add(a, b)
6 1 Tensor("Const_2:0", shape=(3,), dtype=float32)
7 print(a) 2 Tensor("Const_3:0", shape=(3,), dtype=float32)
8 print(b) 3 Tensor("Add_1:0", shape=(3,), dtype=float32)
9 print(x) 4 [ 7.0 2.4 -0.5 ]
10
11 sess = tf.Session()
12 x_val = sess.run(x)
13 print(x_val)

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 7 / 31
Big Data Analytics 1. The Computational Graph

Tensor Types
I Different element types are represented by tf.DType:
dtype description
tf.float16 16-bit half-precision floating-point
tf.float32 32-bit single-precision floating-point
tf.float64 64-bit double-precision floating-point
tf.bfloat16 16-bit truncated floating-point
tf.complex64 64-bit single-precision complex
tf.complex128 128-bit double-precision complex
tf.int8 8-bit signed integer
tf.uint8 8-bit unsigned integer
tf.uint16 16-bit unsigned integer
tf.int16 16-bit signed integer
tf.int32 32-bit signed integer
tf.int64 64-bit signed integer
tf.bool Boolean
tf.string String
tf.qint8 Quantized 8-bit signed integer
tf.quint8 Quantized 8-bit unsigned integer
tf.qint16 Quantized 16-bit signed integer
tf.quint16 Quantized 16-bit unsigned integer
tf.qint32 Quantized 32-bit signed integer
tf.resource Handle to a mutable resource

I if omitted, inferred from values:


1 a = tf.constant(4.0) Output:
2 b = tf.constant(4)
1 Tensor("Const:0", shape=(), dtype=float32)
3 print(a)
2 Tensor("Const_1:0", shape=(), dtype=int32)
4 print(b)
Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 8 / 31
Big Data Analytics 1. The Computational Graph

Operations: Overloaded Operators

1 import tensorflow as tf
2
3 a = tf.constant(3.0)
4 b = tf.constant(4.0) Output:
5 x = a + b
6 1 Tensor("add_1:0", shape=(), dtype=float32)
7 print(x) 2 7
8
9 sess = tf.Session()
10 x_val = sess.run(x)
11 print(x_val)

operator operation node


+ tf.add
- tf.subtract
* tf.multiply
/ tf.divide

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 9 / 31
Big Data Analytics 2. Variables

Outline

1. The Computational Graph

2. Variables

3. Example: Linear Regression

4. Automatic Gradients

5. Large Data I: Feeding

6. Large Data II: Reader Nodes

7. Debugging

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 10 / 31
Big Data Analytics 2. Variables

v = tf.Variable(initial_value = None, . . . , name = None, . . . ,


dtype = None)

I has immutable element type

I has mutable shape (set_shape).


I has mutable element values.
I set by tf.assign, tf.assign_add (operations)
I separate values in each session

I has to be initialized before first use:


I run v.initializer operation or
I run initializers of all variables:
1 init = tf.global_variables_initializer()
2 sess.run(init)

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 10 / 31
Big Data Analytics 2. Variables

Variables

1 import tensorflow as tf
2
3 x = tf.Variable(3.0) Output:
4 sess = tf.Session()
1 3.0
5 sess.run( x.initializer )
2
6 x_val = sess.run(x)
3 0 4.0
7 print(x_val)
4 1 5.0
8
5 2 6.0
9 x_plus_one = tf.assign_add(x, 1.0)
6 3 7.0
10
7 4 8.0
11 for t in range(5):
12 x_val = sess.run(x_plus_one)
13 print(t, x_val)

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 11 / 31
Big Data Analytics 2. Variables

Initializing from Other Variables

1 import tensorflow as tf
2
3 x = tf.Variable(3.0)
4 y = tf.Variable( tf.multiply(tf.constant(2.0), Output:
5 x.initialized_value()) )
6 init = tf.global_variables_initializer() 1 6.0
7 sess = tf.Session()
8 sess.run(init)
9 y_val = sess.run(y)
10 print(y_val)

I v.initialized_value assures that a variable has been initialized before


I do not use
1 y = tf.Variable( tf.multiply(tf.constant(2.0), x) )

as x may be selected to be initialized after y .

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 12 / 31
Big Data Analytics 3. Example: Linear Regression

Outline

1. The Computational Graph

2. Variables

3. Example: Linear Regression

4. Automatic Gradients

5. Large Data I: Feeding

6. Large Data II: Reader Nodes

7. Debugging

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 13 / 31
Big Data Analytics 3. Example: Linear Regression

Example: Linear Regression

ŷ := β0 + X β prediction
r := y − ŷ residuum
N
1 X
` := rn2 loss/error
2
n=1
∂`
− := X T r negative gradient w.r.t. β
∂β
N
∂` X
− := rn negative gradient w.r.t. β0
∂β0
n=1
∂`
β next := β − η update of β
∂β
∂`
β0next := β0 − η update of β0
∂β0
Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 13 / 31
Big Data Analytics 3. Example: Linear Regression

Example: Linear Regression


1 import tensorflow as tf
2
3 Xs_data = [[2,1], [1,2], [4,3], [3,4]]
4 Ys_data = [[+1], [+1], [-1], [-1]]
5 eta_data = 0.01
6
7 Xs = tf.constant(Xs_data, dtype=tf.float32)
8 Ys = tf.constant(Ys_data, dtype=tf.float32)
9 eta = tf.constant(eta_data)
10
11 beta = tf.Variable([[0], [0]], dtype=tf.float32)
12 beta_0 = tf.Variable(0, dtype=tf.float32)
13
14 Yhats = tf.add(beta_0, tf.matmul(Xs, beta))
15 residua = tf.subtract(Ys, Yhats)
16 error = tf.reduce_sum(tf.square(residua))
17
18 neg_grad_beta = tf.matmul(Xs, residua, adjoint_a=True)
19 beta_update = tf.assign_add(beta, tf.multiply(eta, neg_grad_beta))
20 beta_0_update = tf.assign_add(beta_0, tf.multiply(eta, tf.reduce_sum(residua)))
21
22 init = tf.global_variables_initializer()
23 sess = tf.Session()
24 sess.run( init )
25
26 for t in range(100):
27 error_val, beta_val, beta_0_val = sess.run([error,beta_update,beta_0_update])
28 print(t, error_val, beta_0_val, beta_val[0,0], beta_val[1,0])

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 14 / 31
Big Data Analytics 3. Example: Linear Regression

Example: Linear Regression / Computational Graph

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 15 / 31
Big Data Analytics 4. Automatic Gradients

Outline

1. The Computational Graph

2. Variables

3. Example: Linear Regression

4. Automatic Gradients

5. Large Data I: Feeding

6. Large Data II: Reader Nodes

7. Debugging

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 16 / 31
Big Data Analytics 4. Automatic Gradients

tf.gradients(ys, xs, . . .)

I create operations whose final node computes all the gradients


 
∂yn
ys = (y1 , . . . , yN ), xs = (x1 , . . . , xM )
∂xm n=1,...,N,m=1,...,M

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 16 / 31
Big Data Analytics 4. Automatic Gradients

Example: Linear Regression w. Automatic Gradients


1 import tensorflow as tf
2
3 Xs_data = [[2,1], [1,2], [4,3], [3,4]]
4 Ys_data = [[+1], [+1], [-1], [-1]]
5 eta_data = 0.01
6
7 Xs = tf.constant(Xs_data, dtype=tf.float32)
8 Ys = tf.constant(Ys_data, dtype=tf.float32)
9 eta = tf.constant(eta_data)
10
11 beta = tf.Variable([[0], [0]], dtype=tf.float32)
12 beta_0 = tf.Variable(0, dtype=tf.float32)
13
14 Yhats = tf.add(beta_0, tf.matmul(Xs, beta))
15 error = tf.reduce_sum(tf.square(tf.subtract(Ys, Yhats)))
16
17 grads = tf.gradients(error, [beta, beta_0])
18 beta_update = tf.assign_sub(beta, tf.multiply(eta, grads[0]))
19 beta_0_update = tf.assign_sub(beta_0, tf.multiply(eta, grads[1]))
20
21 init = tf.global_variables_initializer()
22 sess = tf.Session()
23 sess.run( init )
24
25 for t in range(100):
26 error_val, beta_val, beta_0_val = sess.run([error,beta_update,beta_0_update])
27 print(t, error_val, beta_0_val, beta_val[0,0], beta_val[1,0])

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 17 / 31
Big Data Analytics 4. Automatic Gradients

How do Automatic Gradients Work?

∂y
to compute ∂x :
I find all paths p 1 , . . . , p K ∈ G ∗ in the graph G from x to y

I use chain rule:


K Y2
∂y X ∂plk
= k
∂x ∂pl−1
k=1 kl=|p |

∂o
I each operation plk =: o has to provide its gradient ∂i for each of its
inputs i.
∂plk ∂o k
I then ∂plk−1
= ∂i for i = pl−1 .

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 18 / 31
Big Data Analytics 4. Automatic Gradients

Example: LinReg w. Auto Grads / Computational Graph

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 19 / 31
Big Data Analytics 4. Automatic Gradients

Example: LinReg w. Auto Grads / Computational Graph

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 20 / 31
Big Data Analytics 5. Large Data I: Feeding

Outline

1. The Computational Graph

2. Variables

3. Example: Linear Regression

4. Automatic Gradients

5. Large Data I: Feeding

6. Large Data II: Reader Nodes

7. Debugging

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 21 / 31
Big Data Analytics 5. Large Data I: Feeding

I representing large data as a whole as a constant is not so useful


I e.g., if its size exceeds GPU memory, it cannot be deployed to GPU at
all.

I better break data into smaller pieces


I e.g., single instances or minibatches
I batch GD → SGD

I build a graph for a single instance / minibatch

I create placeholder nodes for the instance / minibatch

I placeholders are filled with the feed_dict parameter of run.

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 21 / 31
Big Data Analytics 5. Large Data I: Feeding

Placeholder Nodes and Feeding

1 import tensorflow as tf
2
3 a = tf.placeholder(shape=(), dtype=tf.float32) Output:
4 b = tf.placeholder(shape=(), dtype=tf.float32)
5 x = a + b 1 10.0
6 2 2.0
7 sess = tf.Session()
8 print( sess.run(x, {a: 3, b: 7}) )
9 print( sess.run(x, {a: -2, b: 4}) )

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 22 / 31
Big Data Analytics 5. Large Data I: Feeding

Example: Feeding SGD


1 import tensorflow as tf
2
3 Xs_data = [[2,1], [1,2], [4,3], [3,4]]
4 Ys_data = [+1, +1, -1, -1]
5 eta_data = 0.01
6
7 X = tf.placeholder(shape=(2), dtype=tf.float32)
8 Y = tf.placeholder(shape=(), dtype=tf.float32)
9 eta = tf.constant(eta_data)
10
11 beta = tf.Variable([0, 0], dtype=tf.float32)
12 beta_0 = tf.Variable(0, dtype=tf.float32)
13
14 Yhat = tf.add(beta_0, tf.reduce_sum(tf.multiply(X, beta)))
15 error = tf.reduce_sum(tf.square(tf.subtract(Y, Yhat)))
16
17 grads = tf.gradients(error, [beta, beta_0])
18 beta_update = tf.assign_sub(beta, tf.multiply(eta, grads[0]))
19 beta_0_update = tf.assign_sub(beta_0, tf.multiply(eta, grads[1]))
20
21 init = tf.global_variables_initializer()
22 sess = tf.Session()
23 sess.run( init )
24
25 for t in range(100):
26 error_epoch = 0
27 for X_data, Y_data in zip(Xs_data, Ys_data):
28 error_val, beta_val, beta_0_val = sess.run([error,beta_update,beta_0_update],
29 { X: X_data, Y: Y_data })
30 error_epoch += error_val
31 print(t, error_epoch, beta_0_val, beta_val[0], beta_val[1])
Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 23 / 31
Big Data Analytics 6. Large Data II: Reader Nodes

Outline

1. The Computational Graph

2. Variables

3. Example: Linear Regression

4. Automatic Gradients

5. Large Data I: Feeding

6. Large Data II: Reader Nodes

7. Debugging

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 24 / 31
Big Data Analytics 6. Large Data II: Reader Nodes

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 24 / 31
Big Data Analytics 6. Large Data II: Reader Nodes

Reader Node

file lr-data.csv:
1 import tensorflow as tf
1 X1, X2, Y
2
2 2, 1, +1
3 data_files = [’lr-data.csv’]; eta_data = 0.01
3 1, 2, +1
4
4 4, 3, -1
5 filename_queue = tf.train.string_input_producer(data_files)
5 3, 4, -1
6 reader = tf.TextLineReader(skip_header_lines=1)
7 _, line = reader.read(filename_queue)
8
9 sess = tf.Session()
10 coord = tf.train.Coordinator()
11 threads = tf.train.start_queue_runners(coord=coord, sess=sess)
12 Output:
13 for t in range(6):
1 0 b’2,␣1,␣+1’
14 line_val = sess.run(line)
2 1 b’1,␣2,␣+1’
15 print(t, line_val)
3 2 b’4,␣3,␣-1’
16
4 3 b’3,␣4,␣-1’
17 coord.request_stop()
5 4 b’2,␣1,␣+1’
18 coord.join(threads)
6 5 b’1,␣2,␣+1’

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 25 / 31
Big Data Analytics 6. Large Data II: Reader Nodes

Example: SGD Reading On The Fly


1 import tensorflow as tf
2 data_files = [’lr-data.csv’]; eta_data = 0.01
3
4 filename_queue = tf.train.string_input_producer(data_files)
5 reader = tf.TextLineReader(skip_header_lines=1)
6 key, value = reader.read(filename_queue)
7 X1, X2, Y = tf.decode_csv(value, record_defaults=[[0.0],[0.0],[0.0]])
8 X = tf.stack([X1, X2])
9 eta = tf.constant(eta_data)
10
11 beta = tf.Variable([0, 0], dtype=tf.float32)
12 beta_0 = tf.Variable(0, dtype=tf.float32)
13 Yhat = tf.add(beta_0, tf.reduce_sum(tf.multiply(X, beta)))
14 error = tf.reduce_sum(tf.square(tf.subtract(Y, Yhat)))
15 grads = tf.gradients(error, [beta, beta_0])
16 beta_update = tf.assign_sub(beta, tf.multiply(eta, grads[0]))
17 beta_0_update = tf.assign_sub(beta_0, tf.multiply(eta, grads[1]))
18 init = tf.global_variables_initializer()
19 sess = tf.Session() ; sess.run( init )
20 coord = tf.train.Coordinator()
21 threads = tf.train.start_queue_runners(coord=coord, sess=sess)
22
23 error_epoch = 0
24 for t in range(400):
25 error_val, beta_val, beta_0_val = sess.run([error,beta_update,beta_0_update])
26 error_epoch += error_val
27 if t % 10 == 0:
28 print(t, error_epoch, beta_0_val, beta_val[0], beta_val[1])
29 error_epoch = 0
30 coord.request_stop()
31 coord.join(threads)
Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 26 / 31
Big Data Analytics 7. Debugging

Outline

1. The Computational Graph

2. Variables

3. Example: Linear Regression

4. Automatic Gradients

5. Large Data I: Feeding

6. Large Data II: Reader Nodes

7. Debugging

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 27 / 31
Big Data Analytics 7. Debugging

Debugging: Visualize Computational Graph

1. Create a summary.FileWriter for the session and graph


before running the session:
1 import tensorflow as tf
2
3 a = tf.constant(3.0, name=’a’)
4 b = tf.constant(4.0, name=’b’)
5 x = tf.add(a, b, name=’x’)
6
7 print(a)
8
9 sess = tf.Session()
10 log = tf.summary.FileWriter(’logs/add-two-constants.log’, sess.graph)
11 x_val = sess.run(x)
12 log.close()
13 print(x_val)

2. run tensorboard on the logdir:


1 > tensorboard --logdir logs/add-two-constants.log

3. open localhost:6006 in your browser

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 27 / 31
Big Data Analytics 7. Debugging

Debugging: Visualize Computational Graph

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 28 / 31
Big Data Analytics 7. Debugging

Summary (1/3)

I TensorFlow represents computations as graphs.


I nodes representing (a list of) tensors.
I stored:
— immutable: constant, placeholder
— mutable: variable
I computed: operation
I edges representing dependencies
I x → y : y is computed and x is one of its inputs

I Two phases:
I graph construction
I executing (parts of) the graph (running)

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 29 / 31
Big Data Analytics 7. Debugging

Summary (2/3)
I Nodes can be distributed over different devices.
I cores of a CPU, GPUs, different compute nodes
I automatic placement based on cost heuristics
I eligible: sufficient memory available
I expected runtime
— based on cost heuristics
— possibly also based on past runs
I expected time for data movement between devices

I Operations can be assembled from dozens of elementary operations.

I elementary math: add, subtract, multiply, divide


I elementwise functions: log, exp, etc.
I matrix operations: matrix product, inversion, etc.
I structual tensor operations: slicing, stacking etc.

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 30 / 31
Big Data Analytics 7. Debugging

Summary (3/3)
I Gradients can be computed automatically.
I simply using the chain rule
I and explicit gradients for all elementary operations.
I gradients add nodes to the graph.

I Medium-sized data should be broken into parts and


fed into a placeholder for parts
I e.g., SGD: single instances or minibatches
I medium-sized data:
I too large for the GPU
I still can be read on a single data node

I Large data must be read by reader nodes as part of the graph


execution.
I large data: must be read on different data nodes in a distributed
fashion
Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 31 / 31
Big Data Analytics

Further Readings

I TensorFlow white paper:


I Abadi et al. [2016]
I not yet fully complete: evaluation section is missing

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 32 / 31
Big Data Analytics

References I
Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, Andy
Davis, Jeffrey Dean, Matthieu Devin, et al. Tensorflow: Large-scale machine learning on heterogeneous
distributed systems. arXiv preprint arXiv:1603.04467, 2016.

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
C. Distributed Computing Environments / 3. Computational Graphs (TensorFlow) 33 / 31

You might also like