Introduction To Tensors
Introduction To Tensors
bookmark_border
import tensorflow as tf
Run in Google View source on GitHub Download notebook
import numpy as np
Colab
Tensors are multi-dimensional arrays with a
uniform type (called a dtype). You can see all supported dtypes at tf.dtypes.
If you're familiar with NumPy, tensors are (kind of) like np.arrays.
All tensors are immutable like Python numbers and strings: you can never update the contents of a tensor, only create a new one.
Basics
Here is a "scalar" or "rank-0" tensor . A scalar contains a single value, and no "axes".
rank_0_tensor = tf.constant(4)
print(rank_0_tensor)
A "vector" or "rank-1" tensor is like a list of values. A vector has one axis:
print(rank_1_tensor)
# If you want to be specific, you can set the dtype (see below) at creation time
[3, 4],
print(rank_2_tensor)
tf.Tensor(
[[1. 2.]
[3. 4.]
Tensors may have more axes; here is a tensor with three axes:
[[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]],
print(rank_3_tensor)
tf.Tensor(
[[[ 0 1 2 3 4]
[ 5 6 7 8 9]]
[[10 11 12 13 14]
[15 16 17 18 19]]
[[20 21 22 23 24]
There are many ways you might visualize a tensor with more than two axes.
np.array(rank_2_tensor)
array([[1., 2.],
[3., 4.],
rank_2_tensor.numpy()
array([[1., 2.],
[3., 4.],
Tensors often contain floats and ints, but have many other types, including:
complex numbers
strings
The base tf.Tensor class requires tensors to be "rectangular"---that is, along each axis, every element is the same size. However, there are
specialized types of tensors that can handle different shapes:
You can do basic math on tensors, including addition, element-wise multiplication, and matrix multiplication.
a = tf.constant([[1, 2],
[3, 4]])
b = tf.constant([[1, 1],
tf.Tensor(
[[2 3]
tf.Tensor(
[[1 2]
[[3 3]
tf.Tensor(
[[2 3]
tf.Tensor(
[[1 2]
tf.Tensor(
[[3 3]
print(tf.reduce_max(c))
print(tf.math.argmax(c))
print(tf.nn.softmax(c))
tf.Tensor(
[[2.6894143e-01 7.3105854e-01]
Note: Typically, anywhere a TensorFlow function expects a Tensor as input, the function will also accept anything that can be converted to
a Tensor using tf.convert_to_tensor. See below for an example.
tf.convert_to_tensor([1,2,3])
tf.reduce_max([1,2,3])
tf.reduce_max(np.array([1,2,3]))
About shapes
Rank: Number of tensor axes. A scalar has rank 0, a vector has rank 1, a matrix is rank 2.
Size: The total number of items in the tensor, the product of the shape vector's elements.
Note: Although you may see reference to a "tensor of two dimensions", a rank-2 tensor does not usually describe a 2D space.
Tensors and tf.TensorShape objects have convenient properties for accessing these:
But note that the Tensor.ndim and Tensor.shape attributes don't return Tensor objects. If you need a Tensor use
the tf.rank or tf.shape function. This difference is subtle, but it can be important when building graphs (later).
tf.rank(rank_4_tensor)
tf.shape(rank_4_tensor)
While axes are often referred to by their indices, you should always keep track of the meaning of each. Often axes are ordered from global to
local: The batch axis first, followed by spatial dimensions, and features for each location last. This way feature vectors are contiguous regions
of memory.
Single-axis indexing
TensorFlow follows standard Python indexing rules, similar to indexing a list or a string in Python, and the basic rules for NumPy indexing.
indexes start at 0
print(rank_1_tensor.numpy())
[ 0 1 1 2 3 5 8 13 21 34]
print("First:", rank_1_tensor[0].numpy())
print("Second:", rank_1_tensor[1].numpy())
print("Last:", rank_1_tensor[-1].numpy())
First: 0
Second: 1
Last: 34
print("Everything:", rank_1_tensor[:].numpy())
print("Reversed:", rank_1_tensor[::-1].numpy())
Everything: [ 0 1 1 2 3 5 8 13 21 34]
Before 4: [0 1 1 2]
From 2, before 7: [1 2 3 5 8]
Reversed: [34 21 13 8 5 3 2 1 1 0]
Multi-axis indexing
The exact same rules as in the single-axis case apply to each axis independently.
print(rank_2_tensor.numpy())
[[1. 2.]
[3. 4.]
[5. 6.]]
print(rank_2_tensor[1, 1].numpy())
4.0
[[3. 4.]
[5. 6.]]
tf.Tensor(
[[ 4 9]
[14 19]
Selecting the last feature across all locations in each example in the batch
Read the tensor slicing guide to learn how you can apply indexing to manipulate individual elements in your tensors.
Manipulating Shapes
# Shape returns a `TensorShape` object that shows the size along each axis
print(x.shape)
(3, 1)
# You can convert this object into a Python list, too
print(x.shape.as_list())
[3, 1]
You can reshape a tensor into a new shape. The tf.reshape operation is fast and cheap as the underlying data does not need to be duplicated.
print(x.shape)
print(reshaped.shape)
(3, 1)
(1, 3)
The data maintains its layout in memory and a new tensor is created, with the requested shape, pointing to the same data. TensorFlow uses
C-style "row-major" memory ordering, where incrementing the rightmost index corresponds to a single step in memory.
print(rank_3_tensor)
tf.Tensor(
[[[ 0 1 2 3 4]
[ 5 6 7 8 9]]
[[10 11 12 13 14]
[15 16 17 18 19]]
[[20 21 22 23 24]
If you flatten a tensor you can see what order it is laid out in memory.
print(tf.reshape(rank_3_tensor, [-1]))
tf.Tensor(
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Typically the only reasonable use of tf.reshape is to combine or split adjacent axes (or add/remove 1s).
For this 3x2x5 tensor, reshaping to (3x2)x5 or 3x(2x5) are both reasonable things to do, as the slices do not mix:
tf.Tensor(
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]
[20 21 22 23 24]
tf.Tensor(
[[ 0 1 2 3 4 5 6 7 8 9]
[10 11 12 13 14 15 16 17 18 19]
Reshaping will "work" for any new shape with the same total number of elements, but it will not do anything useful if you do not respect the
order of the axes.
Swapping axes in tf.reshape does not work; you need tf.transpose for that.
try:
except Exception as e:
print(f"{type(e).__name__}: {e}")
tf.Tensor(
[[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]]
[[15 16 17 18 19]
[20 21 22 23 24]
tf.Tensor(
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]
You may run across not-fully-specified shapes. Either the shape contains a None (an axis-length is unknown) or the whole shape is None (the
rank of the tensor is unknown).
Except for tf.RaggedTensor, such shapes will only occur in the context of TensorFlow's symbolic, graph-building APIs:
tf.function
More on DTypes
To inspect a tf.Tensor's data type use the Tensor.dtype property.
When creating a tf.Tensor from a Python object you may optionally specify the datatype.
If you don't, TensorFlow chooses a datatype that can represent your data. TensorFlow converts Python integers to tf.int32 and Python
floating point numbers to tf.float32. Otherwise TensorFlow uses the same rules NumPy uses when converting to arrays.
print(the_u8_tensor)
Broadcasting
Broadcasting is a concept borrowed from the equivalent feature in NumPy. In short, under certain conditions, smaller tensors are "stretched"
automatically to fit larger tensors when running combined operations on them.
The simplest and most common case is when you attempt to multiply or add a tensor to a scalar. In that case, the scalar is broadcast to be the
same shape as the other argument.
x = tf.constant([1, 2, 3])
y = tf.constant(2)
z = tf.constant([2, 2, 2])
print(tf.multiply(x, 2))
print(x * y)
print(x * z)
Likewise, axes with length 1 can be stretched out to match the other arguments. Both arguments can be stretched in the same computation.
In this case a 3x1 matrix is element-wise multiplied by a 1x4 matrix to produce a 3x4 matrix. Note how the leading 1 is optional: The shape of
y is [4].
x = tf.reshape(x,[3,1])
y = tf.range(1, 5)
print(x, "\n")
print(y, "\n")
print(tf.multiply(x, y))
tf.Tensor(
[[1]
[2]
[[ 1 2 3 4]
[ 2 4 6 8]
[2, 2, 2, 2],
[3, 3, 3, 3]])
[1, 2, 3, 4],
[1, 2, 3, 4]])
print(x_stretch * y_stretch) # Again, operator overloading
tf.Tensor(
[[ 1 2 3 4]
[ 2 4 6 8]
Most of the time, broadcasting is both time and space efficient, as the broadcast operation never materializes the expanded tensors in
memory.
tf.Tensor(
[[1 2 3]
[1 2 3]
Unlike a mathematical op, for example, broadcast_to does nothing special to save memory. Here, you are materializing the tensor.
It can get even more complicated. This section of Jake VanderPlas's book Python Data Science Handbook shows more broadcasting tricks
(again in NumPy).
tf.convert_to_tensor
Most ops, like tf.matmul and tf.reshape take arguments of class tf.Tensor. However, you'll notice in the above case, Python objects shaped
like tensors are accepted.
Most, but not all, ops call convert_to_tensor on non-tensor arguments. There is a registry of conversions, and most object classes like
NumPy's ndarray, TensorShape, Python lists, and tf.Variable will all convert automatically.
See tf.register_tensor_conversion_function for more details, and if you have your own type you'd like to automatically convert to a tensor.
Ragged Tensors
A tensor with variable numbers of elements along some axis is called "ragged". Use tf.ragged.RaggedTensor for ragged data.
ragged_list = [
[0, 1, 2, 3],
[4, 5],
[6, 7, 8],
[9]]
try:
tensor = tf.constant(ragged_list)
except Exception as e:
print(f"{type(e).__name__}: {e}")
ragged_tensor = tf.ragged.constant(ragged_list)
print(ragged_tensor)
The shape of a tf.RaggedTensor will contain some axes with unknown lengths:
print(ragged_tensor.shape)
(4, None)
String tensors
tf.string is a dtype, which is to say you can represent data as strings (variable-length byte arrays) in tensors.
The strings are atomic and cannot be indexed the way Python strings are. The length of the string is not one of the axes of the tensor.
See tf.strings for functions to manipulate them.
print(scalar_string_tensor)
"Lazy dog"])
# Note that the shape is (3,). The string length is not included.
print(tensor_of_strings)
In the above printout the b prefix indicates that tf.string dtype is not a unicode string, but a byte-string. See the Unicode Tutorial for more
about working with unicode text in TensorFlow.
tf.constant("🥳👍")
Some basic functions with strings can be found in tf.strings, including tf.strings.split.
print(tf.strings.split(tensor_of_strings))
And tf.strings.to_number:
Although you can't use tf.cast to turn a string tensor into numbers, you can convert it into bytes, and then into numbers.
byte_strings = tf.strings.bytes_split(tf.constant("Duck"))
print("Bytes:", byte_ints)
Unicode chars: tf.Tensor([b'\xe3\x82\xa2' b'\xe3\x83\x92' b'\xe3\x83\xab' b' ' b'\xf0\x9f\xa6\x86'], shape=(5,), dtype=string)
The tf.string dtype is used for all raw bytes data in TensorFlow. The tf.io module contains functions for converting data to and from bytes,
including decoding images and parsing csv.
Sparse tensors
Sometimes, your data is sparse, like a very wide embedding space. TensorFlow supports tf.sparse.SparseTensor and related operations to
store sparse data efficiently.
values=[1, 2],
dense_shape=[3, 4])
print(sparse_tensor, "\n")
print(tf.sparse.to_dense(sparse_tensor))
SparseTensor(indices=tf.Tensor(
[[0 0]
[1 2]], shape=(2, 2), dtype=int64), values=tf.Tensor([1 2], shape=(2,), dtype=int32), dense_shape=tf.Tensor([3 4], shape=(2,), dtype=int64))
tf.Tensor(
[[1 0 0 0]
[0 0 2 0]
[0 0 0 0]], shape=(3, 4), dtype=int32)
Introduction to Variables
bookmark_border
Variables are created and tracked via the tf.Variable class. A tf.Variable represents a tensor whose value can be changed by running ops on it.
Specific ops allow you to read and modify the values of this tensor. Higher level libraries like tf.keras use tf.Variable to store model
parameters.
Setup
This notebook discusses variable placement. If you want to see on what device your variables are placed, uncomment this line.
import tensorflow as tf
# tf.debugging.set_log_device_placement(True)
Create a variable
To create a variable, provide an initial value. The tf.Variable will have the same dtype as the initialization value.
my_variable = tf.Variable(my_tensor)
A variable looks and acts like a tensor, and, in fact, is a data structure backed by a tf.Tensor. Like tensors, they have a dtype and a shape, and
can be exported to NumPy.
Shape: (2, 2)
[3. 4.]]
Most tensor operations work on variables as expected, although variables cannot be reshaped.
array([[1., 2.],
[[1. 2.]
As noted above, variables are backed by tensors. You can reassign the tensor using tf.Variable.assign. Calling assign does not (usually)
allocate a new tensor; instead, the existing tensor's memory is reused.
a = tf.Variable([2.0, 3.0])
a.assign([1, 2])
# Not allowed as it resizes the variable:
try:
except Exception as e:
print(f"{type(e).__name__}: {e}")
ValueError: Cannot assign value to variable ' Variable:0': Shape mismatch.The variable shape (2,), and the assigned value shape (3,) are
incompatible.
If you use a variable like a tensor in operations, you will usually operate on the backing tensor.
Creating new variables from existing variables duplicates the backing tensors. Two variables will not share the same memory.
a = tf.Variable([2.0, 3.0])
b = tf.Variable(a)
a.assign([5, 6])
print(a.numpy())
print(b.numpy())
[2. 3.]
[7. 9.]
[0. 0.]
In Python-based TensorFlow, tf.Variable instance have the same lifecycle as other Python objects. When there are no references to a variable
it is automatically deallocated.
Variables can also be named which can help you track and debug them. You can give two variables the same name.
# Create a and b; they will have the same name but will be backed by
# different tensors.
a = tf.Variable(my_tensor, name="Mark")
b = tf.Variable(my_tensor + 1, name="Mark")
print(a == b)
tf.Tensor(
[[False False]
Although variables are important for differentiation, some variables will not need to be differentiated. You can turn off gradients for a
variable by setting trainable to false at creation. An example of a variable that would not need gradients is a training step counter.
For better performance, TensorFlow will attempt to place tensors and variables on the fastest device compatible with its dtype. This means
most variables are placed on a GPU if one is available.
However, you can override this. In this snippet, place a float tensor and a variable on the CPU, even if a GPU is available. By turning on device
placement logging (see Setup), you can see where the variable is placed.
Note: Although manual placement works, using distribution strategies can be a more convenient and scalable way to optimize your
computation.
If you run this notebook on different backends with and without a GPU you will see different logging. Note that logging device placement
must be turned on at the start of the session.
with tf.device('CPU:0'):
c = tf.matmul(a, b)
print(c)
tf.Tensor(
[[22. 28.]
It's possible to set the location of a variable or tensor on one device and do the computation on another device. This will introduce delay, as
data needs to be copied between the devices.
You might do this, however, if you had multiple GPU workers but only want one copy of the variables.
with tf.device('CPU:0'):
with tf.device('GPU:0'):
# Element-wise multiply
k=a*b
print(k)
tf.Tensor(
[[ 1. 4. 9.]
Note: Because tf.config.set_soft_device_placement is turned on by default, even if you run this code on a device without a GPU, it will still
run. The multiplication step will happen on the CPU.
Next steps
To understand how variables are typically used, see our guide on automatic differentiation.
bookmark_border
In this guide, you will explore ways to compute gradients with TensorFlow, especially in eager execution.
Setup
import numpy as np
import tensorflow as tf
Computing gradients
To differentiate automatically, TensorFlow needs to remember what operations happen in what order during the forward pass. Then, during
the backward pass, TensorFlow traverses this list of operations in reverse order to compute gradients.
Gradient tapes
TensorFlow provides the tf.GradientTape API for automatic differentiation; that is, computing the gradient of a computation with respect to
some inputs, usually tf.Variables. TensorFlow "records" relevant operations executed inside the context of a tf.GradientTape onto a "tape".
TensorFlow then uses that tape to compute the gradients of a "recorded" computation using reverse mode differentiation.
x = tf.Variable(3.0)
y = x**2
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1723685409.408818 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685409.412555 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685409.416343 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685409.420087 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685409.431667 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685409.435229 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685409.438777 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685409.442350 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685409.445712 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685409.449141 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685409.452491 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685409.456034 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.685265 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.687389 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.689411 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.691490 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.693542 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.695541 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.697441 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.699432 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.701351 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.703333 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.705229 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.707222 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.744994 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.747037 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.749507 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.751538 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.753500 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.755501 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.757421 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.759404 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.761363 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.763858 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.766199 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.768560 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
Once you've recorded some operations, use GradientTape.gradient(target, sources) to calculate the gradient of some target (often a loss)
relative to some source (often the model's variables):
# dy = 2x * dx
dy_dx = tape.gradient(y, x)
dy_dx.numpy()
6.0
The above example uses scalars, but tf.GradientTape works as easily on any tensor:
loss = tf.reduce_mean(y**2)
To get the gradient of loss with respect to both variables, you can pass both as sources to the gradient method. The tape is flexible about how
sources are passed and will accept any nested combination of lists or dictionaries and return the gradient structured the same way
(see tf.nest).
The gradient with respect to each source has the shape of the source:
print(w.shape)
print(dl_dw.shape)
(3, 2)
(3, 2)
Here is the gradient calculation again, this time passing a dictionary of variables:
my_vars = {
'w': w,
'b': b
grad['b']
In most cases, you will want to calculate gradients with respect to a model's trainable variables. Since all subclasses of tf.Module aggregate
their variables in the Module.trainable_variables property, you can calculate these gradients in a few lines of code:
# Forward pass
y = layer(x)
loss = tf.reduce_mean(y**2)
The default behavior is to record all operations after accessing a trainable tf.Variable. The reasons for this are:
The tape needs to know which operations to record in the forward pass to calculate the gradients in the backwards pass.
The tape holds references to intermediate outputs, so you don't want to record unnecessary operations.
The most common use case involves calculating the gradient of a loss with respect to all a model's trainable variables.
For example, the following fails to calculate a gradient because the tf.Tensor is not "watched" by default, and the tf.Variable is not trainable:
# A trainable variable
x0 = tf.Variable(3.0, name='x0')
# Not trainable
# Not a variable
x3 = tf.constant(3.0, name='x3')
for g in grad:
print(g)
None
None
None
You can list the variables being watched by the tape using the GradientTape.watched_variables method:
['x0:0']
tf.GradientTape provides hooks that give the user control over what is or is not watched.
x = tf.constant(3.0)
tape.watch(x)
y = x**2
# dy = 2x * dx
dy_dx = tape.gradient(y, x)
print(dy_dx.numpy())
6.0
Conversely, to disable the default behavior of watching all tf.Variables, set watch_accessed_variables=False when creating the gradient tape.
This calculation uses two variables, but only connects the gradient for one of the variables:
x0 = tf.Variable(0.0)
x1 = tf.Variable(10.0)
y0 = tf.math.sin(x0)
y1 = tf.nn.softplus(x1)
y = y0 + y1
ys = tf.reduce_sum(y)
Since GradientTape.watch was not called on x0, no gradient is computed with respect to it:
print('dy/dx0:', grad['x0'])
print('dy/dx1:', grad['x1'].numpy())
dy/dx0: None
dy/dx1: 0.9999546
Intermediate results
You can also request gradients of the output with respect to intermediate values computed inside the tf.GradientTape context.
x = tf.constant(3.0)
tape.watch(x)
y=x*x
z=y*y
# Use the tape to compute the gradient of z with respect to the
# intermediate value y.
# dz_dy = 2 * y and y = x ** 2 = 9
print(tape.gradient(z, y).numpy())
18.0
By default, the resources held by a GradientTape are released as soon as the GradientTape.gradient method is called. To compute multiple
gradients over the same computation, create a gradient tape with persistent=True. This allows multiple calls to the gradient method as
resources are released when the tape object is garbage collected. For example:
x = tf.constant([1, 3.0])
tape.watch(x)
y=x*x
z=y*y
[ 4. 108.]
[2. 6.]
Notes on performance
There is a tiny overhead associated with doing operations inside a gradient tape context. For most eager execution this will not be a
noticeable cost, but you should still use tape context around the areas only where it is required.
Gradient tapes use memory to store intermediate results, including inputs and outputs, for use during the backwards pass.
For efficiency, some ops (like ReLU) don't need to keep their intermediate results and they are pruned during the forward pass. However, if
you use persistent=True on your tape, nothing is discarded and your peak memory usage will be higher.
x = tf.Variable(2.0)
y0 = x**2
y1 = 1 / x
print(tape.gradient(y0, x).numpy())
print(tape.gradient(y1, x).numpy())
4.0
-0.25
Thus, if you ask for the gradient of multiple targets, the result for each source is:
x = tf.Variable(2.0)
y1 = 1 / x
3.75
Similarly, if the target(s) are not scalar the gradient of the sum is calculated:
x = tf.Variable(2.)
y = x * [3., 4.]
print(tape.gradient(y, x).numpy())
7.0
This makes it simple to take the gradient of the sum of a collection of losses, or the gradient of the sum of an element-wise loss calculation.
In some cases you can skip the Jacobian. For an element-wise calculation, the gradient of the sum gives the derivative of each element with
respect to its input-element, since each element is independent:
tape.watch(x)
y = tf.nn.sigmoid(x)
dy_dx = tape.gradient(y, x)
plt.plot(x, y, label='y')
plt.legend()
_ = plt.xlabel('x')
Control flow
Because a gradient tape records operations as they are executed, Python control flow is naturally handled (for
example, if and while statements).
Here a different variable is used on each branch of an if. The gradient only connects to the variable that was used:
x = tf.constant(1.0)
v0 = tf.Variable(2.0)
v1 = tf.Variable(2.0)
tape.watch(x)
if x > 0.0:
result = v0
else:
result = v1**2
print(dv0)
print(dv1)
None
Just remember that the control statements themselves are not differentiable, so they are invisible to gradient-based optimizers.
Depending on the value of x in the above example, the tape either records result = v0 or result = v1**2. The gradient with respect to x is
always None.
dx = tape.gradient(result, x)
print(dx)
None
x = tf.Variable(2.)
y = tf.Variable(3.)
z=y*y
print(tape.gradient(z, x))
None
Here z is obviously not connected to x, but there are several less-obvious ways that a gradient can be disconnected.
In the section on "controlling what the tape watches" you saw that the tape will automatically watch a tf.Variable but not a tf.Tensor.
One common error is to inadvertently replace a tf.Variable with a tf.Tensor, instead of using Variable.assign to update the tf.Variable. Here is
an example:
x = tf.Variable(2.0)
y = x+1
print(type(x).__name__, ":", tape.gradient(y, x))
EagerTensor : None
The tape can't record the gradient path if the calculation exits TensorFlow. For example:
x = tf.Variable([[1.0, 2.0],
x2 = x**2
y = np.mean(x2, axis=0)
# Like most ops, reduce_mean will cast the NumPy array to a constant tensor
# using `tf.convert_to_tensor`.
y = tf.reduce_mean(y, axis=0)
print(tape.gradient(y, x))
None
Integers and strings are not differentiable. If a calculation path uses these data types there will be no gradient.
Nobody expects strings to be differentiable, but it's easy to accidentally create an int constant or variable if you don't specify the dtype.
x = tf.constant(10)
with tf.GradientTape() as g:
g.watch(x)
y=x*x
print(g.gradient(y, x))
WARNING:tensorflow:The dtype of the watched tensor must be floating (e.g. tf.float32), got tf.int32
None
TensorFlow doesn't automatically cast between types, so, in practice, you'll often get a type error instead of a missing gradient.
State stops gradients. When you read from a stateful object, the tape can only observe the current state, not the history that lead to it.
A tf.Tensor is immutable. You can't change a tensor once it's created. It has a value, but no state. All the operations discussed so far are also
stateless: the output of a tf.matmul only depends on its inputs.
A tf.Variable has internal state—its value. When you use the variable, the state is read. It's normal to calculate a gradient with respect to a
variable, but the variable's state blocks gradient calculations from going farther back. For example:
x0 = tf.Variable(3.0)
x1 = tf.Variable(0.0)
# Update x1 = x1 + x0.
x1.assign_add(x0)
None
Similarly, tf.data.Dataset iterators and tf.queues are stateful, and will stop all gradients on tensors that pass through them.
No gradient registered
Some tf.Operations are registered as being non-differentiable and will return None. Others have no gradient registered.
The tf.raw_ops page shows which low-level ops have gradients registered.
If you attempt to take a gradient through a float op that has no gradient registered the tape will throw an error instead of silently
returning None. This way you know something has gone wrong.
For example, the tf.image.adjust_contrast function wraps raw_ops.AdjustContrastv2, which could have a gradient but the gradient is not
implemented:
delta = tf.Variable(0.1)
with tf.GradientTape() as tape:
try:
except LookupError as e:
print(f'{type(e).__name__}: {e}')
If you need to differentiate through this op, you'll either need to implement the gradient and register it (using tf.RegisterGradient) or re-
implement the function using other ops.
In some cases it would be convenient to get 0 instead of None for unconnected gradients. You can decide what to return when you have
unconnected gradients using the unconnected_gradients argument:
x = tf.Variable([2., 2.])
y = tf.Variable(3.)
z = y**2
print(tape.gradient(z, x, unconnected_gradients=tf.UnconnectedGradients.ZERO))
tf.Tensor([0. 0.], shape=(2,), dtype=float32)
bookmark_border
In NLP applications, you can use tensor slicing to perform word masking while training. For example, you can generate training data from a list of
sentences by choosing a word index to mask in each sentence, taking the word out as a label, and then replacing the chosen word with a mask
token.
In this guide, you will learn how to use the TensorFlow APIs to:
This guide assumes familiarity with tensor indexing. Read the indexing sections of the Tensor and TensorFlow NumPy guides before getting
started with this guide.
Setup
import tensorflow as tf
import numpy as np
t1 = tf.constant([0, 1, 2, 3, 4, 5, 6, 7])
print(tf.slice(t1,
begin=[1],
size=[3]))
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1723690765.029742 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690765.033507 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690765.037154 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690765.040869 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690765.052765 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690765.056359 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690765.059719 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690765.063138 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690765.066532 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690765.070040 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690765.073650 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690765.077235 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.300773 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.302797 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.304766 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.306851 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.308843 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.310723 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.312589 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.314574 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.316423 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.318291 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.320168 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.322150 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.360968 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.362946 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.364849 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.366867 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.368710 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.370591 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.372447 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.374435 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.376302 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.378677 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.380966 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.383365 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
Alternatively, you can use a more Pythonic syntax. Note that tensor slices are evenly spaced over a start-stop range.
print(t1[1:4])
print(t1[-3:])
tf.Tensor([5 6 7], shape=(3,), dtype=int32)
t2 = tf.constant([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9],
print(t2[:-1, 1:3])
tf.Tensor(
[[ 1 2]
[ 6 7]
t3 = tf.constant([[[1, 3, 5, 7],
])
print(tf.slice(t3,
begin=[1, 1, 0],
size=[1, 1, 2]))
You can also use tf.strided_slice to extract slices of tensors by 'striding' over the tensor dimensions.
print(tf.gather(t1,
indices=[0, 3, 6]))
t1[::3]
alphabet = tf.constant(list('abcdefghijklmnopqrstuvwxyz'))
print(tf.gather(alphabet,
To extract slices from multiple axes of a tensor, use tf.gather_nd. This is useful when you want to gather the elements of a matrix as opposed to
just its rows or columns.
t4 = tf.constant([[0, 5],
[1, 6],
[2, 7],
[3, 8],
[4, 9]])
print(tf.gather_nd(t4,
tf.Tensor(
[[2 7]
[3 8]
print(tf.gather_nd(t5,
tf.Tensor(
[[[ 0 1 2]
[ 6 7 8]]
[[ 9 10 11]
print(tf.gather_nd(t5,
tf.Tensor(
[[ 0 1 2]
[ 6 7 8]
[ 9 10 11]
Use tf.scatter_nd to insert data at specific slices/indices of a tensor. Note that the tensor into which you insert values is zero-initialized.
t6 = tf.constant([10])
indices = tf.constant([[1], [3], [5], [7], [9]])
print(tf.scatter_nd(indices=indices,
updates=data,
shape=t6))
Methods like tf.scatter_nd which require zero-initialized tensors are similar to sparse tensor initializers. You can
use tf.gather_nd and tf.scatter_nd to mimic the behavior of sparse tensor ops.
Consider an example where you construct a sparse tensor using these two methods in conjunction.
t7 = tf.gather_nd(t2, indices=new_indices)
# Add these values into a new tensor
print(t8)
tf.Tensor(
[[ 0 0 2 0 0]
[ 0 0 0 0 0]
[ 0 11 0 0 0]
dense_shape=[4, 5])
print(t9)
SparseTensor(indices=tf.Tensor(
[[0 2]
[2 1]
[3 3]], shape=(3, 2), dtype=int64), values=tf.Tensor([ 2 11 18], shape=(3,), dtype=int32), dense_shape=tf.Tensor([4 5], shape=(2,), dtype=int64))
t10 = tf.sparse.to_dense(t9)
print(t10)
tf.Tensor(
[[ 0 0 2 0 0]
[ 0 0 0 0 0]
[ 0 11 0 0 0]
[9, 0, 1],
[0, 3, 8]])
# Convert the tensor into a magic square by inserting numbers at appropriate indices
t12 = tf.tensor_scatter_nd_add(t11,
updates=[6, 5, 4])
print(t12)
tf.Tensor(
[[2 7 6]
[9 5 1]
Similarly, use tf.tensor_scatter_nd_sub to subtract values from a tensor with pre-existing values.
t13 = tf.tensor_scatter_nd_sub(t11,
indices=[[0, 0], [0, 1], [1, 0], [1, 1], [1, 2], [2, 1], [2, 2]],
print(t13)
tf.Tensor(
[[1 0 0]
[0 1 0]
Use tf.tensor_scatter_nd_min to copy element-wise minimum values from one tensor to another.
[-9, 0, 1],
t15 = tf.tensor_scatter_nd_min(t14,
print(t15)
tf.Tensor(
[[-2 -7 -6]
[-9 -5 1]
Similarly, use tf.tensor_scatter_nd_max to copy element-wise maximum values from one tensor to another.
t16 = tf.tensor_scatter_nd_max(t14,
print(t16)
tf.Tensor(
[[-2 -7 6]
[-9 5 1]
In this guide, you learned how to use the tensor slicing ops available with TensorFlow to exert finer control over the elements in your tensors.
Check out the slicing ops available with TensorFlow NumPy such
as tf.experimental.numpy.take_along_axis and tf.experimental.numpy.take.
Also check out the Tensor guide and the Variable guide.