0% found this document useful (0 votes)
25 views

Introduction To Tensors

Uploaded by

92hussnainali
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Introduction To Tensors

Uploaded by

92hussnainali
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 77

Introduction to Tensors

bookmark_border

import tensorflow as tf
Run in Google View source on GitHub Download notebook
import numpy as np
Colab
Tensors are multi-dimensional arrays with a
uniform type (called a dtype). You can see all supported dtypes at tf.dtypes.

If you're familiar with NumPy, tensors are (kind of) like np.arrays.

All tensors are immutable like Python numbers and strings: you can never update the contents of a tensor, only create a new one.

Basics

First, create some basic tensors.

Here is a "scalar" or "rank-0" tensor . A scalar contains a single value, and no "axes".

# This will be an int32 tensor by default; see "dtypes" below.

rank_0_tensor = tf.constant(4)

print(rank_0_tensor)

tf.Tensor(4, shape=(), dtype=int32)

A "vector" or "rank-1" tensor is like a list of values. A vector has one axis:

# Let's make this a float tensor.

rank_1_tensor = tf.constant([2.0, 3.0, 4.0])

print(rank_1_tensor)

tf.Tensor([2. 3. 4.], shape=(3,), dtype=float32)


A "matrix" or "rank-2" tensor has two axes:

# If you want to be specific, you can set the dtype (see below) at creation time

rank_2_tensor = tf.constant([[1, 2],

[3, 4],

[5, 6]], dtype=tf.float16)

print(rank_2_tensor)

tf.Tensor(

[[1. 2.]

[3. 4.]

[5. 6.]], shape=(3, 2), dtype=float16)

A scalar, shape: [] A vector, shape: [3] A matrix, shape: [3, 2]

Tensors may have more axes; here is a tensor with three axes:

# There can be an arbitrary number of

# axes (sometimes called "dimensions")


rank_3_tensor = tf.constant([

[[0, 1, 2, 3, 4],

[5, 6, 7, 8, 9]],

[[10, 11, 12, 13, 14],

[15, 16, 17, 18, 19]],

[[20, 21, 22, 23, 24],

[25, 26, 27, 28, 29]],])

print(rank_3_tensor)

tf.Tensor(

[[[ 0 1 2 3 4]

[ 5 6 7 8 9]]

[[10 11 12 13 14]

[15 16 17 18 19]]

[[20 21 22 23 24]

[25 26 27 28 29]]], shape=(3, 2, 5), dtype=int32)

There are many ways you might visualize a tensor with more than two axes.

A 3-axis tensor, shape: [3, 2, 5]


You can convert a tensor to a NumPy array either using np.array or the tensor.numpy method:

np.array(rank_2_tensor)

array([[1., 2.],

[3., 4.],

[5., 6.]], dtype=float16)

rank_2_tensor.numpy()

array([[1., 2.],

[3., 4.],

[5., 6.]], dtype=float16)

Tensors often contain floats and ints, but have many other types, including:

 complex numbers
 strings

The base tf.Tensor class requires tensors to be "rectangular"---that is, along each axis, every element is the same size. However, there are
specialized types of tensors that can handle different shapes:

 Ragged tensors (see RaggedTensor below)

 Sparse tensors (see SparseTensor below)

You can do basic math on tensors, including addition, element-wise multiplication, and matrix multiplication.

a = tf.constant([[1, 2],

[3, 4]])

b = tf.constant([[1, 1],

[1, 1]]) # Could have also said `tf.ones([2,2], dtype=tf.int32)`

print(tf.add(a, b), "\n")

print(tf.multiply(a, b), "\n")

print(tf.matmul(a, b), "\n")

tf.Tensor(

[[2 3]

[4 5]], shape=(2, 2), dtype=int32)

tf.Tensor(

[[1 2]

[3 4]], shape=(2, 2), dtype=int32)


tf.Tensor(

[[3 3]

[7 7]], shape=(2, 2), dtype=int32)

print(a + b, "\n") # element-wise addition

print(a * b, "\n") # element-wise multiplication

print(a @ b, "\n") # matrix multiplication

tf.Tensor(

[[2 3]

[4 5]], shape=(2, 2), dtype=int32)

tf.Tensor(

[[1 2]

[3 4]], shape=(2, 2), dtype=int32)

tf.Tensor(

[[3 3]

[7 7]], shape=(2, 2), dtype=int32)

Tensors are used in all kinds of operations (or "Ops").

c = tf.constant([[4.0, 5.0], [10.0, 1.0]])


# Find the largest value

print(tf.reduce_max(c))

# Find the index of the largest value

print(tf.math.argmax(c))

# Compute the softmax

print(tf.nn.softmax(c))

tf.Tensor(10.0, shape=(), dtype=float32)

tf.Tensor([1 0], shape=(2,), dtype=int64)

tf.Tensor(

[[2.6894143e-01 7.3105854e-01]

[9.9987662e-01 1.2339458e-04]], shape=(2, 2), dtype=float32)

Note: Typically, anywhere a TensorFlow function expects a Tensor as input, the function will also accept anything that can be converted to
a Tensor using tf.convert_to_tensor. See below for an example.

tf.convert_to_tensor([1,2,3])

<tf.Tensor: shape=(3,), dtype=int32, numpy=array([1, 2, 3], dtype=int32)>

tf.reduce_max([1,2,3])

<tf.Tensor: shape=(), dtype=int32, numpy=3>

tf.reduce_max(np.array([1,2,3]))

<tf.Tensor: shape=(), dtype=int64, numpy=3>

About shapes

Tensors have shapes. Some vocabulary:


 Shape: The length (number of elements) of each of the axes of a tensor.

 Rank: Number of tensor axes. A scalar has rank 0, a vector has rank 1, a matrix is rank 2.

 Axis or Dimension: A particular dimension of a tensor.

 Size: The total number of items in the tensor, the product of the shape vector's elements.

Note: Although you may see reference to a "tensor of two dimensions", a rank-2 tensor does not usually describe a 2D space.

Tensors and tf.TensorShape objects have convenient properties for accessing these:

rank_4_tensor = tf.zeros([3, 2, 4, 5])

A rank-4 tensor, shape: [3, 2, 4, 5]


print("Type of every element:", rank_4_tensor.dtype)

print("Number of axes:", rank_4_tensor.ndim)

print("Shape of tensor:", rank_4_tensor.shape)

print("Elements along axis 0 of tensor:", rank_4_tensor.shape[0])

print("Elements along the last axis of tensor:", rank_4_tensor.shape[-1])

print("Total number of elements (3*2*4*5): ", tf.size(rank_4_tensor).numpy())

Type of every element: <dtype: 'float32'>


Number of axes: 4

Shape of tensor: (3, 2, 4, 5)

Elements along axis 0 of tensor: 3

Elements along the last axis of tensor: 5

Total number of elements (3*2*4*5): 120

But note that the Tensor.ndim and Tensor.shape attributes don't return Tensor objects. If you need a Tensor use
the tf.rank or tf.shape function. This difference is subtle, but it can be important when building graphs (later).

tf.rank(rank_4_tensor)

<tf.Tensor: shape=(), dtype=int32, numpy=4>

tf.shape(rank_4_tensor)

<tf.Tensor: shape=(4,), dtype=int32, numpy=array([3, 2, 4, 5], dtype=int32)>

While axes are often referred to by their indices, you should always keep track of the meaning of each. Often axes are ordered from global to
local: The batch axis first, followed by spatial dimensions, and features for each location last. This way feature vectors are contiguous regions
of memory.

Typical axis order


Indexing

Single-axis indexing

TensorFlow follows standard Python indexing rules, similar to indexing a list or a string in Python, and the basic rules for NumPy indexing.

 indexes start at 0

 negative indices count backwards from the end

 colons, :, are used for slices: start:stop:step

rank_1_tensor = tf.constant([0, 1, 1, 2, 3, 5, 8, 13, 21, 34])

print(rank_1_tensor.numpy())

[ 0 1 1 2 3 5 8 13 21 34]

Indexing with a scalar removes the axis:

print("First:", rank_1_tensor[0].numpy())

print("Second:", rank_1_tensor[1].numpy())

print("Last:", rank_1_tensor[-1].numpy())

First: 0
Second: 1

Last: 34

Indexing with a : slice keeps the axis:

print("Everything:", rank_1_tensor[:].numpy())

print("Before 4:", rank_1_tensor[:4].numpy())

print("From 4 to the end:", rank_1_tensor[4:].numpy())

print("From 2, before 7:", rank_1_tensor[2:7].numpy())

print("Every other item:", rank_1_tensor[::2].numpy())

print("Reversed:", rank_1_tensor[::-1].numpy())

Everything: [ 0 1 1 2 3 5 8 13 21 34]

Before 4: [0 1 1 2]

From 4 to the end: [ 3 5 8 13 21 34]

From 2, before 7: [1 2 3 5 8]

Every other item: [ 0 1 3 8 21]

Reversed: [34 21 13 8 5 3 2 1 1 0]

Multi-axis indexing

Higher rank tensors are indexed by passing multiple indices.

The exact same rules as in the single-axis case apply to each axis independently.

print(rank_2_tensor.numpy())

[[1. 2.]

[3. 4.]
[5. 6.]]

Passing an integer for each index, the result is a scalar.

# Pull out a single value from a 2-rank tensor

print(rank_2_tensor[1, 1].numpy())

4.0

You can index using any combination of integers and slices:

# Get row and column tensors

print("Second row:", rank_2_tensor[1, :].numpy())

print("Second column:", rank_2_tensor[:, 1].numpy())

print("Last row:", rank_2_tensor[-1, :].numpy())

print("First item in last column:", rank_2_tensor[0, -1].numpy())

print("Skip the first row:")

print(rank_2_tensor[1:, :].numpy(), "\n")

Second row: [3. 4.]

Second column: [2. 4. 6.]

Last row: [5. 6.]

First item in last column: 2.0

Skip the first row:

[[3. 4.]

[5. 6.]]

Here is an example with a 3-axis tensor:


print(rank_3_tensor[:, :, 4])

tf.Tensor(

[[ 4 9]

[14 19]

[24 29]], shape=(3, 2), dtype=int32)

Selecting the last feature across all locations in each example in the batch

Read the tensor slicing guide to learn how you can apply indexing to manipulate individual elements in your tensors.

Manipulating Shapes

Reshaping a tensor is of great utility.

# Shape returns a `TensorShape` object that shows the size along each axis

x = tf.constant([[1], [2], [3]])

print(x.shape)

(3, 1)
# You can convert this object into a Python list, too

print(x.shape.as_list())

[3, 1]

You can reshape a tensor into a new shape. The tf.reshape operation is fast and cheap as the underlying data does not need to be duplicated.

# You can reshape a tensor to a new shape.

# Note that you're passing in a list

reshaped = tf.reshape(x, [1, 3])

print(x.shape)

print(reshaped.shape)

(3, 1)

(1, 3)

The data maintains its layout in memory and a new tensor is created, with the requested shape, pointing to the same data. TensorFlow uses
C-style "row-major" memory ordering, where incrementing the rightmost index corresponds to a single step in memory.

print(rank_3_tensor)

tf.Tensor(

[[[ 0 1 2 3 4]

[ 5 6 7 8 9]]

[[10 11 12 13 14]

[15 16 17 18 19]]
[[20 21 22 23 24]

[25 26 27 28 29]]], shape=(3, 2, 5), dtype=int32)

If you flatten a tensor you can see what order it is laid out in memory.

# A `-1` passed in the `shape` argument says "Whatever fits".

print(tf.reshape(rank_3_tensor, [-1]))

tf.Tensor(

[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

24 25 26 27 28 29], shape=(30,), dtype=int32)

Typically the only reasonable use of tf.reshape is to combine or split adjacent axes (or add/remove 1s).

For this 3x2x5 tensor, reshaping to (3x2)x5 or 3x(2x5) are both reasonable things to do, as the slices do not mix:

print(tf.reshape(rank_3_tensor, [3*2, 5]), "\n")

print(tf.reshape(rank_3_tensor, [3, -1]))

tf.Tensor(

[[ 0 1 2 3 4]

[ 5 6 7 8 9]

[10 11 12 13 14]

[15 16 17 18 19]

[20 21 22 23 24]

[25 26 27 28 29]], shape=(6, 5), dtype=int32)

tf.Tensor(
[[ 0 1 2 3 4 5 6 7 8 9]

[10 11 12 13 14 15 16 17 18 19]

[20 21 22 23 24 25 26 27 28 29]], shape=(3, 10), dtype=int32)

Some good reshapes.

Reshaping will "work" for any new shape with the same total number of elements, but it will not do anything useful if you do not respect the
order of the axes.

Swapping axes in tf.reshape does not work; you need tf.transpose for that.

# Bad examples: don't do this

# You can't reorder axes with reshape.

print(tf.reshape(rank_3_tensor, [2, 3, 5]), "\n")


# This is a mess

print(tf.reshape(rank_3_tensor, [5, 6]), "\n")

# This doesn't work at all

try:

tf.reshape(rank_3_tensor, [7, -1])

except Exception as e:

print(f"{type(e).__name__}: {e}")

tf.Tensor(

[[[ 0 1 2 3 4]

[ 5 6 7 8 9]

[10 11 12 13 14]]

[[15 16 17 18 19]

[20 21 22 23 24]

[25 26 27 28 29]]], shape=(2, 3, 5), dtype=int32)

tf.Tensor(

[[ 0 1 2 3 4 5]

[ 6 7 8 9 10 11]
[12 13 14 15 16 17]

[18 19 20 21 22 23]

[24 25 26 27 28 29]], shape=(5, 6), dtype=int32)

InvalidArgumentError: { {function_node __wrapped__Reshape_device_/job:localhost/replica:0/task:0/device:GPU:0} } Input to reshape is a


tensor with 30 values, but the requested shape requires a multiple of 7 [Op:Reshape]

Some bad reshapes.

You may run across not-fully-specified shapes. Either the shape contains a None (an axis-length is unknown) or the whole shape is None (the
rank of the tensor is unknown).

Except for tf.RaggedTensor, such shapes will only occur in the context of TensorFlow's symbolic, graph-building APIs:

 tf.function

 The keras functional API.

More on DTypes
To inspect a tf.Tensor's data type use the Tensor.dtype property.

When creating a tf.Tensor from a Python object you may optionally specify the datatype.

If you don't, TensorFlow chooses a datatype that can represent your data. TensorFlow converts Python integers to tf.int32 and Python
floating point numbers to tf.float32. Otherwise TensorFlow uses the same rules NumPy uses when converting to arrays.

You can cast from type to type.

the_f64_tensor = tf.constant([2.2, 3.3, 4.4], dtype=tf.float64)

the_f16_tensor = tf.cast(the_f64_tensor, dtype=tf.float16)

# Now, cast to an uint8 and lose the decimal precision

the_u8_tensor = tf.cast(the_f16_tensor, dtype=tf.uint8)

print(the_u8_tensor)

tf.Tensor([2 3 4], shape=(3,), dtype=uint8)

Broadcasting

Broadcasting is a concept borrowed from the equivalent feature in NumPy. In short, under certain conditions, smaller tensors are "stretched"
automatically to fit larger tensors when running combined operations on them.

The simplest and most common case is when you attempt to multiply or add a tensor to a scalar. In that case, the scalar is broadcast to be the
same shape as the other argument.

x = tf.constant([1, 2, 3])

y = tf.constant(2)

z = tf.constant([2, 2, 2])

# All of these are the same computation

print(tf.multiply(x, 2))
print(x * y)

print(x * z)

tf.Tensor([2 4 6], shape=(3,), dtype=int32)

tf.Tensor([2 4 6], shape=(3,), dtype=int32)

tf.Tensor([2 4 6], shape=(3,), dtype=int32)

Likewise, axes with length 1 can be stretched out to match the other arguments. Both arguments can be stretched in the same computation.

In this case a 3x1 matrix is element-wise multiplied by a 1x4 matrix to produce a 3x4 matrix. Note how the leading 1 is optional: The shape of
y is [4].

# These are the same computations

x = tf.reshape(x,[3,1])

y = tf.range(1, 5)

print(x, "\n")

print(y, "\n")

print(tf.multiply(x, y))

tf.Tensor(

[[1]

[2]

[3]], shape=(3, 1), dtype=int32)

tf.Tensor([1 2 3 4], shape=(4,), dtype=int32)


tf.Tensor(

[[ 1 2 3 4]

[ 2 4 6 8]

[ 3 6 9 12]], shape=(3, 4), dtype=int32)

A broadcasted add: a [3, 1] times a [1, 4] gives a [3,4]

Here is the same operation without broadcasting:

x_stretch = tf.constant([[1, 1, 1, 1],

[2, 2, 2, 2],

[3, 3, 3, 3]])

y_stretch = tf.constant([[1, 2, 3, 4],

[1, 2, 3, 4],

[1, 2, 3, 4]])
print(x_stretch * y_stretch) # Again, operator overloading

tf.Tensor(

[[ 1 2 3 4]

[ 2 4 6 8]

[ 3 6 9 12]], shape=(3, 4), dtype=int32)

Most of the time, broadcasting is both time and space efficient, as the broadcast operation never materializes the expanded tensors in
memory.

You see what broadcasting looks like using tf.broadcast_to.

print(tf.broadcast_to(tf.constant([1, 2, 3]), [3, 3]))

tf.Tensor(

[[1 2 3]

[1 2 3]

[1 2 3]], shape=(3, 3), dtype=int32)

Unlike a mathematical op, for example, broadcast_to does nothing special to save memory. Here, you are materializing the tensor.

It can get even more complicated. This section of Jake VanderPlas's book Python Data Science Handbook shows more broadcasting tricks
(again in NumPy).

tf.convert_to_tensor

Most ops, like tf.matmul and tf.reshape take arguments of class tf.Tensor. However, you'll notice in the above case, Python objects shaped
like tensors are accepted.

Most, but not all, ops call convert_to_tensor on non-tensor arguments. There is a registry of conversions, and most object classes like
NumPy's ndarray, TensorShape, Python lists, and tf.Variable will all convert automatically.
See tf.register_tensor_conversion_function for more details, and if you have your own type you'd like to automatically convert to a tensor.

Ragged Tensors

A tensor with variable numbers of elements along some axis is called "ragged". Use tf.ragged.RaggedTensor for ragged data.

For example, This cannot be represented as a regular tensor:

A tf.RaggedTensor, shape: [4, None]

ragged_list = [

[0, 1, 2, 3],

[4, 5],

[6, 7, 8],

[9]]

try:

tensor = tf.constant(ragged_list)

except Exception as e:
print(f"{type(e).__name__}: {e}")

ValueError: Can't convert non-rectangular Python sequence to Tensor.

Instead create a tf.RaggedTensor using tf.ragged.constant:

ragged_tensor = tf.ragged.constant(ragged_list)

print(ragged_tensor)

<tf.RaggedTensor [[0, 1, 2, 3], [4, 5], [6, 7, 8], [9]]>

The shape of a tf.RaggedTensor will contain some axes with unknown lengths:

print(ragged_tensor.shape)

(4, None)

String tensors

tf.string is a dtype, which is to say you can represent data as strings (variable-length byte arrays) in tensors.

The strings are atomic and cannot be indexed the way Python strings are. The length of the string is not one of the axes of the tensor.
See tf.strings for functions to manipulate them.

Here is a scalar string tensor:

# Tensors can be strings, too here is a scalar string.

scalar_string_tensor = tf.constant("Gray wolf")

print(scalar_string_tensor)

tf.Tensor(b'Gray wolf', shape=(), dtype=string)

And a vector of strings:

A vector of strings, shape: [3,]


# If you have three string tensors of different lengths, this is OK.

tensor_of_strings = tf.constant(["Gray wolf",

"Quick brown fox",

"Lazy dog"])

# Note that the shape is (3,). The string length is not included.

print(tensor_of_strings)

tf.Tensor([b'Gray wolf' b'Quick brown fox' b'Lazy dog'], shape=(3,), dtype=string)

In the above printout the b prefix indicates that tf.string dtype is not a unicode string, but a byte-string. See the Unicode Tutorial for more
about working with unicode text in TensorFlow.

If you pass unicode characters they are utf-8 encoded.

tf.constant("🥳👍")

<tf.Tensor: shape=(), dtype=string, numpy=b'\xf0\x9f\xa5\xb3\xf0\x9f\x91\x8d'>

Some basic functions with strings can be found in tf.strings, including tf.strings.split.

# You can use split to split a string into a set of tensors

print(tf.strings.split(scalar_string_tensor, sep=" "))

tf.Tensor([b'Gray' b'wolf'], shape=(2,), dtype=string)

# ...but it turns into a `RaggedTensor` if you split up a tensor of strings,


# as each string might be split into a different number of parts.

print(tf.strings.split(tensor_of_strings))

<tf.RaggedTensor [[b'Gray', b'wolf'], [b'Quick', b'brown', b'fox'], [b'Lazy', b'dog']]>

Three strings split, shape: [3, None]

And tf.strings.to_number:

text = tf.constant("1 10 100")

print(tf.strings.to_number(tf.strings.split(text, " ")))

tf.Tensor([ 1. 10. 100.], shape=(3,), dtype=float32)

Although you can't use tf.cast to turn a string tensor into numbers, you can convert it into bytes, and then into numbers.

byte_strings = tf.strings.bytes_split(tf.constant("Duck"))

byte_ints = tf.io.decode_raw(tf.constant("Duck"), tf.uint8)

print("Byte strings:", byte_strings)

print("Bytes:", byte_ints)

Byte strings: tf.Tensor([b'D' b'u' b'c' b'k'], shape=(4,), dtype=string)


Bytes: tf.Tensor([ 68 117 99 107], shape=(4,), dtype=uint8)

# Or split it up as unicode and then decode it

unicode_bytes = tf.constant("アヒル 🦆")

unicode_char_bytes = tf.strings.unicode_split(unicode_bytes, "UTF-8")

unicode_values = tf.strings.unicode_decode(unicode_bytes, "UTF-8")

print("\nUnicode bytes:", unicode_bytes)

print("\nUnicode chars:", unicode_char_bytes)

print("\nUnicode values:", unicode_values)

Unicode bytes: tf.Tensor(b'\xe3\x82\xa2\xe3\x83\x92\xe3\x83\xab \xf0\x9f\xa6\x86', shape=(), dtype=string)

Unicode chars: tf.Tensor([b'\xe3\x82\xa2' b'\xe3\x83\x92' b'\xe3\x83\xab' b' ' b'\xf0\x9f\xa6\x86'], shape=(5,), dtype=string)

Unicode values: tf.Tensor([ 12450 12498 12523 32 129414], shape=(5,), dtype=int32)

The tf.string dtype is used for all raw bytes data in TensorFlow. The tf.io module contains functions for converting data to and from bytes,
including decoding images and parsing csv.

Sparse tensors

Sometimes, your data is sparse, like a very wide embedding space. TensorFlow supports tf.sparse.SparseTensor and related operations to
store sparse data efficiently.

A tf.SparseTensor, shape: [3, 4]


# Sparse tensors store values by index in a memory-efficient manner

sparse_tensor = tf.sparse.SparseTensor(indices=[[0, 0], [1, 2]],

values=[1, 2],

dense_shape=[3, 4])

print(sparse_tensor, "\n")

# You can convert sparse tensors to dense

print(tf.sparse.to_dense(sparse_tensor))

SparseTensor(indices=tf.Tensor(

[[0 0]

[1 2]], shape=(2, 2), dtype=int64), values=tf.Tensor([1 2], shape=(2,), dtype=int32), dense_shape=tf.Tensor([3 4], shape=(2,), dtype=int64))

tf.Tensor(

[[1 0 0 0]

[0 0 2 0]
[0 0 0 0]], shape=(3, 4), dtype=int32)

Was this helpful?

Introduction to Variables

bookmark_border

A TensorFlow variable is the recommended way to


Run in Google View source on GitHub Download notebook represent shared, persistent state your program
Colab manipulates. This guide covers how to create,
update, and manage instances of tf.Variable in
TensorFlow.

Variables are created and tracked via the tf.Variable class. A tf.Variable represents a tensor whose value can be changed by running ops on it.
Specific ops allow you to read and modify the values of this tensor. Higher level libraries like tf.keras use tf.Variable to store model
parameters.

Setup

This notebook discusses variable placement. If you want to see on what device your variables are placed, uncomment this line.

import tensorflow as tf

# Uncomment to see where your variables get placed (see below)

# tf.debugging.set_log_device_placement(True)

2024-08-15 03:11:13.352524: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting


to register factory for plugin cuFFT when one has already been registered

2024-08-15 03:11:13.373928: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory:


Attempting to register factory for plugin cuDNN when one has already been registered
2024-08-15 03:11:13.380240: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory:
Attempting to register factory for plugin cuBLAS when one has already been registered

Create a variable

To create a variable, provide an initial value. The tf.Variable will have the same dtype as the initialization value.

my_tensor = tf.constant([[1.0, 2.0], [3.0, 4.0]])

my_variable = tf.Variable(my_tensor)

# Variables can be all kinds of types, just like tensors

bool_variable = tf.Variable([False, False, False, True])

complex_variable = tf.Variable([5 + 4j, 6 + 1j])

A variable looks and acts like a tensor, and, in fact, is a data structure backed by a tf.Tensor. Like tensors, they have a dtype and a shape, and
can be exported to NumPy.

print("Shape: ", my_variable.shape)

print("DType: ", my_variable.dtype)

print("As NumPy: ", my_variable.numpy())

Shape: (2, 2)

DType: <dtype: 'float32'>

As NumPy: [[1. 2.]

[3. 4.]]

Most tensor operations work on variables as expected, although variables cannot be reshaped.

print("A variable:", my_variable)


print("\nViewed as a tensor:", tf.convert_to_tensor(my_variable))

print("\nIndex of highest value:", tf.math.argmax(my_variable))

# This creates a new tensor; it does not reshape the variable.

print("\nCopying and reshaping: ", tf.reshape(my_variable, [1,4]))

A variable: <tf.Variable 'Variable:0' shape=(2, 2) dtype=float32, numpy=

array([[1., 2.],

[3., 4.]], dtype=float32)>

Viewed as a tensor: tf.Tensor(

[[1. 2.]

[3. 4.]], shape=(2, 2), dtype=float32)

Index of highest value: tf.Tensor([1 1], shape=(2,), dtype=int64)

Copying and reshaping: tf.Tensor([[1. 2. 3. 4.]], shape=(1, 4), dtype=float32)

As noted above, variables are backed by tensors. You can reassign the tensor using tf.Variable.assign. Calling assign does not (usually)
allocate a new tensor; instead, the existing tensor's memory is reused.

a = tf.Variable([2.0, 3.0])

# This will keep the same dtype, float32

a.assign([1, 2])
# Not allowed as it resizes the variable:

try:

a.assign([1.0, 2.0, 3.0])

except Exception as e:

print(f"{type(e).__name__}: {e}")

ValueError: Cannot assign value to variable ' Variable:0': Shape mismatch.The variable shape (2,), and the assigned value shape (3,) are
incompatible.

If you use a variable like a tensor in operations, you will usually operate on the backing tensor.

Creating new variables from existing variables duplicates the backing tensors. Two variables will not share the same memory.

a = tf.Variable([2.0, 3.0])

# Create b based on the value of a

b = tf.Variable(a)

a.assign([5, 6])

# a and b are different

print(a.numpy())

print(b.numpy())

# There are other versions of assign

print(a.assign_add([2,3]).numpy()) # [7. 9.]

print(a.assign_sub([7,9]).numpy()) # [0. 0.]


[5. 6.]

[2. 3.]

[7. 9.]

[0. 0.]

Lifecycles, naming, and watching

In Python-based TensorFlow, tf.Variable instance have the same lifecycle as other Python objects. When there are no references to a variable
it is automatically deallocated.

Variables can also be named which can help you track and debug them. You can give two variables the same name.

# Create a and b; they will have the same name but will be backed by

# different tensors.

a = tf.Variable(my_tensor, name="Mark")

# A new variable with the same name, but different value

# Note that the scalar add is broadcast

b = tf.Variable(my_tensor + 1, name="Mark")

# These are elementwise-unequal, despite having the same name

print(a == b)

tf.Tensor(

[[False False]

[False False]], shape=(2, 2), dtype=bool)


Variable names are preserved when saving and loading models. By default, variables in models will acquire unique variable names
automatically, so you don't need to assign them yourself unless you want to.

Although variables are important for differentiation, some variables will not need to be differentiated. You can turn off gradients for a
variable by setting trainable to false at creation. An example of a variable that would not need gradients is a training step counter.

step_counter = tf.Variable(1, trainable=False)

Placing variables and tensors

For better performance, TensorFlow will attempt to place tensors and variables on the fastest device compatible with its dtype. This means
most variables are placed on a GPU if one is available.

However, you can override this. In this snippet, place a float tensor and a variable on the CPU, even if a GPU is available. By turning on device
placement logging (see Setup), you can see where the variable is placed.

Note: Although manual placement works, using distribution strategies can be a more convenient and scalable way to optimize your
computation.

If you run this notebook on different backends with and without a GPU you will see different logging. Note that logging device placement
must be turned on at the start of the session.

with tf.device('CPU:0'):

# Create some tensors

a = tf.Variable([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])

b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])

c = tf.matmul(a, b)

print(c)

tf.Tensor(
[[22. 28.]

[49. 64.]], shape=(2, 2), dtype=float32)

It's possible to set the location of a variable or tensor on one device and do the computation on another device. This will introduce delay, as
data needs to be copied between the devices.

You might do this, however, if you had multiple GPU workers but only want one copy of the variables.

with tf.device('CPU:0'):

a = tf.Variable([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])

b = tf.Variable([[1.0, 2.0, 3.0]])

with tf.device('GPU:0'):

# Element-wise multiply

k=a*b

print(k)

tf.Tensor(

[[ 1. 4. 9.]

[ 4. 10. 18.]], shape=(2, 3), dtype=float32)

Note: Because tf.config.set_soft_device_placement is turned on by default, even if you run this code on a device without a GPU, it will still
run. The multiplication step will happen on the CPU.

For more on distributed training, refer to the guide.

Next steps
To understand how variables are typically used, see our guide on automatic differentiation.

Introduction to gradients and automatic differentiation

bookmark_border

Automatic Differentiation and Gradients


Run in Google View source on GitHub Download notebook
Automatic differentiation is useful for implementing
Colab
machine learning algorithms such
as backpropagation for training neural networks.

In this guide, you will explore ways to compute gradients with TensorFlow, especially in eager execution.

Setup

import numpy as np

import matplotlib.pyplot as plt

import tensorflow as tf

2024-08-15 01:30:07.003169: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting


to register factory for plugin cuFFT when one has already been registered

2024-08-15 01:30:07.023862: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory:


Attempting to register factory for plugin cuDNN when one has already been registered

2024-08-15 01:30:07.029954: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory:


Attempting to register factory for plugin cuBLAS when one has already been registered

Computing gradients

To differentiate automatically, TensorFlow needs to remember what operations happen in what order during the forward pass. Then, during
the backward pass, TensorFlow traverses this list of operations in reverse order to compute gradients.
Gradient tapes

TensorFlow provides the tf.GradientTape API for automatic differentiation; that is, computing the gradient of a computation with respect to
some inputs, usually tf.Variables. TensorFlow "records" relevant operations executed inside the context of a tf.GradientTape onto a "tape".
TensorFlow then uses that tape to compute the gradients of a "recorded" computation using reverse mode differentiation.

Here is a simple example:

x = tf.Variable(3.0)

with tf.GradientTape() as tape:

y = x**2

WARNING: All log messages before absl::InitializeLog() is called are written to STDERR

I0000 00:00:1723685409.408818 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685409.412555 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685409.416343 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685409.420087 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685409.431667 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685409.435229 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685409.438777 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685409.442350 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685409.445712 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685409.449141 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685409.452491 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685409.456034 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.685265 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.687389 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.689411 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.691490 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.693542 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.695541 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.697441 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.699432 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.701351 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.703333 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.705229 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.707222 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.744994 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.747037 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.749507 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.751538 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.753500 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.755501 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.757421 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.759404 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723685410.761363 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.763858 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.766199 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723685410.768560 20970 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

Once you've recorded some operations, use GradientTape.gradient(target, sources) to calculate the gradient of some target (often a loss)
relative to some source (often the model's variables):

# dy = 2x * dx

dy_dx = tape.gradient(y, x)

dy_dx.numpy()

6.0

The above example uses scalars, but tf.GradientTape works as easily on any tensor:

w = tf.Variable(tf.random.normal((3, 2)), name='w')

b = tf.Variable(tf.zeros(2, dtype=tf.float32), name='b')

x = [[1., 2., 3.]]

with tf.GradientTape(persistent=True) as tape:


y=x@w+b

loss = tf.reduce_mean(y**2)

To get the gradient of loss with respect to both variables, you can pass both as sources to the gradient method. The tape is flexible about how
sources are passed and will accept any nested combination of lists or dictionaries and return the gradient structured the same way
(see tf.nest).

[dl_dw, dl_db] = tape.gradient(loss, [w, b])

The gradient with respect to each source has the shape of the source:

print(w.shape)

print(dl_dw.shape)

(3, 2)

(3, 2)

Here is the gradient calculation again, this time passing a dictionary of variables:

my_vars = {

'w': w,

'b': b

grad = tape.gradient(loss, my_vars)

grad['b']

<tf.Tensor: shape=(2,), dtype=float32, numpy=array([7.5046124, 2.0025406], dtype=float32)>

Gradients with respect to a model


It's common to collect tf.Variables into a tf.Module or one of its subclasses (layers.Layer, keras.Model) for checkpointing and exporting.

In most cases, you will want to calculate gradients with respect to a model's trainable variables. Since all subclasses of tf.Module aggregate
their variables in the Module.trainable_variables property, you can calculate these gradients in a few lines of code:

layer = tf.keras.layers.Dense(2, activation='relu')

x = tf.constant([[1., 2., 3.]])

with tf.GradientTape() as tape:

# Forward pass

y = layer(x)

loss = tf.reduce_mean(y**2)

# Calculate gradients with respect to every trainable variable

grad = tape.gradient(loss, layer.trainable_variables)

for var, g in zip(layer.trainable_variables, grad):

print(f'{var.name}, shape: {g.shape}')

kernel, shape: (3, 2)

bias, shape: (2,)

Controlling what the tape watches

The default behavior is to record all operations after accessing a trainable tf.Variable. The reasons for this are:

 The tape needs to know which operations to record in the forward pass to calculate the gradients in the backwards pass.

 The tape holds references to intermediate outputs, so you don't want to record unnecessary operations.
 The most common use case involves calculating the gradient of a loss with respect to all a model's trainable variables.

For example, the following fails to calculate a gradient because the tf.Tensor is not "watched" by default, and the tf.Variable is not trainable:

# A trainable variable

x0 = tf.Variable(3.0, name='x0')

# Not trainable

x1 = tf.Variable(3.0, name='x1', trainable=False)

# Not a Variable: A variable + tensor returns a tensor.

x2 = tf.Variable(2.0, name='x2') + 1.0

# Not a variable

x3 = tf.constant(3.0, name='x3')

with tf.GradientTape() as tape:

y = (x0**2) + (x1**2) + (x2**2)

grad = tape.gradient(y, [x0, x1, x2, x3])

for g in grad:

print(g)

tf.Tensor(6.0, shape=(), dtype=float32)

None

None
None

You can list the variables being watched by the tape using the GradientTape.watched_variables method:

[var.name for var in tape.watched_variables()]

['x0:0']

tf.GradientTape provides hooks that give the user control over what is or is not watched.

To record gradients with respect to a tf.Tensor, you need to call GradientTape.watch(x):

x = tf.constant(3.0)

with tf.GradientTape() as tape:

tape.watch(x)

y = x**2

# dy = 2x * dx

dy_dx = tape.gradient(y, x)

print(dy_dx.numpy())

6.0

Conversely, to disable the default behavior of watching all tf.Variables, set watch_accessed_variables=False when creating the gradient tape.
This calculation uses two variables, but only connects the gradient for one of the variables:

x0 = tf.Variable(0.0)

x1 = tf.Variable(10.0)

with tf.GradientTape(watch_accessed_variables=False) as tape:


tape.watch(x1)

y0 = tf.math.sin(x0)

y1 = tf.nn.softplus(x1)

y = y0 + y1

ys = tf.reduce_sum(y)

Since GradientTape.watch was not called on x0, no gradient is computed with respect to it:

# dys/dx1 = exp(x1) / (1 + exp(x1)) = sigmoid(x1)

grad = tape.gradient(ys, {'x0': x0, 'x1': x1})

print('dy/dx0:', grad['x0'])

print('dy/dx1:', grad['x1'].numpy())

dy/dx0: None

dy/dx1: 0.9999546

Intermediate results

You can also request gradients of the output with respect to intermediate values computed inside the tf.GradientTape context.

x = tf.constant(3.0)

with tf.GradientTape() as tape:

tape.watch(x)

y=x*x

z=y*y
# Use the tape to compute the gradient of z with respect to the

# intermediate value y.

# dz_dy = 2 * y and y = x ** 2 = 9

print(tape.gradient(z, y).numpy())

18.0

By default, the resources held by a GradientTape are released as soon as the GradientTape.gradient method is called. To compute multiple
gradients over the same computation, create a gradient tape with persistent=True. This allows multiple calls to the gradient method as
resources are released when the tape object is garbage collected. For example:

x = tf.constant([1, 3.0])

with tf.GradientTape(persistent=True) as tape:

tape.watch(x)

y=x*x

z=y*y

print(tape.gradient(z, x).numpy()) # [4.0, 108.0] (4 * x**3 at x = [1.0, 3.0])

print(tape.gradient(y, x).numpy()) # [2.0, 6.0] (2 * x at x = [1.0, 3.0])

[ 4. 108.]

[2. 6.]

del tape # Drop the reference to the tape

Notes on performance
 There is a tiny overhead associated with doing operations inside a gradient tape context. For most eager execution this will not be a
noticeable cost, but you should still use tape context around the areas only where it is required.

 Gradient tapes use memory to store intermediate results, including inputs and outputs, for use during the backwards pass.

For efficiency, some ops (like ReLU) don't need to keep their intermediate results and they are pruned during the forward pass. However, if
you use persistent=True on your tape, nothing is discarded and your peak memory usage will be higher.

Gradients of non-scalar targets

A gradient is fundamentally an operation on a scalar.

x = tf.Variable(2.0)

with tf.GradientTape(persistent=True) as tape:

y0 = x**2

y1 = 1 / x

print(tape.gradient(y0, x).numpy())

print(tape.gradient(y1, x).numpy())

4.0

-0.25

Thus, if you ask for the gradient of multiple targets, the result for each source is:

 The gradient of the sum of the targets, or equivalently

 The sum of the gradients of each target.

x = tf.Variable(2.0)

with tf.GradientTape() as tape:


y0 = x**2

y1 = 1 / x

print(tape.gradient({'y0': y0, 'y1': y1}, x).numpy())

3.75

Similarly, if the target(s) are not scalar the gradient of the sum is calculated:

x = tf.Variable(2.)

with tf.GradientTape() as tape:

y = x * [3., 4.]

print(tape.gradient(y, x).numpy())

7.0

This makes it simple to take the gradient of the sum of a collection of losses, or the gradient of the sum of an element-wise loss calculation.

If you need a separate gradient for each item, refer to Jacobians.

In some cases you can skip the Jacobian. For an element-wise calculation, the gradient of the sum gives the derivative of each element with
respect to its input-element, since each element is independent:

x = tf.linspace(-10.0, 10.0, 200+1)

with tf.GradientTape() as tape:

tape.watch(x)
y = tf.nn.sigmoid(x)

dy_dx = tape.gradient(y, x)

plt.plot(x, y, label='y')

plt.plot(x, dy_dx, label='dy/dx')

plt.legend()

_ = plt.xlabel('x')
Control flow

Because a gradient tape records operations as they are executed, Python control flow is naturally handled (for
example, if and while statements).

Here a different variable is used on each branch of an if. The gradient only connects to the variable that was used:

x = tf.constant(1.0)
v0 = tf.Variable(2.0)

v1 = tf.Variable(2.0)

with tf.GradientTape(persistent=True) as tape:

tape.watch(x)

if x > 0.0:

result = v0

else:

result = v1**2

dv0, dv1 = tape.gradient(result, [v0, v1])

print(dv0)

print(dv1)

tf.Tensor(1.0, shape=(), dtype=float32)

None

Just remember that the control statements themselves are not differentiable, so they are invisible to gradient-based optimizers.

Depending on the value of x in the above example, the tape either records result = v0 or result = v1**2. The gradient with respect to x is
always None.

dx = tape.gradient(result, x)
print(dx)

None

Cases where gradient returns None

When a target is not connected to a source, gradient will return None.

x = tf.Variable(2.)

y = tf.Variable(3.)

with tf.GradientTape() as tape:

z=y*y

print(tape.gradient(z, x))

None

Here z is obviously not connected to x, but there are several less-obvious ways that a gradient can be disconnected.

1. Replaced a variable with a tensor

In the section on "controlling what the tape watches" you saw that the tape will automatically watch a tf.Variable but not a tf.Tensor.

One common error is to inadvertently replace a tf.Variable with a tf.Tensor, instead of using Variable.assign to update the tf.Variable. Here is
an example:

x = tf.Variable(2.0)

for epoch in range(2):

with tf.GradientTape() as tape:

y = x+1
print(type(x).__name__, ":", tape.gradient(y, x))

x = x + 1 # This should be `x.assign_add(1)`

ResourceVariable : tf.Tensor(1.0, shape=(), dtype=float32)

EagerTensor : None

2. Did calculations outside of TensorFlow

The tape can't record the gradient path if the calculation exits TensorFlow. For example:

x = tf.Variable([[1.0, 2.0],

[3.0, 4.0]], dtype=tf.float32)

with tf.GradientTape() as tape:

x2 = x**2

# This step is calculated with NumPy

y = np.mean(x2, axis=0)

# Like most ops, reduce_mean will cast the NumPy array to a constant tensor

# using `tf.convert_to_tensor`.

y = tf.reduce_mean(y, axis=0)

print(tape.gradient(y, x))
None

3. Took gradients through an integer or string

Integers and strings are not differentiable. If a calculation path uses these data types there will be no gradient.

Nobody expects strings to be differentiable, but it's easy to accidentally create an int constant or variable if you don't specify the dtype.

x = tf.constant(10)

with tf.GradientTape() as g:

g.watch(x)

y=x*x

print(g.gradient(y, x))

WARNING:tensorflow:The dtype of the watched tensor must be floating (e.g. tf.float32), got tf.int32

None

TensorFlow doesn't automatically cast between types, so, in practice, you'll often get a type error instead of a missing gradient.

4. Took gradients through a stateful object

State stops gradients. When you read from a stateful object, the tape can only observe the current state, not the history that lead to it.

A tf.Tensor is immutable. You can't change a tensor once it's created. It has a value, but no state. All the operations discussed so far are also
stateless: the output of a tf.matmul only depends on its inputs.

A tf.Variable has internal state—its value. When you use the variable, the state is read. It's normal to calculate a gradient with respect to a
variable, but the variable's state blocks gradient calculations from going farther back. For example:

x0 = tf.Variable(3.0)
x1 = tf.Variable(0.0)

with tf.GradientTape() as tape:

# Update x1 = x1 + x0.

x1.assign_add(x0)

# The tape starts recording from x1.

y = x1**2 # y = (x1 + x0)**2

# This doesn't work.

print(tape.gradient(y, x0)) #dy/dx0 = 2*(x1 + x0)

None

Similarly, tf.data.Dataset iterators and tf.queues are stateful, and will stop all gradients on tensors that pass through them.

No gradient registered

Some tf.Operations are registered as being non-differentiable and will return None. Others have no gradient registered.

The tf.raw_ops page shows which low-level ops have gradients registered.

If you attempt to take a gradient through a float op that has no gradient registered the tape will throw an error instead of silently
returning None. This way you know something has gone wrong.

For example, the tf.image.adjust_contrast function wraps raw_ops.AdjustContrastv2, which could have a gradient but the gradient is not
implemented:

image = tf.Variable([[[0.5, 0.0, 0.0]]])

delta = tf.Variable(0.1)
with tf.GradientTape() as tape:

new_image = tf.image.adjust_contrast(image, delta)

try:

print(tape.gradient(new_image, [image, delta]))

assert False # This should not happen.

except LookupError as e:

print(f'{type(e).__name__}: {e}')

LookupError: gradient registry has no entry for: AdjustContrastv2

If you need to differentiate through this op, you'll either need to implement the gradient and register it (using tf.RegisterGradient) or re-
implement the function using other ops.

Zeros instead of None

In some cases it would be convenient to get 0 instead of None for unconnected gradients. You can decide what to return when you have
unconnected gradients using the unconnected_gradients argument:

x = tf.Variable([2., 2.])

y = tf.Variable(3.)

with tf.GradientTape() as tape:

z = y**2

print(tape.gradient(z, x, unconnected_gradients=tf.UnconnectedGradients.ZERO))
tf.Tensor([0. 0.], shape=(2,), dtype=float32)

Was this helpful?

Introduction to tensor slicing

bookmark_border

When working on ML applications such as object


Run in Google View source on GitHub Download notebook detection and NLP, it is sometimes necessary to work
Colab with sub-sections (slices) of tensors. For example, if
your model architecture includes routing, where one
layer might control which training example gets routed to the next layer. In this case, you could use tensor slicing ops to split the tensors up and
put them back together in the right order.

In NLP applications, you can use tensor slicing to perform word masking while training. For example, you can generate training data from a list of
sentences by choosing a word index to mask in each sentence, taking the word out as a label, and then replacing the chosen word with a mask
token.

In this guide, you will learn how to use the TensorFlow APIs to:

 Extract slices from a tensor

 Insert data at specific indices in a tensor

This guide assumes familiarity with tensor indexing. Read the indexing sections of the Tensor and TensorFlow NumPy guides before getting
started with this guide.

Setup

import tensorflow as tf

import numpy as np

2024-08-15 02:59:22.387959: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to


register factory for plugin cuFFT when one has already been registered
2024-08-15 02:59:22.409014: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting
to register factory for plugin cuDNN when one has already been registered

2024-08-15 02:59:22.415391: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting


to register factory for plugin cuBLAS when one has already been registered

Extract tensor slices

Perform NumPy-like tensor slicing using tf.slice.

t1 = tf.constant([0, 1, 2, 3, 4, 5, 6, 7])

print(tf.slice(t1,

begin=[1],

size=[3]))

tf.Tensor([1 2 3], shape=(3,), dtype=int32)

WARNING: All log messages before absl::InitializeLog() is called are written to STDERR

I0000 00:00:1723690765.029742 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690765.033507 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690765.037154 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690765.040869 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690765.052765 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690765.056359 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690765.059719 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690765.063138 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690765.066532 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690765.070040 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690765.073650 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690765.077235 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.300773 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.302797 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.304766 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.306851 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.308843 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.310723 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.312589 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.314574 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.316423 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.318291 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.320168 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.322150 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.360968 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.362946 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.364849 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.366867 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.368710 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.370591 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1723690766.372447 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.374435 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.376302 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.378677 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.380966 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

I0000 00:00:1723690766.383365 169903 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there
must be at least one NUMA node, so returning NUMA node zero. See more at
https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355

Alternatively, you can use a more Pythonic syntax. Note that tensor slices are evenly spaced over a start-stop range.

print(t1[1:4])

tf.Tensor([1 2 3], shape=(3,), dtype=int32)

print(t1[-3:])
tf.Tensor([5 6 7], shape=(3,), dtype=int32)

For 2-dimensional tensors,you can use something like:

t2 = tf.constant([[0, 1, 2, 3, 4],

[5, 6, 7, 8, 9],

[10, 11, 12, 13, 14],

[15, 16, 17, 18, 19]])

print(t2[:-1, 1:3])

tf.Tensor(

[[ 1 2]

[ 6 7]

[11 12]], shape=(3, 2), dtype=int32)


You can use tf.slice on higher dimensional tensors as well.

t3 = tf.constant([[[1, 3, 5, 7],

[9, 11, 13, 15]],

[[17, 19, 21, 23],


[25, 27, 29, 31]]

])

print(tf.slice(t3,

begin=[1, 1, 0],

size=[1, 1, 2]))

tf.Tensor([[[25 27]]], shape=(1, 1, 2), dtype=int32)

You can also use tf.strided_slice to extract slices of tensors by 'striding' over the tensor dimensions.

Use tf.gather to extract specific indices from a single axis of a tensor.

print(tf.gather(t1,

indices=[0, 3, 6]))

# This is similar to doing

t1[::3]

tf.Tensor([0 3 6], shape=(3,), dtype=int32)

<tf.Tensor: shape=(3,), dtype=int32, numpy=array([0, 3, 6], dtype=int32)>


tf.gather does not require indices to be evenly spaced.

alphabet = tf.constant(list('abcdefghijklmnopqrstuvwxyz'))

print(tf.gather(alphabet,

indices=[2, 0, 19, 18]))

tf.Tensor([b'c' b'a' b't' b's'], shape=(4,), dtype=string)

To extract slices from multiple axes of a tensor, use tf.gather_nd. This is useful when you want to gather the elements of a matrix as opposed to
just its rows or columns.

t4 = tf.constant([[0, 5],

[1, 6],
[2, 7],

[3, 8],

[4, 9]])

print(tf.gather_nd(t4,

indices=[[2], [3], [0]]))

tf.Tensor(

[[2 7]

[3 8]

[0 5]], shape=(3, 2), dtype=int32)


t5 = np.reshape(np.arange(18), [2, 3, 3])

print(tf.gather_nd(t5,

indices=[[0, 0, 0], [1, 2, 1]]))

tf.Tensor([ 0 16], shape=(2,), dtype=int64)

# Return a list of two matrices


print(tf.gather_nd(t5,

indices=[[[0, 0], [0, 2]], [[1, 0], [1, 2]]]))

tf.Tensor(

[[[ 0 1 2]

[ 6 7 8]]

[[ 9 10 11]

[15 16 17]]], shape=(2, 2, 3), dtype=int64)

# Return one matrix

print(tf.gather_nd(t5,

indices=[[0, 0], [0, 2], [1, 0], [1, 2]]))

tf.Tensor(

[[ 0 1 2]

[ 6 7 8]

[ 9 10 11]

[15 16 17]], shape=(4, 3), dtype=int64)

Insert data into tensors

Use tf.scatter_nd to insert data at specific slices/indices of a tensor. Note that the tensor into which you insert values is zero-initialized.

t6 = tf.constant([10])
indices = tf.constant([[1], [3], [5], [7], [9]])

data = tf.constant([2, 4, 6, 8, 10])

print(tf.scatter_nd(indices=indices,

updates=data,

shape=t6))

tf.Tensor([ 0 2 0 4 0 6 0 8 0 10], shape=(10,), dtype=int32)

Methods like tf.scatter_nd which require zero-initialized tensors are similar to sparse tensor initializers. You can
use tf.gather_nd and tf.scatter_nd to mimic the behavior of sparse tensor ops.

Consider an example where you construct a sparse tensor using these two methods in conjunction.

# Gather values from one tensor by specifying indices

new_indices = tf.constant([[0, 2], [2, 1], [3, 3]])

t7 = tf.gather_nd(t2, indices=new_indices)
# Add these values into a new tensor

t8 = tf.scatter_nd(indices=new_indices, updates=t7, shape=tf.constant([4, 5]))

print(t8)

tf.Tensor(

[[ 0 0 2 0 0]

[ 0 0 0 0 0]

[ 0 11 0 0 0]

[ 0 0 0 18 0]], shape=(4, 5), dtype=int32)

This is similar to:

t9 = tf.SparseTensor(indices=[[0, 2], [2, 1], [3, 3]],


values=[2, 11, 18],

dense_shape=[4, 5])

print(t9)

SparseTensor(indices=tf.Tensor(

[[0 2]

[2 1]

[3 3]], shape=(3, 2), dtype=int64), values=tf.Tensor([ 2 11 18], shape=(3,), dtype=int32), dense_shape=tf.Tensor([4 5], shape=(2,), dtype=int64))

# Convert the sparse tensor into a dense tensor

t10 = tf.sparse.to_dense(t9)

print(t10)

tf.Tensor(

[[ 0 0 2 0 0]

[ 0 0 0 0 0]

[ 0 11 0 0 0]

[ 0 0 0 18 0]], shape=(4, 5), dtype=int32)

To insert data into a tensor with pre-existing values, use tf.tensor_scatter_nd_add.

t11 = tf.constant([[2, 7, 0],

[9, 0, 1],
[0, 3, 8]])

# Convert the tensor into a magic square by inserting numbers at appropriate indices

t12 = tf.tensor_scatter_nd_add(t11,

indices=[[0, 2], [1, 1], [2, 0]],

updates=[6, 5, 4])

print(t12)

tf.Tensor(

[[2 7 6]

[9 5 1]

[4 3 8]], shape=(3, 3), dtype=int32)

Similarly, use tf.tensor_scatter_nd_sub to subtract values from a tensor with pre-existing values.

# Convert the tensor into an identity matrix

t13 = tf.tensor_scatter_nd_sub(t11,

indices=[[0, 0], [0, 1], [1, 0], [1, 1], [1, 2], [2, 1], [2, 2]],

updates=[1, 7, 9, -1, 1, 3, 7])

print(t13)
tf.Tensor(

[[1 0 0]

[0 1 0]

[0 0 1]], shape=(3, 3), dtype=int32)

Use tf.tensor_scatter_nd_min to copy element-wise minimum values from one tensor to another.

t14 = tf.constant([[-2, -7, 0],

[-9, 0, 1],

[0, -3, -8]])

t15 = tf.tensor_scatter_nd_min(t14,

indices=[[0, 2], [1, 1], [2, 0]],

updates=[-6, -5, -4])

print(t15)

tf.Tensor(

[[-2 -7 -6]

[-9 -5 1]

[-4 -3 -8]], shape=(3, 3), dtype=int32)

Similarly, use tf.tensor_scatter_nd_max to copy element-wise maximum values from one tensor to another.

t16 = tf.tensor_scatter_nd_max(t14,

indices=[[0, 2], [1, 1], [2, 0]],


updates=[6, 5, 4])

print(t16)

tf.Tensor(

[[-2 -7 6]

[-9 5 1]

[ 4 -3 -8]], shape=(3, 3), dtype=int32)

Further reading and resources

In this guide, you learned how to use the tensor slicing ops available with TensorFlow to exert finer control over the elements in your tensors.

 Check out the slicing ops available with TensorFlow NumPy such
as tf.experimental.numpy.take_along_axis and tf.experimental.numpy.take.

 Also check out the Tensor guide and the Variable guide.

You might also like