0% found this document useful (0 votes)
11 views

WWW Tensorflow Org Guide Ragged - Tensor

Uploaded by

92hussnainali
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

WWW Tensorflow Org Guide Ragged - Tensor

Uploaded by

92hussnainali
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Insta GitHub

TensorFlow Core
Tutorials Guide Migrate to TF2 TF 1 ↗

filter_list Filter

TensorFlow chevron_right Learn chevron_right TensorFlow Core


TensorFlow guide chevron_right Guide

TensorFlow basics Ragged tensors


Overview
Tensors
Variables
Automatic differentiation Run in View Download
Graphs and functions Google source notebook
Colab on
Modules, layers, and models
GitHub
Training loops

Keras API Documentation:


Overview tf.RaggedTensor tf.ragged
The Sequential model
The Functional API
Training & evaluation with the built-in Setup
methods
Making new layers and models via
subclassing
Serialization and saving !pip install --pre -U tensorflow
Customizing Saving import math
Working with preprocessing layers
import tensorflow as tf
Customizing what happens in fit()
Writing a training loop from scratch
Working with RNNs
2024-08-27 01:20:40.536630: E external
Understanding masking & padding
2024-08-27 01:20:40.557820: E external
Writing your own callbacks 2024-08-27 01:20:40.563984: E external
Transfer learning & fine-tuning
Multi-GPU and distributed training
Overview
Build with Core new_releases Your data comes in many shapes;
Overview
your tensors should too. Ragged
Quickstart for Core
tensors are the TensorFlow equivalent
Logistic regression
of nested variable-length lists. They
Multilayer perceptrons
make it easy to store and process
data with non-uniform shapes,
including:

Variable-length features, such as


the set of actors in a movie.

Batches of variable-length
sequential inputs, such as
sentences or video clips.

Hierarchical inputs, such as text


documents that are subdivided
into sections, paragraphs,
sentences, and words.

Individual fields in structured


inputs, such as protocol buffers.

What you can do with a


ragged tensor
Ragged tensors are supported by
more than a hundred TensorFlow
operations, including math operations
(such as tf.add and
tf.reduce_mean ), array operations
(such as tf.concat and tf.tile ),
string manipulation ops (such as
tf.strings.substr ), control flow
operations (such as tf.while_loop
and tf.map_fn ), and many others:
digits = tf.ragged.constant([[3, 1, 4
words = tf.ragged.constant([["So", "lo
print(tf.add(digits, 3))
print(tf.reduce_mean(digits, axis=1))
print(tf.concat([digits, [[5, 3]]], ax
print(tf.tile(digits, [1, 2]))
print(tf.strings.substr(words, 0, 2))
print(tf.map_fn(tf.math.square, digits

WARNING: All log messages before absl


I0000 00:00:1724721643.161122 10064
I0000 00:00:1724721643.164924 10064
I0000 00:00:1724721643.168750 10064
I0000 00:00:1724721643.172328 10064
I0000 00:00:1724721643.183636 10064
I0000 00:00:1724721643.187178 10064
I0000 00:00:1724721643.190668 10064
I0000 00:00:1724721643.194028 10064
I0000 00:00:1724721643.197422 10064
I0000 00:00:1724721643.201089 10064
I0000 00:00:1724721643.204485 10064
I0000 00:00:1724721643.207934 10064
I0000 00:00:1724721644 469725 10064
There are also a number of methods
and operations that are specific to
ragged tensors, including factory
methods, conversion methods, and
value-mapping operations. For a list of
supported ops, see the tf.ragged
package documentation.

Ragged tensors are supported by


many TensorFlow APIs, including
Keras, Datasets, tf.function,
SavedModels, and tf.Example. For
more information, check the section
on TensorFlow APIs below.
As with normal tensors, you can use
Python-style indexing to access
specific slices of a ragged tensor. For
more information, refer to the section
on Indexing below.

print(digits[0]) # First row

tf.Tensor([3 1 4 1], shape=(4,), dtype

print(digits[:, :2]) # First two val

<tf.RaggedTensor [[3, 1], [], [5, 9],

print(digits[:, -2:]) # Last two valu

<tf.RaggedTensor [[4, 1], [], [9, 2],

And just like normal tensors, you can


use Python arithmetic and
comparison operators to perform
elementwise operations. For more
information, check the section on
Overloaded operators below.

print(digits + 3)
<tf.RaggedTensor [[6, 4, 7, 4], [], [8

print(digits + tf.ragged.constant([[1

<tf.RaggedTensor [[4, 3, 7, 5], [], [1

If you need to perform an elementwise


transformation to the values of a
RaggedTensor , you can use
tf.ragged.map_flat_values , which
takes a function plus one or more
arguments, and applies the function to
transform the RaggedTensor 's
values.

times_two_plus_one = lambda x: x * 2 +
print(tf.ragged.map_flat_values(times_

<tf.RaggedTensor [[7, 3, 9, 3], [], [1

Ragged tensors can be converted to


nested Python list s and NumPy
array s:

digits.to_list()
[[3, 1, 4, 1], [], [5, 9, 2], [6], []

digits.numpy()

array([array([3, 1, 4, 1], dtype=int32


array([5, 9, 2], dtype=int32),
array([], dtype=int32)], dtype=

Constructing a ragged tensor


The simplest way to construct a
ragged tensor is using
tf.ragged.constant , which builds
the RaggedTensor corresponding to
a given nested Python list or
NumPy array :

sentences = tf.ragged.constant([
["Let's", "build", "some", "ragged
["We", "can", "use", "tf.ragged.co
print(sentences)

<tf.RaggedTensor [[b"Let's", b'build'


[b'We', b'can', b'use', b'tf.ragged.c

paragraphs = tf.ragged.constant([
[['I', 'have', 'a', 'cat'], ['His
[['Do', 'you', 'want', 'to', 'come
])
print(paragraphs)

<tf.RaggedTensor [[[b'I', b'have', b'a


[[b'Do', b'you', b'want', b'to', b'co
[b"I'm", b'free', b'tomorrow']]]>

Ragged tensors can also be


constructed by pairing flat values
tensors with row-partitioning tensors
indicating how those values should be
divided into rows, using factory
classmethods such as
tf.RaggedTensor.from_value_rowi
ds ,
tf.RaggedTensor.from_row_length
s , and
tf.RaggedTensor.from_row_split
s.

tf.RaggedTensor.from_value_rowi
ds

If you know which row each value


belongs to, then you can build a
RaggedTensor using a
value_rowids row-partitioning
tensor:

print(tf.RaggedTensor.from_value_rowid
values=[3, 1, 4, 1, 5, 9, 2],
value_rowids=[0, 0, 0, 0, 2, 2, 3

<tf.RaggedTensor [[3, 1, 4, 1], [], [5

tf.RaggedTensor.from_row_length
s

If you know how long each row is,


then you can use a row_lengths
row-partitioning tensor:

print(tf.RaggedTensor.from_row_lengths
values=[3, 1, 4, 1, 5, 9, 2],
row_lengths=[4, 0, 2, 1]))

<tf.RaggedTensor [[3, 1, 4, 1], [], [5

tf.RaggedTensor.from_row_splits

If you know the index where each row


starts and ends, then you can use a
row_splits row-partitioning tensor:
print(tf.RaggedTensor.from_row_splits
values=[3, 1, 4, 1, 5, 9, 2],
row_splits=[0, 4, 4, 6, 7]))

<tf.RaggedTensor [[3, 1, 4, 1], [], [5

See the tf.RaggedTensor class


documentation for a full list of factory
methods.

star Note: By default, these factory


methods add assertions that the row
partition tensor is well-formed and
consistent with the number of values.
The validate=False parameter can
be used to skip these checks if you
can guarantee that the inputs are well-
formed and consistent.

What you can store in a


ragged tensor
As with normal Tensor s, the values
in a RaggedTensor must all have the
same type; and the values must all be
at the same nesting depth (the rank of
the tensor):

print(tf.ragged.constant([["Hi"], ["Ho
<tf.RaggedTensor [[b'Hi'], [b'How', b

print(tf.ragged.constant([[[1, 2], [3

<tf.RaggedTensor [[[1, 2], [3]], [[4,

try:
tf.ragged.constant([["one", "two"],
except ValueError as exception:
print(exception)

Can't convert Python sequence with mix

try:
tf.ragged.constant(["A", ["B", "C"]
except ValueError as exception:
print(exception)

all scalar values must have the same n

Example use case


The following example demonstrates
how RaggedTensor s can be used to
construct and combine unigram and
bigram embeddings for a batch of
variable-length queries, using special
markers for the beginning and end of
each sentence. For more details on
the ops used in this example, check
the tf.ragged package
documentation.

queries = tf.ragged.constant([['Who',
['Pause
['Will'

# Create an embedding table.


num_buckets = 1024
embedding_size = 4
embedding_table = tf.Variable(
tf.random.truncated_normal([num_bu
stddev=1.0 / ma

# Look up the embedding for each word


word_buckets = tf.strings.to_hash_buck
word_embeddings = tf.nn.embedding_look

# Add markers to the beginning and end


marker = tf.fill([queries.nrows(), 1]
padded = tf.concat([marker, queries, m

# Build word bigrams and look up embed


bigrams = tf.strings.join([padded[:,

bigram_buckets = tf.strings.to_hash_bu
bigram_embeddings = tf.nn.embedding_lo

# Find the average embedding for each


all_embeddings = tf.concat([word_embed
avg_embedding = tf.reduce_mean(all_emb
print(avg_embedding)
tf.Tensor(
[[-0.07189021 0.23444025 -0.04020268
[ 0.10560822 0.00976487 -0.17885399
[-0.23479678 -0.15996003 0.07078557

Ragged and uniform


dimensions
A ragged dimension is a dimension
whose slices may have different
lengths. For example, the inner
(column) dimension of rt=[[3, 1,
4, 1], [], [5, 9, 2], [6], []]
is ragged, since the column slices
( rt[0, :] , ..., rt[4, :] ) have
different lengths. Dimensions whose
slices all have the same length are
called uniform dimensions.

The outermost dimension of a ragged


tensor is always uniform, since it
consists of a single slice (and,
therefore, there is no possibility for
differing slice lengths). The remaining
dimensions may be either ragged or
uniform. For example, you may store
the word embeddings for each word in
a batch of sentences using a ragged
tensor with shape [num_sentences,
(num_words), embedding_size] ,
where the parentheses around
(num_words) indicate that the
dimension is ragged.

Ragged tensors may have multiple


ragged dimensions. For example, you
could store a batch of structured text
documents using a tensor with shape
[num_documents,
(num_paragraphs),
(num_sentences), (num_words)]
(where again parentheses are used to
indicate ragged dimensions).

As with tf.Tensor , the rank of a


ragged tensor is its total number of
dimensions (including both ragged
and uniform dimensions). A
potentially ragged tensor is a value
that might be either a tf.Tensor or a
tf.RaggedTensor .
When describing the shape of a
RaggedTensor, ragged dimensions are
conventionally indicated by enclosing
them in parentheses. For example, as
you saw above, the shape of a 3D
RaggedTensor that stores word
embeddings for each word in a batch
of sentences can be written as
[num_sentences, (num_words),
embedding_size] .

The RaggedTensor.shape attribute


returns a tf.TensorShape for a
ragged tensor where ragged
dimensions have size None :

tf.ragged.constant([["Hi"], ["How", "a

TensorShape([2, None])

The method
tf.RaggedTensor.bounding_shape
can be used to find a tight bounding
shape for a given RaggedTensor :

print(tf.ragged.constant([["Hi"], ["Ho

tf.Tensor([2 3], shape=(2,), dtype=int

Ragged vs sparse
A ragged tensor should not be thought
of as a type of sparse tensor. In
particular, sparse tensors are efficient
encodings for tf.Tensor that model
the same data in a compact format;
but ragged tensor is an extension to
tf.Tensor that models an expanded
class of data. This difference is
crucial when defining operations:

Applying an op to a sparse or
dense tensor should always give
the same result.

Applying an op to a ragged or
sparse tensor may give different
results.

As an illustrative example, consider


how array operations such as
concat , stack , and tile are
defined for ragged vs. sparse tensors.
Concatenating ragged tensors joins
each row to form a single row with the
combined length:

ragged_x = tf.ragged.constant([["John
ragged_y = tf.ragged.constant([["fell
print(tf.concat([ragged_x, ragged_y],

<tf.RaggedTensor [[b'John', b'fell', b


[b'my', b'cat', b'is', b'fuzzy']]>
However, concatenating sparse
tensors is equivalent to concatenating
the corresponding dense tensors, as
illustrated by the following example
(where Ø indicates missing values):

sparse_x = ragged_x.to_sparse()
sparse_y = ragged_y.to_sparse()
sparse_result = tf.sparse.concat(sp_in
print(tf.sparse.to_dense(sparse_result

tf.Tensor(
[[b'John' b'' b'' b'fell' b'asleep']
[b'a' b'big' b'dog' b'barked' b'']
[b'my' b'cat' b'' b'is' b'fuzzy']], s

For another example of why this


distinction is important, consider the
definition of “the mean value of each
row” for an op such as
tf.reduce_mean . For a ragged
tensor, the mean value for a row is the
sum of the row’s values divided by the
row’s width. But for a sparse tensor,
the mean value for a row is the sum of
the row’s values divided by the sparse
tensor’s overall width (which is greater
than or equal to the width of the
longest row).

TensorFlow APIs
Keras
tf.keras is TensorFlow's high-level API
for building and training deep learning
models. It doesn't have ragged
support. But it does support masked
tensors. So the easiest way to use a
ragged tensor in a Keras model is to
convert the ragged tensor to a dense
tensor, using .to_tensor() and then
using Keras's builtin masking:

# Task: predict whether each sentence


sentences = tf.constant(
['What makes you think she is a wi
'She turned me into a newt.',
'A newt?',
'Well, I got better.'])
is_question = tf.constant([True, False

# Preprocess the input strings.


hash_buckets = 1000
words = tf.strings.split(sentences, '
hashed_words = tf.strings.to_hash_buck
hashed_words.to_list()

[[940, 203, 668, 387, 790, 320, 939, 1


[315, 515, 791, 181, 939, 787],
[564, 205],
[820, 180, 993, 739]]

hashed_words.to_tensor()
<tf.Tensor: shape=(4, 8), dtype=int64
array([[940, 203, 668, 387, 790, 320,
[315, 515, 791, 181, 939, 787,
[564, 205, 0, 0, 0, 0,
[820, 180, 993, 739, 0, 0,

tf.keras.Input?

# Build the Keras model.


keras_model = tf.keras.Sequential([
tf.keras.layers.Embedding(hash_buc
tf.keras.layers.LSTM(32, return_se
tf.keras.layers.GlobalAveragePooli
tf.keras.layers.Dense(32),
tf.keras.layers.Activation(tf.nn.
tf.keras.layers.Dense(1)
])

keras_model.compile(loss='binary_cross
keras_model.fit(hashed_words.to_tenso

Epoch 1/5
WARNING: All log messages before absl
I0000 00:00:1724721648.311913 10235
I0000 00:00:1724721648.311945 10235
I0000 00:00:1724721648.311949 10235
I0000 00:00:1724721648.311952 10235
I0000 00:00:1724721648.311955 10235
1/1 ━━━━━━━━━━━━━━━━━━━━ 4s 4s/step -
Epoch 2/5
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 47ms/step
Epoch 3/5
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 44ms/step
Epoch 4/5
1/1 0s 44ms/step
print(keras_model.predict(hashed_words

1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 407ms/step


[[-0.00517703]
[-0.00227403]
[-0.00706224]
[-0.00354813]]

tf.Example
tf.Example is a standard protobuf
encoding for TensorFlow data. Data
encoded with tf.Example s often
includes variable-length features. For
example, the following code defines a
batch of four tf.Example messages
with different feature lengths:

import google.protobuf.text_format as

def build_tf_example(s):
return pbtext.Merge(s, tf.train.Exam

example_batch = [
build_tf_example(r'''
features {
feature {key: "colors" value {by
feature {key: "lengths" value {i
build_tf_example(r'''
features {
feature {key: "colors" value {by
feature {key: "lengths" value {i
build_tf_example(r'''
features {
feature {key: "colors" value {by
feature {key: "lengths" value {i
build_tf_example(r'''
features {
feature {key: "colors" value {by
feature {key: "lengths" value {i

You can parse this encoded data


using tf.io.parse_example , which
takes a tensor of serialized strings
and a feature specification dictionary,
and returns a dictionary mapping
feature names to tensors. To read the
variable-length features into ragged
tensors, you simply use
tf.io.RaggedFeature in the feature
specification dictionary:

feature_specification = {
'colors': tf.io.RaggedFeature(tf.s
'lengths': tf.io.RaggedFeature(tf
}
feature_tensors = tf.io.parse_example
for name, value in feature_tensors.ite
print("{}={}".format(name, value))

colors=<tf.RaggedTensor [[b'red', b'bl


lengths=<tf.RaggedTensor [[7], [], [1

tf.io.RaggedFeature can also be


used to read features with multiple
ragged dimensions. For details, refer
to the API documentation.

Datasets

You might also like