tf.nn.ctc_greedy_decoder
Stay organized with collections
Save and categorize content based on your preferences.
Performs greedy decoding on the logits given in input (best path).
tf.nn.ctc_greedy_decoder(
inputs, sequence_length, merge_repeated=True, blank_index=None
)
Given a tensor as inputs
, the blank_index
parameter defines the class
index of the blank symbol.
For example:
If blank_index
is equal to 1:
inf = float("inf")
logits = tf.constant([[[ 0., -inf, -inf],
[ -2.3, -inf, -0.1]],
[[ -inf, -0.5, -inf],
[ -inf, -inf, -0.1]],
[[ -inf, -inf, -inf],
[ -0.1, -inf, -2.3]]])
seq_lens = tf.constant([2, 3])
outputs = tf.nn.ctc_greedy_decoder(
logits,
seq_lens,
blank_index=1)
Notes:
- Unlike
ctc_beam_search_decoder
, ctc_greedy_decoder
considers blanks
as regular elements when computing the probability of a sequence.
- Default
blank_index
is (num_classes - 1)
, unless overriden.
If merge_repeated
is True
, merge repeated classes in output.
This means that if consecutive logits' maximum indices are the same,
only the first of these is emitted. The sequence A B B * B * B
(where '*'
is the blank label) becomes
A B B B
if merge_repeated=True
.
A B B B B
if merge_repeated=False
.
Args |
inputs
|
3-D float Tensor sized [max_time, batch_size, num_classes] .
The logits.
|
sequence_length
|
1-D int32 vector containing sequence lengths, having size
[batch_size] .
|
merge_repeated
|
Boolean. Default: True.
|
blank_index
|
(Optional). Default: num_classes - 1 . Define the class index
to use for the blank label. Negative values will start from num_classes,
ie, -1 will reproduce the ctc_greedy_decoder behavior of using
num_classes - 1 for the blank symbol, which corresponds to the default.
|
Returns |
A tuple (decoded, neg_sum_logits) where
|
decoded
|
A single-element list. decoded[0]
is an SparseTensor containing the decoded outputs s.t.:
decoded.indices : Indices matrix (total_decoded_outputs, 2) .
The rows store: [batch, time] .
decoded.values : Values vector, size (total_decoded_outputs) .
The vector stores the decoded classes.
decoded.dense_shape : Shape vector, size (2) .
The shape values are: [batch_size, max_decoded_length]
|
neg_sum_logits
|
A float matrix (batch_size x 1) containing, for the
sequence found, the negative of the sum of the greatest logit at each
timeframe.
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates. Some content is licensed under the numpy license.
Last updated 2024-04-26 UTC.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-04-26 UTC."],[],[],null,["# tf.nn.ctc_greedy_decoder\n\n\u003cbr /\u003e\n\n|---------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/tensorflow/blob/v2.16.1/tensorflow/python/ops/ctc_ops.py#L297-L377) |\n\nPerforms greedy decoding on the logits given in input (best path).\n\n#### View aliases\n\n\n**Compat aliases for migration**\n\nSee\n[Migration guide](https://www.tensorflow.org/guide/migrate) for\nmore details.\n\n[`tf.compat.v1.nn.ctc_greedy_decoder`](https://www.tensorflow.org/api_docs/python/tf/nn/ctc_greedy_decoder)\n\n\u003cbr /\u003e\n\n tf.nn.ctc_greedy_decoder(\n inputs, sequence_length, merge_repeated=True, blank_index=None\n )\n\nGiven a tensor as `inputs`, the `blank_index` parameter defines the class\nindex of the blank symbol.\n\n#### For example:\n\nIf `blank_index` is equal to 1: \n\n inf = float(\"inf\")\n logits = tf.constant([[[ 0., -inf, -inf],\n [ -2.3, -inf, -0.1]],\n [[ -inf, -0.5, -inf],\n [ -inf, -inf, -0.1]],\n [[ -inf, -inf, -inf],\n [ -0.1, -inf, -2.3]]])\n seq_lens = tf.constant([2, 3])\n outputs = tf.nn.ctc_greedy_decoder(\n logits,\n seq_lens,\n blank_index=1)\n\n#### Notes:\n\n- Unlike `ctc_beam_search_decoder`, `ctc_greedy_decoder` considers blanks as regular elements when computing the probability of a sequence.\n- Default `blank_index` is `(num_classes - 1)`, unless overriden.\n\nIf `merge_repeated` is `True`, merge repeated classes in output.\nThis means that if consecutive logits' maximum indices are the same,\nonly the first of these is emitted. The sequence `A B B * B * B` (where '\\*'\nis the blank label) becomes\n\n- `A B B B` if `merge_repeated=True`.\n- `A B B B B` if `merge_repeated=False`.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|-------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `inputs` | 3-D `float` `Tensor` sized `[max_time, batch_size, num_classes]`. The logits. |\n| `sequence_length` | 1-D `int32` vector containing sequence lengths, having size `[batch_size]`. |\n| `merge_repeated` | Boolean. Default: True. |\n| `blank_index` | (Optional). Default: `num_classes - 1`. Define the class index to use for the blank label. Negative values will start from num_classes, ie, -1 will reproduce the ctc_greedy_decoder behavior of using num_classes - 1 for the blank symbol, which corresponds to the default. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| A tuple `(decoded, neg_sum_logits)` where ||\n| `decoded` | A single-element list. `decoded[0]` is an `SparseTensor` containing the decoded outputs s.t.: \u003cbr /\u003e `decoded.indices`: Indices matrix `(total_decoded_outputs, 2)`. The rows store: `[batch, time]`. `decoded.values`: Values vector, size `(total_decoded_outputs)`. The vector stores the decoded classes. `decoded.dense_shape`: Shape vector, size `(2)`. The shape values are: `[batch_size, max_decoded_length]` |\n| `neg_sum_logits` | A `float` matrix `(batch_size x 1)` containing, for the sequence found, the negative of the sum of the greatest logit at each timeframe. |\n\n\u003cbr /\u003e"]]