tf.keras.utils.pad_sequences
Stay organized with collections
Save and categorize content based on your preferences.
Pads sequences to the same length.
tf.keras.utils.pad_sequences(
sequences,
maxlen=None,
dtype='int32',
padding='pre',
truncating='pre',
value=0.0
)
Used in the notebooks
Used in the guide |
Used in the tutorials |
|
|
This function transforms a list (of length num_samples
)
of sequences (lists of integers)
into a 2D NumPy array of shape (num_samples, num_timesteps)
.
num_timesteps
is either the maxlen
argument if provided,
or the length of the longest sequence in the list.
Sequences that are shorter than num_timesteps
are padded with value
until they are num_timesteps
long.
Sequences longer than num_timesteps
are truncated
so that they fit the desired length.
The position where padding or truncation happens is determined by
the arguments padding
and truncating
, respectively.
Pre-padding or removing values from the beginning of the sequence is the
default.
sequence = [[1], [2, 3], [4, 5, 6]]
keras.utils.pad_sequences(sequence)
array([[0, 0, 1],
[0, 2, 3],
[4, 5, 6]], dtype=int32)
keras.utils.pad_sequences(sequence, value=-1)
array([[-1, -1, 1],
[-1, 2, 3],
[ 4, 5, 6]], dtype=int32)
keras.utils.pad_sequences(sequence, padding='post')
array([[1, 0, 0],
[2, 3, 0],
[4, 5, 6]], dtype=int32)
keras.utils.pad_sequences(sequence, maxlen=2)
array([[0, 1],
[2, 3],
[5, 6]], dtype=int32)
Args |
sequences
|
List of sequences (each sequence is a list of integers).
|
maxlen
|
Optional Int, maximum length of all sequences. If not provided,
sequences will be padded to the length of the longest individual
sequence.
|
dtype
|
(Optional, defaults to "int32" ). Type of the output sequences.
To pad sequences with variable length strings, you can use object .
|
padding
|
String, "pre" or "post" (optional, defaults to "pre" ):
pad either before or after each sequence.
|
truncating
|
String, "pre" or "post" (optional, defaults to "pre" ):
remove values from sequences larger than
maxlen , either at the beginning or at the end of the sequences.
|
value
|
Float or String, padding value. (Optional, defaults to 0.)
|
Returns |
NumPy array with shape (len(sequences), maxlen)
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates. Some content is licensed under the numpy license.
Last updated 2024-06-07 UTC.
[null,null,["Last updated 2024-06-07 UTC."],[],[],null,["# tf.keras.utils.pad_sequences\n\n\u003cbr /\u003e\n\n|--------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/keras-team/keras/tree/v3.3.3/keras/src/utils/sequence_utils.py#L6-L139) |\n\nPads sequences to the same length.\n\n#### View aliases\n\n\n**Main aliases**\n\n[`tf.keras.preprocessing.sequence.pad_sequences`](https://www.tensorflow.org/api_docs/python/tf/keras/utils/pad_sequences)\n\n\u003cbr /\u003e\n\n tf.keras.utils.pad_sequences(\n sequences,\n maxlen=None,\n dtype='int32',\n padding='pre',\n truncating='pre',\n value=0.0\n )\n\n### Used in the notebooks\n\n| Used in the guide | Used in the tutorials |\n|----------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| - [Understanding masking \\& padding](https://www.tensorflow.org/guide/keras/understanding_masking_and_padding) | - [Wiki Talk Comments Toxicity Prediction](https://www.tensorflow.org/responsible_ai/fairness_indicators/tutorials/Fairness_Indicators_TFCO_Wiki_Case_Study) |\n\nThis function transforms a list (of length `num_samples`)\nof sequences (lists of integers)\ninto a 2D NumPy array of shape `(num_samples, num_timesteps)`.\n`num_timesteps` is either the `maxlen` argument if provided,\nor the length of the longest sequence in the list.\n\nSequences that are shorter than `num_timesteps`\nare padded with `value` until they are `num_timesteps` long.\n\nSequences longer than `num_timesteps` are truncated\nso that they fit the desired length.\n\nThe position where padding or truncation happens is determined by\nthe arguments `padding` and `truncating`, respectively.\nPre-padding or removing values from the beginning of the sequence is the\ndefault. \n\n sequence = [[1], [2, 3], [4, 5, 6]]\n keras.utils.pad_sequences(sequence)\n array([[0, 0, 1],\n [0, 2, 3],\n [4, 5, 6]], dtype=int32)\n\n keras.utils.pad_sequences(sequence, value=-1)\n array([[-1, -1, 1],\n [-1, 2, 3],\n [ 4, 5, 6]], dtype=int32)\n\n keras.utils.pad_sequences(sequence, padding='post')\n array([[1, 0, 0],\n [2, 3, 0],\n [4, 5, 6]], dtype=int32)\n\n keras.utils.pad_sequences(sequence, maxlen=2)\n array([[0, 1],\n [2, 3],\n [5, 6]], dtype=int32)\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|--------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `sequences` | List of sequences (each sequence is a list of integers). |\n| `maxlen` | Optional Int, maximum length of all sequences. If not provided, sequences will be padded to the length of the longest individual sequence. |\n| `dtype` | (Optional, defaults to `\"int32\"`). Type of the output sequences. To pad sequences with variable length strings, you can use `object`. |\n| `padding` | String, \"pre\" or \"post\" (optional, defaults to `\"pre\"`): pad either before or after each sequence. |\n| `truncating` | String, \"pre\" or \"post\" (optional, defaults to `\"pre\"`): remove values from sequences larger than `maxlen`, either at the beginning or at the end of the sequences. |\n| `value` | Float or String, padding value. (Optional, defaults to 0.) |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|---|---|\n| NumPy array with shape `(len(sequences), maxlen)` ||\n\n\u003cbr /\u003e"]]