tfds.deprecated.text.ByteTextEncoder
Stay organized with collections
Save and categorize content based on your preferences.
Byte-encodes text.
Inherits From: TextEncoder
tfds.deprecated.text.ByteTextEncoder(
additional_tokens=None
)
Args |
additional_tokens
|
list<str> , list of additional tokens. These will be
assigned vocab ids [1, 1+len(additional_tokens)] . Useful for things
like "end-of-string" tokens (e.g. "").
|
Attributes |
additional_tokens
|
|
vocab_size
|
Size of the vocabulary. Decode produces ints [1, vocab_size).
|
Methods
decode
View source
decode(
ids
)
Decodes a list of integers into text.
encode
View source
encode(
s
)
Encodes text into a list of integers.
load_from_file
View source
@classmethod
load_from_file(
filename_prefix
)
Load from file. Inverse of save_to_file.
save_to_file
View source
save_to_file(
filename_prefix
)
Store to file. Inverse of load_from_file.
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-04-26 UTC.
[null,null,["Last updated 2024-04-26 UTC."],[],[],null,["# tfds.deprecated.text.ByteTextEncoder\n\n\u003cbr /\u003e\n\n|------------------------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/datasets/blob/v4.9.3/tensorflow_datasets/core/deprecated/text/text_encoder.py#L116-L217) |\n\nByte-encodes text.\n\nInherits From: [`TextEncoder`](../../../tfds/deprecated/text/TextEncoder) \n\n tfds.deprecated.text.ByteTextEncoder(\n additional_tokens=None\n )\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|---------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `additional_tokens` | `list\u003cstr\u003e`, list of additional tokens. These will be assigned vocab ids `[1, 1+len(additional_tokens)]`. Useful for things like \"end-of-string\" tokens (e.g. \"\"). |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Attributes ---------- ||\n|---------------------|----------------------------------------------------------------|\n| `additional_tokens` | \u003cbr /\u003e \u003cbr /\u003e |\n| `vocab_size` | Size of the vocabulary. Decode produces ints \\[1, vocab_size). |\n\n\u003cbr /\u003e\n\nMethods\n-------\n\n### `decode`\n\n[View source](https://github.com/tensorflow/datasets/blob/v4.9.3/tensorflow_datasets/core/deprecated/text/text_encoder.py#L158-L194) \n\n decode(\n ids\n )\n\nDecodes a list of integers into text.\n\n### `encode`\n\n[View source](https://github.com/tensorflow/datasets/blob/v4.9.3/tensorflow_datasets/core/deprecated/text/text_encoder.py#L136-L156) \n\n encode(\n s\n )\n\nEncodes text into a list of integers.\n\n### `load_from_file`\n\n[View source](https://github.com/tensorflow/datasets/blob/v4.9.3/tensorflow_datasets/core/deprecated/text/text_encoder.py#L214-L217) \n\n @classmethod\n load_from_file(\n filename_prefix\n )\n\nLoad from file. Inverse of save_to_file.\n\n### `save_to_file`\n\n[View source](https://github.com/tensorflow/datasets/blob/v4.9.3/tensorflow_datasets/core/deprecated/text/text_encoder.py#L209-L212) \n\n save_to_file(\n filename_prefix\n )\n\nStore to file. Inverse of load_from_file."]]