tfm.core.input_reader.InputReader
Stay organized with collections
Save and categorize content based on your preferences.
Input reader that returns a tf.data.Dataset instance.
tfm.core.input_reader.InputReader(
params: tfm.core.config_definitions.DataConfig
,
dataset_fn=tf.data.TFRecordDataset,
decoder_fn: Optional[Callable[..., Any]] = None,
combine_fn: Optional[Callable[..., Any]] = None,
sample_fn: Optional[Callable[..., Any]] = None,
parser_fn: Optional[Callable[..., Any]] = None,
filter_fn: Optional[Callable[..., tf.Tensor]] = None,
transform_and_batch_fn: Optional[Callable[[tf.data.Dataset, Optional[tf.distribute.InputContext]],
tf.data.Dataset]] = None,
postprocess_fn: Optional[Callable[..., Any]] = None
)
Args |
params
|
A config_definitions.DataConfig object.
|
dataset_fn
|
A tf.data.Dataset that consumes the input files. For
example, it can be tf.data.TFRecordDataset .
|
decoder_fn
|
An optional callable that takes the serialized data string
and decodes them into the raw tensor dictionary.
|
combine_fn
|
An optional callable that takes a dictionarty of
tf.data.Dataset objects as input and outputs a combined dataset. It
will be executed after the decoder_fn and before the sample_fn.
|
sample_fn
|
An optional callable that takes a tf.data.Dataset object as
input and outputs the transformed dataset. It performs sampling on the
decoded raw tensors dict before the parser_fn.
|
parser_fn
|
An optional callable that takes the decoded raw tensors dict
and parse them into a dictionary of tensors that can be consumed by the
model. It will be executed after decoder_fn.
|
filter_fn
|
An optional callable mapping a dataset element to a boolean.
It will be executed after parser_fn.
|
transform_and_batch_fn
|
An optional callable that takes a
tf.data.Dataset object and an optional tf.distribute.InputContext as
input, and returns a tf.data.Dataset object. It will be executed after
parser_fn to transform and batch the dataset; if None, after
parser_fn is executed, the dataset will be batched into per-replica
batch size.
|
postprocess_fn
|
A optional callable that processes batched tensors. It
will be executed after batching.
|
Methods
get_files
View source
get_files(
input_path
)
Gets matched files. Can be overridden by subclasses.
read
View source
read(
input_context: Optional[tf.distribute.InputContext] = None,
dataset: Optional[tf.data.Dataset] = None
) -> tf.data.Dataset
Generates a tf.data.Dataset object.
Class Variables |
static_randnum
|
1021534207
|
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates. Some content is licensed under the numpy license.
Last updated 2024-03-15 UTC.
[null,null,["Last updated 2024-03-15 UTC."],[],[],null,["# tfm.core.input_reader.InputReader\n\n\u003cbr /\u003e\n\n|--------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/models/blob/v2.15.0/official/core/input_reader.py#L214-L591) |\n\nInput reader that returns a tf.data.Dataset instance. \n\n tfm.core.input_reader.InputReader(\n params: ../../../tfm/core/config_definitions/DataConfig,\n dataset_fn=tf.data.TFRecordDataset,\n decoder_fn: Optional[Callable[..., Any]] = None,\n combine_fn: Optional[Callable[..., Any]] = None,\n sample_fn: Optional[Callable[..., Any]] = None,\n parser_fn: Optional[Callable[..., Any]] = None,\n filter_fn: Optional[Callable[..., tf.Tensor]] = None,\n transform_and_batch_fn: Optional[Callable[[tf.data.Dataset, Optional[tf.distribute.InputContext]],\n tf.data.Dataset]] = None,\n postprocess_fn: Optional[Callable[..., Any]] = None\n )\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|--------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `params` | A config_definitions.DataConfig object. |\n| `dataset_fn` | A [`tf.data.Dataset`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) that consumes the input files. For example, it can be [`tf.data.TFRecordDataset`](https://www.tensorflow.org/api_docs/python/tf/data/TFRecordDataset). |\n| `decoder_fn` | An optional `callable` that takes the serialized data string and decodes them into the raw tensor dictionary. |\n| `combine_fn` | An optional `callable` that takes a dictionarty of [`tf.data.Dataset`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) objects as input and outputs a combined dataset. It will be executed after the decoder_fn and before the sample_fn. |\n| `sample_fn` | An optional `callable` that takes a [`tf.data.Dataset`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) object as input and outputs the transformed dataset. It performs sampling on the decoded raw tensors dict before the parser_fn. |\n| `parser_fn` | An optional `callable` that takes the decoded raw tensors dict and parse them into a dictionary of tensors that can be consumed by the model. It will be executed after decoder_fn. |\n| `filter_fn` | An optional `callable` mapping a dataset element to a boolean. It will be executed after parser_fn. |\n| `transform_and_batch_fn` | An optional `callable` that takes a [`tf.data.Dataset`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) object and an optional [`tf.distribute.InputContext`](https://www.tensorflow.org/api_docs/python/tf/distribute/InputContext) as input, and returns a [`tf.data.Dataset`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) object. It will be executed after `parser_fn` to transform and batch the dataset; if None, after `parser_fn` is executed, the dataset will be batched into per-replica batch size. |\n| `postprocess_fn` | A optional `callable` that processes batched tensors. It will be executed after batching. |\n\n\u003cbr /\u003e\n\nMethods\n-------\n\n### `get_files`\n\n[View source](https://github.com/tensorflow/models/blob/v2.15.0/official/core/input_reader.py#L357-L369) \n\n get_files(\n input_path\n )\n\nGets matched files. Can be overridden by subclasses.\n\n### `read`\n\n[View source](https://github.com/tensorflow/models/blob/v2.15.0/official/core/input_reader.py#L568-L591) \n\n read(\n input_context: Optional[tf.distribute.InputContext] = None,\n dataset: Optional[tf.data.Dataset] = None\n ) -\u003e tf.data.Dataset\n\nGenerates a tf.data.Dataset object.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Class Variables --------------- ||\n|----------------|--------------|\n| static_randnum | `1021534207` |\n\n\u003cbr /\u003e"]]