make_accumulators_ptransform: this is a beam.PTransform which maps data
to a more compact mergeable representation (accumulator). Mergeable here
means that it is possible to combine multiple representations produced from
a partition of the dataset into a representation of the entire dataset.
merge_accumulators_ptransform: this is a beam.PTransform which operates
on a collection of accumulators, i.e. the results of both the
make_accumulators_ptransform and merge_accumulators_ptransform stages,
and produces a single reduced accumulator. This operation must be
associative and commutative in order to have reliably reproducible results.
extract_output: this is a beam.PTransform which operates on the result of
the merge_accumulators_ptransform stage, and produces the outputs of the
analyzer. These outputs must be consistent with the output_dtypes and
output_shapes provided to ptransform_analyzer.
This container also holds a cache_coder (PTransformAnalyzerCacheCoder)
which can encode outputs and decode the inputs of the
merge_accumulators_ptransform stage.
In many cases, SimpleJsonPTransformAnalyzerCacheCoder would be sufficient.
To ensure the correctness of this analyzer, the following must hold:
merge(make({D1, ..., Dn})) == merge({make(D1), ..., make(Dn)})
[null,null,["Last updated 2024-11-01 UTC."],[],[],null,["# tft.experimental.CacheablePTransformAnalyzer\n\n\u003cbr /\u003e\n\n|---------------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/transform/blob/v1.16.0/tensorflow_transform/experimental/analyzers.py#L72-L105) |\n\nA PTransformAnalyzer which enables analyzer cache. \n\n tft.experimental.CacheablePTransformAnalyzer(\n make_accumulators_ptransform,\n merge_accumulators_ptransform,\n extract_output_ptransform,\n cache_coder\n )\n\n| **Warning:** This should only be used if the analyzer can correctly be separated into make_accumulators, merge_accumulators and extract_output stages.\n\n1. make_accumulators_ptransform: this is a `beam.PTransform` which maps data to a more compact mergeable representation (accumulator). Mergeable here means that it is possible to combine multiple representations produced from a partition of the dataset into a representation of the entire dataset.\n2. merge_accumulators_ptransform: this is a `beam.PTransform` which operates on a collection of accumulators, i.e. the results of both the make_accumulators_ptransform and merge_accumulators_ptransform stages, and produces a single reduced accumulator. This operation must be associative and commutative in order to have reliably reproducible results.\n3. extract_output: this is a `beam.PTransform` which operates on the result of the merge_accumulators_ptransform stage, and produces the outputs of the analyzer. These outputs must be consistent with the `output_dtypes` and `output_shapes` provided to `ptransform_analyzer`.\n\nThis container also holds a `cache_coder` (`PTransformAnalyzerCacheCoder`)\nwhich can encode outputs and decode the inputs of the\n`merge_accumulators_ptransform` stage.\nIn many cases, `SimpleJsonPTransformAnalyzerCacheCoder` would be sufficient.\n\nTo ensure the correctness of this analyzer, the following must hold:\nmerge(make({D1, ..., Dn})) == merge({make(D1), ..., make(Dn)})\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Attributes ---------- ||\n|---------------------------------|-----------------------------------------|\n| `make_accumulators_ptransform` | A `namedtuple` alias for field number 0 |\n| `merge_accumulators_ptransform` | A `namedtuple` alias for field number 1 |\n| `extract_output_ptransform` | A `namedtuple` alias for field number 2 |\n| `cache_coder` | A `namedtuple` alias for field number 3 |\n\n\u003cbr /\u003e"]]