tfp.substrates.numpy.stats.quantiles
Stay organized with collections
Save and categorize content based on your preferences.
Compute quantiles of x
along axis
.
tfp.substrates.numpy.stats.quantiles(
x,
num_quantiles,
axis=None,
interpolation=None,
keepdims=False,
validate_args=False,
name=None
)
The quantiles of a distribution are cut points dividing the range into
intervals with equal probabilities.
Given a vector x
of samples, this function estimates the cut points by
returning num_quantiles + 1
cut points, (c0, ..., cn)
, such that, roughly
speaking, equal number of sample points lie in the num_quantiles
intervals
[c0, c1), [c1, c2), ..., [c_{n-1}, cn]
. That is,
- About
1 / n
fraction of the data lies in [c_{k-1}, c_k)
, k = 1, ..., n
- About
k / n
fraction of the data lies below c_k
.
c0
is the sample minimum and cn
is the maximum.
The exact number of data points in each interval depends on the size of
x
(e.g. whether the size is divisible by n
) and the interpolation
kwarg.
Args |
x
|
Numeric N-D Tensor with N > 0 . If axis is not None ,
x must have statically known number of dimensions.
|
num_quantiles
|
Scalar integer Tensor . The number of intervals the
returned num_quantiles + 1 cut points divide the range into.
|
axis
|
Optional 0-D or 1-D integer Tensor with constant values. The
axis that index independent samples over which to return the desired
percentile. If None (the default), treat every dimension as a sample
dimension, returning a scalar.
|
interpolation
|
{'nearest', 'linear', 'lower', 'higher', 'midpoint'}.
Default value: 'nearest'. This specifies the interpolation method to
use when the fractions k / n lie between two data points i < j :
- linear: i + (j - i) * fraction, where fraction is the fractional part
of the index surrounded by i and j.
- lower:
i .
- higher:
j .
- nearest:
i or j , whichever is nearest.
- midpoint: (i + j) / 2.
linear and midpoint interpolation do not
work with integer dtypes.
|
keepdims
|
Python bool . If True , the last dimension is kept with size 1
If False , the last dimension is removed from the output shape.
|
validate_args
|
Whether to add runtime checks of argument validity. If
False, and arguments are incorrect, correct behavior is not guaranteed.
|
name
|
A Python string name to give this Op . Default is 'percentile'
|
Returns |
cut_points
|
A rank(x) + 1 - len(axis) dimensional Tensor with same
dtype as x and shape [num_quantiles + 1, ...] where the trailing shape
is that of x without the dimensions in axis (unless keepdims is True )
|
Raises |
ValueError
|
If argument 'interpolation' is not an allowed type.
|
ValueError
|
If interpolation type not compatible with dtype .
|
Examples
# Get quartiles of x with various interpolation choices.
x = [0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.]
tfp.stats.quantiles(x, num_quantiles=4, interpolation='nearest')
==> [ 0., 2., 5., 8., 10.]
tfp.stats.quantiles(x, num_quantiles=4, interpolation='linear')
==> [ 0. , 2.5, 5. , 7.5, 10. ]
tfp.stats.quantiles(x, num_quantiles=4, interpolation='lower')
==> [ 0., 2., 5., 7., 10.]
# Get deciles of columns of an R x C data set.
data = load_my_columnar_data(...)
tfp.stats.quantiles(data, num_quantiles=10)
==> Shape [11, C] Tensor
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2023-11-21 UTC.
[null,null,["Last updated 2023-11-21 UTC."],[],[],null,["# tfp.substrates.numpy.stats.quantiles\n\n\u003cbr /\u003e\n\n|------------------------------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/probability/blob/v0.23.0/tensorflow_probability/substrates/numpy/stats/quantiles.py#L645-L741) |\n\nCompute quantiles of `x` along `axis`.\n\n#### View aliases\n\n\n**Main aliases**\n\n[`tfp.experimental.substrates.numpy.stats.quantiles`](https://www.tensorflow.org/probability/api_docs/python/tfp/substrates/numpy/stats/quantiles)\n\n\u003cbr /\u003e\n\n tfp.substrates.numpy.stats.quantiles(\n x,\n num_quantiles,\n axis=None,\n interpolation=None,\n keepdims=False,\n validate_args=False,\n name=None\n )\n\nThe quantiles of a distribution are cut points dividing the range into\nintervals with equal probabilities.\n\nGiven a vector `x` of samples, this function estimates the cut points by\nreturning `num_quantiles + 1` cut points, `(c0, ..., cn)`, such that, roughly\nspeaking, equal number of sample points lie in the `num_quantiles` intervals\n`[c0, c1), [c1, c2), ..., [c_{n-1}, cn]`. That is,\n\n- About `1 / n` fraction of the data lies in `[c_{k-1}, c_k)`, `k = 1, ..., n`\n- About `k / n` fraction of the data lies below `c_k`.\n- `c0` is the sample minimum and `cn` is the maximum.\n\nThe exact number of data points in each interval depends on the size of\n`x` (e.g. whether the size is divisible by `n`) and the `interpolation` kwarg.\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|-----------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `x` | Numeric `N-D` `Tensor` with `N \u003e 0`. If `axis` is not `None`, `x` must have statically known number of dimensions. |\n| `num_quantiles` | Scalar `integer` `Tensor`. The number of intervals the returned `num_quantiles + 1` cut points divide the range into. |\n| `axis` | Optional `0-D` or `1-D` integer `Tensor` with constant values. The axis that index independent samples over which to return the desired percentile. If `None` (the default), treat every dimension as a sample dimension, returning a scalar. |\n| `interpolation` | {'nearest', 'linear', 'lower', 'higher', 'midpoint'}. Default value: 'nearest'. This specifies the interpolation method to use when the fractions `k / n` lie between two data points `i \u003c j`: \u003cbr /\u003e - linear: i + (j - i) \\* fraction, where fraction is the fractional part of the index surrounded by i and j. - lower: `i`. - higher: `j`. - nearest: `i` or `j`, whichever is nearest. - midpoint: (i + j) / 2. `linear` and `midpoint` interpolation do not work with integer dtypes. |\n| `keepdims` | Python `bool`. If `True`, the last dimension is kept with size 1 If `False`, the last dimension is removed from the output shape. |\n| `validate_args` | Whether to add runtime checks of argument validity. If False, and arguments are incorrect, correct behavior is not guaranteed. |\n| `name` | A Python string name to give this `Op`. Default is 'percentile' |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|--------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `cut_points` | A `rank(x) + 1 - len(axis)` dimensional `Tensor` with same `dtype` as `x` and shape `[num_quantiles + 1, ...]` where the trailing shape is that of `x` without the dimensions in `axis` (unless `keepdims is True`) |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Raises ------ ||\n|--------------|-----------------------------------------------------|\n| `ValueError` | If argument 'interpolation' is not an allowed type. |\n| `ValueError` | If interpolation type not compatible with `dtype`. |\n\n\u003cbr /\u003e\n\n#### Examples\n\n # Get quartiles of x with various interpolation choices.\n x = [0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.]\n\n tfp.stats.quantiles(x, num_quantiles=4, interpolation='nearest')\n ==\u003e [ 0., 2., 5., 8., 10.]\n\n tfp.stats.quantiles(x, num_quantiles=4, interpolation='linear')\n ==\u003e [ 0. , 2.5, 5. , 7.5, 10. ]\n\n tfp.stats.quantiles(x, num_quantiles=4, interpolation='lower')\n ==\u003e [ 0., 2., 5., 7., 10.]\n\n # Get deciles of columns of an R x C data set.\n data = load_my_columnar_data(...)\n tfp.stats.quantiles(data, num_quantiles=10)\n ==\u003e Shape [11, C] Tensor"]]