When p is "reparameterized", i.e., a diffeomorphic transformation of a
parameterless distribution (e.g.,
Normal(Y; m, s) <=> Y = sX + m, X ~ Normal(0,1)), we can swap gradient and
expectation, i.e.,
grad[ Avg{ s_i : i=1...n } ] = Avg{ grad[s_i] : i=1...n } where
S_n = Avg{s_i} and s_i = f(x_i), x_i ~ p.
However, if p is not reparameterized, TensorFlow's gradient will be incorrect
since the chain-rule stops at samples of non-reparameterized distributions.
(The non-differentiated result, approx_expectation, is the same regardless
of use_reparameterization.) In this circumstance using the Score-Gradient
trick results in an unbiased gradient, i.e.,
Tensor or nested structure (list, dict, etc.) of Tensors,
representing samples used to form the Monte-Carlo approximation of
E_p[f(X)]. A batch of samples should be indexed by axis dimensions.
log_prob
Python callable which can return log_prob(samples). Must
correspond to the natural-logarithm of the pdf/pmf of each sample. Only
required/used if use_reparameterization=False.
Default value: None.
use_reparameterization
Python bool indicating that the approximation
should use the fact that the gradient of samples is unbiased. Whether
True or False, this arg only affects the gradient of the resulting
approx_expectation.
Default value: True.
axis
The dimensions to average. If None, averages all
dimensions.
Default value: 0 (the left-most dimension).
keepdims
If True, retains averaged dimensions using size 1.
Default value: False.
name
A name_scope for operations created by this function.
Default value: None (which implies "expectation").
Returns
approx_expectation
Tensor corresponding to the Monte-Carlo approximation
of E_p[f(X)].
Raises
ValueError
if f is not a Python callable.
ValueError
if use_reparameterization=False and log_prob is not a Python
callable.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2023-11-21 UTC."],[],[],null,["# tfp.substrates.numpy.monte_carlo.expectation\n\n\u003cbr /\u003e\n\n|-------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| [View source on GitHub](https://github.com/tensorflow/probability/blob/v0.23.0/tensorflow_probability/substrates/numpy/monte_carlo/expectation.py#L25-L192) |\n\nComputes the Monte-Carlo approximation of `E_p[f(X)]`.\n\n#### View aliases\n\n\n**Main aliases**\n\n[`tfp.experimental.substrates.numpy.monte_carlo.expectation`](https://www.tensorflow.org/probability/api_docs/python/tfp/substrates/numpy/monte_carlo/expectation)\n\n\u003cbr /\u003e\n\n tfp.substrates.numpy.monte_carlo.expectation(\n f,\n samples,\n log_prob=None,\n use_reparameterization=True,\n axis=0,\n keepdims=False,\n name=None\n )\n\nThis function computes the Monte-Carlo approximation of an expectation, i.e., \n\n E_p[f(X)] approx= m**-1 sum_i^m f(x_j), x_j ~iid p(X)\n\nwhere:\n\n- `x_j = samples[j, ...]`,\n- `log(p(samples)) = log_prob(samples)` and\n- `m = prod(shape(samples)[axis])`.\n\nTricks: Reparameterization and Score-Gradient\n\nWhen p is \"reparameterized\", i.e., a diffeomorphic transformation of a\nparameterless distribution (e.g.,\n`Normal(Y; m, s) \u003c=\u003e Y = sX + m, X ~ Normal(0,1)`), we can swap gradient and\nexpectation, i.e.,\n`grad[ Avg{ s_i : i=1...n } ] = Avg{ grad[s_i] : i=1...n }` where\n`S_n = Avg{s_i}` and `s_i = f(x_i), x_i ~ p`.\n\nHowever, if p is not reparameterized, TensorFlow's gradient will be incorrect\nsince the chain-rule stops at samples of non-reparameterized distributions.\n(The non-differentiated result, `approx_expectation`, is the same regardless\nof `use_reparameterization`.) In this circumstance using the Score-Gradient\ntrick results in an unbiased gradient, i.e., \n\n grad[ E_p[f(X)] ]\n = grad[ int dx p(x) f(x) ]\n = int dx grad[ p(x) f(x) ]\n = int dx [ p'(x) f(x) + p(x) f'(x) ]\n = int dx p(x) [p'(x) / p(x) f(x) + f'(x) ]\n = int dx p(x) grad[ f(x) p(x) / stop_grad[p(x)] ]\n = E_p[ grad[ f(x) p(x) / stop_grad[p(x)] ] ]\n\nUnless p is not reparameterized, it is usually preferable to\n`use_reparameterization = True`.\n| **Warning:** users are responsible for verifying `p` is a \"reparameterized\" distribution.\n\n#### Example Use:\n\n # Monte-Carlo approximation of a reparameterized distribution, e.g., Normal.\n\n num_draws = int(1e5)\n p = tfp.distributions.Normal(loc=0., scale=1.)\n q = tfp.distributions.Normal(loc=1., scale=2.)\n exact_kl_normal_normal = tfp.distributions.kl_divergence(p, q)\n # ==\u003e 0.44314718\n approx_kl_normal_normal = tfp.monte_carlo.expectation(\n f=lambda x: p.log_prob(x) - q.log_prob(x),\n samples=p.sample(num_draws, seed=42),\n log_prob=p.log_prob,\n use_reparameterization=(p.reparameterization_type\n == tfp.distributions.FULLY_REPARAMETERIZED))\n # ==\u003e 0.44632751\n # Relative Error: \u003c1%\n\n # Monte-Carlo approximation of non-reparameterized distribution,\n # e.g., Bernoulli.\n\n num_draws = int(1e5)\n p = tfp.distributions.Bernoulli(probs=0.4)\n q = tfp.distributions.Bernoulli(probs=0.8)\n exact_kl_bernoulli_bernoulli = tfp.distributions.kl_divergence(p, q)\n # ==\u003e 0.38190854\n approx_kl_bernoulli_bernoulli = tfp.monte_carlo.expectation(\n f=lambda x: p.log_prob(x) - q.log_prob(x),\n samples=p.sample(num_draws, seed=42),\n log_prob=p.log_prob,\n use_reparameterization=(p.reparameterization_type\n == tfp.distributions.FULLY_REPARAMETERIZED))\n # ==\u003e 0.38336259\n # Relative Error: \u003c1%\n\n # For comparing the gradients, see `expectation_test.py`.\n\n**Note:** The above example is for illustration only. To compute approximate KL-divergence, the following is preferred: \n\n approx_kl_p_q = bf.monte_carlo_variational_loss(\n p_log_prob=q.log_prob,\n q=p,\n discrepancy_fn=bf.kl_reverse,\n num_draws=num_draws)\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Args ---- ||\n|--------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| `f` | Python callable which can return `f(samples)`. |\n| `samples` | `Tensor` or nested structure (list, dict, etc.) of `Tensor`s, representing samples used to form the Monte-Carlo approximation of `E_p[f(X)]`. A batch of samples should be indexed by `axis` dimensions. |\n| `log_prob` | Python callable which can return `log_prob(samples)`. Must correspond to the natural-logarithm of the pdf/pmf of each sample. Only required/used if `use_reparameterization=False`. Default value: `None`. |\n| `use_reparameterization` | Python `bool` indicating that the approximation should use the fact that the gradient of samples is unbiased. Whether `True` or `False`, this arg only affects the gradient of the resulting `approx_expectation`. Default value: `True`. |\n| `axis` | The dimensions to average. If `None`, averages all dimensions. Default value: `0` (the left-most dimension). |\n| `keepdims` | If True, retains averaged dimensions using size `1`. Default value: `False`. |\n| `name` | A `name_scope` for operations created by this function. Default value: `None` (which implies \"expectation\"). |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Returns ------- ||\n|----------------------|-------------------------------------------------------------------------|\n| `approx_expectation` | `Tensor` corresponding to the Monte-Carlo approximation of `E_p[f(X)]`. |\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n\u003cbr /\u003e\n\n| Raises ------ ||\n|--------------|------------------------------------------------------------------------------|\n| `ValueError` | if `f` is not a Python `callable`. |\n| `ValueError` | if `use_reparameterization=False` and `log_prob` is not a Python `callable`. |\n\n\u003cbr /\u003e"]]