0% found this document useful (0 votes)

94 views96 pages

Deep Gen Models Tutorial

Uploaded by

vizard

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

94 views96 pages

Deep Gen Models Tutorial

Uploaded by

vizard

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 96

UAI 2017 Australia

Tutorial on

Deep Generative Models

Shakir Mohamed and Danilo Rezende

@shakir_za @deepspiker
Abstract

This tutorial will be a review of recent advances in deep generative models. Generative models
have a long history at UAI and recent methods have combined the generality of probabilistic
reasoning with the scalability of deep learning to develop learning algorithms that have been
applied to a wide variety of problems giving state-of-the-art results in image generation,
text-to-speech synthesis, and image captioning, amongst many others. Advances in deep
generative models are at the forefront of deep learning research because of the promise they
offer for allowing data-efficient learning, and for model-based reinforcement learning. At the
end of this tutorial, audience member will have a full understanding of the latest advances in
generative modelling covering three of the active types of models: Markov models, latent
variable models and implicit models, and how these models can be scaled to high-dimensional
data. The tutorial will expose many questions that remain in this area, and for which there
remains a great deal of opportunity from members of the UAI community.
Beyond Classification

Move beyond associating Understand and imagine

inputs to outputs how the world evolves

Recognise objects in the

Detect surprising events in
world and their factors of
the world
variation

Establish concepts as useful

Anticipate and generate
for reasoning and
rich plans for the future
decision making
What is a Generative Model?
A model that allows Models that allow for Approaches for
us to learn a (conditional) density unsupervised learning
simulator of data estimation of data

Characteristics are:
- Probabilistic models of data that allow
for uncertainty to be captured.
- Data distribution p(x) is targeted.
- High-dimensional outputs.
Why Generative Models?
Why Generative Models
Generative models have a
role in many problems.
Drug Design and Response Prediction
Proposing candidate molecules and for improving prediction through semi-supervised learning.

Gómez-Bombarelli, et al. 2016

Locating Celestial Bodies
Generative models for applications in astronomy and high-energy physics.

Regier et al., 2015

Image super-resolution
Photo-realistic single image super-resolution

Ledig et al., 2016

Text-to-speech Synthesis
Generating audio conditioned on text

Oord et al., 2016

Image and Content Generation
Generating images and video content.

DRAW Pixel RNN ALI

Gregor et al., 2015, Oord et al., 2016, Dumoulin et al., 2016
Communication and Compression
Hierarchical compression of images and other data.

Original images

Compression rate: 0.2bits/dimension

JPEG

JPEG-2000

RVAE v1

RVAE v2

Gregor et al., 2016

One-shot Generalisation
Rapid generalisation of novel concepts

Rezende et al., 2016

Visual Concept Learning
Understanding the factors of variation and invariances.

Higgins et al., 2017

Future Simulation
Simulate future trajectories of environments based on actions for planning

Atari simulation Robot arm simulation

Chiappa et al, 2017, Kalchbrenner et al., 2017

Scene Understanding
Understanding the components of scenes and their interactions

Wu et al., 2017
Probabilistic Deep Learning
Two Streams of Machine Learning
Deep Learning Probabilistic Reasoning

+ Rich non-linear models for - Mainly conjugate and linear models.

classification and sequence - Potentially intractable inference,
prediction. computationally expensive or long
+ Scalable learning using stochastic simulation time.
approximation and conceptually
simple. + Unified framework for model
+ Easily composable with other building, inference, prediction and
gradient-based methods. decision making.
+ Explicit accounting for uncertainty
- Only point estimates. and variability of outcomes.
- Hard to score models, do selection + Robust to overfitting; tools for model
and complexity penalisation. selection and composition.
Complementary strengths, making it natural to combine them
Thinking about Machine Learning

3. Algorithms

1. Models 2. Learning Principles

Types of Generative Models

Fully-observed models
Model observed data directly without
introducing any new unobserved local
variables.

Latent Variable Models

Introduce an unobserved random variable for every
observed data point to explain hidden causes.
● Prescribed models: Use observer likelihoods and
assume observation noise.
1. Models ● Implicit models: Likelihood-free models.
Spectrum of Fully-observed Models
Building Generative Models

Equivalent ways of representing the same DAG

Fully-observed Models + Can directly encode how observed
points are related.
+ Any data type can be used
+ For directed graphical models:
Parameter learning simple
+ Log-likelihood is directly
computable, no approximation
needed.
+ Easy to scale-up to large models,
many optimisation tools available.

- Order sensitive.
- For undirected models, parameter
learning difficult: Need to compute
normalising constants.
- Generation can be slow: iterate
through elements sequentially, or
using a Markov chain.
All conditional probabilities described by deep networks.
Spectrum of Latent Variable Models
Building Generative Models
Building Generative Models
Graphical Models + Computational Graphs (aka NNets)
Latent Variable Models + Easy sampling.
+ Easy way to include
hierarchy and depth.
- Inversion process to + Easy to encode structure
determine latents + Avoids order dependency
corresponding to a input is assumptions:
difficult in general marginalisation induces
- Difficult to compute dependencies.
marginalised likelihood + Provide compression and
requiring approximations. representation.
- Not easy to specify rich + Scoring, model comparison
approximations for latent and selection possible using
posterior distribution. the marginalised likelihood.

Introduce an unobserved local

random variables that represents
hidden causes.
Choice of Learning Principles
For a given model, there are many competing inference methods.

● Exact methods (conjugacy, enumeration)

● Numerical integration (Quadrature)
● Generalised method of moments
● Maximum likelihood (ML)
● Maximum a posteriori (MAP)
● Laplace approximation
● Integrated nested Laplace approximations (INLA)
● Expectation Maximisation (EM)
● Monte Carlo methods (MCMC, SMC, ABC)
● Contrastive estimation (NCE)
● Cavity Methods (EP)
● Variational methods
2. Learning Principles
Combining Models and Inference
3. Algorithms

A given model and learning principle can be implemented in many ways.

Convolutional neural network Implicit Generative Model

+ penalised maximum likelihood + Two-sample testing

● Optimisation methods ● Method-of-moments

(SGD, Adagrad) ● Approximate Bayesian Computation
● Regularisation (L1, L2, (ABC)
batchnorm, dropout) ● Generative adversarial network (GAN)

Latent variable model Restricted Boltzmann Machine

+ variational inference + maximum likelihood
● VEM algorithm ● Contrastive Divergence
● Expectation propagation ● Persistent CD
● Approximate message passing ● Parallel Tempering
● Variational auto-encoders (VAE) ● Natural gradients
Inference Questions?

Objective Quantity of Interest

Prediction

Planning

Parameter estimation

Experimental Design

Hypothesis testing
Approximate Inference
Latent Variable Models
Methods for Approximate Inference

● Laplace approximations

● Importance sampling

● Variational approximations

● Perturbative corrections

● Other methods: MCMC, Langevin, HMC, Adaptive MCMC

Laplace Approximation

Other names
Saddle-point approximation,
Delta-method
Importance Sampling

Importance weights

Monte-Carlo

Pointwise Free-energy

Important property
Importance sampling provides a bound in expectation

x
Variational Inference
Variational Inference

Reconstruction Regularizer
Perturbative Corrections
Design Choices

Choice of Model
Computation graphs, Renderers, simulators and environments

Variational Optimisation Approximate Posteriors

- Variational EM - Mean-field
- Stochastic VEM - Structured approx
- Monte Carlo gradient - Aux. variable methods
estimators
Variational EM Algorithm
Fixed-point iterations between variational and model parameters

E M
Amortised Inference

Introduce a parametric family of conditional densities

Rezende et al., 2015

Variational Auto-encoders
Simplest instantiation of a VAE

Deep Latent Gaussian Model p(x,z)

Gaussian Recognition Model q(z)

We then optimise the free-energy wrt model and variational parameters

Kingma and Welling, 2014, Rezende et al., 2014
Richer VAES
DRAW: Recurrent/Dependent Priors Recurrent/Dependent Inference Networks

AIR: Structured Priors Volumetric and Semi-supervised Learning

Sequence data
Summary so far
Applications of Probabilistic Types of
Generative Models Deep Learning Generative Models

Variational Principles Amortised Inference

END OF FIRST HALF
Stochastic Optimisation
Classical Inference Approach

E M

Compute expectations then M-step gradients

Stochastic Inference Approach

In general, we won’t know the expectations.

Gradient is of the parameters of the distribution w.r.t. which the expectation is taken.
Stochastic Gradient Estimators

Score-function estimator: Typical problem areas:

● Generative models and inference
Differentiate the density q(z|x)
● Reinforcement learning and control
● Operations research and inventory control
● Monte Carlo simulation
Pathwise gradient estimator:
● Finance and asset pricing
Differentiate the function f(z)
● Sensitivity estimation

Fu, 2006
Score Function Estimators

Gradient reweighted by the value of the function

Other names: When to use:

● Likelihood-ratio trick ● Function is not differentiable.
● Radon-Nikodym derivative ● Distribution q is easy to sample
● REINFORCE and policy from.
gradients ● Density q is known and
● Automated inference differentiable.
● Black-box inference
Reparameterisation

Find an invertible function g(.) that expresses z as a

transformation of a base distribution .

Kingma and Welling, 2014, Rezende et al., 2014

Pathwise Derivative Estimator

Other names:
When to use
● Reparameterisation trick
● Function f is differentiable
● Stochastic backpropagation
● Perturbation analysis ● Density q can be described using a simpler
● Affine-independent inference base distribution: inverse CDF, location-scale
● Doubly stochastic estimation transform, or other co-ordinate transform.
● Hierarchical non-centred ● Easy to sample from base distribution.
parameterisations.
Gaussian Stochastic Gradients

First-order Gradient

Second-order Gradient

We can develop low-variance estimators by exploiting knowledge

of the distributions involved when we know them

Rezende et al., 2014

Beyond the Mean Field
Mean Field Approximations

Key part of variational inference is choice of approximate posterior distribution q.

Mean-Field Posterior Approximations

Deep Latent
Gaussian Model

Mean-field or fully-factorised posterior is usually not sufficient

Real-world Posterior Distributions

Deep Latent
Gaussian Model

Complex dependencies · Non-Gaussian distributions · Multiple modes

Richer Families of Posteriors
Two high-level goals:
● Build richer approximate posterior distributions.
● Maintain computational efficiency and scalability.

Same as the problem of specifying a model of the data itself

Structured Approximations
Families of Approximate Posteriors
Covariance Models
Normalising Flows

Auxiliary Variable Models

Normalising Flows
Exploit the rule for change of variables:
● Begin with an initial distribution
● Apply a sequence of K invertible transforms

Distribution flows through a sequence of invertible transforms

Rezende and Mohamed, 2015
Normalising Flows
Normalising Flows
Choice of Transformation

Begin with a fully-factorised

Gaussian and improve by
change of variables.

Triangular Jacobians allow for

computational efficiency.

Linear time computation of the determinant and its gradient.

Rezende and Mohamed, 2015; Dinh et al, 2016, Kingma et al, 2016
Normalising Flows on Non-Euclidean Manifolds

Gemici et al., 2016

Normalising Flows on non-Euclidean Manifolds
Learning in
Implicit Generative Models
Learning by Comparison
For some models, we only have access to an
unnormalised probability, partial knowledge of the
distribution, or a simulator of data.

We compare the estimated

distribution q(x) to the true
distribution p*(x) using samples.
Mohamed and Lakshminarayanan, 2017.
Learning by Comparison

Comparison

Use a hypothesis test or comparison to

build an auxiliary model to indicate how
data simulated from the model differs
from observed data.

Estimation

Adjust model parameters to better match

the data distribution using the comparison.
Density Ratios and Classification

Density Bayes’
Ratio Rule

Real Data Simulated Data

Combine data

Assign labels

Equivalence

Sugiyama et al, 2012

Density Ratios and Classification

Conditional

Bayes’ substitution

Class probability

Computing a density ratio is equivalent to class probability estimation.

Unsupervised-as-Supervised Learning

Scoring Function

Bernoulli Loss

Alternating
optimisation

Other names and places:

● Use when we have differentiable
● Unsupervised and supervised learning
simulators and models
● Continuously updating inference
● Can form the loss using any proper
● Classifier ABC
scoring rule.
● Generative Adversarial Networks
Friedman et al. 2001
Generative Adversarial Networks

Alternating optimisation

Comparison loss

(Alt) Generative loss

Goodfellow et al. 2014
Integral Probability Metrics

f sometimes referred to as a
test function, witness function or a critic.

Many choices of f available: classifiers or

functions in specified spaces.

Wasserstein Total
Variation

Max Mean Discrepancy Cramer

Generative Models and RL
Probabilistic Policy Learning

Policy gradient update:

● Uniform prior on actions
● Score-function gradient estimator (aka Reinforce)

Other algorithms: Other names and instantiations:

● Relative entropy policy search ● Planning-as-inference
● Generative adversarial imitation learning ● Variational MDPs
● Reinforced variational inference ● Path-integral control
The Future
Applications of Probabilistic Types of Rich Distributions
Generative Models Deep Learning Generative Models

Stochastic Optimisation
Variational Principles Amortised Inference

Learning by Comparison
Challenges
● Scalability to large images, videos, multiple data modalities.
● Evaluation of generative models.
● Robust conditional models.
● Discrete latent variables.
● Support-coverage in models, mode-collapse.
● Calibration.
● Parameter uncertainty.
● Principles of likelihood-free inference.
UAI 2017 Australia

Tutorial on

Deep Generative Models

Shakir Mohamed and Danilo Rezende

@shakir_za @deepspiker
References: Applications
● Frey, Brendan J., and Geoffrey E. Hinton. "Variational learning in nonlinear Gaussian belief networks." Neural Computation 11, no. 1
(1999): 193-213.
● Eslami, S. M., Heess, N., Weber, T., Tassa, Y., Kavukcuoglu, K., and Hinton, G. E. Attend, Infer, Repeat: Fast Scene Understanding
with Generative Models. NIPS (2016).
● Rezende, Danilo Jimenez, Shakir Mohamed, Ivo Danihelka, Karol Gregor, and Daan Wierstra. "One-Shot Generalization in Deep
Generative Models." ICML (2016).
● Kingma, Diederik P., Shakir Mohamed, Danilo Jimenez Rezende, and Max Welling. "Semi-supervised learning with deep
generative models." In Advances in Neural Information Processing Systems, pp. 3581-3589. 2014.
● Higgins, Irina, Loic Matthey, Xavier Glorot, Arka Pal, Benigno Uria, Charles Blundell, Shakir Mohamed, and Alexander Lerchner.
"Early Visual Concept Learning with Unsupervised Deep Learning." arXiv preprint arXiv:1606.05579 (2016).
● Bellemare, Marc G., Sriram Srinivasan, Georg Ostrovski, Tom Schaul, David Saxton, and Remi Munos. "Unifying Count-Based
Exploration and Intrinsic Motivation." arXiv preprint arXiv:1606.01868 (2016).
● Odena, Augustus. "Semi-Supervised Learning with Generative Adversarial Networks." arXiv preprint arXiv:1606.01583 (2016).
● Springenberg, Jost Tobias. "Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks."
arXiv preprint arXiv:1511.06390 (2015).
● Alexander (Sasha) Vezhnevets, Mnih, Volodymyr, John Agapiou, Simon Osindero, Alex Graves, Oriol Vinyals, and Koray
Kavukcuoglu. "Strategic Attentive Writer for Learning Macro-Actions." arXiv preprint arXiv:1606.04695 (2016).
● Gregor, Karol, Ivo Danihelka, Alex Graves, Danilo Jimenez Rezende, and Daan Wierstra. "DRAW: A recurrent neural network for
image generation." arXiv preprint arXiv:1502.04623 (2015).
References: Applications
● Gómez-Bombarelli R, Duvenaud D, Hernández-Lobato JM, Aguilera-Iparraguirre J, Hirzel TD, Adams RP, Aspuru-Guzik A.
Automatic chemical design using a data-driven continuous representation of molecules. arXiv preprint arXiv:1610.02415. 2016.
● Rampasek L, Goldenberg A. Dr. VAE: Drug Response Variational Autoencoder. arXiv preprint arXiv:1706.08203. 2017 Jun 26.
● Regier J, Miller A, McAuliffe J, Adams R, Hoffman M, Lang D, Schlegel D, Prabhat M. Celeste: Variational inference for a
generative model of astronomical images. In International Conference on Machine Learning 2015 Jun 1 (pp. 2095-2103).
● Ledig C, Theis L, Huszár F, Caballero J, Cunningham A, Acosta A, Aitken A, Tejani A, Totz J, Wang Z, Shi W. Photo-realistic single
image super-resolution using a generative adversarial network. arXiv preprint arXiv:1609.04802. 2016 Sep 15.
● Oord AV, Dieleman S, Zen H, Simonyan K, Vinyals O, Graves A, Kalchbrenner N, Senior A, Kavukcuoglu K. Wavenet: A generative
model for raw audio. arXiv preprint arXiv:1609.03499. 2016 Sep 12.
● Dumoulin V, Belghazi I, Poole B, Lamb A, Arjovsky M, Mastropietro O, Courville A. Adversarially learned inference. arXiv preprint
arXiv:1606.00704. 2016 Jun 2.
● Gregor K, Besse F, Rezende DJ, Danihelka I, Wierstra D. Towards conceptual compression. In Advances In Neural Information
Processing Systems 2016 (pp. 3549-3557).
● Higgins I, Matthey L, Glorot X, Pal A, Uria B, Blundell C, Mohamed S, Lerchner A. Early visual concept learning with unsupervised
deep learning. arXiv preprint arXiv:1606.05579. 2016 Jun 17.
● Chiappa S, Racaniere S, Wierstra D, Mohamed S. Recurrent Environment Simulators. arXiv preprint arXiv:1704.02254. 2017 Apr 7.
● Kalchbrenner N, Oord AV, Simonyan K, Danihelka I, Vinyals O, Graves A, Kavukcuoglu K. Video pixel networks. arXiv preprint
arXiv:1610.00527. 2016 Oct 3.
● Wu J, Tenenbaum JB, Kohli P. Neural Scene De-rendering., CVPR 2017
References: Fully-observed Models
● Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint
arXiv:1601.06759 (2016).
● Larochelle, Hugo, and Iain Murray. "The Neural Autoregressive Distribution Estimator." In AISTATS, vol. 1, p. 2. 2011.
● Uria, Benigno, Iain Murray, and Hugo Larochelle. "A Deep and Tractable Density Estimator." In ICML, pp. 467-475. 2014.
● Veness, Joel, Kee Siong Ng, Marcus Hutter, and Michael Bowling. "Context tree switching." In 2012 Data Compression
Conference, pp. 327-336. IEEE, 2012.
● Rue, Havard, and Leonhard Held. Gaussian Markov random fields: theory and applications. CRC Press, 2005.
● Wainwright, Martin J., and Michael I. Jordan. "Graphical models, exponential families, and variational inference." Foundations and
Trends® in Machine Learning 1, no. 1-2 (2008): 1-305.
References: Latent Variable Models
● Hyvärinen, A., Karhunen, J., & Oja, E. (2004). Independent component analysis (Vol. 46). John Wiley & Sons.
● Gregor, Karol, Ivo Danihelka, Andriy Mnih, Charles Blundell, and Daan Wierstra. "Deep autoregressive networks." arXiv preprint
arXiv:1310.8499 (2013).
● Ghahramani, Zoubin, and Thomas L. Griffiths. "Infinite latent feature models and the Indian buffet process." In Advances in neural
information processing systems, pp. 475-482. 2005.
● Teh, Yee Whye, Michael I. Jordan, Matthew J. Beal, and David M. Blei. "Hierarchical dirichlet processes." Journal of the american
statistical association (2012).
● Adams, Ryan Prescott, Hanna M. Wallach, and Zoubin Ghahramani. "Learning the Structure of Deep Sparse Graphical Models." In
AISTATS, pp. 1-8. 2010.
● Lawrence, Neil D. "Gaussian process latent variable models for visualisation of high dimensional data." Advances in neural
information processing systems 16.3 (2004): 329-336.
● Damianou, Andreas C., and Neil D. Lawrence. "Deep Gaussian Processes." In AISTATS, pp. 207-215. 2013.
● Mattos, César Lincoln C., Zhenwen Dai, Andreas Damianou, Jeremy Forth, Guilherme A. Barreto, and Neil D. Lawrence.
"Recurrent Gaussian Processes." arXiv preprint arXiv:1511.06644 (2015).
● Salakhutdinov, Ruslan, Andriy Mnih, and Geoffrey Hinton. "Restricted Boltzmann machines for collaborative filtering." In
Proceedings of the 24th international conference on Machine learning, pp. 791-798. ACM, 2007.
● Saul, Lawrence K., Tommi Jaakkola, and Michael I. Jordan. "Mean field theory for sigmoid belief networks." Journal of artificial
intelligence research 4, no. 1 (1996): 61-76.
● Frey, Brendan J., and Geoffrey E. Hinton. "Variational learning in nonlinear Gaussian belief networks." Neural Computation 11, no. 1
(1999): 193-213.
● Durk Kingma and Max Welling. "Auto-encoding Variational Bayes." ICLR (2014).
● Burda Y, Grosse R, Salakhutdinov R. Importance weighted autoencoders. arXiv preprint arXiv:1509.00519. 2015 Sep 1.
References: Latent Variable Models (cont)
● Ranganath, Rajesh, Sean Gerrish, and David M. Blei. "Black Box Variational Inference." In AISTATS, pp. 814-822. 2014.
● Mnih, Andriy, and Karol Gregor. "Neural variational inference and learning in belief networks." arXiv preprint arXiv:1402.0030
(2014).
● Lázaro-Gredilla, Miguel. "Doubly stochastic variational Bayes for non-conjugate inference." (2014).
● Wingate, David, and Theophane Weber. "Automated variational inference in probabilistic programming." arXiv preprint
arXiv:1301.1299 (2013).
● Paisley, John, David Blei, and Michael Jordan. "Variational Bayesian inference with stochastic search." arXiv preprint
arXiv:1206.6430 (2012).
● Barber D, de van Laar P. Variational cumulant expansions for intractable distributions. Journal of Artificial Intelligence Research.
1999;10:435-55.
References: Stochastic Gradients
● Pierre L’Ecuyer, Note: On the interchange of derivative and expectation for likelihood ratio derivative estimators, Management
Science, 1995
● Peter W Glynn, Likelihood ratio gradient estimation for stochastic systems, Communications of the ACM, 1990
● Michael C Fu, Gradient estimation, Handbooks in operations research and management science, 2006
● Ronald J Williams, Simple statistical gradient-following algorithms for connectionist reinforcement learning, Machine learning, 1992
● Paul Glasserman, Monte Carlo methods in financial engineering, 2003
● Omiros Papaspiliopoulos, Gareth O Roberts, Martin Skold, A general framework for the parametrization of hierarchical models,
Statistical Science, 2007
● Michael C Fu, Gradient estimation, Handbooks in operations research and management science, 2006
● Rajesh Ranganath, Sean Gerrish, and David M. Blei. "Black Box Variational Inference." In AISTATS, pp. 814-822. 2014.
● Andriy Mnih, and Karol Gregor. "Neural variational inference and learning in belief networks." arXiv preprint arXiv:1402.0030 (2014).
● Michalis Titsias and Miguel Lázaro-Gredilla. "Doubly stochastic variational Bayes for non-conjugate inference." (2014).
● David Wingate and Theophane Weber. "Automated variational inference in probabilistic programming." arXiv preprint arXiv:1301.1299
(2013).
● John Paisley, David Blei, and Michael Jordan. "Variational Bayesian inference with stochastic search." arXiv preprint arXiv:1206.6430
(2012).
● Durk Kingma and Max Welling. "Auto-encoding Variational Bayes." ICLR (2014).
● Danilo Jimenez Rezende, Shakir Mohamed, Daan Wierstra. "Stochastic Backpropagation and Approximate Inference in Deep
Generative Models." ICML (2014).
● Papaspiliopoulos O, Roberts GO, Sköld M. A general framework for the parametrization of hierarchical models. Statistical Science.
2007 Feb 1:59-73.
● Fan K, Wang Z, Beck J, Kwok J, Heller KA. Fast second order stochastic backpropagation for variational inference. InAdvances in
Neural Information Processing Systems 2015 (pp. 1387-1395).
References: Amortised Inference
● Dayan, Peter, Geoffrey E. Hinton, Radford M. Neal, and Richard S. Zemel. "The helmholtz machine." Neural computation 7, no. 5
(1995): 889-904.
● Gershman, Samuel J., and Noah D. Goodman. "Amortized inference in probabilistic reasoning." In Proceedings of the 36th Annual
Conference of the Cognitive Science Society. 2014.
● Danilo Jimenez Rezende, Shakir Mohamed, Daan Wierstra. "Stochastic Backpropagation and Approximate Inference in Deep
Generative Models." ICML (2014).
● Durk Kingma and Max Welling. "Auto-encoding Variational Bayes." ICLR (2014). \
● Heess, Nicolas, Daniel Tarlow, and John Winn. "Learning to pass expectation propagation messages." In Advances in Neural
Information Processing Systems, pp. 3219-3227. 2013.
● Jitkrittum, Wittawat, Arthur Gretton, Nicolas Heess, S. M. Eslami, Balaji Lakshminarayanan, Dino Sejdinovic, and ZoltÃ ˛an SzabÃ¸s.
"Kernel-based just-in-time learning for passing expectation propagation messages." arXiv preprint arXiv:1503.02551 (2015).
● Korattikara, Anoop, Vivek Rathod, Kevin Murphy, and Max Welling. "Bayesian dark knowledge." arXiv preprint arXiv:1506.04416
(2015).
References: Structured Mean Field
● Jaakkola, T. S., and Jordan, M. I. (1998). Improving the mean field approximation via the use of mixture distributions. In Learning in
graphical models (pp. 163-173). Springer Netherlands.
● Saul, L.K. and Jordan, M.I., 1996. Exploiting tractable substructures in intractable networks. Advances in neural information
processing systems, pp.486-492.
● Gregor, Karol, Ivo Danihelka, Alex Graves, Danilo Jimenez Rezende, and Daan Wierstra. "DRAW: A recurrent neural network for
image generation." ICML (2015).
● Gershman, S., Hoffman, M. and Blei, D., 2012. Nonparametric variational inference. arXiv preprint arXiv:1206.4665.
● Felix V. Agakov, and David Barber. "An auxiliary variational method." NIPS (2004). Rajesh Ranganath, Dustin Tran, and David M.
Blei. "Hierarchical Variational Models." ICML (2016). Lars Maaløe et al. "Auxiliary Deep Generative Models." ICML (2016). Tim
Salimans, Durk Kingma, Max Welling. "Markov chain Monte Carlo and variational inference: Bridging the gap. In International
Conference on Machine Learning." ICML (2015).
● Maaløe L, Sønderby CK, Sønderby SK, Winther O. Auxiliary deep generative models. arXiv preprint arXiv:1602.05473. 2016 Feb
17.
References: Normalising Flows
● Tabak, E. G., and Cristina V. Turner. "A family of nonparametric density estimation algorithms." Communications on Pure and
Applied Mathematics 66, no. 2 (2013): 145-164.
● Rezende, Danilo Jimenez, and Shakir Mohamed. "Variational inference with normalizing flows." ICML (2015).
● Kingma, D.P., Salimans, T. and Welling, M., 2016. Improving variational inference with inverse autoregressive flow. arXiv preprint
arXiv:1606.04934.
● Dinh, L., Sohl-Dickstein, J. and Bengio, S., 2016. Density estimation using Real NVP. arXiv preprint arXiv:1605.08803.
References: Other Variational Objectives
● Yuri Burda, Roger Grosse, Ruslan Salakhutidinov. "Importance weighted autoencoders." ICLR (2015).
● Yingzhen Li, Richard E. Turner. "Rényi divergence variational inference." NIPS (2016).
● Guillaume and Balaji Lakshminarayanan. "Approximate Inference with the Variational Holder Bound." ArXiv (2015).
● José Miguel Hernández-Lobato, Yingzhen Li, Daniel Hernández-Lobato, Thang Bui, and Richard E. Turner. Black-box
α-divergence Minimization. ICML (2016).
● Rajesh Ranganath, Jaan Altosaar, Dustin Tran, David M. Blei. Operator Variational Inference. NIPS (2016).
References: Discrete Latent Variable Models
● Radford Neal. "Learning stochastic feedforward networks." Tech. Rep. CRG-TR-90-7: Department of Computer Science,
University of Toronto (1990).
● Lawrence K. Saul, Tommi Jaakkola, and Michael I. Jordan. "Mean field theory for sigmoid belief networks." Journal of artificial
intelligence research 4, no. 1 (1996): 61-76.
● Karol Gregor, Ivo Danihelka, Andriy Mnih, Charles Blundell, and Daan Wierstra. "Deep autoregressive networks." ICML (2014).
● Rajesh Ranganath, Linpeng Tang, Laurent Charlin, and David M. Blei. "Deep Exponential Families." AISTATS (2015).
● Rajesh Ranganath, Dustin Tran, and David M. Blei. "Hierarchical Variational Models." ICML (2016).
References: Implicit Generative Models
● Borgwardt, Karsten M., and Zoubin Ghahramani. "Bayesian two-sample tests." arXiv preprint arXiv:0906.4032 (2009).
● Gutmann, Michael, and Aapo Hyvärinen. "Noise-contrastive estimation: A new estimation principle for unnormalized statistical
models." AISTATS. Vol. 1. No. 2. 2010.
● Tsuboi, Yuta, Hisashi Kashima, Shohei Hido, Steffen Bickel, and Masashi Sugiyama. "Direct Density Ratio Estimation for
Large-scale Covariate Shift Adaptation." Information and Media Technologies 4, no. 2 (2009): 529-546.
● Sugiyama, Masashi, Taiji Suzuki, and Takafumi Kanamori. Density ratio estimation in machine learning. Cambridge University
Press, 2012.
● Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua
Bengio. "Generative adversarial nets." In Advances in Neural Information Processing Systems, pp. 2672-2680. 2014.
● Verrelst, Herman, Johan Suykens, Joos Vandewalle, and Bart De Moor. "Bayesian learning and the Fokker-Planck machine." In
Proceedings of the International Workshop on Advanced Black-box Techniques for Nonlinear Modeling, Leuven, Belgium, pp.
55-61. 1998.
● Devroye, Luc. "Random variate generation in one line of code." In Proceedings of the 28th conference on Winter simulation, pp.
265-272. IEEE Computer Society, 1996.
● Mohamed S, Lakshminarayanan B. Learning in implicit generative models. arXiv preprint arXiv:1610.03483. 2016 Oct 11.
● Gutmann MU, Dutta R, Kaski S, Corander J. Likelihood-free inference via classification. Statistics and Computing. 2017 Mar 13:1-5.
● Beaumont MA, Zhang W, Balding DJ. Approximate Bayesian computation in population genetics. Genetics. 2002 Dec
1;162(4):2025-35.
● Arjovsky M, Chintala S, Bottou L. Wasserstein gan. ICML 2017.
● Nowozin S, Cseke B, Tomioka R. f-gan: Training generative neural samplers using variational divergence minimization. In
Advances in Neural Information Processing Systems 2016 (pp. 271-279).
● Bellemare MG, Danihelka I, Dabney W, Mohamed S, Lakshminarayanan B, Hoyer S, Munos R. The Cramer Distance as a Solution
to Biased Wasserstein Gradients. arXiv preprint arXiv:1705.10743. 2017 May 30.
● Dumoulin V, Belghazi I, Poole B, Lamb A, Arjovsky M, Mastropietro O, Courville A. Adversarially learned inference.
● Friedman J, Hastie T, Tibshirani R. The elements of statistical learning. New York: Springer series in statistics; 2001.
References: Prob. Reinforcement Learning
● Kappen HJ. Path integrals and symmetry breaking for optimal control theory. Journal of statistical mechanics: theory and
experiment. 2005 Nov 30;2005(11):P11011.
● Rawlik K, Toussaint M, Vijayakumar S. Approximate inference and stochastic optimal control. arXiv preprint arXiv:1009.3958. 2010
Sep 20.
● Toussaint M. Robot trajectory optimization using approximate inference. InProceedings of the 26th annual international
conference on machine learning 2009 Jun 14 (pp. 1049-1056). ACM.
● Weber T, Heess N, Eslami A, Schulman J, Wingate D, Silver D. Reinforced variational inference. In Advances in Neural Information
Processing Systems (NIPS) Workshops 2015.
● Rajeswaran A, Lowrey K, Todorov E, Kakade S. Towards Generalization and Simplicity in Continuous Control.
● Peters J, Mülling K, Altun Y. Relative Entropy Policy Search. In AAAI 2010 Jul 11 (pp. 1607-1612).
● Furmston T, Barber D. Variational methods for reinforcement learning. In Proceedings of the Thirteenth International Conference
on Artificial Intelligence and Statistics 2010 Mar 31 (pp. 241-248).
● Ho J, Ermon S. Generative adversarial imitation learning. In Advances in Neural Information Processing Systems 2016 (pp.
4565-4573

Machine Learning Basics
No ratings yet
Machine Learning Basics
151 pages
1 Nature of Economics
100% (1)
1 Nature of Economics
99 pages
Concept Paper For Seabin Project
100% (2)
Concept Paper For Seabin Project
5 pages
Aai Ia1 Que Ans
No ratings yet
Aai Ia1 Que Ans
17 pages
Deep Learning
No ratings yet
Deep Learning
800 pages
Developing Research Questions
100% (1)
Developing Research Questions
13 pages
Conceptual Framework of The Study
100% (4)
Conceptual Framework of The Study
2 pages
Budgeting and Spending Behaviors of Accountancy
50% (2)
Budgeting and Spending Behaviors of Accountancy
9 pages
Lec 19
No ratings yet
Lec 19
111 pages
Different Types of Curve Fitting
No ratings yet
Different Types of Curve Fitting
13 pages
WWW - Shiksha - Com - Sarkari Exams - Teaching - Articles - Ugc Net Economics Question Paper 2023 Blogid 128473
No ratings yet
WWW - Shiksha - Com - Sarkari Exams - Teaching - Articles - Ugc Net Economics Question Paper 2023 Blogid 128473
15 pages
Library and Information Science Research: DR - Murugaih
No ratings yet
Library and Information Science Research: DR - Murugaih
10 pages
Deep Generative Models
No ratings yet
Deep Generative Models
55 pages
(Ebook) Generative Deep Learning, 2nd Edition (Third Early Release) by David Foster ISBN 9781098134174, 1098134176
No ratings yet
(Ebook) Generative Deep Learning, 2nd Edition (Third Early Release) by David Foster ISBN 9781098134174, 1098134176
81 pages
15000+ ChatGPT Prompts, (Crafti - Pro) - Tareas
95% (21)
15000+ ChatGPT Prompts, (Crafti - Pro) - Tareas
367 pages
Chapter 6
No ratings yet
Chapter 6
53 pages
The Best ChatGPT
100% (39)
The Best ChatGPT
8 pages
Data Analysis Oforphanagehomedonation
No ratings yet
Data Analysis Oforphanagehomedonation
27 pages
Prompt Engineering Bible Join and Master The AI Revolution Profit Online With GPT-4 Plugins For Effortless Money Making (Robert E. Miller) (Z-Library)
100% (5)
Prompt Engineering Bible Join and Master The AI Revolution Profit Online With GPT-4 Plugins For Effortless Money Making (Robert E. Miller) (Z-Library)
209 pages
Prompt Engineer 101
97% (30)
Prompt Engineer 101
45 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
123 pages
What Is Quantitative Research
No ratings yet
What Is Quantitative Research
3 pages
Introduction To VAE
No ratings yet
Introduction To VAE
5 pages
Major Minor Courses
No ratings yet
Major Minor Courses
13 pages
Baiq Maziza Adawiach Mandala 1710105081
No ratings yet
Baiq Maziza Adawiach Mandala 1710105081
17 pages
Unit II
No ratings yet
Unit II
27 pages
Mathematics Paper 3 Statistics and Probability HL
No ratings yet
Mathematics Paper 3 Statistics and Probability HL
4 pages
226 ChatGPT Prompts A-Z ChatGPT Prompt Engineering BootCamp
93% (15)
226 ChatGPT Prompts A-Z ChatGPT Prompt Engineering BootCamp
120 pages
513C3B
No ratings yet
513C3B
4 pages
Week1 UDL CM20315 01 Intro
No ratings yet
Week1 UDL CM20315 01 Intro
49 pages
L11 - UCLxDeepMind DL2020
No ratings yet
L11 - UCLxDeepMind DL2020
68 pages
Top 100 Applications of Generative AI 1683282083
100% (15)
Top 100 Applications of Generative AI 1683282083
119 pages
Harrisson A. How To Make Money Online With ChatGPT... 2023
95% (19)
Harrisson A. How To Make Money Online With ChatGPT... 2023
194 pages
200 ChatGPT Prompts
88% (56)
200 ChatGPT Prompts
14 pages
Lecture # 1-2 Introduction To Gen AI
No ratings yet
Lecture # 1-2 Introduction To Gen AI
41 pages
Unit 4 Short Notes
No ratings yet
Unit 4 Short Notes
27 pages
Lec1 Intro
No ratings yet
Lec1 Intro
51 pages
Case Processing Summary
No ratings yet
Case Processing Summary
4 pages
Unit 1 DLT
No ratings yet
Unit 1 DLT
10 pages
Deep Learning UNIT 5
No ratings yet
Deep Learning UNIT 5
182 pages
Hassan I 2010
No ratings yet
Hassan I 2010
12 pages
Qualitative Document Analysis in Political Science: University of Manitoba, Department of Political Studies
No ratings yet
Qualitative Document Analysis in Political Science: University of Manitoba, Department of Political Studies
15 pages
Btrblocks - Data Lake Compression
No ratings yet
Btrblocks - Data Lake Compression
14 pages
The Art of Asking ChatGPT For High-Quality Answers A Complete Guide To Prompt Engineering Techniques (Ibrahim John) (Z-Library)
97% (29)
The Art of Asking ChatGPT For High-Quality Answers A Complete Guide To Prompt Engineering Techniques (Ibrahim John) (Z-Library)
52 pages
Chi Square Test
100% (1)
Chi Square Test
52 pages
Student Handbook: Ppe Prelims 2 0 1 8 - 2 0 1 9
No ratings yet
Student Handbook: Ppe Prelims 2 0 1 8 - 2 0 1 9
52 pages
ChatGPT Advanced Tutorial
91% (34)
ChatGPT Advanced Tutorial
57 pages
Module in Practical Research 2: Your Lesson For Today!
No ratings yet
Module in Practical Research 2: Your Lesson For Today!
20 pages
Inherent Stochasticity
No ratings yet
Inherent Stochasticity
12 pages
Unit 3
No ratings yet
Unit 3
38 pages
Lec15 Generative Models
No ratings yet
Lec15 Generative Models
51 pages
Deep Learning
No ratings yet
Deep Learning
15 pages
Cheatsheets For Deep Learning 1650192034
No ratings yet
Cheatsheets For Deep Learning 1650192034
95 pages
Lect-Gen Ai-2
No ratings yet
Lect-Gen Ai-2
22 pages
Lect-Gen Ai-2
No ratings yet
Lect-Gen Ai-2
22 pages
Module 1
No ratings yet
Module 1
64 pages
Advanced Machine Learning: Course Overview
No ratings yet
Advanced Machine Learning: Course Overview
26 pages
BSCRIM Mathematics in The Modern World
No ratings yet
BSCRIM Mathematics in The Modern World
7 pages
Swt549 Data Mining and Business Intelligence TH 1.10 Ac26
No ratings yet
Swt549 Data Mining and Business Intelligence TH 1.10 Ac26
2 pages
A Gentle Introduction To Generative Adversarial Networks (GANs)
No ratings yet
A Gentle Introduction To Generative Adversarial Networks (GANs)
15 pages
DL Asmt-2
No ratings yet
DL Asmt-2
17 pages
DLT Unit-1
No ratings yet
DLT Unit-1
28 pages
An Introduction To Variational Autoencoders: Foundations and Trends in Machine Learning
No ratings yet
An Introduction To Variational Autoencoders: Foundations and Trends in Machine Learning
89 pages
MODULE 2 Deep Learning
No ratings yet
MODULE 2 Deep Learning
26 pages
Applied Generative AI For Beginners Practical Knowledge 1703207445
93% (14)
Applied Generative AI For Beginners Practical Knowledge 1703207445
221 pages
CSGL
No ratings yet
CSGL
11 pages
Intro To Vae
No ratings yet
Intro To Vae
89 pages
AI60201 Module3
No ratings yet
AI60201 Module3
61 pages
Quadrant Data Efficient Machine Learning Screen
No ratings yet
Quadrant Data Efficient Machine Learning Screen
6 pages
Inbound 9065663253201989159
No ratings yet
Inbound 9065663253201989159
9 pages
Generative Learning Algorithims 1233
No ratings yet
Generative Learning Algorithims 1233
33 pages
Chap 003
No ratings yet
Chap 003
40 pages
GAPE Module 1
No ratings yet
GAPE Module 1
29 pages
Generative
No ratings yet
Generative
19 pages
Gen AI Notes Part 1
No ratings yet
Gen AI Notes Part 1
15 pages
Lec 12
No ratings yet
Lec 12
15 pages
101 Best Funnel Prompts PDF
100% (24)
101 Best Funnel Prompts PDF
57 pages
Introduction To Gen Ai
No ratings yet
Introduction To Gen Ai
13 pages
Unlocking The Potential of ChatGPT
100% (18)
Unlocking The Potential of ChatGPT
45 pages
Generative Ai Fundamentals v1
100% (16)
Generative Ai Fundamentals v1
80 pages
Deep Learning
No ratings yet
Deep Learning
78 pages
MuskanSharma - III IT
No ratings yet
MuskanSharma - III IT
10 pages
CSD411 Week14 AutoRBM
No ratings yet
CSD411 Week14 AutoRBM
18 pages
Main
No ratings yet
Main
17 pages
Generative Models
No ratings yet
Generative Models
10 pages
Deep Unsupervised Learning
No ratings yet
Deep Unsupervised Learning
90 pages
Module 1 Presentation
No ratings yet
Module 1 Presentation
48 pages
Deep Learning U5
No ratings yet
Deep Learning U5
5 pages
Table of Content
No ratings yet
Table of Content
9 pages
Week 12 Chats
No ratings yet
Week 12 Chats
4 pages
DL (1-10)
No ratings yet
DL (1-10)
10 pages
Tutorial 1,2
No ratings yet
Tutorial 1,2
12 pages
Unit 2.1
No ratings yet
Unit 2.1
37 pages
nlfynx7RfS0IZ9YGOtls - Some Core Concepts
No ratings yet
nlfynx7RfS0IZ9YGOtls - Some Core Concepts
6 pages
Awesome ChatGPT Prompts PDF
100% (13)
Awesome ChatGPT Prompts PDF
103 pages
Class Notes Astronomy 3 of 5
No ratings yet
Class Notes Astronomy 3 of 5
2 pages
Deep Learning: Nicholas G. Polson Vadim O. Sokolov
No ratings yet
Deep Learning: Nicholas G. Polson Vadim O. Sokolov
18 pages
Deep
No ratings yet
Deep
15 pages
Oussidi 2018
No ratings yet
Oussidi 2018
8 pages
(EARLY RELEASE) Quick Start Guide To Large Language Models Strategies and Best Practices For Using ChatGPT and Other LLMs (Sinan Ozdemir) (Z-Library)
100% (14)
(EARLY RELEASE) Quick Start Guide To Large Language Models Strategies and Best Practices For Using ChatGPT and Other LLMs (Sinan Ozdemir) (Z-Library)
132 pages
Social Media
No ratings yet
Social Media
10 pages
Full Course of Machine Learning
100% (16)
Full Course of Machine Learning
660 pages
A Selective Overview of Deep Learning: Jianqing Fan Cong Ma Yiqiao Zhong April 16, 2019
No ratings yet
A Selective Overview of Deep Learning: Jianqing Fan Cong Ma Yiqiao Zhong April 16, 2019
37 pages
Course Description: Cc3780@cumc - Columbia.edu
No ratings yet
Course Description: Cc3780@cumc - Columbia.edu
4 pages
70 AI Tools To Boost Productivity
83% (24)
70 AI Tools To Boost Productivity
72 pages
ChatGPT Cheat Sheet
100% (35)
ChatGPT Cheat Sheet
4 pages
45 ChatGPT Use Cases For Product Managers 1674466304
100% (18)
45 ChatGPT Use Cases For Product Managers 1674466304
100 pages
900+ Best ChatGPT Prompts
50% (10)
900+ Best ChatGPT Prompts
73 pages
Chat GPT
92% (75)
Chat GPT
34 pages
ChatGPT-Guide 1
90% (10)
ChatGPT-Guide 1
42 pages
Q3M4-Semi Detailed Lesson Plan
No ratings yet
Q3M4-Semi Detailed Lesson Plan
3 pages
Preferred Types of Marketing Strategies and Types of Senior High School Consumers in Central Colleges of The Philippines
No ratings yet
Preferred Types of Marketing Strategies and Types of Senior High School Consumers in Central Colleges of The Philippines
76 pages
Advanced ChatGPT Prompt Engineering
100% (4)
Advanced ChatGPT Prompt Engineering
7 pages
150 ChatGPT Prompts PDF
90% (10)
150 ChatGPT Prompts PDF
10 pages
ChatGPT Cheat Sheet - DataCamp PDF
91% (11)
ChatGPT Cheat Sheet - DataCamp PDF
78 pages
Train ChatGPT
79% (14)
Train ChatGPT
67 pages
The Art of ChatGPT Prompting - A Guide To Crafting Clear and Effective Prompts December 2022
94% (34)
The Art of ChatGPT Prompting - A Guide To Crafting Clear and Effective Prompts December 2022
31 pages
The ChatGPT Prompt Book - LifeArchitect - Ai - Rev 1
100% (6)
The ChatGPT Prompt Book - LifeArchitect - Ai - Rev 1
45 pages
THE Chat GPT Guide
90% (10)
THE Chat GPT Guide
23 pages
AI-Driven Time Series Forecasting: Complexity-Conscious Prediction and Decision-Making
From Everand
AI-Driven Time Series Forecasting: Complexity-Conscious Prediction and Decision-Making
Raghurami Reddy Etukuru Ph.D.
No ratings yet

Deep Gen Models Tutorial

Uploaded by

Deep Gen Models Tutorial

Uploaded by

UAI 2017 Australia

Deep Generative Models

Move beyond associating Understand and imagine

Recognise objects in the

Establish concepts as useful

Gómez-Bombarelli, et al. 2016

Regier et al., 2015

Ledig et al., 2016

Oord et al., 2016

DRAW Pixel RNN ALI

Compression rate: 0.2bits/dimension

Gregor et al., 2016

Rezende et al., 2016

Higgins et al., 2017

Atari simulation Robot arm simulation

Chiappa et al, 2017, Kalchbrenner et al., 2017

+ Rich non-linear models for - Mainly conjugate and linear models.

1. Models 2. Learning Principles

Latent Variable Models

Equivalent ways of representing the same DAG

Introduce an unobserved local

● Exact methods (conjugacy, enumeration)

A given model and learning principle can be implemented in many ways.

Convolutional neural network Implicit Generative Model

● Optimisation methods ● Method-of-moments

Latent variable model Restricted Boltzmann Machine

Objective Quantity of Interest

● Other methods: MCMC, Langevin, HMC, Adaptive MCMC

Variational Optimisation Approximate Posteriors

Introduce a parametric family of conditional densities

Rezende et al., 2015

Deep Latent Gaussian Model p(x,z)

Gaussian Recognition Model q(z)

We then optimise the free-energy wrt model and variational parameters

AIR: Structured Priors Volumetric and Semi-supervised Learning

Variational Principles Amortised Inference

Compute expectations then M-step gradients

In general, we won’t know the expectations.

Score-function estimator: Typical problem areas:

Gradient reweighted by the value of the function

Other names: When to use:

Find an invertible function g(.) that expresses z as a

Kingma and Welling, 2014, Rezende et al., 2014

We can develop low-variance estimators by exploiting knowledge

Rezende et al., 2014

Key part of variational inference is choice of approximate posterior distribution q.

Mean-field or fully-factorised posterior is usually not sufficient

Complex dependencies · Non-Gaussian distributions · Multiple modes

Same as the problem of specifying a model of the data itself

Auxiliary Variable Models

Distribution flows through a sequence of invertible transforms

Begin with a fully-factorised

Triangular Jacobians allow for

Linear time computation of the determinant and its gradient.

Gemici et al., 2016

We compare the estimated

Use a hypothesis test or comparison to

Adjust model parameters to better match

Real Data Simulated Data

Sugiyama et al, 2012

Computing a density ratio is equivalent to class probability estimation.

Other names and places:

(Alt) Generative loss

Many choices of f available: classifiers or

Max Mean Discrepancy Cramer

Policy gradient update:

Other algorithms: Other names and instantiations:

Deep Generative Models

You might also like