0% found this document useful (0 votes)
83 views6 pages

Generative Adversarial Networks

Uploaded by

Armando
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
83 views6 pages

Generative Adversarial Networks

Uploaded by

Armando
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

DOI:10.

1145 / 3 42 2 6 2 2

Generative Adversarial Networks


By Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu,
David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio

Abstract applications of GANs and identify core research problems


Generative adversarial networks are a kind of artificial intel- related to convergence in games necessary to make GANs a
ligence algorithm designed to solve the generative model- reliable technology.
ing problem. The goal of a generative model is to study a
collection of training examples and learn the probability 2. GENERATIVE MODELING
distribution that generated them. Generative Adversarial The goal of supervised learning is relatively straightforward
Networks (GANs) are then able to generate more examples to specify, and all supervised learning algorithms have
from the estimated probability distribution. Generative essentially the same goal: learn to accurately associate new
models based on deep learning are common, but GANs input examples with the correct outputs. For instance, an
are among the most successful generative models (espe- object recognition algorithm may associate a photo of a dog
cially in terms of their ability to generate realistic high- with some kind of DOG category identifier.
resolution images). GANs have been successfully applied Unsupervised learning is a less clearly defined branch of
to a wide variety of tasks (mostly in research settings) but machine learning, with many different unsupervised learn-
continue to present unique challenges and research ing algorithms pursuing many different goals. Broadly
opportunities because they are based on game theory speaking, the goal of unsupervised learning is to learn some-
while most other approaches to generative modeling are thing useful by examining a dataset containing unlabeled
based on optimization. input examples. Clustering and dimensionality reduction
are common examples of unsupervised learning.
Another approach to unsupervised learning is generative
1. INTRODUCTION modeling. In generative modeling, training examples x are
Most current approaches to developing artificial intelli- drawn from an unknown distribution pdata(x). The goal of a
gence are based primarily on machine learning. The most generative modeling algorithm is to learn a pmodel(x) that
widely used and successful form of machine learning to date approximates pdata(x) as closely as possible.
is supervised learning. Supervised learning algorithms are A straightforward way to learn an approximation of pdata is
given a dataset of pairs of example inputs and example out- to explicitly write a function pmodel(x; θ) controlled by param-
puts. They learn to associate each input with each output eters θ and search for the value of the parameters that makes
and thus learning a mapping from input to output exam- pdata and pmodel as similar as possible. In particular, the most
ples. The input examples are typically complicated data popular approach to generative modeling is probably maxi-
objects like images, natural language sentences, or audio mum likelihood estimation, consisting of minimizing the
waveforms, while the output examples are often relatively Kullback-Leibler divergence between pdata and pmodel. The
simple. The most common kind of supervised learning is common approach of estimating the mean parameter of a
classification, where the output is just an integer code iden- Gaussian distribution by taking the mean of a set of observa-
tifying a specific category (a photo might be recognized as tions is one example of maximum likelihood estimation.
coming from category 0 containing cats, or category 1 con- This approach based on explicit density functions is illus-
taining dogs, etc.). trated in Figure 1.
Supervised learning is often able to achieve greater than Explicit density modeling has worked well for traditional
human accuracy after the training process is complete, and statistics, using simple functional forms of probability dis-
thus has been integrated into many products and services. tributions, usually applied to small numbers of variables.
Unfortunately, the learning process itself still falls far short More recently, with the rise of machine learning in general
of human abilities. Supervised learning by definition relies and deep learning in particular, researchers have become
on a human supervisor to provide an output example for interested in learning models that make use of relatively
each input example. Worse, existing approaches to super- complicated functional forms. When a deep neural net-
vised learning often require millions of training examples to work is used to generate data, the corresponding density
exceed human performance, when a human might be able function may be computationally intractable.
to learn to perform the task acceptably from a very small Traditionally, there have been two dominant approaches
number of examples. to confronting this intractability problem: (1) carefully
In order to reduce both the amount of human supervi- design the model to have a tractable density function
sion required for learning and the number of examples (e.g., Frey11) and (2) design a learning algorithm based on
required for learning, many researchers today study
unsupervised learning, often using generative models. In The original version of this paper is entitled “Generative
this overview paper, we describe one particular approach Adversarial Networks” and was published in Advances in
to unsupervised learning via generative modeling called Neural Information Processing Systems 27 (NIPS 2014).
generative adversarial networks. We briefly review

N OV E MB E R 2 0 2 0 | VO L. 6 3 | N O. 1 1 | C OM M U N IC AT ION S OF T H E ACM 139


research highlights

Figure 1. Many approaches to generative modeling are based Figure 2. The goal of many generative models, as illustrated
on density estimation: observing several training examples of here, is to study a collection of training examples, then learn to
a random variable x and inferring a density function p(x) that generate more examples that come from the same probability
generates the training data. This approach is illustrated here, distribution. GANs learn to do this without using an explicit
with several data points on a real number line used to fit a representation of the density function. One advantage of the
Gaussian density function that explains the observed samples. GAN framework is that it may be applied to models for which the
In contrast to this common approach, GANs are implicit models density function is computationally intractable. The samples
that infer the probability distribution p(x) without necessarily shown here are all samples from the ImageNet dataset,8
representing the density function explicitly. including the ones labeled “model samples.” We use actual
ImageNet data to illustrate the goal that a hypothetical perfect
model would attain.
p(x)

a computationally tractable approximation of an intractable


density function (e.g., Kingma and Welling15). Both approaches
have proved difficult, and for many applications, such as gen- Training data
erating realistic high resolution images, researchers remain
unsatisfied with the results so far. This motivates further
research to improve these two paths, but also suggests that a
third path could be useful.
Learned
Besides taking a point x as input and returning an esti- model
mate of the probability of generating that point, a generative
model can be useful if it is able to generate a sample from
the distribution pmodel. This is illustrated in Figure 2. Many
models that represent a density function can also generate
samples from that density function. In some cases, generat-
ing samples is very expensive or only approximate methods
of generating samples are tractable.
Some generative models avoid the entire issue of design-
ing a tractable density function and learn only a tractable
sample generation process. These are called implicit genera-
tive models. GANs fall into this category. Prior to the intro-
duction of GANs, the state of the art deep implicit generative
model was the generative stochastic network4 which is capa-
ble of approximately generating samples via an incremental
process based on Markov chains. GANs were introduced in
order to create a deep implicit generative model that was
able to generate true samples from the model distribution
Generated samples
in a single generation step, without need for the incremen-
tal generation process or approximate nature of sampling
Markov chains. 3. GENERATIVE ADVERSARIAL NETWORKS
Today, the most popular approaches to generative mod- Generative adversarial networks are based on a game, in the
eling are probably GANs, variational autoencoders,15 and sense of game theory, between two machine learning models,
fully-visible belief nets (e.g., Frey11, 26). None of these typically implemented using neural networks.
approaches relies on Markov chains, so the reason for the One network called the generator defines pmodel(x) implic-
interest in GANs today is not that they succeeded at their itly. The generator is not necessarily able to evaluate the den-
original goal of generative modeling without Markov chains, sity function pmodel. For some variants of GANs, evaluation of
but rather that they have succeded in generating high-qual- the density function is possible (any tractable density model
ity images and have proven useful for several tasks other for which sampling is tractable and differntiable could
than straightforward generation, as described in Section 5. be trained as a GAN generator, as done by Danihelka

140 COMM UNICATIO NS O F T H E AC M | NOV EM BER 2020 | VO L . 63 | N O. 1 1


et al.6), but this is not required. Instead, the generator is Figure 3. Training GANs involves training both a generator network
able to draw samples from the distribution pmodel. The gen- and a discriminator network. The process involves both real data
erator is defined by a prior distribution p(z) over a vector z drawn from a dataset and fake data created continuously by the
that serves as input to the generator function G(z; θ(G)) where generator throughout the training process. The discriminator is
θ(G) is a set of learnable parameters defining the generator’s trained much like any other classifier defined by a deep neural
network. As shown on the left, the discriminator is shown data
strategy in the game. The input vector z can be thought of as from the training set. In this case, the discriminator is trained to
a source of randomness in an otherwise deterministic sys- assign data to the “real” class. As shown on the right, the training
tem, analogous to the seed of pseudorandom number gen- process also involves fake data. The fake data is constructed by first
erator. The prior distribution p(z) is typically a relatively sampling a random vector z from a prior distribution over latent
variables of the model. The generator is then used to to produce a
unstructured distribution, such as a high-dimensional
sample x = G(z). The function G is simply a function represented by a
Gaussian distribution or a uniform distribution over a neural network that transforms the random, unstructured z vector
hypercube. Samples z from this distribution are then just into structured data, intended to be statistically indistinguishable
noise. The main role of the generator is to learn the func- from the training data. The discriminator then classifies this fake
tion G(z) that transforms such unstructured noise z into data. The discriminator is trained to assign this data to the “fake”
class. The backpropagation algorithm makes it possible to use
realistic samples.
the derivatives of the discriminator’s output with respect to the
The other player in this game is the discriminator. The discriminator’s input to train the generator. The generator is trained
discriminator examines samples x and returns some esti- to fool the discriminator, in other words, to make the discriminator
mate D(x; θ(D)) of whether x is real (drawn from the training assign its input to the “real” class. The training process for the
distribution) or fake (drawn from pmodel by running the gen- discriminator is thus much the same as for any other binary
classifier with the exception that the data for the “fake” class comes
erator). In the original formulation of GANs, this estimate from a distribution that changes constantly as the generator learns
consists of a probability that the input is real rather than rather than from a fixed distribution. The learning process for the
fake assuming that the real distribution and fake distribu- generator is somewhat unique, because it is not given specific
tion are sampled equally often. Other formulations (e.g., targets for its output, but rather simply given a reward for producing
Arjovsky et al.1) exist but generally speaking, at the level of outputs that fool its (constantly changing) opponent.
verbal, intuitive descriptions, the discriminator tries to pre-
dict whether the input was real or fake. Discriminator Discriminator
Each player incurs a cost: J (G)(θ(G), θ(D)) for the generator
and J (D)(θ(G), θ(D)) for the discriminator. Each player attempts
to minimize its own cost. Roughly speaking, the discrimina-
tor’s cost encourages it to correctly classify data as real or
fake, while the generator’s cost encourages it to generate
samples that the discriminator incorrectly classifies as real.
Very many different specific formulations of these costs are
possible and so far most popular formulations seem to per-
form roughly the same.18 In the original version of
GANs, J (D) was defined to be the negative log-likelihood that
the discriminator assigns to the real-vs-fake labels given the
input to the discriminator. In other words, the discriminator
is trained just like a regular binary classifier. The original Real data Fake data
work on GANs offered two versions of the cost for the gener-
ator. One version, today called minimax GAN (M-GAN)
defined a cost J (G) = −J (D), yielding a minimax game that is
straightforward to analyze theoretically. M-GAN defines the
cost for the generator by flipping the sign of the discrimina- Dataset Generator
tor’s cost; another approach is the non-saturating GAN
(NS-GAN), for which the generator’s cost is defined by flip-
ping the discriminator’s labels. In other words, the genera-
tor is tried to minimize the negative log-likelihood that the Random Random
discriminator assigns to the wrong labels. The later helps to
avoid gradient saturation while training the model. index into latent
We can think of GANs as a bit like counterfeiters and
police: the counterfeiters make fake money while the dataset variable
police try to arrest counterfeiters and continue to allow
the spending of legitimate money. Competition between
counterfeiters and police leads to more and more realistic gradient, as if the counterfeiters have a mole among the
counterfeit money until eventually the counterfeiters pro- police reporting the specific methods that the police use
duce perfect fakes and the police cannot tell the difference to detect fakes.
between real and fake money. One complication to this This process is illustrated in Figure 3. Figure 4 shows a
analogy is that the generator learns via the discriminator’s cartoon giving some intution for how the process works.

N OV E MB E R 2 0 2 0 | VO L. 6 3 | N O. 1 1 | C OM M U N IC AT ION S OF T H E ACM 141


research highlights

Figure 4. An illustration of the basic intuition behind the


GAN training process, illustrated by fitting a 1-D Gaussian
distribution. In this example, we can understand the goal of the
generator as learning a simple scaling of the inverse cumulative as demonstrated by Metz et al.,22 but the argmin operation is
distribution function of the data generating distribution. GANs difficult to work with in this way. The most popular approach
are trained by simultaneously updating the discriminator
function (D, blue, dashed line) so that it discriminates between
is to regard this situation as a game between two players.
samples from the data generating distribution (black, dotted Much of the game theory literature is concerned with games
line) px from those of the generative distribution pmodel (green, that have discrete and finite action spaces, convex losses, or
solid line). The lower horizontal line is the domain from which other properties simplifying them. GANs require use of
z is sampled, in this case uniformly. The horizontal line above
game theory in settings that are not yet well-explored, where
is part of the domain of x. The upward arrows show how the
mapping x = G(z) imposes the non-uniform distribution pmodel on the costs are non-convex and the actions and policies are
transformed samples. G contracts in regions of high density continuous and high-dimensional (regardless of whether
and expands in regions of low density of p model. (a) Consider a we consider an action to be choosing a specific parameter
pair of adversarial networks at initialization: p model is initialized vector θ(G) or whether we consider the action to be generating
to a unit Gaussian for this example while D is defined by a
a sample x). The goal of a machine learning algorithm in this
randomly initialized deep neural network. (b) Suppose that
D were trained to convergence while G were held fixed. In context is to find a local Nash equilibrium28: a point that is a
practice, both are trained simultaneously, but for the purpose of local minimum of each player’s cost with respect to that
building intuition, we see that if G were fixed, D would converge player’s parameters. With local moves, no player can reduce
to . (c) Now suppose that we gradually train its cost further, assuming the other player’s parameters do
both G and D for a while. The samples x generated by G flow in
the direction of increasing D in order to arrive at regions that
not change.
are more likely to be classified as data. Meanwhile the estimate The most common training algorithm is simply to use a
of D is updated in response to this update in G. (d) At the Nash gradient-based optimizer to repeatedly take simultaneous
equilibrium, neither player can improve its payoff because pmodel steps on both players, incrementally minimizing each play-
= pdata. The discriminator is unable to differentiate between er’s cost with respect to that player’s parameters.
the two distributions, that is, . This constant function
shows that all points are equally likely to have come from either
At the end of the training process, GANs are often able to
distribution. In practice, G and D are typically optimized with produce realistic samples, even for very complicated datas-
simultaneous gradient steps, and it is not necessary for D to ets containing high-resolution images. An example is shown
be optimal at every step as shown in this intuitive cartoon. See in Figure 5.
Refs. Fedus et al. 10 and Nagarajan and Kolter24 for more realistic
At a high level, one reason that the GAN framework is suc-
discussions of the GAN equilibration process.
cesful may be that it involves very little approximation. Many
other approaches to generative modeling must approximate
an intractable density functions. GANs do not involve any

Figure 5. This image is a sample from a Progressive GAN14 depicting


a person who does not exist but was “imagined” by a GAN after
training on photos of celebrities.
x x

z z

(a) (b)

x x

z z

(c) (d)

The situation is not straightforward to model as an opti-


mization problem because each player’s cost is a function of
the other player’s parameters, but each player may control
only its own parameters. It is possible to reduce the situa-
tion to optimization, where the goal is to minimize

142 COM MUNICATIO NS O F TH E ACM | NOV EM BER 2020 | VO L . 63 | N O. 1 1


Figure 6. An illustration of progress in GAN capabilities over the course of approximately three years following the introduction of
GANs. GANs have rapidly become more capable, due to changes in GAN algorithms, improvements to the underlying deep learning
algorithms, and improvements to underlying deep learning software and hardware infrastructure. This rapid progress means that
it is infeasible for any single document to summarize the state-of-the-art GAN capabilities or any specific set of best practices;
both continue to evolve rapidly enough that any comprehensive survey quickly becomes out of date. Figure reproduced with
permission from Brundage et al.5 The individual results are from Refs. Goodfellow,13 Karras et al.,14 Liu and Tuzel,17 and
Radford et al.27 respectively.

2014 2015 2016 2017

approximation to their true underlying task. The only real spurious Nash equilibria exist,32 whether the learning algo-
error is the statistical error (sampling of a finite amount of rithm converges to a Nash equilibrium,24 and if it does so,
training data rather than measuring the true underlying how quickly.21
data-generating distribution) and failure of the learning In many cases of practical interest, these theoretical
algorithm to converge to exactly the optimal parameters. questions are open, and the best learning algorithms seem
Many generative modeling strategies would introduce these empirically to often fail to converge. Theoretical work to
sources of error and also further sources of approximation answer these questions is ongoing, as is work to design bet-
error, based on Markov chains, optimization of bounds on ter costs, models, and training algorithms with better con-
the true cost rather than the cost itself, etc. vergence properties.
It is difficult to give much further specific guidance regard-
ing the details of GANs because GANs are such an active 5. OTHER GAN TOPICS
research area and most specific advice quickly becomes out This article is focused on a summary of the core design con-
of date. Figure 6 shows how quickly the capabilities of GANs siderations and algorithmic properties of GANs.
have progressed in the years since their introduction. Many other topics of potential interest cannot be consid-
ered here due to space consideration. This article discussed
4. CONVERGENCE OF GANS using GANs to approximate a distribution p(x) they have also
The central theoretical results presented in the original GAN been extended to the conditional setting23, 25 where they gen-
paper13 were that: erate samples corresponding to some input by drawing sam-
ples from the conditional distribution p(x | y). GANs are
1. in the space of density functions pmodel and discrimina- related to moment matching16 and optimal transport.1 A
tor functions D, there is only one local Nash equilib- quirk of GANs that is made especially clear through their
rium, where pmodel = pdata. connection to MMD and optimal transport is that they may
2. if it were possible to optimize directly over such den- be used to train generative models for which pmodel has sup-
sity functions, then the algorithm that consists of opti- port only on a thin manifold and may actually assign zero
mizing D to convergence in the inner loop, then likelihood to the training data. GANs struggle to generate
making a small gradient step on pmodel in the outer discrete data because the back-propagation algorithm
loop, converges to this Nash equilibrium. needs to propagate gradients from the discriminator
through the output of the generator, but this problem is
However, the theoretical model of local moves directly in being gradually resolved.9 Like most generative models,
density function space may not be very relevant to GANs as GANs can be used to fill in gaps in missing data.34 GANs have
they are trained in practice: using local moves in parameter proven very effective for learning to classify data using very
space of the generator function, among the set of functions few labeled training examples.29 Evaluating the performance
representable by neural networks with a finite number of of generative models including GANs is a difficult research
parameters, with each parameter represented with a finite area in its own right.29, 31, 32, 33 GANs can be seen as a way for
number of bits. machine learning to learn its own cost function, rather than
In many different theoretical models, it is interesting to minimizing a hand-designed cost function. GANs can be
study whether a Nash equilibrium exists,2 whether any seen as a way of supervising machine learning by asking it to

N OV E MB E R 2 0 2 0 | VO L. 6 3 | N O. 1 1 | C OM M U N IC AT ION S OF T H E ACM 143


research highlights

produce any output that the machine learning algorithm Z. Ghahramani, M. Welling, C. Cortes, 25. Odena, A., Olah, C., Shlens, J.
N.D. Lawrence, K.Q. Weinberger, eds. Conditional image synthesis with
itself recognizes as acceptable, rather than by asking it to Advances in Neural Information auxiliary classifier gans. arXiv preprint
produce a specific example output. GANs are thus great for Processing Systems 27, Curran arXiv:1610.09585 (2016).
Associates, Inc., Boston, 2014, 26. Oord, A. v. d., Li, Y., Babuschkin, I.,
learning in situations where there are many possible correct 2672–2680. Simonyan, K., Vinyals, O.,
answers, such as predicting the many possible futures that 14. Karras, T., Aila, T., Laine, S., Lehtinen, J. Kavukcuoglu, K., Driessche, G. v. d.,
Progressive growing of GANs for Lockhart, E., Cobo, L.C., Stimberg, F.,
can happen in video generation.19 GANs and GAN-like mod- improved quality, stability, and variation. et al. Parallel wavenet: Fast
CoRR, abs/1710.10196 (2017). high-fidelity speech synthesis. arXiv
els can be used to learn to transform data from one domain 15. Kingma, D.P., Welling, M. Auto- preprint arXiv:1711.10433 (2017).
into data from another domain, even without any labeled encoding variational bayes. In 27. Radford, A., Metz, L., Chintala, S.
Proceedings of the International Unsupervised representation learning
pairs of examples from those domains (e.g., Zhu et al.35). For Conference on Learning with deep convolutional generative
example, after studying a collection of photos of zebras and Representations (ICLR) (2014). adversarial networks. arXiv preprint
16. Li, Y., Swersky, K., Zemel, R.S. Generative arXiv:1511.06434 (2015).
a collection of photos of horses, GANs can turn a photo of a moment matching networks. CoRR, 28. Ratliff, L.J., Burden, and S.A., Sastry,
horse into a photo of a zebra.35 GANs have been used in sci- abs/1502.02761 (2015). S.S. Characterization and computation
17. Liu, M.-Y., Tuzel, O. Coupled generative of local nash equilibria in continuous
ence to simulate experiments that would be costly to run adversarial networks. D.D. Lee, M. games. In Communication, Control,
even in traditional software simulators.7 GANs can be used Sugiyama, U.V. Luxburg, I. Guyon, R. and Computing (Allerton), 2013 51st
Garnett, eds. Advances in Neural Annual Allerton Conference on. IEEE,
to create fake data to train other machine learning models, Information Processing Systems 29, (2013), 917–924.
either when real data would be hard to acquire30 or when Curran Associates, Inc., Boston, 2016, 29. Salimans, T., Goodfellow, I.,
469–477. Zaremba, W., Cheung, V., Radford, A.,
there would be privacy concerns associated with real data.3 18. Lucic, M., Kurach, K., Michalski, M., Chen, X. Improved techniques for
GAN-like models called domain-adversarial networks can be Gelly, S., Bousquet, O. Are GANs training gans. In Advances in Neural
created equal? a large-scale study. Information Processing Systems
used for domain adaptation.12 GANs can be used for a variety arXiv preprint arXiv:1711.10337 (2017). (2016), 2234–2242.
of interactive digital media effects where the end goal is to 19. Mathieu, M., Couprie, C., LeCun, Y. 30. Shrivastava, A., Pfister, T., Tuzel, O.,
Deep multi-scale video prediction Susskind, J., Wang, W., Webb, R.
produce compelling imagery.35 GANs can even be used to beyond mean square error. arXiv Learning from simulated and
preprint arXiv:1511.05440 (2015). unsupervised images through
solve variational inference problems used in other 20. Mescheder, L., Nowozin, S., Geiger, A. adversarial training.
approaches to generative modeling.20 GANs can learn useful Adversarial variational bayes: Unifying 31. Theis, L., van den Oord, A., Bethge,
variational autoencoders and M. A note on the evaluation of
embedding vectors and discover concepts like gender of generative adversarial networks. arXiv generative models. arXiv:1511.01844
human faces without supervision.27 preprint arXiv:1701.04722 (2017). (Nov 2015).
21. Mescheder, L., Nowozin, S., Geiger, A. 32. Unterthiner, T., Nessler, B.,
The numerics of gans. In Advances in Klambauer, G., Heusel, M., Ramsauer,
6. CONCLUSION Neural Information Processing H., Hochreiter, S. Coulomb GANs:
Systems (2017), 1823–1833. Provably optimal Nash equilibria via
GANs are a kind of generative model based on game theory. 22. Metz, L., Poole, B., Pfau, D., potential fields. arXiv preprint
They have had great practical success in terms of generating Sohl-Dickstein, J. Unrolled generative arXiv:1708.08819 (2017).
adversarial networks. arXiv preprint 33. Wu, Y., Burda, Y., Salakhutdinov, R.,
realistic data, especially images. It is currently still difficult arXiv:1611.02163 (2016). Grosse, R. On the quantitative analysis
to train them. For GANs to become a more reliable technol- 23. Mirza, M., Osindero, S. Conditional of decoder-based generative models.
generative adversarial nets. arXiv arXiv preprint arXiv:1611.04273 (2016).
ogy, it will be necessary to design models, costs, or training preprint arXiv:1411.1784 (2014). 34. Yeh, R., Chen, C., Lim, T.Y., Hasegawa-
algorithms for which it is possible to find good Nash equilib- 24. Nagarajan, V., Kolter, J.Z. Gradient Johnson, M., Do, M.N. Semantic image
descent GAN optimization is locally inpainting with perceptual and
ria consistently and quickly. stable. I. Guyon, U.V. Luxburg, S. contextual losses. arXiv preprint
Bengio, H. Wallach, R. Fergus, S. arXiv:1607.07539 (2016).
Vishwanathan, R. Garnett, eds. 35. Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.
Advances in Neural Information Unpaired image-to-image translation
References 7. de Oliveira, L., Paganini, M., Nachman, Processing Systems 30, Curran using cycle-consistent adversarial
1. Arjovsky, M., Chintala, S., Bottou, L. B. Learning particle physics by Associates, Inc., Boston, 2017, networks. arXiv preprint
Wasserstein gan. arXiv preprint example: location-aware generative 5585–5595. arXiv:1703.10593 (2017).
arXiv:1701.07875 (2017). adversarial networks for physics
2. Arora, S., Ge, R., Liang, Y., Ma, T., synthesis. Computing and Software for
Zhang, Y. Generalization and Big Science 1 1(2017), 4. Ian Goodfellow, written while at Google Jean Pouget-Abadie, Mehdi Mirza,
equilibrium in generative adversarial 8. Deng, J., Dong, W., Socher, R., Brain. Bing Xu, David Warde-Farley,
nets (gans). arXiv preprint Li, L.-J., Li, K., Fei-Fei, L. ImageNet: A Sherjil Ozair, Aaron Courville, and
arXiv:1703.00573 (2017). Large-Scale Hierarchical Image Yoshua Bengio, Université de Montréal.
3. Beaulieu-Jones, B.K., Wu, Z.S., Database. In CVPR09 (2009).
Williams, C., Greene, C.S. Privacy- 9. Fedus, W., Goodfellow, I.,
preserving generative deep neural Dai, A.M. MaskGAN: Better text
networks support clinical data generation via filling in the _____. In Final submitted 5/9/2018.
sharing. bioRxiv (2017), 159756. International Conference on Learning
4. Bengio, Y., Thibodeau-Laufer, E., Representations (2018).
Alain, G., Yosinski, J. Deep generative 10. Fedus, W., Rosca, M.,
stochastic networks trainable by Lakshminarayanan, B., Dai, A.M.,
backprop. In ICML’2014 (2014). Mohamed, S., Goodfellow, I. Many
5. Brundage, M., Avin, S., Clark, J., paths to equilibrium: GANs do not
Toner, H., Eckersley, P., Garfinkel, B., need to decrease a divergence at
Dafoe, A., Scharre, P., Zeitzoff, T., Filar, B., every step. In International
Anderson, H., Roff, H., Allen, G.C., Conference on Learning
Steinhardt, J., Flynn, C., hÉigeartaigh, Representations (2018).
S.Ó., Beard, S., Belfield, H., Farquhar, S., 11. Frey, B.J. Graphical Models for Machine
Lyle, C., Crootof, R., Evans, O., Page, M., Learning and Digital Communication.
Bryson, J., Yampolskiy, R., Amodei, D. MIT Press, Boston, 1998.
The Malicious Use of Artificial 12. Ganin, Y., Lempitsky, V. Unsupervised
Intelligence: Forecasting, Prevention, domain adaptation by
and Mitigation. ArXiv e-prints (Feb. 2018). backpropagation. In International
6. Danihelka, I., Lakshminarayanan, B., Conference on Machine Learning
Uria, B., Wierstra, D., Dayan, P. (2015), 1180–1189.
Comparison of maximum likelihood 13. Goodfellow, I., Pouget-Abadie, J.,
and GAN-based training of real nvps. Mirza, M., Xu, B., Warde-Farley, D.,
arXiv preprint arXiv:1705.05263 Ozair, S., Courville, A., Bengio, Y.
(2017). Generative adversarial nets. Copyright held by authors/owners. Publication rights licensed to ACM.

144 COMM UNICATIO NS O F T H E ACM | NOV EM BER 2020 | VO L . 63 | N O. 1 1

You might also like