0% found this document useful (0 votes)
20 views53 pages

GAN Lecture

About GAN and its uses from nit Raipur

Uploaded by

t54dpsktvt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views53 pages

GAN Lecture

About GAN and its uses from nit Raipur

Uploaded by

t54dpsktvt
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

Generative Adversarial Networks

Today’s class
• Unsupervised Learning

• Generative Models
• Autoencoders (AE)
• Generative Adversarial Networks (GAN)
• GANs: Recent Trends
Supervised vs Unsupervised Learning
Supervised Learning

Data: (x, y)
x is data, y is label

Goal: Learn a function to map x -> y

Examples: Classification,
regression, object detection,
semantic segmentation,
image captioning, etc.
Credit: cs231n, Stanford
Supervised vs Unsupervised Learning
Supervised Learning

Data: (x, y)
x is data, y is label
Cat
Goal: Learn a function to map x -> y

Examples: Classification, Classification


regression, object detection,
semantic segmentation,
image captioning, etc.
Credit: cs231n, Stanford
Supervised vs Unsupervised Learning
Supervised Learning

Data: (x, y)
x is data, y is label

Goal: Learn a function to map x -> y

Examples: Classification,
regression, object detection, Object Detection
semantic segmentation,
image captioning, etc.
Credit: cs231n, Stanford
Supervised vs Unsupervised Learning
Supervised Learning

Data: (x, y)
x is data, y is label

Goal: Learn a function to map x -> y

Examples: Classification,
regression, object detection,
semantic segmentation, Semantic Segmentation
image captioning, etc.
Credit: cs231n, Stanford
Supervised vs Unsupervised Learning
Supervised Learning

Data: (x, y)
x is data, y is label

Goal: Learn a function to map x -> y

Examples: Classification,
regression, object detection,
semantic segmentation, Image Captioning
image captioning, etc.
Credit: cs231n, Stanford
Supervised vs Unsupervised Learning
Unsupervised Learning

Data: x
Just data, no labels!

Goal: Learn some underlying


hidden structure of the data

Examples: Clustering,
dimensionality reduction, feature
learning, density estimation, etc.
Credit: cs231n, Stanford
Supervised vs Unsupervised Learning
Unsupervised Learning

Data: x
Just data, no labels!

Goal: Learn some underlying


hidden structure of the data

Examples: Clustering,
dimensionality reduction, feature K-Means Clustering
learning, density estimation, etc.
Credit: cs231n, Stanford
Supervised vs Unsupervised Learning
Unsupervised Learning

Data: x
Just data, no labels!

Goal: Learn some underlying


hidden structure of the data

Examples: Clustering,
dimensionality reduction, feature (Principal Component Analysis)
Dimensionality Reduction
learning, density estimation, etc.
Credit: cs231n, Stanford
Supervised vs Unsupervised Learning
Unsupervised Learning

Data: x
Just data, no labels!

Goal: Learn some underlying


hidden structure of the data

Examples: Clustering,
dimensionality reduction, feature Generative Advarsarial Networks
(Distribution learning)
learning, density estimation, etc.
Credit: cs231n, Stanford
Autoencoders
Unsupervised approach for learning a lower-dimensional feature
representation from unlabeled training data

Credit: cs231n, Stanford


Autoencoders
Unsupervised approach for learning a lower-dimensional feature
representation from unlabeled training data

Originally: Linear +
nonlinearity (sigmoid)
Later: Deep, fully-connected
Later: ReLU CNN

Credit: cs231n, Stanford


Autoencoders
Unsupervised approach for learning a lower-dimensional feature
representation from unlabeled training data
Z usually smaller than X
(Dimensionality Reduction)
Originally: Linear +
nonlinearity (sigmoid)
Q: Why dimensionality
Later: Deep, fully-connected
reduction?
Later: ReLU CNN

Credit: cs231n, Stanford


Autoencoders
Unsupervised approach for learning a lower-dimensional feature
representation from unlabeled training data
Z usually smaller than X
(Dimensionality Reduction)
Originally: Linear +
nonlinearity (sigmoid)
Q: Why dimensionality
Later: Deep, fully-connected
reduction?
Later: ReLU CNN
A: Want features to
capture meaningful
factors of variation in
data

Credit: cs231n, Stanford


Autoencoders
How to learn this feature representation?

Credit: cs231n, Stanford


Autoencoders
How to learn this feature representation?
Train such that features can be used to reconstruct
original data “Autoencoding” - encoding itself

Credit: cs231n, Stanford


Autoencoders
How to learn this feature representation?
Train such that features can be used to reconstruct
original data “Autoencoding” - encoding itself

Originally: Linear +
nonlinearity (sigmoid)
Later: Deep, fully-connected
Later: ReLU CNN

Credit: cs231n, Stanford


Autoencoders Reconstructed Data

How to learn this feature representation?


Train such that features can be used to reconstruct
original data “Autoencoding” - encoding itself

Decoder: 4-layer upconv


Encoder: 4-layer conv

Input Data

Credit: cs231n, Stanford


Autoencoders Reconstructed Data

Train such that features


can be used to
reconstruct original data
L2 Loss Function

Decoder: 4-layer upconv


Encoder: 4-layer conv

Input Data

Credit: cs231n, Stanford


Autoencoders Reconstructed Data

Train such that features


can be used to Doesn’t use labels!
reconstruct original data
L2 Loss Function

Decoder: 4-layer upconv


Encoder: 4-layer conv

Input Data

Credit: cs231n, Stanford


Autoencoders Reconstructed Data

Encoder: 4-layer conv


Decoder: 4-layer upconv
After training,
throw awayInput Data
decoder

Credit: cs231n, Stanford


Autoencoders

Credit: cs231n, Stanford


Autoencoders
Encoder can be used to
initialize a supervised model

Loss Function
(Softmax, etc.)

Credit: cs231n, Stanford


Autoencoders
Encoder can be used to
initialize a supervised model

Loss Function
(Softmax, etc.)

Fine-
tune
encoder Train for final task
jointly (sometimes with
with small data)
classifier

Credit: cs231n, Stanford


Generative Adversarial Networks
Sample from a simple distribution, e.g. random noise.
Learn transformation to training distribution.

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Generative Adversarial Networks
Sample from a simple distribution, e.g. random noise.
Learn transformation to training distribution.

A neural network can be


used to represent
this complex transformation?

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Training GANs: Two-player game
Generator network: try to fool the discriminator by generating real-looking images
Discriminator network: try to distinguish between real and fake images

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Training GANs: Two-player game
Generator network: try to fool the discriminator by generating real-looking images
Discriminator network: try to distinguish between real and fake images

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Fake and real images copyright Emily Denton et al. 2015. Credit: cs231n, Stanford
Training GANs: Two-player game
Generator network: try to fool the discriminator by generating real-looking images
Discriminator network: try to distinguish between real and fake images

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Figure: Ian Goodfellow NIPS Talk
Training GANs: Two-player game
Generator network: try to fool the discriminator by generating real-looking images
Discriminator network: try to distinguish between real and fake images

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Training GANs: Two-player game
Generator network: try to fool the discriminator by generating real-looking images
Discriminator network: try to distinguish between real and fake images

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Training GANs: Two-player game
Generator network: try to fool the discriminator by generating real-looking images
Discriminator network: try to distinguish between real and fake images

• Discriminator (θd) wants to maximize objective such that D(x) is close to 1 (real)
and D(G(z)) is close to 0 (fake)
• Generator (θg) wants to minimize objective such that D(G(z)) is close to 1
(discriminator is fooled into thinking generated G(z) is real)
Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Training GANs: Two-player game

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Training GANs: Two-player game

In practice, optimizing this generator


objective does not work well!

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Training GANs: Two-player game

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Training GANs: Two-player game

Instead of minimizing likelihood of discriminator being correct,


now maximize likelihood of discriminator being wrong.
Same objective of fooling discriminator, but now higher
gradient signal for bad samples => works much better!
Standard in practice.
Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Training GANs: Two-player game
GAN training algorithm

Some find k=1


more stable,
others use k > 1,
no best rule.

Recent work (e.g.


Wasserstein GAN)
alleviates this
problem, better
stability!

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Credit: cs231n, Stanford
Training GANs: Two-player game

After training, use generator network


to generate new images

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Fake and real images copyright Emily Denton et al. 2015. Credit: cs231n, Stanford
Generative Adversarial Nets
Generated Samples [MNIST Database, Toronto Face Database (TFD)]

Nearest neighbor from training set

Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014


Generative Adversarial Nets
Generated Samples [CIFAR-10 Database]
convolutional discriminator and
Fully connected model “deconvolutional” generator

Nearest neighbor from training set


Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014
DCGAN
Deep Convolutional Generative Adversarial Nets
❖ Generator is an upsampling network with fractionally-strided convolutions.

Generator

Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016
DCGAN
Deep Convolutional Generative Adversarial Nets
❖ Generator is an upsampling network with fractionally-strided convolutions.
❖ Discriminator is a convolutional network.

Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016
DCGAN
Deep Convolutional Generative Adversarial Nets
❖ Generator is an upsampling network with fractionally-strided convolutions.
❖ Discriminator is a convolutional network.

Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016
DCGAN
Deep Convolutional Generative Adversarial Nets

Generated bedrooms after one training pass through the LSUN dataset.
Amazing!
Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016
Image-to-Image Translation

• Conditional GAN1

• Cycle-Consistent Adversarial Network2

• Dual GAN3

1. Mirza, Mehdi, and Simon Osindero. "Conditional generative adversarial nets." arXiv preprint
arXiv:1411.1784 (2014).
2. Zhu, Jun-Yan, et al. Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial
Networks. CVPR 2017.
3. Yi, Zili, et al. DualGAN: Unsupervised Dual Learning for Image-To-Image Translation. CVPR 2017.

Slide Credit: Kishan Babu, PhD Student, IIIT Sri City


Image-to-Image Translation with Conditional
Adversarial Networks

Slide Credit: Kishan Babu, PhD Student, IIIT Sri City Phillip et al. 2017
Unpaired image to image translation using cycle
consistency adversarial networks

Slide Credit: Kishan Babu, PhD Student, IIIT Sri City Jun-Yan Zhu et al. 2017
“The GAN Zoo”

https://github.com/hindupuravinash/the-gan-zoo
“The GAN Zoo”

https://github.com/hindupuravinash/the-gan-zoo
“The GAN Zoo”

And Many More …………….......

https://github.com/hindupuravinash/the-gan-zoo
GANs: Things to Remember
Take game-theoretic approach: learn to generate from training distribution
through 2-player game

Pros:
- Beautiful, state-of-the-art samples!

Cons:
- Trickier / more unstable to train
- Can’t solve inference queries such as p(x), p(z|x)

Active areas of research:


-Better loss functions, more stable training (Wasserstein GAN, LSGAN, many
others)
- Conditional GANs, GANs for all kinds of applications

You might also like