CISC 867 Deep Learning: 15. Generative Adversarial Networks
CISC 867 Deep Learning: 15. Generative Adversarial Networks
1
Synthetic Data
2
Uses of Synthetic Data
input image
Hand pose:
• Hand shape
• 3D hand
orientation.
3
Realistic Scenes in Games and Movies
4
Generative Adversarial Networks
5
Generator and Discriminator
6
Generator and Discriminator
7
How It (Hopefully) Works
8
How It (Hopefully) Works
9
Problems With Convergence
10
Problems With Convergence
11
Problems With Convergence
12
Needed Math
Expectation
13
Kullback–Leibler Divergence
14
Jensen–Shannon Divergence
1 𝑝+𝑞 1 𝑝+𝑞
𝐷𝐽𝑆 𝑝ԡ𝑞 = 𝐷 𝑝 ԡ + 𝐷𝐾𝐿 𝑞 ԡ
2 𝐾𝐿 2 2 2
JS divergence is symmetric
15
Generative Adversarial Networks
Setup: Assume we have data xi drawn from distribution pdata(x). Want to sample from pdata.
16
Generative Adversarial Networks
Setup: Assume we have data xi drawn from distribution pdata(x). Want to sample from pdata.
17
Generative Adversarial Networks
Setup: Assume we have data xi drawn from distribution pdata(x). Want to sample from pdata.
Generator Generated
Network Sample
Sample
z from pz
z G
Train Generator Network G to convert
z into fake data x sampled from pG
18
Generative Adversarial Networks
Setup: Assume we have data xi drawn from distribution pdata(x). Want to sample from pdata.
19
Generative Adversarial Networks
Setup: Assume we have data xi drawn from distribution pdata(x). Want to sample from pdata.
20
Generative Adversarial Networks:
Training Objective
Jointly train generator G and discriminator D with a minimax game
21
Generative Adversarial Networks:
Training Objective
Jointly train generator G and discriminator D with a minimax game
22
Generative Adversarial Networks:
Training Objective
Jointly train generator G and discriminator D with a minimax game
Discriminator wants
D(x) = 1 for real data
23
Generative Adversarial Networks:
Training Objective
Jointly train generator G and discriminator D with a minimax game
Discriminator wants Discriminator wants
D(x) = 1 for real data D(x) = 0 for fake data
24
Generative Adversarial Networks:
Training Objective
Jointly train generator G and discriminator D with a minimax game
Discriminator wants Discriminator wants
D(x) = 1 for real data D(x) = 0 for fake data
25
Generative Adversarial Networks:
Training Objective
Jointly train generator G and discriminator D with a minimax game
26
Generative Adversarial Networks:
Training Objective
Jointly train generator G and discriminator D with a minimax game
27
Generative Adversarial Networks:
Training Objective
Jointly train generator G and discriminator D with a minimax game
28
Generative Adversarial Networks:
Training Objective
Jointly train generator G and discriminator D with a minimax game
29
Generative Adversarial Networks:
Training Objective
Jointly train generator G and discriminator D with a minimax game
30
Generative Adversarial Networks:
Training Objective
Jointly train generator G and discriminator D with a minimax game
31
Generative Adversarial Networks:
Training Objective
Jointly train generator G and discriminator D with a minimax game
min max 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎 log 𝑫 𝑥 + 𝐸𝒛~𝑝(𝒛) log 1 − 𝑫 𝑮 𝒛
𝑮 𝑫
At start of training, generator is very bad
and discriminator can easily tell apart
real/fake, so D(G(z)) close to 0
Problem: Vanishing gradients for G
Solution: Right now G is trained to
minimize log(1-D(G(z)). Instead, train G
to minimize –log(D(G(z)). Then G gets
strong gradients at start of training!
32
Generative Adversarial Networks: Optimality
Jointly train generator G and discriminator D with a minimax game
33
Generative Adversarial Networks: Optimality
34
Generative Adversarial Networks: Optimality
35
Generative Adversarial Networks: Optimality
36
Generative Adversarial Networks: Optimality
37
Generative Adversarial Networks: Optimality
38
Generative Adversarial Networks: Optimality
𝑎 𝑏
𝑓′ 𝑦 = −
𝑦 1−𝑦
39
Generative Adversarial Networks: Optimality
𝑎 𝑏
𝑓′ 𝑦 = −
𝑦 1−𝑦
40
Generative Adversarial Networks: Optimality
𝑎 𝑏 ∗
𝑝𝑑𝑎𝑡𝑎 𝑥
𝑓′ 𝑦 = − Optimal Discriminator: 𝐷𝐺 𝑥 =
𝑦 1−𝑦 𝑝𝑑𝑎𝑡𝑎 𝑥 + 𝑝𝐺 𝑥
41
Generative Adversarial Networks: Optimality
𝑝𝑑𝑎𝑡𝑎 𝑥
Optimal Discriminator: 𝐷𝐺∗ 𝑥 =
𝑝𝑑𝑎𝑡𝑎 𝑥 + 𝑝𝐺 𝑥
42
Generative Adversarial Networks: Optimality
𝑝𝑑𝑎𝑡𝑎 𝑥
Optimal Discriminator: 𝐷𝐺∗ 𝑥 =
𝑝𝑑𝑎𝑡𝑎 𝑥 + 𝑝𝐺 𝑥
43
Generative Adversarial Networks: Optimality
44
Generative Adversarial Networks: Optimality
45
Generative Adversarial Networks: Optimality
(Definition of expectation)
46
Generative Adversarial Networks: Optimality
(Multiply by a constant)
47
Generative Adversarial Networks: Optimality
2 𝑝𝑑𝑎𝑡𝑎 𝑥 2 𝑝𝑮 𝑥
= min 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎 log + 𝐸𝑥~𝑝𝑮 log
𝑮 2 𝑝𝑑𝑎𝑡𝑎 𝑥 + 𝑝𝑮 𝑥 2 𝑝𝑑𝑎𝑡𝑎 𝑥 + 𝑝𝑮 𝑥
2 ∗ 𝑝𝑑𝑎𝑡𝑎 𝑥 2 ∗ 𝑝𝑮 𝑥
= min 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎 log + 𝐸𝑥~𝑝𝑮 log − log 4
𝑮 𝑝𝑑𝑎𝑡𝑎 𝑥 + 𝑝𝑮 𝑥 𝑝𝑑𝑎𝑡𝑎 𝑥 + 𝑝𝑮 𝑥
48
Generative Adversarial Networks: Optimality
49
Generative Adversarial Networks: Optimality
min max 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎 log 𝑫 𝑥 + 𝐸𝒛~𝑝(𝒛) log 1 − 𝑫 𝑮 𝒛
𝑮 𝑫
2 ∗ 𝑝𝑑𝑎𝑡𝑎 𝑥 2 ∗ 𝑝𝑮 𝑥
= min 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎 log + 𝐸𝑥~𝑝𝑮 log − log 4
𝑮 𝑝𝑑𝑎𝑡𝑎 𝑥 + 𝑝𝑮 𝑥 𝑝𝑑𝑎𝑡𝑎 𝑥 + 𝑝𝑮 𝑥
Kullback-Leibler Divergence:
𝑝 𝑥
𝐾𝐿 𝑝, 𝑞 = 𝐸𝑥~𝑝 log
𝑞 𝑥
50
Generative Adversarial Networks: Optimality
min max 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎 log 𝑫 𝑥 + 𝐸𝒛~𝑝(𝒛) log 1 − 𝑫 𝑮 𝒛
𝑮 𝑫
2 ∗ 𝑝𝑑𝑎𝑡𝑎 𝑥 2 ∗ 𝑝𝑮 𝑥
= min 𝐸𝑥~𝑝𝑑𝑎𝑡𝑎 log + 𝐸𝑥~𝑝𝑮 log − log 4
𝑮 𝑝𝑑𝑎𝑡𝑎 𝑥 + 𝑝𝑮 𝑥 𝑝𝑑𝑎𝑡𝑎 𝑥 + 𝑝𝑮 𝑥
Kullback-Leibler Divergence:
𝑝 𝑥
𝐾𝐿 𝑝, 𝑞 = 𝐸𝑥~𝑝 log
𝑞 𝑥
51
Generative Adversarial Networks: Optimality
Kullback-Leibler Divergence:
𝑝 𝑥
𝐾𝐿 𝑝, 𝑞 = 𝐸𝑥~𝑝 log
𝑞 𝑥
52
Generative Adversarial Networks: Optimality
53
Generative Adversarial Networks: Optimality
Jensen-Shannon Divergence:
1 𝑝+𝑞 1 𝑝+𝑞
𝐽𝑆𝐷 𝑝, 𝑞 = 𝐾𝐿 𝑝, + 𝐾𝐿 𝑞,
2 2 2 2
54
Generative Adversarial Networks: Optimality
Jensen-Shannon Divergence:
1 𝑝+𝑞 1 𝑝+𝑞
𝐽𝑆𝐷 𝑝, 𝑞 = 𝐾𝐿 𝑝, + 𝐾𝐿 𝑞,
2 2 2 2
55
Generative Adversarial Networks: Optimality
56
Generative Adversarial Networks: Optimality
57
Generative Adversarial Networks: Optimality
58
Generative Adversarial Networks: Optimality
59
Generative Adversarial Networks: Optimality
60
Generative Adversarial Networks: Results
Generated samples
61
Generative Adversarial Networks: DC-GAN
Generator
Radford et al, “Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks”, ICLR 2016
62
Generative Adversarial Networks: DC-GAN
Samples
from the
model
look
much
better!
63
Generative Adversarial Networks: Interpolation
Interpolating
between
points in
latent z
space
64
Generative Adversarial Networks: Vector Math
Smiling Neutral Neutral
woman woman man
Samples
from the
model
65
Generative Adversarial Networks: Vector Math
Smiling Neutral Neutral
woman woman man
Samples
from the
model
Average Z
vectors, do
arithmetic
Radford et al, ICLR 2016
66
Generative Adversarial Networks: Vector Math
Smiling Neutral Neutral
woman woman man
Average Z
vectors, do
arithmetic
Radford et al, ICLR 2016
67
Generative Adversarial Networks: Vector Math
Man with Man w/o Woman
glasses glasses w/o glasses
Samples
from the
model
Average Z
vectors, do
arithmetic
Radford et al, ICLR 2016
68
Generative Adversarial Networks: Vector Math
Man with Man w/o Woman
glasses glasses w/o glasses
Woman with
Samples
glasses
from the
model
Average Z
vectors, do
arithmetic
Radford et al, ICLR 2016
69
2017 to present: Explosion of GANs
https://github.com/hindupuravinash/the-gan-zoo
70
GAN Improvements: Improved Loss Functions
WGAN with Gradient Penalty
Wasserstein GAN (WGAN)
(WGAN-GP)
71