6GAN
6GAN
I'll carefully follow the material in the images you uploaded and add a bit more explanation
(especially for terms and equations) so it's easier for you to understand. I’ll maintain the same structure
and flow as your notes.
GAN – The idea behind GAN is to sample from a simple, tractable distribution (for example, z ∼
N (0, 1)), and then learn a complex transformation from this simple distribution to the training
distribution (real-world data).
Diagram:
java
`Simple Distribution (z) ---> Complex Transformation ---> Sample Generated (like images)
`
2. Parts of a GAN:
1. Generator (G)
2. Discriminator (D)
The generator's job is to produce images that look so natural that the discriminator gets
confused and thinks that these generated images are real (i.e., from the true data distribution).
The discriminator's task is to get better and better at distinguishing between true images and
fake (generated) images.
mathematica
` Real or Fake?
[Discriminator]
/ \
/ \
[G(z)] [Real Images]
|
Generator
|
Random Noise z ~ N(0,1)
`
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 1/4
5. Formal Definitions:
Let:
Gϕ be the generator (with parameters ϕ).
6. Generator:
Neural network based on generator takes a noise vector z ∼ N (0, 1) as input and produces an
output Gϕ (z).
7. Discriminator:
Neural network based on discriminator takes either a real image x or a generated image Gϕ (z)
8. Output of Discriminator:
Given an image generated by Gϕ (z), the discriminator assigns a score Dθ (Gϕ (z)).
This score is between 0 and 1, representing the probability that the image is real (close to 1) or fake
(close to 0).
N
1
min ∑ log(1 − Dθ (Gϕ (zi )))
ϕ N
i=1
where:
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 2/4
min Ez∼p(z) [log(1 − Dθ (Gϕ (z)))]
where:
where:
x is a real sample.
ϕ θ
Meaning:
Discriminator Dθ wants to maximize the expression (do better at distinguishing real and fake).
Thus, training alternates between updating the discriminator and the generator.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 3/4
1. Oscillation Loss:
The loss values of generator and discriminator can fluctuate wildly during training, making it
unstable and hard for the model to converge properly.
2. Mode Collapse:
The generator produces only limited types of outputs (e.g., very similar images), instead of
generating diverse samples.
If the discriminator is too weak or the generator is too good, then the model won't get
useful feedback to improve.
4. Hyperparameters:
GANs are very sensitive to learning rate, batch size, optimizer settings, etc.
Finding the right combination of hyperparameters can be challenging and time-
consuming.
Batch Normalization:
Helps stabilize training by normalizing intermediate layers' outputs.
Alternative Loss Functions:
Using different losses like Wasserstein loss (used in WGAN) to provide more stable gradients.
Regularization:
Adding small penalties in the loss function (like L2 regularization) to avoid overfitting and
improve stability.
Hyperparameter Tuning:
Carefully experimenting with learning rates, optimizers (like Adam), etc.
Would you also like me to clean this up into a neat PDF you could save and print? 📄
(If yes, just say!)
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 4/4