0% found this document useful (0 votes)
5 views4 pages

6GAN

The document provides an overview of Generative Adversarial Networks (GANs), detailing their architecture, including the roles of the generator and discriminator. It outlines the objectives for both components, the minimax game they engage in, and the challenges faced during training such as oscillation loss and mode collapse. Additionally, it suggests strategies for addressing these challenges, including data augmentation and the use of alternative loss functions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views4 pages

6GAN

The document provides an overview of Generative Adversarial Networks (GANs), detailing their architecture, including the roles of the generator and discriminator. It outlines the objectives for both components, the minimax game they engage in, and the challenges faced during training such as oscillation loss and mode collapse. Additionally, it suggests strategies for addressing these challenges, including data augmentation and the use of alternative loss functions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Of course!

I'll carefully follow the material in the images you uploaded and add a bit more explanation
(especially for terms and equations) so it's easier for you to understand. I’ll maintain the same structure
and flow as your notes.

Here’s the full answer you asked for:

Generative Adversarial Network (GAN) Architecture


1. What is GAN?

GAN – The idea behind GAN is to sample from a simple, tractable distribution (for example, z ∼
N (0, 1)), and then learn a complex transformation from this simple distribution to the training
distribution (real-world data).

Diagram:

java

`Simple Distribution (z) ---> Complex Transformation ---> Sample Generated (like images)
`

2. Parts of a GAN:

There are two parts:

1. Generator (G)

2. Discriminator (D)

3. Job of the Generator:

The generator's job is to produce images that look so natural that the discriminator gets
confused and thinks that these generated images are real (i.e., from the true data distribution).

4. Job of the Discriminator:

The discriminator's task is to get better and better at distinguishing between true images and
fake (generated) images.

Diagram for basic architecture:

mathematica

` Real or Fake?
[Discriminator]
/ \
/ \
[G(z)] [Real Images]
|
Generator
|
Random Noise z ~ N(0,1)
`

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 1/4
5. Formal Definitions:
Let:
Gϕ be the generator (with parameters ϕ).

Dθ be the discriminator (with parameters θ).


6. Generator:

Neural network based on generator takes a noise vector z ∼ N (0, 1) as input and produces an
output Gϕ (z). ​

7. Discriminator:

Neural network based on discriminator takes either a real image x or a generated image Gϕ (z) ​

and classifies the input as real or fake.

8. Output of Discriminator:

Given an image generated by Gϕ (z), the discriminator assigns a score Dθ (Gϕ (z)).
​ ​ ​

This score is between 0 and 1, representing the probability that the image is real (close to 1) or fake
(close to 0).

9. Objective for Generator:


For a given z , the generator wants to:

Maximize log(Dθ (Gϕ (z)))


​ ​

(make discriminator think fake images are real)

Or equivalently, minimize log(1 − Dθ (Gϕ (z))). ​ ​

10. Objective for Single z:

If z is discrete and drawn from a uniform distribution:

N
1
min ∑ log(1 − Dθ (Gϕ (zi )))
​ ​ ​ ​ ​ ​

ϕ N
i=1

where:

N = number of noise samples.


zi = ith noise sample.

11. Objective for Continuous z:


If z is continuous and not uniform:

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 2/4
min Ez∼p(z) [log(1 − Dθ (Gϕ (z)))]
​ ​ ​ ​

where:

E means expectation (average over all possible z ).


p(z) is the probability distribution of z (typically N (0, 1)).

12. Objective for Discriminator:


The discriminator tries to assign high scores to real images and low scores to fake images.
It tries to maximize the objective:

max Ex∼pdata [log Dθ (x)] + Ez∼p(z) [log(1 − Dθ (Gϕ (z)))]



​ ​ ​ ​ ​

where:

pdata is the real data distribution.


x is a real sample.

13. Combined Objective (Minimax Game):


When combining generator and discriminator objectives, we get a minimax problem:

min max Ex∼pdata [log Dθ (x)] + Ez∼p(z) [log(1 − Dθ (Gϕ (z)))]


​ ​


​ ​ ​ ​ ​

ϕ θ

Meaning:

Discriminator Dθ wants to maximize the expression (do better at distinguishing real and fake).

Generator Gϕ wants to minimize the expression (fool the discriminator).


14. How to Solve This:


Step 1: Gradient ascent on discriminator (maximize objective for θ ):

max Ex∼pdata [log Dθ (x)] + Ez∼p(z) [log(1 − Dθ (Gϕ (z)))]



​ ​ ​ ​ ​

Step 2: Gradient descent on generator (minimize objective for ϕ):

min Ez∼p(z) [log(1 − Dθ (Gϕ (z)))]


​ ​ ​ ​

Thus, training alternates between updating the discriminator and the generator.

6.3 GAN Challenges:

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 3/4
1. Oscillation Loss:

The loss values of generator and discriminator can fluctuate wildly during training, making it
unstable and hard for the model to converge properly.
2. Mode Collapse:
The generator produces only limited types of outputs (e.g., very similar images), instead of
generating diverse samples.

Leads to poor diversity in generated samples.


3. Uninformative Loss:
Sometimes the loss signal becomes meaningless.

If the discriminator is too weak or the generator is too good, then the model won't get
useful feedback to improve.
4. Hyperparameters:

GANs are very sensitive to learning rate, batch size, optimizer settings, etc.
Finding the right combination of hyperparameters can be challenging and time-
consuming.

6.4 Tackling GAN Challenges:


Data Augmentation:
Using techniques like flipping, rotating images to artificially expand the dataset.

Batch Normalization:
Helps stabilize training by normalizing intermediate layers' outputs.
Alternative Loss Functions:

Using different losses like Wasserstein loss (used in WGAN) to provide more stable gradients.
Regularization:
Adding small penalties in the loss function (like L2 regularization) to avoid overfitting and
improve stability.
Hyperparameter Tuning:
Carefully experimenting with learning rates, optimizers (like Adam), etc.

Using WGAN (Wasserstein GAN):


Instead of classical GANs, WGANs use a different distance metric that leads to smoother
training and less mode collapse.

Would you also like me to clean this up into a neat PDF you could save and print? 📄
(If yes, just say!)

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 4/4

You might also like