Gen AI Unit 3

The document provides an overview of Generative Adversarial Networks (GANs), detailing their structure, training techniques, and various applications such as image generation and style transfer. It discusses the roles of the generator and discriminator, different types of GANs like Vanilla GAN, Progressive GAN, and Conditional GAN, as well as challenges faced in training them. Additionally, it highlights advancements in style transfer using architectures like CycleGAN and pix2pix GAN.

Uploaded by

23adl05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views52 pages

Gen AI Unit 3

Uploaded by

23adl05

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 52

Unit 3: Generative Adversarial Networks

Generative Adversarial Networks – Vanilla GAN – Progressive

GAN – Style transfer and image transformation – Image
Generation with GANs – Style Transfer with GANs
Generative Adversarial Networks
• Generative Adversarial Networks (GANs) were developed in 2014 by Ian
Goodfellow and his teammates.
• GAN is basically an approach to generative modeling that generates a new set of
data based on training data that look like training data.
• GANs have two main blocks (two neural networks) which compete with each
other and are able to capture, copy, and analyze the variations in a dataset.
To understand the term GAN let’s break it into separate three parts
• Generative – To learn a generative model, which describes how data is generated
in terms of a probabilistic model.
• Adversarial – The training of the model is done in an adversarial setting.
• Networks – use deep neural networks for training purposes.
Generator
• The generator network takes random input (typically noise) and generates
samples, such as images, text, or audio, that resemble the training data it was
trained on.
• The goal of the generator is to produce samples that are indistinguishable from
real data.
Discriminator
• The discriminator network, on the other hand, tries to distinguish between real
and generated samples.
• It is trained with real samples from the training data and generated samples from
the generator.
• The discriminator’s objective is to correctly classify real data as real and
generated data as fake.
• The training process involves an adversarial game between the generator and the
discriminator.
• The generator aims to produce samples that fool the discriminator, while the
discriminator tries to improve its ability to distinguish between real and
generated data. This adversarial training pushes both networks to improve over
time.
• As training progresses, the generator becomes more adept at producing realistic
samples, while the discriminator becomes more skilled at differentiating between
real and generated data.
• Ideally, this process converges to a point where the generator is capable of
generating high-quality samples that are difficult for the discriminator to
distinguish from real data.
Components of Generative Adversarial Networks
(GANs)
1) Discriminator – It is a supervised approach.
It is a simple classifier that predicts data is fake or real.
It is trained on real data and provides feedback to a generator.
2) Generator – It is an unsupervised learning approach.
It will generate data that is fake data based on original(real) data.
It is also a neural network that has hidden layers, activation, loss function.
Its aim is to generate the fake image based on feedback and make the discriminator
fool that it cannot predict a fake image.
When the discriminator is made a fool by the generator, the training stops and we
can say that a generalized GAN model is created.
Training Techniques of GAN
Define the Problem
• Begin by defining the problem, aim to solve with GANs, whether it's
generating audio, poems, text, images, or another type of content.

Select GAN Architecture

• Choose the specific architecture of the GAN that suits problem statement and
requirements.

Train Discriminator on Real Dataset

• Initially, train the discriminator on real dataset samples without noise,
classifying both real and fake data. Update discriminator weights based on its
performance and misclassifications.
Training Techniques of GAN
Train Generator
• Provide random noise inputs to the generator to produce fake outputs. During
generator training, discriminator is idle. The generator learns to transform
noise into meaningful data over many epochs through backpropagation.

Train Discriminator on Fake Data

• The generated fake samples are passed to the discriminator for classification.
Discriminator provides feedback to the generator, helping it improve its
output.

Train Generator with Discriminator Feedback

• Generator is trained based on the feedback from the discriminator, aiming to
create more convincing fake samples. This iterative process continues until the
generator's output successfully deceives the discriminator.
Applications of GAN
• Generate new data from available data – It means generating new samples from
an available sample that is not similar to a real one.
• Generate realistic pictures of people that have never existed.
• GANs is not limited to Images, It can generate text, articles, songs, poems, etc.
• Generate Music by using some clone Voice – If you provide some voice then
GANs can generate a similar clone feature of it.
• Text to Image Generation (Object GAN and Object Driven GAN)
• Creation of anime characters in Game Development and animation production.
• Image to Image Translation – We can translate one Image to another without
changing the background of the source image. For example, Gans can replace a
dog with a cat.
• Low resolution to High resolution – If you pass a low-resolution Image or video,
GAN can produce a high-resolution Image version of the same.
• Prediction of Next Frame in the video – By training a neural network on small
frames of video, GANs are capable to generate or predict a small next frame of
video.
Challenges of GAN
• The problem of stability between generator and discriminator. We do not want
that discriminator should be too strict, we want to be lenient
• Problem to determine the positioning of objects. suppose in a picture we have 3
horse and generator have created 6 eyes and 1 horse.
• The problem in understanding the global objects – GANs do not understand the
global structure or holistic structure which is similar to the problem of
perspective. It means sometimes GAN generates an image that is unrealistic and
cannot be possible.
• A problem in understanding the perspective – It cannot understand the 3-d
images and if we train it on such types of images then it will fail to create 3-d
images because today GANs are capable to work on 1-d images.
Vanilla GAN
• Vanilla GANs are the simplest type of GANs. They consist of two neural networks,
a generator and a discriminator, which are trained in a two-player minimax
game.
• The generator learns to produce fake data that resembles real data, while the
discriminator learns to distinguish between real and fake data.
Progressive GAN
• It involves training by starting with a very small image and then the blocks of
layers added incrementally so that the output size of the generator model
increases and increases the input size of the discriminator model until the
desired image size is obtained.
• This type of approach has proven very effective at generating high-quality
synthetic images that are highly realistic.
• It basically involves 4 major steps
1) Progressive growing (of model and layers)
2) Minibatch std on Discriminator
3) Normalization with PixelNorm
4) Equalized Learning Rate
• During the training process, it systematically adds new blocks of convolutional
layers to both the generator model and the discriminator model.
• This incremental addition of the convolutional layers allows the models to learn
coarse-level detail effectively at the beginning and later learn even finer detail,
both on the generator and discriminator side.
• The process of adding a new block of layers involves the usage of skip connection
as shown in the above figure, it is mainly to connect the new block either to the
output of the generator or the input of the discriminator and adding it to the
existing output or input layer with a weighting which controls the influence of the
new block.
• It shows a generator that outputs a 16×16 image and a discriminator that takes a
16×16 pixel image. The models are grown to the size of 32×32.
Style transfer and image transformation
Image Generation with GANs
• The taxonomy of generative models
• The discriminator model
• The generator model
• Training GANs
The generator is learning to generate good enough fake samples,
while the discriminator is working hard to discriminate between real
and fake. More formally, this is termed as the minimax game, where
the value function V(G, D) is described as follows:

• We can better understand the value function V(G, D) by separating

out the objective function for each of the players. The following
equations describe individual objective functions:

where 𝐽𝐷is the discriminator objective function in the classical sense,

𝐽𝐺is the generator objective equal to the negative of the discriminator,
and 𝑝𝑑ata is the distribution of the training data.
• The objective functions help us to understand the aim of each of the
players. If we assume both probability densities are non-zero
everywhere, we can get the optimal value of D(x) as:
• The simplest yet widely used way of training a GAN is as follows.
Repeat the following steps N times. N is the number of total
iterations:
1. Repeat steps k times:
• Sample a minibatch of size m from the generator: {z1, z2, … zm} =
pmodel(z)
• Sample a minibatch of size m from the actual data: {x1, x2, … xm} =
pdata(x)
• Update the discriminator loss, 𝐽𝐷
2. Set the discriminator as non-trainable
3. Sample a minibatch of size m from the generator: {z1, z2, … zm} =
pmodel(z)
4. Update the generator loss, 𝐽𝐺
Deep Convolutional GAN
• Let's start by preparing the discriminator model. CNN-based binary
classifiers are simple models. One modification we make here is to
use strides longer than 1 to down-sample the input between layers
instead of using pooling layers. This helps in providing better stability
for the training of the generator model. We also rely on batch
normalization and Leaky ReLU for the same purposes (although these
were not used in the original GAN paper). Another important aspect
of this discriminator (as compared to the vanilla GAN discriminator) is
the absence of fully connected layers.
Vector arithmetic

• The ability to manipulate the latent vectors by addition, subtraction,

and so on to generate meaningful output transformations is a
powerful tool.
• The authors of the DCGAN paper showed that indeed the z
representative space of the generator obeys such a rich linear
structure.
• Similar to vector arithmetic in the NLP domain, where word2vec
generates a vector similar to "Queen" upon performing the
manipulation "King" – "Man" + "Woman," DCGAN allows the same in
the visual domain.
Conditional GAN
• CGANs work by training the generator model to generate fake
samples conditioned on specific characteristics of the output
required.
• The discriminator, on the other hand, needs to do some extra work. It
needs to learn not only to differentiate between fake and real but
also to mark out samples as fake if the generated sample and its
conditioning characteristics do not match
• We denote the conditioning input as y and transform the value
function for the GAN minimax game as follows:
• where log log D (x|y) is the discriminator output for a real sample, x,
conditioned on y and similarly log log (1 - D (G(z|y))) is the
discriminator output for a fake sample, G(z), conditioned on y.
Wasserstein GAN
• The main difference between typical GANs and W-GANs is
the fact that W-GANs treat the discriminator as a critic.
• Hence, instead of simply classifying input images as real or fake, the
W-GAN discriminator (or critic) generates a score to inform the generator
about the realness or fakeness of the input
image.
• The maximum likelihood game we discussed in the initial sections of the
chapter explained the task as one where we try to minimize the divergence
between pz and pdata using KL divergence, that is, 𝛩=argmin𝐷KL(𝑝data(𝑥) ∥ 𝑝𝑔
(𝑧)).
• Mathematically, this can be stated as the infimum (or greatest lower
bound, denoted as inf) for any transport plan (denoted as W(source,
destination), that is:
Progressive GAN
Style transfer and image transformation
Style Transfer with GANs
• Neural networks are improving in a number of tasks involving
analytical and linguistic skills. Creativity is one sphere where humans
have had an upper hand. Not only is art subjective and has no defined
boundaries, it is also difficult to quantify.
• Generative Adversarial Networks (GANs) in particular have been
studied and explored in detail for the task of style transfer over the
years. One such example is presented, where the CycleGAN
architecture has been used to successfully transform photographs
into paintings using the styles of famous artists such as Monet and
Van Gogh.
Paired style transfer using pix2pix GAN
• Image Generation with GANs, we discussed a number of innovations
related to GAN architectures that led to improved results and better
control of the output class. One of those innovations was conditional
GANs. This simple yet powerful addition to the GAN setup enabled us
to navigate the latent vector space and control the generator to
generate specific outputs.
• Style transfer is an intriguing research area, pushing the boundaries
of creativity and deep learning together.
• As the name suggests, this GAN architecture takes a specific type of
image as input and transforms it into a different domain. It is called
pair-wise style transfer as the training set needs to have matching
samples from both source and target domains.
• This generic approach is shown to effectively synthesize high-quality
images from label maps and edge maps, and even colorize images.
The U-Net generator
• Since CNNs are optimized for computer vision tasks, using them for
generator as well as discriminator architectures has a number of
advantages.
• The two choices are the vanilla encoder-decoder architecture and the
encoder-decoder architecture with skip connections.
• The architecture with skip connections has more in common with the
U-Net model5 than the encoder-decoder setup. Hence, the generator
in the pix2pix GAN is termed a U-Net generator.
• A typical encoder (in the encoder-decoder setup) takes an input and
passes it through a series of downsampling layers to generate a
condensed vector form. This condensed vector is termed the
bottleneck features.
• The decoder part then upsamples the bottleneck features to generate
the final output. This setup is extremely useful in a number of
scenarios, such as language translation and image reconstruction. The
bottleneck features condense the overall input into a
lower-dimensional space.
• The U-Net architecture uses skip connections to shuttle important
features between the input and output.
• In the case of the pix2pix GAN, skip connections are added between
every ith down-sampling layer and (n - i)th oversampling layer, where
n is the total number of layers in the generator.
• The skip connection leads to the concatenation of all channels from
the ith to (n - i)th layers, with the ith layers being appended to the (n -
i)th layers:
Use cases
Unpaired style transfer using CycleGAN
• Paired style transfer is a powerful setup with a number of use cases, some
of which we discussed in the previous section. It provides the ability to
perform cross-domain transfer given a pair of source and target domain
datasets.
• The pix2pix setup also showcased the power of GANs to understand and
learn the required loss functions without the need for manually specifying
them.
• CycleGAN improves upon paired style transfer architecture by relaxing the
constraints on input and output images.
• CycleGAN explores the unpaired style transfer paradigm where the model
actually tries to learn the stylistic differences between source and target
domains without explicit pairing between input and output images.
Overall setup for CycleGAN
• For CycleGAN, the training dataset consists of unpaired samples from
the source set, denoted as,
with no specific information regarding which xi matches which yj.
In order to reduce the search space and add more constraints in our
search for the best possible generator G, the authors introduced a
property called cycle consistency.
CycleGAN generated outputs at different stages of
training for the apples to oranges experiment
Thank
You

Generative Adversarial Networks
100% (1)
Generative Adversarial Networks
14 pages
Mastering Generative AI With Diffusion Models - NVIDIA's Cutting-Edge Course
No ratings yet
Mastering Generative AI With Diffusion Models - NVIDIA's Cutting-Edge Course
94 pages
DL Unit5
No ratings yet
DL Unit5
15 pages
Neural - N - Problems - SLP
No ratings yet
Neural - N - Problems - SLP
123 pages
Unit 5 CNN
No ratings yet
Unit 5 CNN
151 pages
What Are Generative Adversarial Networks
No ratings yet
What Are Generative Adversarial Networks
14 pages
Chapter 8 - GAN
No ratings yet
Chapter 8 - GAN
86 pages
GAN Report by Manisha
No ratings yet
GAN Report by Manisha
30 pages
Week 8
No ratings yet
Week 8
61 pages
Unit 5
No ratings yet
Unit 5
46 pages
Atharv Report Final
No ratings yet
Atharv Report Final
23 pages
PDL Unit 5-GAN
No ratings yet
PDL Unit 5-GAN
36 pages
Introduction To Radial Basis Function Networks
No ratings yet
Introduction To Radial Basis Function Networks
45 pages
Reviewon Generative Adversarial Networks
No ratings yet
Reviewon Generative Adversarial Networks
6 pages
Chap 7.2 Sequence Analysis Using RNN LSTM
No ratings yet
Chap 7.2 Sequence Analysis Using RNN LSTM
60 pages
12-DL-Deep Learning For GANS
No ratings yet
12-DL-Deep Learning For GANS
75 pages
Unit-V Deep Generative Models Part-02
No ratings yet
Unit-V Deep Generative Models Part-02
35 pages
Introduction Generative Adversarial Networks
No ratings yet
Introduction Generative Adversarial Networks
41 pages
29 - Gan - 1
No ratings yet
29 - Gan - 1
24 pages
L2 - Basic ANN Model Building With TF-Keras
No ratings yet
L2 - Basic ANN Model Building With TF-Keras
16 pages
Combining Classifiers
No ratings yet
Combining Classifiers
12 pages
Deep Learning - A Gentle Introduction
No ratings yet
Deep Learning - A Gentle Introduction
100 pages
Unit V
No ratings yet
Unit V
20 pages
Chapter8 GANs
No ratings yet
Chapter8 GANs
24 pages
3rd Unit Notes
No ratings yet
3rd Unit Notes
16 pages
Whitepaper KX
No ratings yet
Whitepaper KX
230 pages
Techniques of Cluster Analysis: A Seminar On
No ratings yet
Techniques of Cluster Analysis: A Seminar On
25 pages
E-Note 28189 Content Document 20241127105359AM
No ratings yet
E-Note 28189 Content Document 20241127105359AM
32 pages
DL Unit6 Gan
No ratings yet
DL Unit6 Gan
44 pages
Ann PDF
No ratings yet
Ann PDF
38 pages
Generative Adversarial Networks For Image and Video Synthesis: Algorithms and Applications
No ratings yet
Generative Adversarial Networks For Image and Video Synthesis: Algorithms and Applications
22 pages
Generative Adversarial Network
No ratings yet
Generative Adversarial Network
19 pages
Generative Adversarial Network GAN A General Review On Different Variants of GAN and Applications
No ratings yet
Generative Adversarial Network GAN A General Review On Different Variants of GAN and Applications
8 pages
What Are Generative Adversarial Networks (GANs) - Simplilearn
No ratings yet
What Are Generative Adversarial Networks (GANs) - Simplilearn
19 pages
What Are Generative Adversarial Networks - 2
No ratings yet
What Are Generative Adversarial Networks - 2
20 pages
10 Generative Adversarial Networks
No ratings yet
10 Generative Adversarial Networks
37 pages
7apriori Algorithm Slide
No ratings yet
7apriori Algorithm Slide
15 pages
Unit Iii
No ratings yet
Unit Iii
24 pages
Week 3 - Post - GAN
No ratings yet
Week 3 - Post - GAN
38 pages
Advances in AI
No ratings yet
Advances in AI
16 pages
GAN Technical Final Report
No ratings yet
GAN Technical Final Report
21 pages
Gans
No ratings yet
Gans
26 pages
DL Co4 PPT-3
No ratings yet
DL Co4 PPT-3
14 pages
Technical Report Guideline Vũ Quang Huy
No ratings yet
Technical Report Guideline Vũ Quang Huy
5 pages
Deep & Reinforcement - Unit 3
No ratings yet
Deep & Reinforcement - Unit 3
8 pages
GaNs L7
No ratings yet
GaNs L7
14 pages
Frank Gabel Eml2018 Report
No ratings yet
Frank Gabel Eml2018 Report
15 pages
Gan Tutorial Suwang
No ratings yet
Gan Tutorial Suwang
11 pages
Introduction To Neural Networks: Deep Learning For NLP
No ratings yet
Introduction To Neural Networks: Deep Learning For NLP
57 pages
AAM Summer 2024 Question Paper
No ratings yet
AAM Summer 2024 Question Paper
4 pages
GANs
No ratings yet
GANs
13 pages
Self Organizing Networks
No ratings yet
Self Organizing Networks
9 pages
Generative Adversarial Networks (Gans) : Date: 14.11.2022
100% (1)
Generative Adversarial Networks (Gans) : Date: 14.11.2022
12 pages
Introduction To Deep Convolutional Neural Networks: March 2016
No ratings yet
Introduction To Deep Convolutional Neural Networks: March 2016
51 pages
Why Were Gans Developed in The First Place?: Generative Adversarial Network (Gan)
No ratings yet
Why Were Gans Developed in The First Place?: Generative Adversarial Network (Gan)
3 pages
8 Gan
No ratings yet
8 Gan
6 pages
JIUP - Deny Ardianto
No ratings yet
JIUP - Deny Ardianto
14 pages
Aiml Demo
No ratings yet
Aiml Demo
12 pages
Course Curriculum
No ratings yet
Course Curriculum
3 pages
Neuron Model and Network Architectures
No ratings yet
Neuron Model and Network Architectures
18 pages
Image Captioning: - A Deep Learning Approach
No ratings yet
Image Captioning: - A Deep Learning Approach
14 pages
Text To Image Translation Using Generative Adversarial Networks
No ratings yet
Text To Image Translation Using Generative Adversarial Networks
7 pages
The Six Fronts of The Generative Adversarial Networks
No ratings yet
The Six Fronts of The Generative Adversarial Networks
11 pages
Deep Learning - IIT Ropar - Unit 5 - Week 2
No ratings yet
Deep Learning - IIT Ropar - Unit 5 - Week 2
4 pages
Generative Adversial Network
No ratings yet
Generative Adversial Network
21 pages
Roadmap To GenAi
No ratings yet
Roadmap To GenAi
2 pages
Generative Adversarial Networks
No ratings yet
Generative Adversarial Networks
10 pages
ML Using Scikit
50% (4)
ML Using Scikit
23 pages
l7 - Learning in Multi-Layer Perceptrons, Back-Propagation
No ratings yet
l7 - Learning in Multi-Layer Perceptrons, Back-Propagation
16 pages
Generative Adversarial Networks For Image and Video Synthesis: Algorithms and Applications
No ratings yet
Generative Adversarial Networks For Image and Video Synthesis: Algorithms and Applications
24 pages
DWDM Externallab2022for Student
No ratings yet
DWDM Externallab2022for Student
3 pages
Master of Technology in Computer Science: Generative Adversarial Network
No ratings yet
Master of Technology in Computer Science: Generative Adversarial Network
11 pages
Proceedings of Spie: A Survey On Generative Adversarial Networks and Their Variants Methods
No ratings yet
Proceedings of Spie: A Survey On Generative Adversarial Networks and Their Variants Methods
8 pages
Role of Machine Learning in MIS
No ratings yet
Role of Machine Learning in MIS
4 pages
Introduction To Generative Adversarial Networks: Luke de Oliveira
No ratings yet
Introduction To Generative Adversarial Networks: Luke de Oliveira
31 pages
The Adaline Learning Algorithm
No ratings yet
The Adaline Learning Algorithm
11 pages
Lec19 - GANs
No ratings yet
Lec19 - GANs
47 pages
Clustering & Association Algorithms 4
No ratings yet
Clustering & Association Algorithms 4
17 pages
Master of Technology in Computer Science: Generative Adversarial Network
No ratings yet
Master of Technology in Computer Science: Generative Adversarial Network
11 pages
Anime Gan
No ratings yet
Anime Gan
1 page
Adversarial Training Technique
No ratings yet
Adversarial Training Technique
3 pages
Gans
No ratings yet
Gans
14 pages
2017 Beginner's Review of Generative Adversarial Networks (GAN) Architectures
No ratings yet
2017 Beginner's Review of Generative Adversarial Networks (GAN) Architectures
9 pages
Lecture 2.3.4GAN
No ratings yet
Lecture 2.3.4GAN
4 pages
Generative Adversarial Networks
No ratings yet
Generative Adversarial Networks
4 pages
Introduction To Soft Computing: Practice Sheet: NN-1
No ratings yet
Introduction To Soft Computing: Practice Sheet: NN-1
2 pages
Practical 3 ANN
No ratings yet
Practical 3 ANN
3 pages
A Survey On Generative Adversarial Networks (GANs)
No ratings yet
A Survey On Generative Adversarial Networks (GANs)
5 pages
Python数据科学速查表 - Scikit-Learn
No ratings yet
Python数据科学速查表 - Scikit-Learn
1 page
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet

Gen AI Unit 3

Uploaded by

Gen AI Unit 3

Uploaded by

Unit 3: Generative Adversarial Networks

Generative Adversarial Networks – Vanilla GAN – Progressive

Select GAN Architecture

Train Discriminator on Real Dataset

Train Discriminator on Fake Data

Train Generator with Discriminator Feedback

• We can better understand the value function V(G, D) by separating

where 𝐽𝐷is the discriminator objective function in the classical sense,

• The ability to manipulate the latent vectors by addition, subtraction,

You might also like