Sem5 PPT
Sem5 PPT
CLASSIFCATION
USING
GLOW_MODEL
Professor : Prof. Shekhar Verma
Mentor : Jagriti
Singh
Team Members
IIT2021001 KSHITIJ
IIT2021235 TEJASWI.N
IIT2021033 SATHWIKA.CH
• The core of GANs contain two parts: the Generator and the
Discriminator.
• In GANs, the Generator learns to create images that look real starting
from simple noise patterns. The Discriminator's task is to identify if
these images are real or created by the Generator.
• The Generator aims to create images that are so convincing that the
Discriminator thinks they are real. Meanwhile, the Discriminator gets
better at telling real images from fake ones.
• This process is like a continuous game, where each part's success
depends on outsmarting the other.
Img-Credits
VAE
Img-Credits
NORMALIZING FLOWS
Img-Credits
CHANGE OF VARIABLES IN NORMALIZING FLOWS
Img-Credits
GLOW-MODEL
Img-Credits
CONT..
• Following the squeeze function is a flow step, consisting of ActNorm for normalization, a 1x1
Convolution for channel-wise information mixing, and an Affine Coupling Layer for specialized
transformations.
• The Glow model is designed for reversible operations which means each layer's function can be
applied in both forward and reverse, allowing for greater flexibility during training and testing
phases. For instance, during testing, the reshaping function can revert a tensor from [4*c, h/2,
w/2] back to its original size [c, h, w], demonstrating the model's adaptability.
Img-Credits
ARCHITECTURE OF GLOW-MODEL
GENERATED DATA USING GLOW-MODEL
USING MODEL IN PERSPECTIVE OF CLASSIFICATION
• We trained two separate Glow models: one on the '0' digit images and another on the '1' digit
images from the MNIST dataset, enabling us to learn the distinct distributions specific to each
digit.
• After training, each model can map its respective images of '0' or '1' to points in its latent space.
This space is a lower-dimensional area where the model encodes information from each input
image.
• In this latent space, each point represents an encoding of an image, capturing essential features
of '0' or '1' as learned by the respective models.
CONT..
• Passing images of '0' and '1' through their trained models gives us their
corresponding latent representations.
• We calculate the mean of all latent vectors for '0' and '1' separately,
obtaining two distinct vectors. These vectors represent the "average"
points in the latent space for the digits '0' and '1', respectively.
• These mean latent vectors symbolize the typical characteristics of the
digits '0' and '1' as learned from our training datasets for each digit.
• The mean vectors serve as reference points in the latent space,
allowing us to see how other images or different digits differ from the
mean latent vectors of '0' and '1'.
CONT..
• Test images from the MNIST dataset are then passed through both Glow
models. Each model, having been trained on a specific digit, encodes
the test images into latent vectors, capturing the key features as
learned for '0' and '1', respectively.
• We calculate the Euclidean distance between each test image's latent
vector (from both models) and the mean latent vectors of '0' and '1'.
• A test image is then classified based on which mean latent vector—'0'
or '1'—it is closer to, as determined by the calculated distances from
both models.
• This proximity-based approach, using the dual encoding by the separate
models, enables us to accurately predict the identity of each digit. We
do this by comparing the test image's position in the latent space
relative to the mean positions of '0' and '1'.
GET-LATENT-MEAN
• We have implemented the ‘classify’ function to assign classes to test data instances by
comparing their latent vectors' distances to the mean latent vectors of '0' and '1', classifying
each digit based on closest proximity.
• Classification is determined by the shortest Euclidean distance to the mean vectors of '0' and
'1', simplifying the decision-making process.
• The function's accuracy is measured by tallying the percentage of correct predictions against
the actual labels in the test dataset.
• Our results showcase the accuracy percentages for both '0' and '1', indicating the model's
capability to effectively distinguish between these two digits.
EXTENDED PERFORMANCE METRICS: PRECISION, RECALL, AND F1 SCORE
• We also created a single confusion matrix for both '0' and '1', and
calculated precision, recall, and the F1 score to further evaluate the
model's performance.
CONCLUSION AND FURTHER SCOPE
Final-Project-
Link
Reference-Links
• https://github.com/VincentStimper/normalizing-flows
• https://lyusungwon.github.io/studies/2018/07/23/glow/
• https://towardsdatascience.com/introduction-to-normalizing-flows-
d002af262a4b
• https://medium.com/@sairajreddy/gan-generative-adversarial-nets-
e8520157ec62
• https://learnopencv.com/wp-content/uploads/2020/11/vae-diagram-
scaled.jpg
• https://www.google.com/url?sa=i&url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Fankurdhuriya.medium.com%2Fwhat-are-
normalizing-flows-
ce7ccd222ee7&psig=AOvVaw2ID8I3TSD2mrIKVsak2jIp&ust=1701449036087000&source=images&cd=vfe
&opi=89978449&ved=0CBIQjRxqFwoTCJid286V7IIDFQAAAAAdAAAAABAD
THANK YOU!