0% found this document useful (0 votes)
14 views21 pages

Sem5 PPT

Uploaded by

tejaswi nadendla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views21 pages

Sem5 PPT

Uploaded by

tejaswi nadendla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

MINI-PROJECT ‘2023

CLASSIFCATION
USING
GLOW_MODEL
Professor : Prof. Shekhar Verma
Mentor : Jagriti
Singh
Team Members

IIT2021001 KSHITIJ

IIT2021235 TEJASWI.N

IIT2021033 SATHWIKA.CH

IIB2021039 JISHWITHA REDDY

IIB2021022 GAYATHRI NARAYANAM


PROBLEM STATEMENT

THE OBJECTIVE IS TO ADAPT GLOW, A GENERATIVE MODEL TYPICALLY


USED FOR GENERATING NEW DATA SAMPLES, FOR THE TASK OF
CLASSIFYING DIGITS FROM THE MNIST DATASET. SPECIFICALLY, THE
AIM IS TO CLASSIFY TWO DISTINCT DIGITS, '0' AND '1', BY TRAINING
SEPARATE GLOW MODELS FOR EACH DIGIT AND THEN USING THESE
MODELS TO CLASSIFY NEW IMAGES.
GENERATIVE MODELS

• Unlike traditional models that differentiate data, Generative


Models focus on learning data distributions and creating new
instances from these learned patterns.
• This method gives a unique way to really understand and use the
patterns in complex data sets.
• These models find extensive application in tasks like image
generation, data augmentation, and creating realistic simulations.
GENERATIVE ADVERSIAL NETWORKS

• The core of GANs contain two parts: the Generator and the
Discriminator.
• In GANs, the Generator learns to create images that look real starting
from simple noise patterns. The Discriminator's task is to identify if
these images are real or created by the Generator.
• The Generator aims to create images that are so convincing that the
Discriminator thinks they are real. Meanwhile, the Discriminator gets
better at telling real images from fake ones.
• This process is like a continuous game, where each part's success
depends on outsmarting the other.

Img-Credits
VAE

• Variational Autoencoders (VAEs) are a popular unsupervised approach


for learning complex data distributions like images using neural
networks .
• VAEs aim to learn and sample from the probability distribution of
training data, utilizing a low-dimensional latent representation.
• Latent variables in VAEs, inferred but not directly observed, are crucial
for generating data that closely resembles the training set.

Img-Credits
NORMALIZING FLOWS

• Normalizing Flows turn simple distributions into complex ones with


invertible neural networks, making it easier to accurately estimate
densities.
• They use a series of transformations that keep density calculations
manageable, thanks to the change of variables formula.
• Different from GANs and VAEs, Normalizing Flows can precisely
calculate likelihoods and reverse their transformations, leading to more
straightforward and efficient model creation.
• Their ability to both invert transformations and accurately estimate
densities is what makes them unique in deep generative modeling."

Img-Credits
CHANGE OF VARIABLES IN NORMALIZING FLOWS

• The change of variables formula is crucial in normalizing flows, allowing


the transformation of simple probability distributions into more complex
ones, representing the data more accurately.
• This transformation is achieved using an invertible function, which
modifies the variables of the original distribution while preserving the
overall probability structure.

Img-Credits
GLOW-MODEL

• The Glow model is a neural network architecture characterized by its


multi-scale structure. It is built upon a series of repeating layers, each
referred to as a scale, which are fundamental to its functioning.
• At the heart of each scale is a squeeze function that alters the shape of
the input tensor from [c, h, w] to [4*c, h/2, w/2], effectively increasing
the channel depth while halving the spatial dimensions. This
transformation is crucial for the model's ability to process images at
different scales.

Img-Credits
CONT..

• Following the squeeze function is a flow step, consisting of ActNorm for normalization, a 1x1
Convolution for channel-wise information mixing, and an Affine Coupling Layer for specialized
transformations.
• The Glow model is designed for reversible operations which means each layer's function can be
applied in both forward and reverse, allowing for greater flexibility during training and testing
phases. For instance, during testing, the reshaping function can revert a tensor from [4*c, h/2,
w/2] back to its original size [c, h, w], demonstrating the model's adaptability.

Img-Credits
ARCHITECTURE OF GLOW-MODEL
GENERATED DATA USING GLOW-MODEL
USING MODEL IN PERSPECTIVE OF CLASSIFICATION

• We trained two separate Glow models: one on the '0' digit images and another on the '1' digit
images from the MNIST dataset, enabling us to learn the distinct distributions specific to each
digit.
• After training, each model can map its respective images of '0' or '1' to points in its latent space.
This space is a lower-dimensional area where the model encodes information from each input
image.
• In this latent space, each point represents an encoding of an image, capturing essential features
of '0' or '1' as learned by the respective models.
CONT..

• Passing images of '0' and '1' through their trained models gives us their
corresponding latent representations.
• We calculate the mean of all latent vectors for '0' and '1' separately,
obtaining two distinct vectors. These vectors represent the "average"
points in the latent space for the digits '0' and '1', respectively.
• These mean latent vectors symbolize the typical characteristics of the
digits '0' and '1' as learned from our training datasets for each digit.
• The mean vectors serve as reference points in the latent space,
allowing us to see how other images or different digits differ from the
mean latent vectors of '0' and '1'.
CONT..

• Test images from the MNIST dataset are then passed through both Glow
models. Each model, having been trained on a specific digit, encodes
the test images into latent vectors, capturing the key features as
learned for '0' and '1', respectively.
• We calculate the Euclidean distance between each test image's latent
vector (from both models) and the mean latent vectors of '0' and '1'.
• A test image is then classified based on which mean latent vector—'0'
or '1'—it is closer to, as determined by the calculated distances from
both models.
• This proximity-based approach, using the dual encoding by the separate
models, enables us to accurately predict the identity of each digit. We
do this by comparing the test image's position in the latent space
relative to the mean positions of '0' and '1'.
GET-LATENT-MEAN

• We have implemented the get_latent_mean_vector function to calculate mean latent vectors


for each class, by averaging individual latent representations. This process effectively captures
the key features of each class in the latent space.
• These mean vectors act as benchmarks, aiding in classifying new data. They enable the model
to identify the class that most closely matches an input by comparing it against these class-
specific representations.
• We used the get_latent_mean_vector function to calculate mean latent vectors for '0' and '1',
giving us insights into how the model uniquely represents these two digits.
CLASSIFICATION OF IMAGE USING PRE-CALCULATED MEAN VECTOR

• We have implemented the ‘classify’ function to assign classes to test data instances by
comparing their latent vectors' distances to the mean latent vectors of '0' and '1', classifying
each digit based on closest proximity.
• Classification is determined by the shortest Euclidean distance to the mean vectors of '0' and
'1', simplifying the decision-making process.
• The function's accuracy is measured by tallying the percentage of correct predictions against
the actual labels in the test dataset.
• Our results showcase the accuracy percentages for both '0' and '1', indicating the model's
capability to effectively distinguish between these two digits.
EXTENDED PERFORMANCE METRICS: PRECISION, RECALL, AND F1 SCORE

• We also created a single confusion matrix for both '0' and '1', and
calculated precision, recall, and the F1 score to further evaluate the
model's performance.
CONCLUSION AND FURTHER SCOPE

In conclusion, our exploration with the Glow generative model in classifying


digits '0' and '1' has been both enlightening and rewarding.
Further we can improve our model to classify more than two classes.
The positive outcomes of our project invite further research and
development.
This venture into generative models for classification tasks hints at a
promising future, particularly in addressing more complex challenges in
image recognition.

Final-Project-
Link
Reference-Links

• https://github.com/VincentStimper/normalizing-flows
• https://lyusungwon.github.io/studies/2018/07/23/glow/
• https://towardsdatascience.com/introduction-to-normalizing-flows-
d002af262a4b
• https://medium.com/@sairajreddy/gan-generative-adversarial-nets-
e8520157ec62
• https://learnopencv.com/wp-content/uploads/2020/11/vae-diagram-
scaled.jpg
• https://www.google.com/url?sa=i&url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Fankurdhuriya.medium.com%2Fwhat-are-
normalizing-flows-
ce7ccd222ee7&psig=AOvVaw2ID8I3TSD2mrIKVsak2jIp&ust=1701449036087000&source=images&cd=vfe
&opi=89978449&ved=0CBIQjRxqFwoTCJid286V7IIDFQAAAAAdAAAAABAD
THANK YOU!

You might also like