Batch 17 Major
Batch 17 Major
Classification
using
GAIN
Under Guidance of :
Dr.B.Mathura Bai
Contents of this Presentation
• Abstract
• Introduction
• Literature Review
• Project Requirements
• Problems in the existing system
• Flowchart for methodology
• Stages in Methodology
• Conclusion
Abstract
In recent years, deep learning has made significant strides in image
classification, but challenges remain when dealing with incomplete or occluded
images. This project leverages Generative Adversarial Imputation Networks
(GAIN) to address this issue by generating full images from partial or occluded
inputs.The primary objective is to develop a model capable of accurately
reconstructing incomplete facial images and subsequently classifying them
against a predefined dataset of faces.The proposed system first takes a partial or
full image as input and employs GAIN to predict the missing portions, thereby
generating a complete image. This reconstructed image is then passed through a
classification network that compares it against the dataset to identify the closest
match with high accuracy.
Introduction
Generative Adversarial Imputation Networks (GAIN) are a class of machine learning models that
extend the concept of Generative Adversarial Networks (GANs) to handle missing data. While
GANs are known for generating realistic data from random noise, GAIN specifically focuses on
imputing missing parts of data, making it widely used in tasks where incomplete information is
present. This ability to intelligently reconstruct missing data has positioned GAIN as a powerful
tool in various fields, such as medical imaging, data recovery, and image restoration.
In image classification tasks, particularly those involving facial recognition, dealing with incomplete
or partially occluded images poses a significant challenge. Traditional classification models often
fail to accurately predict or identify objects when key portions of the image are missing, leading to
errors or reduced accuracy. This is especially problematic in security, surveillance, or real-world
applications where perfect image data is rarely available.
Introduction cntd..
The use of GAIN in this project aims to address these challenges by generating full images from
partial or occluded inputs. By reconstructing missing regions, GAIN ensures that essential features in
the image are restored before proceeding with classification. This approach improves the model's
ability to classify even incomplete images, enhancing overall accuracy and reliability in face
recognition and related tasks.
This version emphasizes the role of GAIN, its widespread adoption, and the specific problems in
classifying partial images.
Literature Survey
GAIN: Missing Data Imputation using Generative Adversarial Nets
Jinsung Yoon, James Jordon, Mihaela van der Schaar
Year of Publication : 2018
Proposed a generative model for missing data imputation, GAIN. This novel architecture generalizes
the well-known GAN such that it can deal with the unique characteristics of the imputation problem.
Various experiments with realworld datasets show that GAIN significantly outperforms state-of-the-art
imputation techniques. The development of a new, state-of-the-art technique for imputation can have
transformative impacts; most datasets in medicine as well as in other domains have missing data.
Future work will investigate the performance of GAIN in recommender systems, error concealment
as well as in active sensing (Yu et al., 2009). Preliminary results in error concealment using the
MNIST dataset (LeCun & Cortes, 2010) can be found in the Supplementary Materials
GAIN: Missing Data Imputation using Generative Adversarial Nets
Jinsung Yoon, James Jordon, Mihaela van der Schaar
Proposed a generative model for missing data imputation, GAIN. This novel architecture generalizes
the well-known GAN such that it can deal with the unique characteristics of the imputation problem.
Various experiments with realworld datasets show that GAIN significantly outperforms state-of-the-art
imputation techniques. The development of a new, state-of-the-art technique for imputation can have
transformative impacts; most datasets in medicine as well as in other domains have missing data.
Future work will investigate the performance of GAIN in recommender systems, error concealment
as well as in active sensing (Yu et al., 2009). Preliminary results in error concealment using the
MNIST dataset (LeCun & Cortes, 2010) can be found in the Supplementary Materials
GAIN: Missing Data Imputation using Generative Adversarial Nets
Jinsung Yoon, James Jordon, Mihaela van der Schaar
Proposed a generative model for missing data imputation, GAIN. This novel architecture generalizes
the well-known GAN such that it can deal with the unique characteristics of the imputation problem.
Various experiments with realworld datasets show that GAIN significantly outperforms state-of-the-art
imputation techniques. The development of a new, state-of-the-art technique for imputation can have
transformative impacts; most datasets in medicine as well as in other domains have missing data.
Future work will investigate the performance of GAIN in recommender systems, error concealment
as well as in active sensing (Yu et al., 2009). Preliminary results in error concealment using the
MNIST dataset (LeCun & Cortes, 2010) can be found in the Supplementary Materials
Generative Adversarial Networks Assist Missing Data Imputation: A Comprehensive
Evaluation REZA SHAHBAZIAN , (Senior Member, IEEE), AND SERGIO GRECO
Department of Informatics, Modeling, Electronics and System Engineering, University of
Calabria, 87036 Arcavacata, Italy Corresponding author: Reza Shahbazian (
[email protected])
Year of publication : 2024
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE
RECOGNITION Karen Simonyan ∗ & Andrew Zisserman + Visual Geometry Group,
Department of Engineering Science, University of Oxford {karen,az}@robots.ox.ac.uk
Year of publication : 2016
Literature Survey
Paper Title: Generative Adversarial Networks: An Overview
Authors: Antonia Creswell, Tom White, Vincent Dumoulin, Kai Arulkumaran, Biswa Sengupta, Anil A
Bharath
Year of Publication: 2017
This paper reviews Generative Adversarial Networks (GANs), a framework for unsupervised and
semi-supervised learning. GANs consist of a generator that creates data samples and a
discriminator that distinguishes real from generated samples, trained through adversarial
competition. It covers architectures like Conditional GANs and Adversarial Autoencoders and
addresses training challenges such as instability and mode collapse, with solutions like
Wasserstein GANs and advanced regularization techniques. GANs have revolutionized tasks like
image synthesis, super-resolution, and style transfer. However, training remains unstable due to
vanishing gradients and saddle points, while evaluation methods lack standardization. Despite
these challenges, GANs are a cornerstone for deep learning in applications requiring realistic data
generation and complex data representation, offering significant potential for further
advancements.
Literature Survey
Paper Title: Generative Adversarial Classification Network with Application to Network Traffic
Classification
Authors: Rozhina Ghanavi, Ben Liang, Ali Tizghadam
Year of Publication: 2023
This paper introduces the Generative Adversarial Classification Network (GACN), a novel approach
for joint data imputation and classification, designed to improve classification accuracy for datasets
with missing values. The GACN architecture integrates three networks: a generator to impute missing
data, a discriminator to validate imputed values, and a classifier to ensure the imputation enhances
classification performance. An extension, the Semi-Supervised GACN (SS-GACN), supports partially
labeled data, enhancing its utility in real-world applications.
Experiments on network traffic datasets demonstrate that GACN and SS-GACN outperform state-of-
the-art methods like GAIN and MICE in classification accuracy and imputation quality. For example,
GACN achieved 90.2% accuracy at 20% missingness, compared to GAIN’s 85.7%. GACN prioritizes
imputing features critical to classification, reducing RMSE for significant attributes. While highly
effective, GACN's performance decreases with high missingness or limited labeled data, a gap
addressed by SS-GACN. The study highlights GACN's potential for reliable data analysis and
classification in scenarios with incomplete data.
PROBLEMS IN THE EXISTING
SYSTEM
1.Difficulty in Handling Occlusions and Incomplete Data
2.Bias Towards Dominant Features
3. Limited Generalization in Real-World Scenarios
4.Overfitting to Training Data
5.Lack of Robustness in Partial Image Classification
Problem Definition
Develop a GAIN-powered system to reconstruct and classify occluded or partially visible
images, focusing on real-time scenarios like surveillance and dynamic environments such as
drone footage. The model must handle varying levels of occlusion, adapt to unpredictable
missing data patterns, and ensure accurate classification under constraints like
environmental factors, noise, and computational efficiency.
Proposed System
1. System Overview:
The system will utilize GAIN (Generative Adversarial Imputation Networks) for reconstructing
incomplete or occluded images and a classification model for identifying the reconstructed
images. It will include real-time processing capabilities for surveillance and drone applications,
ensuring adaptability to dynamic environments.
2. Architecture:
A. Input Module
• Accepts image data from cameras or drones.
• Handles occluded, corrupted, or incomplete image frames.
• Preprocessing includes resizing, noise reduction, and normalization.
Proposed System
B. Reconstruction Module (GAIN)
D. Classification Module
• Utilizes a Convolutional Neural Network (CNN) or pre-trained models like VGG to classify
reconstructed images.
• Incorporates fine-tuning for occluded or reconstructed images.
Advantages of Proposed System
1.Dual Functionality for Occlusion Handling
Combines image reconstruction (via GAIN) and classification in a single pipeline, enabling
accurate predictions even with incomplete or occluded data.
2.Real-Time Processing
Optimized for low-latency operation using tools like TensorRT and ONNX, ensuring fast and
reliable performance for critical applications.
3.Enhanced Robustness and Generalization
Excels in noisy or challenging conditions, with superior generalization across diverse datasets,
occlusion levels, and environmental variations.
4.Versatility Across Applications
Adaptable for multiple domains such as surveillance, autonomous systems, healthcare, and
drones, offering a scalable and modular architecture for future enhancements.
System architecture
UML Diagrams - USECASE
DIAGRAM
CLASS DIAGRAM
Sequence Diagram for user
Sequence Diagram for Admin
Activity Diagram ics - image classification system
FLOW CHART FOR METHODOLGY
Work Plan
• Normalization
• Data Augmentation
• Create Missing Regions
• Design the GAIN architecture
• Loss Function
• Classification Model
Requirements
Hardware: Software:
RAM : 16 GB Windows OS - version
32 GB 10,11
(recommended) jyupter Notebook
Storage : 50 GB Language : Python
GPU : RTX 3060
(4080 recommended)
CPU : 8 cores
Dataset
CIFAR-10 and CIFAR-100
ImageNet
CONCLUSION
This project demonstrates the powerful application of Generative Adversarial Imputation
Networks (GAIN) in solving the problem of classifying images that are partially occluded. By
leveraging GAIN’s ability to reconstruct missing or corrupted parts of an image, we improve
classification accuracy even under challenging scenarios. Through the use of the CelebA
dataset, we showcased how GAIN can be integrated with a convolutional neural network
(CNN) for face classification, providing a robust system capable of handling occlusions.
The project highlights the importance of handling missing data in image classification tasks
and offers a practical solution to real-world problems such as face recognition with
obstructions like masks or shadows. GAIN not only improves the reconstruction of missing
parts but also significantly boosts the overall classification performance of the model.
While the results are promising, several challenges remain, such as optimizing GAIN for faster
training and improving generalization to diverse occlusion patterns. The potential for real-world
applications, including security systems,biometrics, and image-based diagnostics, makes this
project a stepping stone for further innovations in handling occlusions in image classification
tasks.
REFERENCES:
• REFERENCES [1] J. L. Schafer and J. W. Graham, ‘‘Missing data: Our view of the state of the
art,’’ Psychol. Methods, vol. 7, no. 2, pp. 147–177, 2002.
• Alaa, A. M., Yoon, J., Hu, S., and van der Schaar, M. Personalized risk scoring for critical care
prognosis using mixtures of gaussian processes. IEEE Transactions on Biomedical Engineering,
65(1):207–218, 2018.
REFERENCES:
• V. T. T. Nguyen, C.-N. Wang, F.-C. Yang, and T. M. N. Vo, ‘‘Efficiency evaluation of cyber security based
on EBM-DEA model,’’ Eurasia Proc. Sci. Technol. Eng. Math., vol. 17, pp. 38–44, Sep. 2022.
• C.-N. Wang, F.-C. Yang, N. T. M. Vo, and V. T. T. Nguyen, ‘‘Wireless communications for data security:
Efficiency assessment of cybersecurity industry—A promising application for UAVs,’’ Drones, vol. 6, no.
11, p. 363, Nov. 2022
• D. Li, J. Deogun, W. Spaulding, and B. Shuart, ‘‘Towards missing data imputation: A study of fuzzy K-
means clustering method,’’ in Proc. Int. Conf. Rough Sets Current Trends Comput. Cham, Switzerland:
Springer, 2004, pp. 573–579.
• M. M. Rahman and D. N. Davis, ‘‘Machine learning-based missing value imputation method for clinical
datasets,’’ in IAENG Transactions on Engineering Technologies. Berlin, Germany: Springer, 2013, pp.
245–257.
REFERENCES:
• J. Ma, Z. Shou, A. Zareian, H. Mansour, A. Vetro, and S.-F. Chang, ‘‘CDSA: Cross-dimensional self-
attention for multivariate, geo-tagged time series imputation,’’ 2019, arXiv:1905.09904.
• T. Marwala, Computational Intelligence for Missing Data Imputation, Estimation, and Management:
Knowledge Optimization Techniques. Hershey, PA, USA: IGI Global, 2009, doi: 10.4018/978-1-60566-
336-4.
• C. Platias and G. Petasis, ‘‘A comparison of machine learning methods for data imputation,’’ in Proc.
11th Hellenic Conf. Artif. Intell., Sep. 2020, pp. 150–159.
• R. K. C. Chan, J. M. Lim, and R. Parthiban, ‘‘Missing traffic data imputation for artificial intelligence in
intelligent transportation systems: Review of methods, limit
QUERIES
?