Phase 1 Report

Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

Jnana Sangama, Belgaum-590018

A PROJECT PHASE I REPORT (18CSP77) ON

“Facial Recognition On Low Resolution Images”


Submitted in Partial fulfillment of the Requirements for the VII Semester of the Degree of

Bachelor of Engineering in Computer Science & Engineering


By

Muteeba Shoukat (1CR20CS121)

Moksha Sri S (1CR20CS119)

P Varshika Prashanth (1CR20CS133)

Under the Guidance of,


Prof. Paramita Mitra
Assistant Professor, Dept. of CSE

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

CMR INSTITUTE OF TECHNOLOGY

#132, AECS LAYOUT, IT PARK ROAD, KUNDALAHALLI, BANGALORE-560037

CMR INSTITUTE OF TECHNOLOGY


i
#132, AECS LAYOUT, IT PARK ROAD, KUNDALAHALLI, BANGALORE-560037

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

CERTIFICATE
Certified that the project work entitled “Facial Recognition On Low Resolution Images” carried out
by Ms. Muteeba Shoukat, USN 1CR20CS121, Ms. Moksha Sri S, USN 1CR20CS119, Ms. P Varshika
Prashanth, USN 1CR20CS133, bonafide students of CMR Institute of Technology, in partial
fulfillment for the award of Bachelor of Engineering in Computer Science and Engineering of the
Visveswaraiah Technological University, Belgaum during the year 2023-2024. It is certified that all
corrections/suggestions indicated for Internal Assessment have been incorporated in the Report
deposited in the departmental library.

The project report has been approved as it satisfies the academic requirements in respect of Project
work prescribed for the said Degree.

_____________________________ _____________________________
Signature of Guide Signature of HOD
Prof. Paramita Mitra Dr. Shreekanth M Prabhu
Assistant Professor Professor & HoD
Dept. of CSE, CMRIT Dept. of CSE, CMRIT

ii
DECLARATION

We, the students of 7th semester of Computer Science and Engineering, CMR Institute of
Technology, Bangalore declare that the work entitled " Facial Recognition On Low Resolution
Images” has been successfully completed under the guidance of Prof. Paramita Mitra, Assistant
Professor, Computer Science and Engineering Department, CMR Institute of Technology,
Bangalore. This dissertation work is submitted in partial fulfillment of the requirements for the
award of Degree of Bachelor of Engineering in Computer Science and Engineering during the
academic year 2023 - 2024. Further the matter embodied in the project report has not been
submitted previously by anybody for the award of any degree or diploma to any university.

Place: Bangalore

Date:

Team members:

Muteeba Shoukat (1CR20CS121) __________________

Moksha Sri S (1CR20CS119) __________________

P Varshika Prashanth (1CR20CS133) __________________

iii
ABSTRACT

The project aims to address the challenge of enhancing image resolution through the application
of advanced deep learning techniques. Image resolution enhancement is a critical task in various
domains, including medical imaging, satellite imagery, surveillance, and photography.
Traditional methods often suffer from limitations in handling complex patterns and generating
high-quality results. In this project, we propose a novel approach leveraging SR3 for image
resolution enhancement.

Keywords: Image super-resolution, diffusion models, deep generative models, image-to-image


translation, denoising process, iterative methods, face recognition

iv
ACKNOWLEDGEMENT

We take this opportunity to express my sincere gratitude and respect to CMR Institute
of Technology, Bengaluru for providing me a platform to pursue my studies and carry out my
final year project.
I have a great pleasure in expressing my deep sense of gratitude to Dr. Sanjay Jain,
Principal, CMRIT, Bangalore, for his constant encouragement.
I would like to thank Dr. Shreekanth M Prabhu , HOD, Department of Computer
Science and Engineering, CMRIT, Bangalore, who has been a constant support and
encouragement throughout the course of this project.
I consider it a privilege and honor to express my sincere gratitude to my guide
Prof. Paramita Mitra, Assistant Professor, Department of Computer Science and Engineering,
for the valuable guidance throughout the tenure of this review.
I also extend my thanks to all the faculty of Computer Science and Engineering who
directly or indirectly encouraged me.
Finally, I would like to thank my parents and friends for all their moral support they have
given me during the completion of this work.

v
TABLE OF CONTENTS

Page No.
Certificate ii
Declaration iii
Abstract iv
Acknowledgement v
Table of contents vi-vii
List of Figures viii
List of Tables ix
List of Abbreviations x

1 INTRODUCTION 1-4
1.1 Relevance of the Project 1
1.2 Problem Statement 1
1.3 Objectives 2
1.4 Scope of the project 2
1.5 Software Engineering Methodology 2-3
1.6 Tools and Technologies 3-4
1.7 Chapter Wise Summary 4

2 LITERATURE SURVEY 5-14


2.1 Overview 5
2.2 Image Super Resolution Via Iterative Refinement 5-6
2.3 Dense Nested Attention Network for Infrared Small Target 6-7
Detection
2.4 Deep Convolutional Neural Network for Inverse Problems in 7-8
Imaging

vi
2.5 High-Resolution Image Synthesis and Semantic Manipulation with 8-9
Conditional GANs
2.6 Convolutional Sparse Coding for Compressed Sensing CT 9-10
Reconstruction
2.7 A Variational Auto-Encoder Approach for Image Transmission in 10-11
Noisy Channel
2.8 A Comparative Study on Variational Autoencoder and Generative 11-12
Adversarial Networks
2.9 Research Gap / Market Analysis 13-14

3 PROBLEM FORMULATION 15

4 STATUS AND ROADMAP 16

REFERENCES 17

vii
LIST OF FIGURES

Page No.
Fig 1.1 Software Engineering Methodology Model 3

viii
LIST OF TABLES

Page No.
Table 2.1 Comparison of different approaches 13
Table 4.1 Schedule of project 16

ix
LIST OF ABBREVIATIONS

CT Computed Tomography
CNN Convolutional Neural Network
DDPM Dell Display And Peripheral Manager
GANs Generative Adversarial Networks
PSNR Peak Signal-to-Noise Ratio
RNNs Recurrent Neural Networks
SR3 Super Resolution Via Repeated Refinement
SSI Small Scale Integration
VAEs Variational Autoencoders

x
Facial Recognition On Low Resolution Images

CHAPTER 1
INTRODUCTION
In the realm of computer vision and image processing, the demand for high-resolution
images continues to surge across various domains, including medical imaging, satellite
observations, and surveillance systems. This project introduces a novel approach to Image
Resolution Enhancement through the implementation of SR3 (Super-Resolution)
modeling.
Super-Resolution (SR) techniques aim to reconstruct high-resolution images from their
low-resolution counterparts, offering a solution to enhance visual quality and extract finer
details. The SR3 modeling technique employed in this project integrates the power of
recurrent neural networks (RNNs) with residual error learning to achieve superior image
resolution.

1.1 Relevance of the Project


The SR3 (Super-Resolution with Recurrent Neural Networks and Residual Errors) model
is highly relevant for Image Resolution Enhancement. By combining recurrent neural
networks and residual error learning, SR3 effectively captures complex relationships
between low-resolution and high-resolution images. The model preserves contextual
information, learns residual errors, and adapts to diverse image content. It is assessed
using quantitative metrics like PSNR and SSI, ensuring robust evaluation.
It also holds significant relevance in real world applications. Its capacity to enhance
image resolution has practical implications, including improved medical diagnostics,
precise satellite observations, enhanced surveillance systems, sharper media production,
detailed geospatial mapping, augmented capabilities in artificial intelligence and
autonomous vehicles, forensic analysis, and improved resolution in astronomy and
astrophysics. The versatility of SR3 underscores its potential to positively impact multiple
sectors by providing sharper and more detailed visual information in practical and
meaningful ways.

1.2 Problem Statement

To enhance Image Resolution using SR3 modelling.

Dept. of CSE, CMRIT 2023-2024 Page 1


Facial Recognition On Low Resolution Images

1.3 Objectives
• Develop a State-of-the-Art Super Resolution Technique: Create an innovative
image super-resolution method that leverages diffusion models to significantly
enhance image quality and detail.
• Human Perception Testing: Conduct rigorous human evaluation tests to validate
the perceptual quality and realism of super-resolved images generated by the
diffusion model.
• User-Friendly Implementation: Develop user-friendly interfaces or frameworks
to make the diffusion model accessible and practical for a wider range of
applications.
• Achieve High-Quality Results: Aim to produce super-resolved images that
exhibit superior visual quality, closely resembling high-resolution ground truth
images

1.4 Scope of the project


The project's scope is comprehensive and involves a multi-faceted approach to improving
image quality and resolution using diffusion models. It encompasses both technical and
practical aspects, with the potential to influence various industries and contribute to the
field of computer vision and image processing.
Image Enhancement: The primary focus of the project is to enhance the quality and
detail of low-resolution images, making them more suitable for various applications,
including visual content creation, medical imaging, surveillance, and remote sensing.
Data Requirements: Consideration of the project's scope should include the need for
large-scale datasets to train and validate the model, taking into account variations in
image content, quality, and resolution.
Industry Impact: Consideration of the practical and commercial scope includes
identifying potential industry applications, collaborations, or licensing opportunities for
the developed technology

1.5 Software Engineering Methodology


Our Project uses agile development methodology for cyclic development and
improvement(reviews). The major stages of our software cycle are:

Dept. of CSE, CMRIT 2023-2024 Page 2


Facial Recognition On Low Resolution Images

• Requirements Gathering: Ensure all requirements are well-documented, including user


stories, functional specifications, and technical requirements.
• Dataset Retrieval: Gathering the required dataset for the implementation of our project.
• Training: With the acquired dataset, we went through various training processes.
• Testing: Conduct various tests after the training process.
• Refinement and Iteration: Address bugs, gather feedback, and make necessary
improvements based on user testing and reviews.
• Comparison: Keeping up with the current and available technologies and comparing it
with the algorithm used by us in order to make our project function better.
• Monitoring and Optimization: Put monitoring instruments into place, keep an eye on
data, and improve the system iteratively in response to performance evaluations and
new needs.
• Documentation: To make future maintenance and additions easier, keep thorough
documentation throughout the development lifecycle.

Requirements Dataset
Training
Gathering Retrieval

Comparison Refinement &


Iteration Testing

Monitoring &
Documentation
Optimization

Fig. 1.1 – Software Engineering Methodology Model

1.6 Tools and Technologies


➢ Deep Learning Frameworks:
• TensorFlow
• Keras (built on top of TensorFlow)
➢ Computer Vision Libraries:

Dept. of CSE, CMRIT 2023-2024 Page 3


Facial Recognition On Low Resolution Images

• OpenCV: For image and video processing tasks, including face detection and video
capture.
➢ Data Collection and Annotation Tools:
• Video surveillance hardware or a surveillance camera.
➢ Data Preprocessing:
• Image and video preprocessing libraries for tasks like resizing, normalization, and
augmentation.
➢ PyCharm.

1.7 Chapter Wise Summary


In Chapter 1, We give a short introduction of the project giving its scope and relevance.
The objective of the project has also been defined
In Chapter 2, The literature survey of all the sources of the papers used for this project has
been analyzed. The sources has been defined along with its advantages and
disadvantages.
In Chapter 3, research gap of the problem statement has been done and the architecture of
the the problem statement has been described.

Dept. of CSE, CMRIT 2023-2024 Page 4


Facial Recognition On Low Resolution Images

CHAPTER 2
LITERATURE SURVEY
The literature survey on the SR3 model for image resolution enhancement reveals a
foundational paper introducing the integration of recurrent neural networks (RNNs) and
residual error learning. Researchers have explored architectural innovations, emphasizing
the importance of curated datasets and optimal training strategies. Quantitative metrics
such as PSNR and SSI are commonly used for performance evaluation, and real-world
applications in medical imaging, satellite observations, and other domains have been
investigated. Studies also focus on interpretability and visualization of SR3 models,
highlighting ongoing research on attention mechanisms, multi-scale architectures, and
adversarial training. Challenges related to computational complexity, generalization, and
real-time deployment are addressed as future directions. The literature reflects a dynamic
field with continuous advancements in super-resolution techniques.

2.1 Overview

Sources:

• Google
• IEEEXplore
• Springer
• Elsevier
• Google Scholar

Keywords used for the search: Image super-resolution, diffusion models, deep generative
models, image-to-image translation, denoising process, iterative methods, face
recognition

2.2 Image Super Resolution Via Iterative Refinement [1]


The paper introduces a novel framework that employs iterative refinement techniques to
enhance the resolution of low-resolution images effectively. The key focus of the research
lies in developing a model that refines its predictions through multiple iterations,
progressively improving the visual quality of the reconstructed high-resolution images.

Dept. of CSE, CMRIT 2023-2024 Page 5


Facial Recognition On Low Resolution Images

The paper addresses the challenges of image super-resolution with a multi-stage


approach, demonstrating promising results through comprehensive experiments and
evaluations. This iterative refinement strategy contributes to the growing body of research
in the field of image processing and computer vision, offering a valuable perspective on
improving image resolution through successive enhancements.

Advantages

1. The iterative nature of the approach may enable the model to capture finer details
in the images over successive iterations, resulting in a more accurate and detailed
reconstruction.
2. Iterative refinement methods may enhance the robustness of the super-resolution
model by mitigating noise and artifacts present in low-resolution images through
successive improvements.
3. The model may adapt and learn from its own previous iterations, allowing it to
refine its predictions based on the feedback and information gained during each
iteration.

Disadvantages
1. Iterative refinement approaches can be computationally intensive, requiring
multiple passes through the network for each image. This may lead to increased
computational time and resource requirements.
2. Training models with iterative refinement might be more complex compared to
single-pass models. It may involve additional challenges related to convergence,
stability, and tuning hyperparameters for multiple stages.
3. The interpretation of the learning process and feature extraction in each iteration
may be challenging, making it harder to understand and explain the decisions
made by the model.

2.3 Dense Nested Attention Network for Infrared Small Target


Detection [2]
This research paper introduces a novel solution to address the intricacies of infrared small
target detection — the Dense Nested Attention Network. Leveraging the power of deep
learning and attention mechanisms, this proposed network architecture aims to enhance
the accuracy and reliability of small target detection in infrared imagery. The integration

Dept. of CSE, CMRIT 2023-2024 Page 6


Facial Recognition On Low Resolution Images

of dense nested attention mechanisms facilitates the model's ability to capture both global
context and intricate local features, enabling it to discern subtle target signatures against
challenging backgrounds.

Advantages

1. The proposed Dense Nested Attention Network may lead to improved accuracy in
detecting small targets in infrared imagery, thanks to the integration of advanced
attention mechanisms that capture both global and local context.
2. The dense nested attention mechanisms can contribute to a more effective
representation of features, allowing the network to discern subtle details of small
targets against complex backgrounds, leading to better discrimination.
3. If the proposed Dense Nested Attention Network is computationally efficient, it
could be advantageous for real-time applications, where quick and accurate small
target detection is crucial.

Disadvantages
1. If the Dense Nested Attention Network has a high computational cost, it might
limit its practicality, especially in real-time applications or scenarios with resource
constraints.
2. The iterative and complex nature of attention mechanisms could potentially lead
to overfitting, where the model memorizes details from the training data but
struggles to generalize well to new, unseen infrared images.
3. The success of the iterative refinement process may be sensitive to the quality of
the initializations. If the model's performance is highly dependent on the initial
estimates, it could be a limitation.

2.4 Deep Convolutional Neural Network for Inverse Problems in


Imaging [3]
The research paper focuses on the application of deep convolutional neural networks
(CNNs) to address inverse problems in the field of imaging. Inverse problems involve the
estimation of input parameters or information from observed data, and this is a common
challenge in various imaging applications such as medical imaging, computer vision, and
remote sensing.

Dept. of CSE, CMRIT 2023-2024 Page 7


Facial Recognition On Low Resolution Images

Advantages

1. Deep convolutional neural networks are known for their ability to learn complex
mappings, enabling more accurate reconstructions in inverse problems. The paper
may demonstrate improved accuracy in reconstructing images or information from
noisy or incomplete data.
2. A well-designed deep learning model can often generalize well to unseen data. If
the paper proposes a model that performs well across a variety of imaging tasks
and datasets, it would be considered an advantage.
3. Deep learning models can automatically learn relevant features from data,
reducing the need for manually designed algorithms. This can be advantageous in
situations where the underlying mathematical model of the inverse problem is
complex or not well understood.

Disadvantages
1. Deep learning models, especially deep convolutional neural networks, can be
computationally intensive. The paper might face criticism if it does not adequately
address concerns about the computational complexity of the proposed method,
especially in scenarios where computational resources are limited.
2. Deep learning models often require large amounts of labelled training data to
generalize well. If the paper suffers from a lack of diverse and representative
training data, the model's performance might be limited in real-world applications.
3. Deep models are susceptible to overfitting, where the model performs well on the
training data but fails to generalize to new, unseen data. The paper may face
criticism if it does not adequately address or mitigate overfitting issues.

2.5 High-Resolution Image Synthesis and Semantic Manipulation with


Conditional GANs [4]
The research paper focuses on the application of Conditional Generative Adversarial
Networks to achieve high-resolution image synthesis and semantic manipulation. GANs
are a class of machine learning models that consist of a generator and a discriminator
trained adversarially. The term "conditional" implies that the generation process is
conditioned on specific information, often in the form of additional input data or labels.

Dept. of CSE, CMRIT 2023-2024 Page 8


Facial Recognition On Low Resolution Images

Advantages
1. The paper may introduce a novel architecture or training strategy that enhances
the fidelity of generated images, producing results with higher resolution and
visual quality compared to existing methods.
2. Leveraging conditional GANs allows for the generation of images based on
specific conditions or attributes. This capability enables more controlled and
customizable image synthesis, addressing the needs of various applications
requiring specific visual characteristics.
3. If the paper focuses on semantic manipulation, it could provide a method for
precise control over specific features or aspects of the generated images. This
fine-grained control is valuable in applications where users need to modify or
customize certain visual elements.

Disadvantages
1. GANs are susceptible to mode collapse, where the generator produces limited
varieties of samples, failing to capture the full diversity of the target distribution.
If the proposed model is prone to mode collapse, it could limit the range of
generated images.
2. GAN training is known for being sensitive and prone to instability. If the paper
does not address or mitigate training challenges, such as oscillations or divergence
issues, it may hinder the practical applicability of the proposed model.
3. Generating high-resolution images with complex models can be computationally
intensive, requiring substantial resources in terms of memory and processing
power. This could limit the accessibility of the proposed approach, particularly for
users with limited computational resources.

2.6 Convolutional Sparse Coding for Compressed Sensing CT


Reconstruction [5]
The research paper focuses on the application of convolutional sparse coding techniques
for reconstructing CT images using compressed sensing principles. Compressed sensing
is a signal processing paradigm that allows for the reconstruction of images from highly
under sampled data, reducing the amount of data acquired during the imaging process.
Sparse coding is a technique that represents signals as a combination of a few basis

Dept. of CSE, CMRIT 2023-2024 Page 9


Facial Recognition On Low Resolution Images

functions, and convolutional sparse coding extends this idea by incorporating local spatial
relationships through convolutional operations. The paper explores how convolutional
sparse coding methods can be tailored to the specific challenges of CT image
reconstruction from sparse data.
Advantages
1. Compressed sensing techniques aim to reconstruct images from a reduced set of
acquired data, potentially leading to a lower radiation dose for patients undergoing
CT scans. If the paper successfully demonstrates a reduction in radiation exposure
without compromising image quality, it would be a significant advantage.
2. Convolutional sparse coding methods may capture local spatial relationships more
effectively than traditional sparse coding approaches. This could lead to improved
image quality, reduced artifacts, and better preservation of fine details in the
reconstructed CT images.
3. Convolutional sparse coding, by incorporating local spatial relationships, may
contribute to achieving higher spatial resolution in reconstructed CT images.
Higher resolution is crucial for accurate diagnosis and better visualization of
anatomical structures.

Disadvantages
1. Convolutional sparse coding methods, especially when integrated into complex
algorithms or neural network architectures, can be computationally intensive. This
might pose challenges for real-time applications or environments with limited
computational resources.
2. Convolutional neural networks and sparse coding models often require large
amounts of training data to generalize well. If the proposed method demands
extensive training datasets, it could be a limitation in scenarios where such data
are scarce or difficult to obtain.
3. Convolutional sparse coding models might have numerous hyperparameters that
require careful tuning for optimal performance. Finding the right balance between
model complexity and generalization can be a challenging task.

2.7 A Variational Auto-Encoder Approach for Image Transmission in


Noisy Channel [6]

Dept. of CSE, CMRIT 2023-2024 Page 10


Facial Recognition On Low Resolution Images

The paper focuses on leveraging variational autoencoders for the transmission of images
over a noisy communication channel. Variational autoencoders are a type of generative
model that aims to learn a probabilistic representation of input data, and they have been
widely used in image processing and compression tasks. The paper discusses how the
variational autoencoder is structured and trained to encode images into a latent space and
decode them back to the original form, emphasizing its ability to handle noisy channel
conditions.
Advantages
1. VAEs are known for their ability to generate data with inherent noise robustness.
The probabilistic nature of VAEs allows them to handle noise in the transmission
channel more effectively, resulting in improved image reconstruction under noisy
conditions.
2. VAEs encode images into a latent space, which often captures meaningful and
compact representations of the input data. This can lead to efficient transmission
as the information is concentrated in a lower-dimensional space.
3. VAEs are generative models, meaning they can generate new samples from the
learned latent space. This generative capability can be advantageous in scenarios
where reconstructed images need to be generated from partial or degraded data
received in a noisy channel.

Disadvantages
1. Although VAEs can generate samples from the learned latent space, the quality of
generated images may not always match the quality of the input images. This
could be a limitation in scenarios where high-fidelity image reconstruction is
crucial.
2. The latent space representation learned by VAEs might lack interpretability.
Understanding the significance of specific dimensions in the latent space may be
challenging, impacting the model's transparency and explainability.
3. The effectiveness of VAEs in handling noise may depend on the type and
characteristics of the noise. The model may not generalize well to certain types of
noise patterns, limiting its robustness.

2.8 A Comparative Study on Variational Autoencoder and Generative


Adversarial Networks [7]

Dept. of CSE, CMRIT 2023-2024 Page 11


Facial Recognition On Low Resolution Images

The paper explores and contrasts two prominent generative models: variational
autoencoders (VAEs) and generative adversarial networks (GANs). Both VAEs and
GANs are popular frameworks in the field of deep learning for generating realistic data,
such as images, and their comparative analysis provides valuable insights into their
strengths and weaknesses. The paper highlights the significance of generative models in
various applications, including image synthesis, data augmentation, and generative tasks.

Advantages
1. The paper aids researchers and practitioners in making informed decisions about
selecting the appropriate generative model for their specific tasks. Understanding
the advantages and disadvantages of both VAEs and GANs can guide the choice
based on the requirements of the application.
2. A comparative study offers a comprehensive overview of the strengths and
weaknesses of VAEs and GANs. This can serve as a valuable resource for readers
seeking a deeper understanding of these generative models.
3. The paper may provide insights into the architectural differences between VAEs
and GANs, explaining how each model operates and generates realistic data. This
knowledge can be beneficial for researchers aiming to design or modify
generative models.

Disadvantages
1. Comparative studies can be sensitive to the choice of datasets, hyperparameters,
and evaluation metrics. Small variations in these factors might lead to different
conclusions. It's essential for authors to thoroughly detail their experimental setup
to enhance the study's reproducibility.
2. Findings from a comparative study may be specific to the datasets and tasks
chosen for evaluation. The study's generalizability to different domains or
applications might be limited, and this limitation should be acknowledged.
3. VAEs and generative adversarial networks GANs can be sensitive to
hyperparameter tuning. The study might not capture the full range of each model's
performance if certain hyperparameter configurations are not explored.

Dept. of CSE, CMRIT 2023-2024 Page 12


Facial Recognition On Low Resolution Images

Fig. 2.1 Comparison of different approaches

2.9 Research Gap / Market Analysis

Analyzing research gaps and conducting a market analysis for image resolution involves
identifying areas where current research or market offerings fall short of meeting specific
needs or expectations.

Research Gaps-

• Super-Resolution Techniques for Real-time Applications: There is a gap in the


development of super-resolution techniques that can achieve high-quality image
upscaling in real-time or near-real-time scenarios, such as video streaming or live
broadcasts.

Dept. of CSE, CMRIT 2023-2024 Page 13


Facial Recognition On Low Resolution Images

• Adversarial Attacks on Image Super-Resolution: The vulnerability of super-


resolution models to adversarial attacks is an emerging area that requires further
investigation. Understanding and mitigating vulnerabilities can be crucial for the
security of systems relying on high-resolution images.
• Cross-Domain Super-Resolution: Research addressing the challenges of super-
resolving images across different domains or modalities, such as medical imaging
or satellite imagery, is relatively limited. Developing models that can generalize
well to diverse domains is a research gap.

Market Analysis Considerations-


• Demand for High-Resolution Imaging Devices: Analyzing the market demand
for high-resolution imaging devices, such as cameras, smartphones, and drones, is
crucial. Understanding consumer preferences and industry needs can guide the
development of technologies that meet market expectations.
• Applications in Medical Imaging: There is a growing market for high-resolution
medical imaging devices and software. Analyzing the specific requirements of
medical professionals and healthcare institutions can help identify opportunities
for advancements in image resolution technologies.
• Entertainment and Gaming Industry: The entertainment and gaming industry
often requires high-resolution graphics for immersive experiences. Investigating
market trends and demands in these industries can guide the development of
technologies that cater to their specific needs.

Dept. of CSE, CMRIT 2023-2024 Page 14


Facial Recognition On Low Resolution Images

CHAPTER 3
PROBLEM FORMULATION
This project aims to address this challenge by harnessing the power of diffusion-based
probabilistic models. The primary issue at hand is the limitation of low-resolution images,
which lack the fine details, sharpness, and clarity necessary for various applications, such
as medical imaging, surveillance, entertainment, and remote sensing. This problem arises
from the inherent constraints of image sensors, hardware, or transmission systems,
resulting in images with reduced visual fidelity therefore the project revolves around the
development of a novel framework that can produce high-resolution images from low-
resolution inputs, utilizing the principles of diffusion and denoising
The goal of this project is to use SR3 (Super-Resolution via Repeated Refinement), a new
approach to conditional image generation, inspired by recent work on Denoising
Diffusion Probabilistic Models (DDPM) and denoising score matching. SR3 works by
learning to transform a standard normal distribution into an empirical data distribution
through a sequence of refinement steps. The key is a U-Net architecture that is trained
with a denoising objective to iteratively remove various levels of noise from an image.
We adapt DDPMs to image-to-image translation by proposing a simple effective
modification to the U-Net architecture. In contrast to GANs, which require inner-loop
maximization, we minimize a well-defined loss function. Unlike autoregressive models,
SR3 uses a constant number of inference steps regardless of output resolution. SR3
models work well across a range of magnification factors and input resolutions.

Dept. of CSE, CMRIT 2023-2024 Page 15


Facial Recognition On Low Resolution Images

CHAPTER 4
STATUS AND ROADMAP
Currently, in order to identify research gaps and user demands, a complete literature study
and market analysis were conducted before beginning the process of implementing SR3
for enhancing low resolution facial image.
A conducted comprehensive testing is used for the standard datasets for achieving an
average accuracy boost of percentage in facial recognition compared to traditional
upscaling methods.
The roadmap outlines important stages including requirement analysis, system design,
testing, deployment, optimization, monitoring, and documentation. Implementation
entails creating the application and incorporating the optimization method. Testing and
optimization phases aim for high accuracy, effective accessibility, and real-time
performance. The deployment phase includes a pilot release, iterative refinements, and
comprehensive documentation.

Fig. 4.1- Schedule Of Project

Dept. of CSE, CMRIT 2023-2024 Page 16


Facial Recognition On Low Resolution Images

REFERENCES

[1] Image Super Resolution via Iterative Refinement, Chitwan Saharia, Jonathan Ho,
William Chan , Tim Salimans, David J. Fleet , and Mohammad Norouzi, IEEE, 2023
[2] Dense Nested Attention Network for Infrared Small Target Detection, Boyang Li ,
Chao Xiao , Longguang Wang , Yingqian Wang , Zaiping Lin ,Miao Li, Wei An , and
Yulan Guo , Senior Member, IEEE, 2023
[3] Deep Convolutional Neural Network for Inverse Problems in Imaging, Chitwan
Saharia, Jonathan Ho, William Chan , Tim Salimans, David J. Fleet , and Mohammad
Norouzi, IEEE, 2019
[4] High-Resolution Image Synthesis and Semantic Manipulation with Conditional
GANs, Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, Bryan
Catanzaro, NVIDIA Corporation, UC Berkeley, 2018
[5] Convolutional Sparse Coding for Compressed Sensing CT Reconstruction, Chitwan
Saharia, Jonathan Ho, William Chan , Tim Salimans, David J. Fleet , and Mohammad
Norouzi, IEEE,2023
[6] A variational auto-encoder approach for image transmission in noisy channel, Amir
Hossein Estiri, Ali Banaei, Benyamin Jamialahmadi, Mahdi Jafari siavoshani, 2021
[7] A comparative study on variational autoencoder and generative adversarial networks,
Mirza Sami , Iftekharul Mobin, 2019

Dept. of CSE, CMRIT 2023-2024 Page 17

You might also like