Proposal FYP
Proposal FYP
Introduction
The proliferation of smart devices equipped with high-quality cameras and advanced image
processing applications, along with the widespread availability of desktop computers, has
resulted in a scenario where individuals can easily amass, store, and manipulate an
unprecedented volume of digital visual data. This is facilitated by the constant
interconnectivity of these devices, which are nearly always linked to both each other and
remote data servers through the Internet.
As a result, sharing videos and images has become commonplace, and they are now
regarded as essential sources of information in a variety of settings. Even professionals who
use smartphones significantly contribute to the documentation of many commonplace
events and data [1]. Various digital technologies [2], including effective compression
methods, fast networks, and specialized user apps, enable the wide-scale exchange of visual
images. These programs include web platforms like social networks (like Instagram) and
forums like Reddit, enabling the quick distribution of user-generated photographs and
videos. Additional factors that contribute to this phenomenon include readily available, user-
friendly picture editing software, including both paid (like Adobe Photoshop [3]) and open
source (like GIMP) options. These tools also include smartphone apps that enable quick,
basic image adjustments.
These elements have contributed to the spread of misleading or manipulated images and
films, where the true meaning or context has been significantly altered. This change is
occasionally done maliciously, for reasons like political or financial gain [4]. Major social
network platforms will struggle to filter falsified data adequately by 2022 to stop the quick
and widespread spread of such false content, especially when it preys on the most gullible
individuals [5]. The division of legal liability for potentially damaging effects brought on by
the spread of fraudulent news is likewise becoming more complex [6].
These challenges result from the fact that people can be easily tricked and sometimes find it
difficult to recognize subtle changes in visual information. This is further aggravated by the
"change blindness" cognitive impact, which makes it challenging for people to notice even
very minor changes in the material [7, 8]. Therefore, to properly handle this issue, carefully
developed digital techniques are very necessary.
Image forgery detection serves a multitude of practical applications across diverse domains.
In the realm of forensic analysis, it is an invaluable tool for identifying manipulated or
tampered images, making it essential in legal contexts as it ensures the authenticity of digital
photographs or videos used as evidence.[15]Within the sphere of journalism and media,
image forgery detection plays a pivotal role in verifying the authenticity of news
photographs and videos, thereby upholding the credibility and trustworthiness of media
outlets.[16] Moreover, the authentication of historical images and artistic works relies
heavily on image forgery detection to thwart the creation and circulation of counterfeit or
forged pieces of art.[17]In the medical field, image forgery detection safeguards the integrity
of medical images used for critical purposes such as diagnosis and treatment.[18] Likewise, it
plays a pivotal role in maintaining academic and scientific integrity by preventing the
submission of manipulated or fabricated research results, ensuring the credibility of
scholarly work.[19] On social media and online platforms, it helps combat the spread of fake
or manipulated images, maintaining the credibility of user-generated content[20]. Lastly,
image forgery detection is instrumental in safeguarding a company's brand reputation by
identifying and mitigating the spread of counterfeit or manipulated images that could harm
its standing in the market [21]. These applications collectively contribute to enhancing the
trustworthiness and authenticity of digital content in various domains.
2. Literature Review
Image tampering detection and localization is a challenging task, as forgers are developing
increasingly sophisticated techniques to tamper with images. However, researchers have
also made significant progress in developing new and more effective image forgery detection
methods in recent years. One of the most promising trends in image forgery detection is the
use of deep learning. Deep learning models can learn complex patterns in images, which can
be used to detect forged regions with high accuracy. Another promising trend is the use of
hybrid methods, which combine deep learning with traditional image processing techniques.
Hybrid methods can often achieve better performance than either deep learning or
traditional methods alone.
The paper [9] proposes a deep learning framework for image tampering detection and
localization. The framework consists of two main components: a feature extraction network
and a tampering detection and localization network. The feature extraction network extracts
features from the input image, and the tampering detection and localization network uses
these features to detect and localize tampered regions in the image. The feature extraction
network is a deep convolutional neural network (CNN). The CNN consists of a series of
convolutional layers and pooling layers. The convolutional layers extract features from the
image, and the pooling layers reduce the size of the feature maps. The tampering detection
and localization network is a fully connected neural network. The fully connected neural
network takes the features extracted by the CNN as input and outputs a probability map for
each pixel in the image. The probability map indicates the probability that each pixel is
tampered.
The method [10] proposes a robust detection and localization technique for copy-move
forgery in digital images. The proposed technique is based on a deep learning model that is
trained on a large dataset of copy-move forged images.
The proposed technique consists of two main steps:
1. Feature extraction: The first step is to extract features from the input image. The
proposed technique uses a deep learning model to extract features from the image.
The deep learning model is trained on a large dataset of natural images, and it learns
to extract features that are representative of the image content.
2. Tampering detection and localization: The second step is to detect and localize
tampered regions in the image. The proposed technique uses a support vector
machine (SVM) to classify each patch in the image as either tampered or non-
tampered. The SVM is trained on a dataset of copy-move forged images.
The proposed technique is evaluated on several public datasets, and it achieves state-of-the-
art results on copy-move forgery detection and localization.
The paper [11] proposes a deep learning-based image tampering detection and localization
method using attention mechanisms. The proposed method consists of two main
components: a feature extraction network and a tampering detection and localization
network. The feature extraction network is a deep convolutional neural network (CNN). The
CNN consists of a series of convolutional layers and pooling layers. The convolutional layers
extract features from the image, and the pooling layers reduce the size of the feature maps.
The tampering detection and localization network is a fully connected neural network. The
fully connected neural network takes the features extracted by the CNN as input and
outputs a probability map for each pixel in the image. The probability map indicates the
probability that each pixel is tampered. The proposed method uses attention mechanisms to
focus on the most important regions of the image for tampering detection and localization.
The attention mechanisms learn to identify regions in the image that are most likely to be
tampered. The proposed method is evaluated on several public datasets, and it achieves
state-of-the-art results on both tampering detection and localization.
The paper [12] proposes a multi-semantic CRF-based attention model for image forgery
detection and localization. The proposed model consists of three main components: a
feature extraction network, an attention module, and a conditional random field (CRF) layer.
The feature extraction network is a deep convolutional neural network (CNN). The CNN
consists of a series of convolutional layers and pooling layers. The convolutional layers
extract features from the image, and the pooling layers reduce the size of the feature maps.
The attention module learns to focus on the most important regions of the image for
tampering detection and localization. The attention module learns to identify regions in the
image that are most likely to be tampered. The CRF layer is used to refine the predictions of
the attention module. The CRF layer considers the spatial and contextual relationships
between pixels in the image to produce more accurate predictions. The proposed model is
evaluated on several public datasets, and it achieves state-of-the-art results on both
tampering detection and localization.
In [13], a copy-move forgery detection method is introduced, which is entirely based on Deep
Learning (DL). Unlike traditional approaches, there's no preprocessing to compute separate features.
The authors devised a Convolutional Neural Network (CNN) with the following architecture: Six
convolutional layers, each followed by a max pooling layer. A Global Average Pooling (GAP) layer,
serving to reduce the network's parameters and mitigate overfitting. This layer functions like a fully
connected dense layer. A softmax classification with two classes: authentic or forged. This
architecture was designed to effectively identify copy-move forgeries in images.
Table 1
Metho Detecte Acc% CASIA CASIA ImageNe MICC MICC MICC Columbi CoMoFo
d d 1 2 t -F220 -F60 - a D
Attacks Acc% Acc% Acc% Acc% Acc% F200 Acc% Acc%
0
Acc%
In reference [14], the authors created a system to identify DeepFakes in images. They built
upon the XceptionNet design, initially introduced by Google in another research paper [15].
The standout feature of their model is the use of a specialized layer named SeparableConv.
This layer serves to separate the depth-wise convolution from the spatial convolution,
resulting in a reduction of the model's weight count.
In the research presented in [15], a fresh approach to detect forgeries called Capsule-
Forensic was introduced. This approach stands out by employing Capsule Networks, a
specific type of neural network introduced in [16], as the binary detector, rather than the
conventional convolutional neural networks. Capsule Networks were developed to
effectively represent hierarchical relationships among objects in an image. They not only
estimate the probability of object presence but also infer important details about their
positions and orientations.
In [24], the researchers introduced a deep learning approach designed to identify
DeepFakes. Their method was built upon a key insight: DeepFakes generation algorithms
often produce noticeable irregularities in the face region due to differences in resolution
between the source and target image or video. Specifically, face images generated by
Generative Adversarial Networks (GANs) are typically set at a fixed low resolution. When
applied to a target video, an affine warping is necessary to align the source face with the
facial landmarks of the target face. However, if the resolutions of the source and target
videos don't match or if the facial landmarks of the target person significantly deviate from
the standard frontal view, these irregularities become increasingly prominent and
detectable.
100
80
60
40
20
0
Rossler et al. [ 14] Nguyen et al. [15] Li and Lyu [24]
3. Problem Statement
The review of existing literature highlights a crucial issue: there isn't a single method that
excels across various benchmark datasets such as CASIA1, CASIA2, ImageNet, MICC-F220,
MICC-F60, MICC-F2000, Columbia, and CoMoFoD. This gap emphasizes the necessity for a
model that demonstrates superior performance on all these datasets, ensuring accurate and
efficient forgery detection. Currently, there is a lack of easily accessible tool for users to
comprehensively detect image forgery. To address the gap stated earlier and to create a
user-friendly forgery detection tool, it's vital to devise a method that exhibits exceptional
performance across all the mentioned datasets while maintaining high accuracy in detection.
4. Proposed Solution
The problem at hand will be tackled through the implementation of a solution leveraging the
transformative power of the Transformers Architecture for image forgery detection and localization.
This entails training a robust model on comprehensive datasets utilizing the Transformers
Architecture, renowned for its prowess in various natural language processing tasks. The
Transformers' attention-based mechanism facilitates effective feature extraction, allowing the model
to discern intricate patterns and irregularities indicative of image forgeries. The model will be
rigorously trained on diverse datasets, including CASIA1, CASIA2, ImageNet, MICC-F220, MICC-F60,
MICC-F2000, Columbia, and CoMoFoD. Once the model is trained, it will be made accessible through
APIs, seamlessly integrated into a user-friendly tool. This tool will empower general users to easily
detect and pinpoint forgeries in images, addressing the need for a comprehensive and accessible
forgery detection solution.
References
1. López-García, X., et al., Mobile journalism: Systematic literature review. 2019.
2. Passarella, A., A survey on content-centric technologies for the current Internet: CDN and P2P
solutions. Computer Communications, 2012. 35(1): p. 1-32.
3. Adobe. Adobe Photoshop. [cited 2023; Available from:
https://www.adobe.com/it/products/photoshop.html.
4. Shen, C., et al., Fake images: The effects of source, intermediary, and digital media literacy on
contextual assessment of image credibility online. New media & society, 2019. 21(2): p. 438-
463.
5. Spohr, D., Fake news and ideological polarization: Filter bubbles and selective exposure on
social media. Business information review, 2017. 34(3): p. 150-160.
6. Goldman, E., The complicated story of FOSTA and section 230. First Amend. L. Rev., 2018. 17:
p. 279.
7. Nightingale, S.J., K.A. Wade, and D.G. Watson, Can people identify original and manipulated
photos of real-world scenes? Cognitive research: principles and implications, 2017. 2(1): p. 1-
21.
8. Schetinger, V., et al., Humans are easily fooled by digital images. Computers & Graphics,
2017. 68: p. 142-151.
9. https://www.jetir.org/papers/JETIR2309239.pdf
10. https://www.sciencedirect.com/science/article/pii/S1319157822004323
11. https://www.mdpi.com/2227-7390/10/20/3852
12. https://www.sciencedirect.com/science/article/pii/S0165168421000906?
casa_token=lV0EUkOvdiwAAAAA:Q0GDEPBROY8ufYKKq0C5-
9r_D6oBGSViKhuMWtPckdSwqjY5XkqespJyI9jhLeBqT3YEiV2xEIw
13. Elaskily M, Elnemr H, Sedik A, Dessouky M, El Banby G, Elaskily O, Khalaf AAM, Aslan H,
Faragallah O, El-Samie FA (2020) A novel deep learning framework for copy-move forgery
detection in images. Multimed Tools Appl 79. https://doi.org/10.1007/s11042-020-08751-7
14. Rossler A, Cozzolino D, Verdoliva L, Riess C, Thies J, Nießner M (2019) Faceforensics++:
learning to ¨ detect manipulated facial images
15. Nguyen H, Yamagishi J, Echizen I (2019) Use of a capsule network to detect fake images and
videos
16. Hinton GE, Krizhevsky A, Wang SD (2011) Transforming auto-encoders. In: Honkela T, Duch W,
Girolami M, Kaski S (eds) Artificial Neural Networks and Machine Learning – ICANN 2011.
Springer, Berlin, pp 44–51
17. Fridrich, J., Soukal, D., & Lukas, J. (2003). Detection of Copy-Move Forgery in
Digital Images.
18. Farid, H. (2009). Digital Image Forensics: A Primer.
19. Christlein, V., Riess, C., Jordan, J., & Riess, C. (2012). An Evaluation of Popular Copy-
Move Forgery Detection Approaches.
20. Liu, G., Zhang, T., Lu, W., & Ma, S. (2014). A Survey of Passive Image Tampering
Detection.
21. Hussain, M., Muhammad, G., & Bebis, G. (2017). A Survey of Image Forgery Detection
Techniques.
22. Bappy, J. H., Roy-Chowdhury, A. K., & Bunk, J. (2017). Exploiting Spatial Structure for
Localizing Manipulated Image Regions.
23. Johnson, M. K., & Farid, H. (2015). Exposing Digital Forgeries in Interlaced and
Deinterlaced Video.
24. Li Y, Lyu S (2018) Exposing deepfake videos by detecting face warping artifacts