Computers 12 00216 v2
Computers 12 00216 v2
Review
Deepfake Attacks: Generation, Detection, Datasets, Challenges,
and Research Directions
Amal Naitali 1, *, Mohammed Ridouani 1 , Fatima Salahdine 2 and Naima Kaabouch 3, *
1 RITM Laboratory, CED Engineering Sciences, Hassan II University, Casablanca 20000, Morocco;
[email protected]
2 Department of Electrical and Computer Engineering, The University of North Carolina at Charlotte,
Charlotte, NC 28223, USA; [email protected]
3 School of Electrical Engineering and Computer Science, University of North Dakota, Grand Forks,
ND 58202, USA
* Correspondence: [email protected] (A.N.); [email protected] (N.K.)
Abstract: Recent years have seen a substantial increase in interest in deepfakes, a fast-developing
field at the nexus of artificial intelligence and multimedia. These artificial media creations, made
possible by deep learning algorithms, allow for the manipulation and creation of digital content that
is extremely realistic and challenging to identify from authentic content. Deepfakes can be used for
entertainment, education, and research; however, they pose a range of significant problems across
various domains, such as misinformation, political manipulation, propaganda, reputational damage,
and fraud. This survey paper provides a general understanding of deepfakes and their creation; it also
presents an overview of state-of-the-art detection techniques, existing datasets curated for deepfake
research, as well as associated challenges and future research trends. By synthesizing existing
knowledge and research, this survey aims to facilitate further advancements in deepfake detection
and mitigation strategies, ultimately fostering a safer and more trustworthy digital environment.
Keywords: deepfake detection; face forgery; deep learning; generative artificial intelligence; vision
transformers
Citation: Naitali, A.; Ridouani, M.;
Salahdine, F.; Kaabouch, N. Deepfake
Attacks: Generation, Detection,
Datasets, Challenges, and Research 1. Introduction
Directions. Computers 2023, 12, 216.
Deepfakes are produced by manipulating existing videos and images to produce
https://doi.org/10.3390/
realistic-looking but wholly fake content. The rise of advanced artificial intelligence-based
computers12100216
tools and software that require no technical expertise has made deepfake creation easier.
Academic Editors: Aditya Kumar With the unprecedented exponential advancement, the world is currently witnessing in
Sahu, Amine Khaldi and Jatindra generative artificial intelligence, the research community is in dire need of keeping informed
Kumar Dash on the most recent developments in deepfake generation and detection technologies to not
Received: 19 September 2023 fall behind in this critical arms race.
Revised: 11 October 2023 Deepfakes present a number of serious issues that arise in a variety of fields. These
Accepted: 20 October 2023 issues could significantly impact people, society [1], and the reliability of digital media [2].
Published: 23 October 2023 Some significant issues include fake news, which can lead to the propagation of deceptive
information, manipulation of public opinion, and erosion of trust in media sources. Deep-
fakes can also be employed as tools for political manipulation, influence elections, and
destabilize public trust in political institutions [3,4]. In addition, this technology enables
Copyright: © 2023 by the authors.
malicious actors to create and distribute non-consensual explicit content to harass and
Licensee MDPI, Basel, Switzerland.
cause reputational damage or create convincing impersonations of individuals, deceiving
This article is an open access article
others for financial or personal gains [5]. Furthermore, the rise of deepfakes poses a serious
distributed under the terms and
issue in the domain of digital forensics as it contributes to a general crisis of trust and
conditions of the Creative Commons
authenticity in digital evidence used in litigation and criminal justice proceedings. All
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
of these impacts show that deepfakes present a serious threat, especially in the current
4.0/).
sensitive state of the international political climate and the high stakes at hand considering
the conflicts on the global scene and how deepfakes and fake news can be weaponized in
the ongoing media war, which can ultimately result in catastrophic consequences.
Therefore, deepfake detection techniques need to be constantly improved to catch
up with the fast-paced evolution of generative artificial intelligence. There is a need
for literature reviews to keep up with the fast-changing field of artificial intelligence
and deepfakes to enable researchers and professionals to develop robust countermeasure
methods and to lay the right groundwork to make it easier to detect and mitigate deepfakes.
The key contributions to this survey paper are as follows:
• A summary of the state-of-the art deepfake generation and detection techniques;
• An overview of fundamental deep learning architectures used as backbone in deepfake
video detection models;
• A list of existing deepfake datasets contributing to the improvement of the perfor-
mance, generalization and robustness of deepfake detection models;
• A discussion of the limitations of existing techniques, challenges, and research direc-
tions in the field of deepfake detection and mitigation.
The remainder of this paper is organized as follows. Section 2 provides an outline of
the most recent, existing survey papers related to deepfake technology. Section 3 is devoted
to deepfake manipulation techniques for generating deepfakes. Section 4 describes existing
deepfake detection techniques. Section 5 gives a list of existing datasets used for deepfake
research. In Section 6, we discuss some of the challenges and future research directions of
the deepfake field. Finally, the survey ends with a conclusion.
2. Related Surveys
Multiple surveys of the literature in the area of deepfake detection have been published
in recent years as the topic is advancing rapidly. For instance, the authors of [6] offered a
systematic literature review with a new, interdisciplinary viewpoint on deepfakes. They
provided a meticulous definition of deepfakes and discussed the impact of the creation
and spread of deepfakes. They also suggested future research directions for innovation.
Alternatively, the authors of [7] provided a rich review paper that has an exhaustive
breakdown of deepfake types alongside the technology leveraged in their creation and
detection, as well as open-source datasets and future trends in deepfake technology. In
Ref. [8], the authors focused in their systematic review on deepfake detection-technology.
They include machine learning and deep learning methods alongside statistical techniques
and blockchain-based techniques, assessed how well each method performs when applied
to diverse datasets, and offered some recommendations on deepfake detection that may aid
future studies. In Ref. [9], the author presented recent deepfake advancements, covering
four face manipulation types—generation, detection methods, and future prospects.
In Ref. [10], the authors explored the background and methods of deepfakes before
looking at the development of improved and resilient deep learning techniques to combat
their use. In Ref. [11], the authors provided a survey with an extensive summary of
deepfake manipulation types, the tools and technology used to generate deepfakes, a
technical background of deep learning models and public datasets, and the challenges
faced in the creation and detection of deepfakes. Whereas, in [12], the authors presented a
detailed review of deepfake manipulation types and their generation processes, as well as
several detection methods and the features leveraged, alongside some issues that demand
serious consideration in future studies. The authors of [13] offered in their survey a technical
background on the architecture used in deepfake creation that deals with two manipulation
types: reenactment and replacement. In addition to detection technologies and prevention
solutions, they mentioned several inadequacies of the available defense options and areas
that need more focus.
In a detailed survey [14], the authors covered several topics of deepfake manipulation,
including audio deepfakes, the technology used in its creation and detection, performance
metrics, and publicly available datasets, in addition to a discussion about the limitations
and future trends in the field of deepfakes. An analysis of several CNN- and RNN-based
Computers 2023, 12, 216 3 of 26
deepfake video detection models was described in [15]. In addition, other surveys [16–20]
offered a boiled-down summary of the principal elements in the field of deepfakes, such as
their definition, impact, creation process, and detection methods. Table 1 gives a summary
of topics covered and not covered by the above-mentioned survey papers.
Table 1. Cont.
3. Deepfake Generation
In this section, we will first state the various types of deepfake manipulations and
then deliver an overview of deepfake generation techniques.
modifies specific facial attributes in images while preserving the person’s identity. The
model can work with labeled and unlabeled data and has shown promising results in
accurately controlling attribute changes. Another technique is StarGANv2 [43], which
is able to perform multi-domain image-to-image translation, enabling the generation of
images across multiple different domains using a single unified model. Unlike the original
StarGAN [44], which could only perform one-to-one translation between each pair of
domains, StarGANv2 [43] can handle multiple domains simultaneously. An additional
GAN variant is CycleGAN [45], which specializes in style transfer between two domains.
It can be applied to transfer facial features from one individual to another, making it useful
for face-swapping applications. Moreover, there is RSGAN [46], which can encode the
appearances of faces and hair into underlying latent space representations, enabling the
image appearances to be modified by manipulating the representations in the latent spaces.
For a given audio input, LipGAN [47] is intended to produce realistic lip motions and
speech synchronization.
In addition to the previously mentioned methods, there is a range of open-source
tools readily available for digital use, enabling users to create deep fakes with relative ease,
like FaceApp [48], Reface [49], DeepBrain [50], DeepFaceLab [51], and Deepfakes Web [52].
These tools have captured the public’s attention due to their accessibility and ability to
produce convincing deepfakes. It is essential for users to utilize these tools responsibly
and ethically to avoid spreading misinformation or engaging in harmful activities. As
artificial intelligence is developing fast, deepfake generation algorithms are simultaneously
becoming more sophisticated, convincing, and hard to detect.
4. Deepfake Detection
This section will point out the diverse clues and detection models exploited to achieve
the task of classifying fake media from genuine ones. Next, it will delve into the various
state-of-the-art deep learning architectures implemented in deepfake detection techniques
and provide a summary of several recent deepfake detection models.
Figure
Figure 2. Clues and 2. Clues
features and features
employed employeddetection
by deepfake by deepfake detection
models models
in the in the identification
identification of deep- of deep
fake content. fake content.
4.1.1. Detection4.1.1.
Based on Spatial
Detection Artifacts
Based on Spatial Artifacts
To effectively To useeffectively
face landmark use face landmark information,
information, in Ref. [53],inLiang
Ref. [53],
et al.Liang et al. described
described a
facial geometryfacial
priorgeometry
module.prior module.
The model The model
harnesses harnesses
facial maps facial maps and correlation
and correlation within thewithin th
frequency domain frequency
to studydomain to study the distinguishing
the distinguishing traits of altered traitsand
of altered and unmanipulated
unmanipulated regions region
by employing by employing a CNN-LSTM
a CNN-LSTM order toInpredict
network. Innetwork. order tomanipulation
predict manipulation localization,
localization, a a de
codertoisacquire
decoder is utilized utilizedthe to mapping
acquire the mapping
from from low-resolution
low-resolution feature maps feature maps to pixel-leve
to pixel-level
details, and
details, and SoftMax SoftMax
function wasfunction was implemented
implemented for the classification
for the classification task. Atask. A different ap
different
approach, dubbed proach, dubbed
forensic forensic symmetry,
symmetry, by Li, G. by Li, [54],
et al. G. et assessed
al. [54], assessed
whether whether the natural fea
the natural
turesofofmirrored
features of a pair a pair of mirrored facial regions
facial regions are identical
are identical or dissimilar.
or dissimilar. TheThe symmetry attribut
symmetry
attribute extracted from frontal facial images and the resemblance feature obtained from
extracted from frontal facial images and the resemblance feature obtained fromprofiles o
the face images are obtained by a multi-stream learning structure that uses DRN as it
profiles of the face images are obtained by a multi-stream learning structure that uses DRN
backbone network. The difference between the two symmetrical face patches is then quan
as its backbone network. The difference between the two symmetrical face patches is then
tified by mapping them into angular hyperspace. A heuristic prediction technique wa
quantified by mapping them into angular hyperspace. A heuristic prediction technique was
used to put this model into functioning at the video level. As a further step, a multi-margin
used to put this model into functioning at the video level. As a further step, a multi-margin
angular loss function was developed for classification.
angular loss function was developed for classification.
Hu et al. [55] proposed DeepfakeMAE which is a detection model that can leverag
Hu et al. [55] proposed DeepfakeMAE
the commonalities which
across all facial is a detection
components. To be model
morethat can leverage
specific, a masked autoen
the commonalities across all facial components. To be more specific,
coder is pretrained to learn facial part consistency by randomly masking a masked autoencoder
some facial fea
is pretrained totures
learnandfacial part consistency
rebuilding by randomly
missing sections usingmasking
the facialsome
partsfacial features
that are and This i
still visible.
rebuilding missing sections using the facial parts that are still visible. This
performed given a real face image. Moreover, a model employing two networks, both is performed
given a real face image.
utilizing Moreover,
pre-trained a model
encoders andemploying
decoders, istwo networks,
leveraged both utilizing
to optimize the differentiation
pre-trained encoders and decoders, is leveraged to optimize the differentiation between
authentic and counterfeit videos. Yang, J. et al. [56] tackled deepfake detection from a
different perspective where they simulate the fake image generation process to explore
Computers 2023, 12, 216 8 of 26
StyleGAN2 [69]. Next, naive classifiers are trained to differentiate between real images and
those produced by these designs.
temporal consistency by rapidly scanning the entire video, and a second pathway improved
by an attention branch to analyze key frames of the video at a lower rate.
Figure3.
Figure Overview ofofpredominant
3. Overview predominant deep learning
deep architectures,
learning networks,
architectures, and frameworks
networks, employed
and frameworks em-
in the development
ployed of deepfake
in the development detection
of deepfake models. models.
detection
The Convolutional
Furthermore, Neural Network
data augmentation is a apowerful
plays crucial roledeep inlearning
trainingalgorithm designed
deep learning-based
for image recognition and processing tasks. It consists of various levels,
detection models. This technique involves augmenting the training dataset with synthetic encompassing
convolutional layers, pooling layers, and fully connected layers. There are different types of
or modified samples, which enhances the model’s capacity to generalize and recognize
CNN models used in deepfake detection such as ResNet [93], short for Residual Network,
diverse variations of deepfake media. Data augmentation enables the model to learn from
which is an architecture that introduces skip connections to fix the vanishing gradient
a wider range of examples and improves its robustness against different types of manip-
problem that occurs when the gradient diminishes significantly during backpropagation;
ulations. Attention mechanisms have also proven to be valuable additions to deep learn-
these connections involve stacking identity mappings and skipping them, utilizing the
ing-based detection
layer’s prior models.
activations. ThisBy directingaccelerates
technique the model’s firstfocus toward
training relevantthe
by reducing features
numberand
x FOR PEER REVIEW regions
of layersofin the input data,
the network. Theattention mechanisms
concept underlying this enhance
network isthe model’s
different fromdiscriminative
having the
power and improve its overall accuracy. These mechanisms help the
layers learn the fundamental mapping. Rather than directly defining the initial mappingmodel select critical
as
details [92],
H(x), we letmaking it more
the network effective
adapt in distinguishing
and determine it, as shown between real4.and fake media. Col-
in Figure
lectively, the combination of deep learning-based architectures, meta-learning, data aug-
mentation, and attention F(x): = H(x) − x has
mechanisms which gives H(x):advanced
significantly = F(x) + x.the field of deepfake de-
tection. These technologies work in harmony to equip models with the ability to identify
and flag manipulated media with unprecedented accuracy.
The Convolutional Neural Network is a powerful deep learning algorithm designed
for image recognition and processing tasks. It consists of various levels, encompassing
convolutional layers, pooling layers, and fully connected layers. There are different types
of CNN models used in deepfake detection such as ResNet [93], short for Residual Net-
work, which is an architecture that introduces skip connections to fix the vanishing gradi-
ent problem that occurs when the gradient diminishes significantly during backpropaga-
tion; these connections involve stacking identity mappings and skipping them, utilizing
the layer’s prior activations. This technique accelerates first training by reducing the num-
ber of layers in the network. The concept underlying this network is different from having
the layers learn the fundamental mapping. Rather than directly defining the initial map-
ping as H(x), we let the network adapt and determine it, as shown in Figure 4.
F(x): = H(x) − x which gives H(x): = F(x) + x.
Figure 4. ResNet building block (source: [93]).
Figure 4. ResNet building block (source: [93]).
Inception [89] models help mitigate the computational cost and other overfitting in
CNN architectures by utilizing stacked 1 × 1 convolutions for dimensionality reduction.
Xception [94], developed by researchers at Google, is an advanced version of the Inception
architecture. It offers a novel approach by reinterpreting Inception modules as an interme-
diate step between standard convolution and depthwise separable convolution. While the
conventional convolution operation combines channel-wise and spatial-wise computations
in a single step, depthwise separable convolution divides this process into two distinct
steps. Firstly, it employs depthwise convolution to apply an individual convolutional filter
to each input channel, and subsequently, pointwise convolution is employed to create a
linear combination of the results obtained from the depthwise convolution.
An alternative to CNNs would be Capsule Networks [90] that are able to retrieve
spatial information as well as other important details to avoid the information loss seen
during pooling operations. Capsules exhibit equivariance characteristics and consist of a
neural network that handles vectors as inputs and outputs, in contrast to the scalar values
processed by CNNs. This unique attribute of capsules enables them to capture not only the
features of an image, but also its deformations and various viewing conditions. Within a
capsule network, each capsule comprises a cluster of neurons, with each neuron’s output
signifying a distinct attribute of the same feature. This structure offers the advantage of
recognizing the entire entity by first identifying its constituent parts.
Recurrent Neural Networks are a kind of neural network that handles sequential data
by feeding it in a sequential manner. They are specifically designed to tackle the challenge
of time-series data, where the input is a sequence of data points. In an RNN, the input not
only includes the current data point but also the previous ones. This creates a directed graph
structure between the nodes, following the temporal sequence of the data. Additionally,
each neuron in an RNN has its own internal memory, which retains information from the
computations performed on the previous data points. LSTM, or Long Short-Term Memory,
is a specific type of recurrent neural network that addresses the challenge of long-term
dependencies in sequential data by allowing more accurate predictions based on recent
information. While traditional RNNs struggle as the gap between relevant information
increases, LSTM networks excel at retaining information over extended periods. This
capability makes LSTM particularly effective for processing, predicting, and classifying
time-series data.
A new model that has emerged as a strong alternative to convolutional neural networks
is the vision transformer [91]. ViT models exhibit exceptional performance, surpassing the
state-of-the-art CNNs by nearly four times in both computational efficiency and accuracy.
Transformers, which are non-sequential deep learning models, play a significant role in
vision transformers. They utilize the self-attention mechanism, assigning varying degrees
of importance to different segments of the input data. The Swin Transformer [95] is a
type of ViTs that exhibits versatility in modeling at different scales and maintains linear
computational complexity concerning image size. This advantageous combination of
features enables the Swin Transformer to be well suited for a wide array of vision tasks,
encompassing image classification, object detection, and semantic segmentation, among
others. Another variant of transformers is Video Transformers [96], which are efficient for
evaluating videos on a large scale, ensuring optimal utilization of computational resources
and reduced wall runtime. This capability enables full video processing during test time,
making VTNs particularly well-suited for handling lengthy videos. Table 2 shows some of
the recent detection techniques.
Computers 2023, 12, 216 14 of 26
Table 2. Summary of recent deepfake detection models, employed techniques, feature sets, datasets,
and intra-dataset performance results.
Table 2. Cont.
Figure 6. Frequency of usage of different deepfake datasets in the discussed detection models within
Figure
this 6. Frequency of usage of different deepfake datasets in the discussed detection models within
survey.
this survey.
Computers 2023, 12, 216 17 of 26
Table 3. Key characteristics of the most prominent and publicly available deepfake datasets.
FaceForensics++ [118] is a well-known dataset used for deepfake detection that was
provided in 2019 as an addition to the FaceForensics dataset, which was made available in
2018 and only included videos with altered facial expressions. Four subsets of the FF++
dataset are available: FaceSwap, Deepfake, Face2Face, and NeuralTextures. It includes 3000
edited videos in addition to 1000 original videos that were pulled from the YouTube-8M
dataset. The dataset can be used to test deepfake detection strategies on both compressed
and uncompressed videos because it is supplied in two different quality levels. The FF++
dataset has limits when it comes to spotting lip-sync deepfakes, and some videos might
have color discrepancies near the modified faces.
DFDC [123], the deepfake detection challenge dataset hosted by Facebook, stands
as the most extensive collection of face swap videos available and openly accessible. It
contains over 100,000 total clips sourced from 3426 paid actors from diverse backgrounds,
including different genders, ages, and ethnic groups.
DeeperForensics-1.0 [126] is a significant dataset available for detecting deepfakes
that contains 50,000 original clips and 10,000 forged ones. These manipulated videos
were generated using a conditional autoencoder called DF-VAE. The dataset includes a
broad range of actor appearances and is designed to represent real-world scenarios more
accurately by including a blend of alterations and disturbances, including compression,
blurriness, noise, and other visual anomalies.
WildDeepfake [127] is a dataset that is widely recognized as a difficult one for deep-
fake detection. It features both authentic and deepfake samples obtained from the internet,
which distinguishes it from other available datasets. While previous datasets have only
included synthesized facial images, this dataset includes a variety of body types. How-
ever, there remains a need for a more comprehensive dataset that can generate full-body
deepfakes to improve the robustness of deepfake detection models.
Computers 2023, 12, 216 18 of 26
and Temporal Forgery Localization. It consists of 2.9 million images, 221,247 videos and
15 manipulation methods.
For the predominant focus on a single modality and limited coverage of forgery
methods, current datasets for deepfake detection are primarily constrained when it comes
to audio-visual deepfakes. DefakeAVMiT [62] is a dataset includes an ample amount of
deepfake visuals paired with corresponding audios and generated by various deepfake
methods affecting either modality. Alternatively, LAV-DF [61] consists of content-driven
manipulations to help with the detection of content altering fake segments in videos due to
the lack of suitable datasets for this task. It is important to note that the availability and
creation of datasets are ongoing processes, with new datasets being introduced and existing
ones being expanded or refined over time. The continuous development of diverse and
representative datasets is crucial to ensure the robustness and generalizability of deepfake
detection algorithms, as well as to keep up with the evolving techniques employed by
malicious actors.
approaches have emerged within the field of deepfakes that aim to not only identify these
manipulated media but also provide effective means to mitigate and defend against them.
These multifaceted approaches serve the purpose of not only detecting deepfakes but also
hindering their creation and curbing their rapid dissemination across various platforms.
One prominent avenue of exploration in combating deepfakes involves the incorporation
of adversarial perturbations to obstruct the creation of deepfakes. An alternative method
involves employing digital watermarking, which discreetly embeds data or signatures
within digital content to safeguard its integrity and authenticity. Additionally, blockchain
technology offers a similar solution by generating a digital signature for the content and
storing it on the blockchain, enabling the verification of any alterations or manipulations to
the content.
Moreover, increasing public knowledge of the existence and potential risks linked with
deepfakes is essential. Education and media literacy initiatives can educate users on how to
critically evaluate digital content, recognize signs of manipulation, and verify the authentic-
ity of media before sharing or believing its content. By empowering individuals to be more
discerning consumers of information, the impact of deepfakes can be mitigated. Lastly,
governments and policymakers are working to develop regulations and laws that address
the misuse of deepfake technology. These policies and legislative measures aim to prevent
the creation and dissemination of malicious deepfakes, establish liability frameworks for
their creation and distribution, and protect individuals’ rights and privacy.
7. Conclusions
In conclusion, deepfake videos will get harder to detect as AI algorithms become more
sophisticated. This survey paper has provided a comprehensive overview encompassing
the realm of deepfake generation, the spectrum of deep learning architectures employed
in detection, the latest advances in detection techniques, and the pivotal datasets tailored
to advance this field of study, all in order to stay one step ahead in the race with genera-
tive artificial intelligence, curb the spread of false information, safeguard the integrity of
digital content, and stop the damage that deepfakes can cause on a social, political, and
economic level. The survey has also highlighted the importance of continued research and
development in deepfake detection techniques. Despite the issues presented by deepfakes,
this technology nevertheless shows potential for artistic uses in virtual communication,
entertainment, and visual effects. Future work must continue to focus on finding a balance
between utilizing deepfakes’ beneficial potential and reducing their negative effects.
Author Contributions: Conceptualization, A.N., M.R., F.S. and N.K.; methodology, A.N.; formal
analysis, A.N.; investigation, A.N.; writing—original draft preparation, A.N.; writing—review and
editing M.R., N.K. and F.S.; supervision, M.R., F.S. and N.K. All authors have read and agreed to the
published version of the manuscript.
Funding: This research received no external funding.
Data Availability Statement: No new data were created or analyzed in this study. Data sharing is
not applicable to this article.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Hancock, J.T.; Bailenson, J.N. The Social Impact of Deepfakes. Cyberpsychol. Behav. Soc. Netw. 2021, 24, 149–152. [CrossRef]
[PubMed]
2. Giansiracusa, N. How Algorithms Create and Prevent Fake News: Exploring the Impacts of Social Media, Deepfakes, GPT-3, and More;
Apress: Berkeley, CA, USA, 2021; ISBN 978-1-4842-7154-4.
3. Fallis, D. The Epistemic Threat of Deepfakes. Philos. Technol. 2021, 34, 623–643. [CrossRef] [PubMed]
4. Karnouskos, S. Artificial Intelligence in Digital Media: The Era of Deepfakes. IEEE Trans. Technol. Soc. 2020, 1, 138–147. [CrossRef]
5. Ridouani, M.; Benazzouza, S.; Salahdine, F.; Hayar, A. A Novel Secure Cooperative Cognitive Radio Network Based on Chebyshev
Map. Digit. Signal Process. 2022, 126, 103482. [CrossRef]
Computers 2023, 12, 216 21 of 26
6. Whittaker, L.; Mulcahy, R.; Letheren, K.; Kietzmann, J.; Russell-Bennett, R. Mapping the Deepfake Landscape for Innovation: A
Multidisciplinary Systematic Review and Future Research Agenda. Technovation 2023, 125, 102784. [CrossRef]
7. Seow, J.W.; Lim, M.K.; Phan, R.C.W.; Liu, J.K. A Comprehensive Overview of Deepfake: Generation, Detection, Datasets, and
Opportunities. Neurocomputing 2022, 513, 351–371. [CrossRef]
8. Rana, M.S.; Nobi, M.N.; Murali, B.; Sung, A.H. Deepfake Detection: A Systematic Literature Review. IEEE Access 2022, 10,
25494–25513. [CrossRef]
9. Akhtar, Z. Deepfakes Generation and Detection: A Short Survey. J. Imaging 2023, 9, 18. [CrossRef]
10. Ahmed, S.R.; Sonuç, E.; Ahmed, M.R.; Duru, A.D. Analysis Survey on Deepfake Detection and Recognition with Convolutional
Neural Networks. In Proceedings of the 2022 International Congress on Human-Computer Interaction, Optimization and Robotic
Applications (HORA), Ankara, Turkey, 9–11 June 2022; pp. 1–7.
11. Malik, A.; Kuribayashi, M.; Abdullahi, S.M.; Khan, A.N. DeepFake Detection for Human Face Images and Videos: A Survey.
IEEE Access 2022, 10, 18757–18775. [CrossRef]
12. Yu, P.; Xia, Z.; Fei, J.; Lu, Y. A Survey on Deepfake Video Detection. IET Biom. 2021, 10, 607–624. [CrossRef]
13. Mirsky, Y.; Lee, W. The Creation and Detection of Deepfakes: A Survey. ACM Comput. Surv. 2021, 54, 1–41. [CrossRef]
14. Masood, M.; Nawaz, M.; Malik, K.M.; Javed, A.; Irtaza, A. Deepfakes Generation and Detection: State-of-the-Art, Open Challenges,
Countermeasures, and Way forward. Appl. Intell. 2023, 53, 3974–4026. [CrossRef]
15. Das, A.; Viji, K.S.A.; Sebastian, L. A Survey on Deepfake Video Detection Techniques Using Deep Learning. In Proceedings of the
2022 Second International Conference on Next Generation Intelligent Systems (ICNGIS), Kerala, India, 29–31 July 2022; pp. 1–4.
16. Lin, K.; Han, W.; Gu, Z.; Li, S. A Survey of DeepFakes Generation and Detection. In Proceedings of the 2021 IEEE Sixth
International Conference on Data Science in Cyberspace (DSC), Shenzhen, China, 9–11 October 2021; pp. 474–478.
17. Chauhan, R.; Popli, R.; Kansal, I. A Comprehensive Review on Fake Images/Videos Detection Techniques. In Proceedings of
the 2022 10th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions)
(ICRITO), Noida, India, 13–14 October 2022; pp. 1–6.
18. Khichi, M.; Kumar Yadav, R. A Threat of Deepfakes as a Weapon on Digital Platform and Their Detection Methods. In Proceedings
of the 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT), Khargpur,
India, 6–8 July 2021; pp. 1–8.
19. Chaudhary, S.; Saifi, R.; Chauhan, N.; Agarwal, R. A Comparative Analysis of Deep Fake Techniques. In Proceedings of the 2021
3rd International Conference on Advances in Computing, Communication Control and Networking (ICAC3N), Greater Noida,
India, 17–18 December 2021; pp. 300–303.
20. Younus, M.A.; Hasan, T.M. Abbreviated View of Deepfake Videos Detection Techniques. In Proceedings of the 2020 6th
International Engineering Conference “Sustainable Technology and Development” (IEC), Erbil, Iraq, 26–27 February 2020;
pp. 115–120.
21. Sudhakar, K.N.; Shanthi, M.B. Deepfake: An Endanger to Cyber Security. In Proceedings of the 2023 International Conference on
Sustainable Computing and Smart Systems (ICSCSS), Coimbatore, India, 10–12 July 2023; pp. 1542–1548.
22. Salman, S.; Shamsi, J.A.; Qureshi, R. Deep Fake Generation and Detection: Issues, Challenges, and Solutions. IT Prof. 2023, 25,
52–59. [CrossRef]
23. Khder, M.A.; Shorman, S.; Aldoseri, D.T.; Saeed, M.M. Artificial Intelligence into Multimedia Deepfakes Creation and Detection.
In Proceedings of the 2023 International Conference on IT Innovation and Knowledge Discovery (ITIKD), Manama, Bahrain, 8–9
March 2023; pp. 1–5.
24. Kandari, M.; Tripathi, V.; Pant, B. A Comprehensive Review of Media Forensics and Deepfake Detection Technique. In Proceedings
of the 2023 10th International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India,
15–17 March 2023; pp. 392–395.
25. Boutadjine, A.; Harrag, F.; Shaalan, K.; Karboua, S. A Comprehensive Study on Multimedia DeepFakes. In Proceedings of the
2023 International Conference on Advances in Electronics, Control and Communication Systems (ICAECCS), BLIDA, Algeria,
6–7 March 2023; pp. 1–6.
26. Mallet, J.; Dave, R.; Seliya, N.; Vanamala, M. Using Deep Learning to Detecting Deepfakes. In Proceedings of the 2022 9th
International Conference on Soft Computing & Machine Intelligence (ISCMI), Toronto, ON, Canada, 26–27 November 2022;
pp. 1–5.
27. Alanazi, F. Comparative Analysis of Deep Fake Detection Techniques. In Proceedings of the 2022 14th International Conference on
Computational Intelligence and Communication Networks (CICN), Al-Khobar, Saudi Arabia, 4–6 December 2022; pp. 119–124.
28. Xinwei, L.; Jinlin, G.; Junnan, C. An Overview of Face Deep Forgery. In Proceedings of the 2021 International Conference on
Computer Engineering and Application (ICCEA), Nanjing, China, 25–27 June 2021; pp. 366–370.
29. Weerawardana, M.; Fernando, T. Deepfakes Detection Methods: A Literature Survey. In Proceedings of the 2021 10th International
Conference on Information and Automation for Sustainability (ICIAfS), Negambo, Sri Lanka, 11–13 August 2021; pp. 76–81.
30. Swathi, P.; Sk, S. DeepFake Creation and Detection:A Survey. In Proceedings of the 2021 Third International Conference on
Inventive Research in Computing Applications (ICIRCA), Coimbatore, India, 2–4 September 2021; pp. 584–588.
31. Zhang, T.; Deng, L.; Zhang, L.; Dang, X. Deep Learning in Face Synthesis: A Survey on Deepfakes. In Proceedings of the 2020
IEEE 3rd International Conference on Computer and Communication Engineering Technology (CCET), Beijing, China, 14–16
August 2020; pp. 67–70.
Computers 2023, 12, 216 22 of 26
32. Shi, Y.; Liu, X.; Wei, Y.; Wu, Z.; Zuo, W. Retrieval-Based Spatially Adaptive Normalization for Semantic Image Synthesis. In
Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA,
18–24 June 2022; pp. 11214–11223.
33. Liu, M.; Ding, Y.; Xia, M.; Liu, X.; Ding, E.; Zuo, W.; Wen, S. STGAN: A Unified Selective Transfer Network for Arbitrary Image
Attribute Editing. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long
Beach, CA, USA, 15–20 June 2019; pp. 3668–3677.
34. Li, L.; Bao, J.; Yang, H.; Chen, D.; Wen, F. FaceShifter: Towards High Fidelity And Occlusion Aware Face Swapping. arXiv 2020,
arXiv:1912.13457.
35. Robust and Real-Time Face Swapping Based on Face Segmentation and CANDIDE-3. Available online: https://www.
springerprofessional.de/robust-and-real-time-face-swapping-based-on-face-segmentation-an/15986368 (accessed on 18
July 2023).
36. Ferrara, M.; Franco, A.; Maltoni, D. The Magic Passport. In Proceedings of the IEEE International Joint Conference on Biometrics,
Clearwater, FL, USA, 29 September–2 October 2014; pp. 1–7.
37. Thies, J.; Zollhöfer, M.; Stamminger, M.; Theobalt, C.; Nießner, M. Face2Face: Real-Time Face Capture and Reenactment of RGB
Videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020.
38. Zhang, J.; Zeng, X.; Wang, M.; Pan, Y.; Liu, L.; Liu, Y.; Ding, Y.; Fan, C. FReeNet: Multi-Identity Face Reenactment. In Proceedings
of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020;
IEEE: Seattle, WA, USA, 2020; pp. 5325–5334.
39. Wang, Y.; Song, L.; Wu, W.; Qian, C.; He, R.; Loy, C.C. Talking Faces: Audio-to-Video Face Generation. In Handbook of Digital Face
Manipulation and Detection: From DeepFakes to Morphing Attacks; Rathgeb, C., Tolosana, R., Vera-Rodriguez, R., Busch, C., Eds.;
Advances in Computer Vision and Pattern Recognition; Springer International Publishing: Cham, Switzerland, 2022; pp. 163–188,
ISBN 978-3-030-87664-7.
40. Ledig, C.; Theis, L.; Huszár, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z.; et al. Photo-
Realistic Single Image Super-Resolution Using a Generative Adversarial Network. In Proceedings of the 2017 IEEE Conference
on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 105–114.
41. He, Z.; Zuo, W.; Kan, M.; Shan, S.; Chen, X. AttGAN: Facial Attribute Editing by Only Changing What You Want. IEEE Trans.
Image Process. 2019, 28, 5464–5478. [CrossRef] [PubMed]
42. Karras, T.; Laine, S.; Aila, T. A Style-Based Generator Architecture for Generative Adversarial Networks. In Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–19 June 2019.
43. Choi, Y.; Uh, Y.; Yoo, J.; Ha, J.-W. StarGAN v2: Diverse Image Synthesis for Multiple Domains. In Proceedings of the 2020
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; IEEE: Seattle,
WA, USA, 2020; pp. 8185–8194.
44. Choi, Y.; Uh, Y.; Yoo, J.; Ha, J.-W. StarGAN v2: Diverse Image Synthesis for Multiple Domains. Available online: https:
//arxiv.org/abs/1912.01865v2 (accessed on 8 October 2023).
45. Zhu, J.-Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In
Proceedings of the IEEE International Conference on Computer Vision, Seattle, WA, USA, 13–19 June 2020.
46. Natsume, R.; Yatagawa, T.; Morishima, S. RSGAN: Face Swapping and Editing Using Face and Hair Representation in Latent
Spaces. In Proceedings of the ACM SIGGRAPH 2018 Posters, Vancouver, BC, Canada, 12 August 2018; pp. 1–2.
47. Prajwal, K.R.; Mukhopadhyay, R.; Philip, J.; Jha, A.; Namboodiri, V.; Jawahar, C.V. Towards Automatic Face-to-Face Translation.
In Proceedings of the 27th ACM International Conference on Multimedia, Nice, France, 15 October 2019; Association for Computing
Machinery: New York, NY, USA, 2019; pp. 1428–1436.
48. FaceApp: Face Editor. Available online: https://www.faceapp.com/ (accessed on 5 October 2023).
49. Reface—AI Face Swap App & Video Face Swaps. Available online: https://reface.ai/ (accessed on 5 October 2023).
50. DeepBrain AI—Best AI Video Generator. Available online: https://www.deepbrain.io/ (accessed on 5 October 2023).
51. Perov, I.; Gao, D.; Chervoniy, N.; Liu, K.; Marangonda, S.; Umé, C.; Dpfks, M.; Facenheim, C.S.; RP, L.; Jiang, J.; et al. DeepFaceLab:
Integrated, Flexible and Extensible Face-Swapping Framework. arXiv 2021, arXiv:2005.05535.
52. Make Your Own Deepfakes [Online App]. Available online: https://deepfakesweb.com/ (accessed on 5 October 2023).
53. Liang, P.; Liu, G.; Xiong, Z.; Fan, H.; Zhu, H.; Zhang, X. A Facial Geometry Based Detection Model for Face Manipulation Using
CNN-LSTM Architecture. Inf. Sci. 2023, 633, 370–383. [CrossRef]
54. Li, G.; Zhao, X.; Cao, Y. Forensic Symmetry for DeepFakes. IEEE Trans. Inf. Forensics Secur. 2023, 18, 1095–1110. [CrossRef]
55. Hu, J.; Liao, X.; Gao, D.; Tsutsui, S.; Qin, Z.; Shou, M.Z. DeepfakeMAE: Facial Part Consistency Aware Masked Autoencoder for
Deepfake Video Detection. arXiv 2023, arXiv:2303.01740. [CrossRef]
56. Yang, J.; Xiao, S.; Li, A.; Lu, W.; Gao, X.; Li, Y. MSTA-Net: Forgery Detection by Generating Manipulation Trace Based on
Multi-Scale Self-Texture Attention. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 4854–4866. [CrossRef]
57. Wang, J.; Wu, Z.; Ouyang, W.; Han, X.; Chen, J.; Lim, S.-N.; Jiang, Y.-G. M2TR: Multi-Modal Multi-Scale Transformers for
Deepfake Detection. In Proceedings of the ICMR—International Conference on Multimedia Retrieval, Newark, NJ, USA, 27–30 June 2022;
Association for Computing Machinery, Inc.: New York, NY, USA, 2022; pp. 615–623.
58. Xiao, S.; Yang, J.; Lv, Z. Protecting the Trust and Credibility of Data by Tracking Forgery Trace Based on GANs. Digit. Commun.
Netw. 2022, 8, 877–884. [CrossRef]
Computers 2023, 12, 216 23 of 26
59. Li, Y.; Chang, M.-C.; Lyu, S. In Ictu Oculi: Exposing AI Created Fake Videos by Detecting Eye Blinking. In Proceedings of the
International Workshop on Information Forensics and Security, WIFS, Hong Kong, China, 11–13 December 2018; Institute of
Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2019.
60. Hernandez-Ortega, J.; Tolosana, R.; Fierrez, J.; Morales, A. DeepFakesON-Phys: Deepfakes Detection Based on Heart Rate
Estimation. arXiv 2020, arXiv:2010.00400.
61. Cai, Z.; Stefanov, K.; Dhall, A.; Hayat, M. Do You Really Mean That? Content Driven Audio-Visual Deepfake Dataset and
Multimodal Method for Temporal Forgery Localization: Anonymous Submission Paper ID 73. In Proceedings of the International
Conference on Digital Image Computing: Techniques and Applications, DICTA, Sydney, Australia, 30 November–2 December
2022; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2022.
62. Yang, W.; Zhou, X.; Chen, Z.; Guo, B.; Ba, Z.; Xia, Z.; Cao, X.; Ren, K. AVoiD-DF: Audio-Visual Joint Learning for Detecting
Deepfake. IEEE Trans. Inf. Forensics Secur. 2023, 18, 2015–2029. [CrossRef]
63. Ilyas, H.; Javed, A.; Malik, K.M. AVFakeNet: A Unified End-to-End Dense Swin Transformer Deep Learning Model for Audio–
Visual Deepfakes Detection. Appl. Soft Comput. 2023, 136, 110124. [CrossRef]
64. Huang, Y.; Juefei-Xu, F.; Guo, Q.; Liu, Y.; Pu, G. FakeLocator: Robust Localization of GAN-Based Face Manipulations. IEEE Trans.
Inf. Forensics Secur. 2022, 17, 2657–2672. [CrossRef]
65. Chen, H.; Li, Y.; Lin, D.; Li, B.; Wu, J. Watching the BiG Artifacts: Exposing DeepFake Videos via Bi-Granularity Artifacts. Pattern
Recogn. 2023, 135, 109179. [CrossRef]
66. Guarnera, L.; Giudice, O.; Battiato, S. DeepFake Detection by Analyzing Convolutional Traces. In Proceedings of the 2020
IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, 14 June 2020;
pp. 2841–2850.
67. Cho, W.; Choi, S.; Park, D.K.; Shin, I.; Choo, J. Image-to-Image Translation via Group-Wise Deep Whitening-and-Coloring
Transformation. Available online: https://arxiv.org/abs/1812.09912v2 (accessed on 8 October 2023).
68. Choi, Y.; Choi, M.; Kim, M.; Ha, J.-W.; Kim, S.; Choo, J. StarGAN: Unified Generative Adversarial Networks for Multi-Domain
Image-to-Image Translation. Available online: https://arxiv.org/abs/1711.09020v3 (accessed on 8 October 2023).
69. Karras, T.; Laine, S.; Aittala, M.; Hellsten, J.; Lehtinen, J.; Aila, T. Analyzing and Improving the Image Quality of StyleGAN.
Available online: https://arxiv.org/abs/1912.04958v2 (accessed on 8 October 2023).
70. Agarwal, S.; Hu, L.; Ng, E.; Darrell, T.; Li, H.; Rohrbach, A. Watch Those Words: Video Falsification Detection Using
Word-Conditioned Facial Motion. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision,
WACV, Waikoloa, HI, USA, 2–7 January 2023; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2023;
pp. 4699–4708.
71. Dong, X.; Bao, J.; Chen, D.; Zhang, T.; Zhang, W.; Yu, N.; Chen, D.; Wen, F.; Guo, B. Protecting Celebrities from DeepFake with
Identity Consistency Transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,
New Orleans, LA, USA, 18–24 June 2022; IEEE Computer Society: Washington, DC, USA, 2022; Volume 2022, pp. 9458–9468.
72. Nirkin, Y.; Wolf, L.; Keller, Y.; Hassner, T. DeepFake Detection Based on Discrepancies Between Faces and Their Context. IEEE
Trans. Pattern Anal. Mach. Intell. 2022, 44, 6111–6121. [CrossRef]
73. Liu, B.; Liu, B.; Ding, M.; Zhu, T.; Yu, X. TI2Net: Temporal Identity Inconsistency Network for Deepfake Detection. In Proceedings
of the IEEE/CVF Winter Conference on Applications of Computer Vision, WACV, Waikoloa, HI, USA, 2–7 January 2023; Institute
of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2023; pp. 4680–4689.
74. Hosler, B.; Salvi, D.; Murray, A.; Antonacci, F.; Bestagini, P.; Tubaro, S.; Stamm, M.C. Do Deepfakes Feel Emotions? A Semantic
Approach to Detecting Deepfakes via Emotional Inconsistencies. In Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; IEEE Computer Society: Washington, DC, USA, 2021;
pp. 1013–1022.
75. Conti, E.; Salvi, D.; Borrelli, C.; Hosler, B.; Bestagini, P.; Antonacci, F.; Sarti, A.; Stamm, M.C.; Tubaro, S. Deepfake Speech Detection
through Emotion Recognition: A Semantic Approach. In Proceedings of the ICASSP 2022-2022 IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP), Singapore, 22–27 May 2022; Institute of Electrical and Electronics Engineers
Inc.: Piscataway, NJ, USA, 2022; Volume 2022, pp. 8962–8966.
76. Zheng, Y.; Bao, J.; Chen, D.; Zeng, M.; Wen, F. Exploring Temporal Coherence for More General Video Face Forgery Detection.
In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021;
Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2021; pp. 15024–15034.
77. Pei, S.; Wang, Y.; Xiao, B.; Pei, S.; Xu, Y.; Gao, Y.; Zheng, J. A Bidirectional-LSTM Method Based on Temporal Features for Deep
Fake Face Detection in Videos. In Proceedings of the 2nd International Conference on Information Technology and Intelligent
Control (CITIC 2022), Kunming, China, 15–17 July 2022; Nikhath, K., Ed.; SPIE: Washington, DC, USA, 2022; Volume 12346.
78. Gu, Z.; Yao, T.; Chen, Y.; Yi, R.; Ding, S.; Ma, L. Region-Aware Temporal Inconsistency Learning for DeepFake Video Detection. In
Proceedings of the 31th International Joint Conference on Artificial Intelligence, Vienna, Austria, 23–29 July 2022; De Raedt, L.,
De Raedt, L., Eds.; International Joint Conferences on Artificial Intelligence: Vienna, Austria, 2022; pp. 920–926.
79. Ru, Y.; Zhou, W.; Liu, Y.; Sun, J.; Li, Q. Bita-Net: Bi-Temporal Attention Network for Facial Video Forgery Detection. In
Proceedings of the 2021 IEEE International Joint Conference on Biometrics, IJCB, Shenzhen, China, 4–7 August 2021; Institute of
Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2021.
Computers 2023, 12, 216 24 of 26
80. Sun, Y.; Zhang, Z.; Echizen, I.; Nguyen, H.H.; Qiu, C.; Sun, L. Face Forgery Detection Based on Facial Region Displacement
Trajectory Series. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, WACV,
Waikoloa, HI, USA, 3–7 January 2023; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2023; pp. 633–642.
81. Lu, T.; Bao, Y.; Li, L. Deepfake Video Detection Based on Improved CapsNet and Temporal–Spatial Features. Comput. Mater.
Contin. 2023, 75, 715–740. [CrossRef]
82. Waseem, S.; Abu-Bakar, S.R.; Omar, Z.; Ahmed, B.A.; Baloch, S. A Multi-Color Spatio-Temporal Approach for Detecting DeepFake.
In Proceedings of the 2022 12th International Conference on Pattern Recognition Systems, ICPRS, Saint-Etienne, France, 7–10 June
2022; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2022.
83. Matern, F.; Riess, C.; Stamminger, M. Exploiting Visual Artifacts to Expose Deepfakes and Face Manipulations. In Proceedings of
the 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW), Waikoloa Village, HI, USA, 7–11 January 2019;
pp. 83–92.
84. Ciftci, U.A.; Demir, I.; Yin, L. FakeCatcher: Detection of Synthetic Portrait Videos Using Biological Signals. IEEE Trans. Pattern
Anal. Mach. Intell. 2020, 9, 1. [CrossRef]
85. Benazzouza, S.; Ridouani, M.; Salahdine, F.; Hayar, A. A Novel Prediction Model for Malicious Users Detection and Spectrum
Sensing Based on Stacking and Deep Learning. Sensors 2022, 22, 6477. [CrossRef]
86. Verdoliva, L. Media Forensics and DeepFakes: An Overview. IEEE J. Sel. Top. Signal Process. 2020, 14, 910–932. [CrossRef]
87. Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2015, arXiv:1409.1556.
88. Tan, M.; Le, Q.V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks; PMLR: Westminster, UK, 2020.
89. Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper with
Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28
June 2014.
90. Sabour, S.; Frosst, N.; Hinton, G.E. Dynamic Routing between Capsules. arXiv 2017, arXiv:1710.09829.
91. Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.;
Gelly, S.; et al. An Image Is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. arXiv 2021, arXiv:2010.11929.
92. Benazzouza, S.; Ridouani, M.; Salahdine, F.; Hayar, A. Chaotic Compressive Spectrum Sensing Based on Chebyshev Map for
Cognitive Radio Networks. Symmetry 2021, 13, 429. [CrossRef]
93. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015.
94. Chollet, F. Xception: Deep Learning with Depthwise Separable Convolutions. In Proceedings of the 2017 IEEE Conference
on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; IEEE: Honolulu, HI, USA, 2017;
pp. 1800–1807.
95. Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer Using
Shifted Windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 17–11
October 2021.
96. Neimark, D.; Bar, O.; Zohar, M.; Asselmann, D. Video Transformer Network. In Proceedings of the IEEE/CVF International
Conference on Computer Vision, Montreal, QC, Canada, 17–11 October 2021.
97. Zhao, C.; Wang, C.; Hu, G.; Chen, H.; Liu, C.; Tang, J. ISTVT: Interpretable Spatial-Temporal Video Transformer for Deepfake
Detection. IEEE Trans. Inf. Forensics Secur. 2023, 18, 1335–1348. [CrossRef]
98. Yu, Y.; Zhao, X.; Ni, R.; Yang, S.; Zhao, Y.; Kot, A.C. Augmented Multi-Scale Spatiotemporal Inconsistency Magnifier for
Generalized DeepFake Detection. IEEE Trans Multimed. 2023, 99, 1–13. [CrossRef]
99. Yang, Z.; Liang, J.; Xu, Y.; Zhang, X.; He, R. Masked Relation Learning for DeepFake Detection. IEEE Trans. Inf. Forensics Secur.
2023, 18, 1696–1708. [CrossRef]
100. Shang, Z.; Xie, H.; Yu, L.; Zha, Z.; Zhang, Y. Constructing Spatio-Temporal Graphs for Face Forgery Detection. ACM Trans. Web
2023, 17, 1–25. [CrossRef]
101. Rajalaxmi, R.R.; Sudharsana, P.P.; Rithani, A.M.; Preethika, S.; Dhivakar, P.; Gothai, E. Deepfake Detection Using Inception-
ResNet-V2 Network. In Proceedings of the 2023 7th International Conference on Computing Methodologies and Communication
(ICCMC), Erode, India, 23–25 February 2023; pp. 580–586.
102. Korshunov, P.; Jain, A.; Marcel, S. Custom Attribution Loss for Improving Generalization and Interpretability of Deepfake
Detection. In Proceedings of the ICASSP 2022—2022 IEEE International Conference on Acoustics, Speech and Signal Processing
(ICASSP), Singapore, 22–27 May 2022; IEEE: Singapore, 2022; pp. 8972–8976.
103. Patel, Y.; Tanwar, S.; Bhattacharya, P.; Gupta, R.; Alsuwian, T.M.; Davison, I.E.; Mazibuko, T.F. An Improved Dense CNN
Architecture for Deepfake Image Detection. IEEE Access 2023, 11, 22081–22095. [CrossRef]
104. Pang, G.; Zhang, B.; Teng, Z.; Qi, Z.; Fan, J. MRE-Net: Multi-Rate Excitation Network for Deepfake Video Detection. IEEE Trans.
Circuits Syst. Video Technol. 2023, 33, 3663–3676. [CrossRef]
105. Mehra, A.; Agarwal, A.; Vatsa, M.; Singh, R. Motion Magnified 3-D Residual-in-Dense Network for DeepFake Detection. IEEE
Trans. Biom. Behav. Iden. Sci. 2023, 5, 39–52. [CrossRef]
106. Lin, H.; Huang, W.; Luo, W.; Lu, W. DeepFake Detection with Multi-Scale Convolution and Vision Transformer. Digit. Signal
Process. Rev. J. 2023, 134, 103895. [CrossRef]
Computers 2023, 12, 216 25 of 26
107. Khalid, F.; Akbar, M.H.; Gul, S. SWYNT: Swin Y-Net Transformers for Deepfake Detection. In Proceedings of the 2023 International
Conference on Robotics and Automation in Industry (ICRAI), Peshawar, Pakistan, 3–5 March 2023; pp. 1–6.
108. Zhuang, W.; Chu, Q.; Tan, Z.; Liu, Q.; Yuan, H.; Miao, C.; Luo, Z.; Yu, N. UIA-ViT: Unsupervised Inconsistency-Aware Method
Based on Vision Transformer for Face Forgery Detection. In European Conference on Computer Vision; Avidan, S., Brostow, G.,
Cisse, M., Farinella, G., Hassner, T., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2022; Volume 13665,
pp. 391–407.
109. Yan, Z.; Sun, P.; Lang, Y.; Du, S.; Zhang, S.; Wang, W. Landmark Enhanced Multimodal Graph Learning for Deepfake Video
Detection. arXiv 2022, arXiv:2209.05419. [CrossRef]
110. Saealal, M.S.; Ibrahim, M.Z.; Shapiai, M.I.; Fadilah, N. In-the-Wild Deepfake Detection Using Adaptable CNN Models with
Visual Class Activation Mapping for Improved Accuracy. In Proceedings of the 2023 5th International Conference on Computer
Communication and the Internet (ICCCI), Fujisawa, Japan, 23–25 June 2023; IEEE: Fujisawa, Japan, 2023; pp. 9–14.
111. Xu, Y.; Raja, K.; Pedersen, M. Supervised Contrastive Learning for Generalizable and Explainable DeepFakes Detection. In
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, WACVW, Waikoloa, HI, USA,
4–8 January 2022; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2022; pp. 379–389.
112. Xia, Z.; Qiao, T.; Xu, M.; Wu, X.; Han, L.; Chen, Y. Deepfake Video Detection Based on MesoNet with Preprocessing Module.
Symmetry 2022, 14, 939. [CrossRef]
113. Wu, N.; Jin, X.; Jiang, Q.; Wang, P.; Zhang, Y.; Yao, S.; Zhou, W. Multisemantic Path Neural Network for Deepfake Detection.
Secur. Commun. Netw. 2022, 2022, 4976848. [CrossRef]
114. Wu, H.; Wang, P.; Wang, X.; Xiang, J.; Gong, R. GGViT:Multistream Vision Transformer Network in Face2Face Facial Reenactment
Detection. In Proceedings of the 2022 26th International Conference on Pattern Recognition, Montreal, QC, Canada, 21–25 August
2022; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2022; Volume 2022, pp. 2335–2341.
115. Cozzolino, D.; Pianese, A.; Nießner, M.; Verdoliva, L. Audio-Visual Person-of-Interest DeepFake Detection. In Proceedings of
the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 943–952.
[CrossRef]
116. Wang, B.; Li, Y.; Wu, X.; Ma, Y.; Song, Z.; Wu, M. Face Forgery Detection Based on the Improved Siamese Network. Secur. Commun.
Netw. 2022, 2022, 5169873. [CrossRef]
117. Saealal, M.S.; Ibrahim, M.Z.; Mulvaney, D.J.; Shapiai, M.I.; Fadilah, N. Using Cascade CNN-LSTM-FCNs to Identify AIaltered
Video Based on Eye State Sequence. PLoS ONE 2022, 17, e0278989. [CrossRef]
118. Rössler, A.; Cozzolino, D.; Verdoliva, L.; Riess, C.; Thies, J.; Niessner, M. FaceForensics++: Learning to Detect Manipulated Facial
Images. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27
November 2019; pp. 1–11.
119. GitHub—Deepfakes/Faceswap: Deepfakes Software for All. Available online: https://github.com/deepfakes/faceswap (ac-
cessed on 10 October 2023).
120. GitHub—MarekKowalski/FaceSwap: 3D Face Swapping Implemented in Python. Available online: https://github.com/
MarekKowalski/FaceSwap/ (accessed on 10 October 2023).
121. Thies, J.; Zollhöfer, M.; Nießner, M. Deferred Neural Rendering: Image Synthesis Using Neural Textures. Available online:
https://arxiv.org/abs/1904.12356v1 (accessed on 10 October 2023).
122. Li, Y.; Yang, X.; Sun, P.; Qi, H.; Lyu, S. Celeb-DF: A Large-Scale Challenging Dataset for DeepFake Forensics. In Proceedings
of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020;
pp. 3204–3213.
123. Dolhansky, B.; Bitton, J.; Pflaum, B.; Lu, J.; Howes, R.; Wang, M.; Ferrer, C.C. The DeepFake Detection Challenge (DFDC) Dataset.
arXiv 2020, arXiv:2006.07397.
124. GitHub—Cuihaoleo/Kaggle-Dfdc: 2nd Place Solution for Kaggle Deepfake Detection Challenge. Available online: https:
//github.com/cuihaoleo/kaggle-dfdc (accessed on 10 October 2023).
125. Nirkin, Y.; Keller, Y.; Hassner, T. FSGAN: Subject Agnostic Face Swapping and Reenactment. In Proceedings of the IEEE/CVF
International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019.
126. Jiang, L.; Li, R.; Wu, W.; Qian, C.; Loy, C.C. DeeperForensics-1.0: A Large-Scale Dataset for Real-World Face Forgery Detection. In
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020.
127. Zi, B.; Chang, M.; Chen, J.; Ma, X.; Jiang, Y.-G. WildDeepfake: A Challenging Real-World Dataset for Deepfake Detection. In
Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA, 12–16 October 2020; pp. 2382–2390.
[CrossRef]
128. Le, T.-N.; Nguyen, H.H.; Yamagishi, J.; Echizen, I. OpenForensics: Large-Scale Challenging Dataset for Multi-Face Forgery
Detection and Segmentation In-the-Wild. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision
(ICCV), Montreal, BC, Canada, 11–17 October 2021; pp. 10097–10107.
129. Kwon, P.; You, J.; Nam, G.; Park, S.; Chae, G. KoDF: A Large-Scale Korean DeepFake Detection Dataset. In Proceedings of the 2021
IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, BC, Canada, 11–17 October 2021; pp. 10724–10733.
130. Siarohin, A.; Lathuilière, S.; Tulyakov, S.; Ricci, E.; Sebe, N. First Order Motion Model for Image Animation. In Proceedings of the
Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019.
Computers 2023, 12, 216 26 of 26
131. Yi, R.; Ye, Z.; Zhang, J.; Bao, H.; Liu, Y.-J. Audio-Driven Talking Face Video Generation with Learning-Based Personalized Head
Pose. arXiv 2020, arXiv:2002.10137.
132. Prajwal, K.R.; Mukhopadhyay, R.; Namboodiri, V.; Jawahar, C.V. A Lip Sync Expert Is all You Need for Speech to Lip Generation
in the Wild. In Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA, 12 October 2020;
pp. 484–492.
133. Khalid, H.; Tariq, S.; Woo, S.S. FakeAVCeleb: A Novel Audio-Video Multimodal Deepfake Dataset. arXiv 2021, arXiv:2108.05080.
134. Jia, Y.; Zhang, Y.; Weiss, R.J.; Wang, Q.; Shen, J.; Ren, F.; Chen, Z.; Nguyen, P.; Pang, R.; Moreno, I.L.; et al. Transfer Learning from
Speaker Verification to Multispeaker Text-to-Speech Synthesis. Available online: https://arxiv.org/abs/1806.04558v4 (accessed
on 10 October 2023).
135. Korshunov, P.; Marcel, S. DeepFakes: A New Threat to Face Recognition? Assessment and Detection. arXiv 2018, arXiv:1812.08685.
136. Yang, X.; Li, Y.; Lyu, S. Exposing Deep Fakes Using Inconsistent Head Poses. In Proceedings of the ICASSP 2019-2019 IEEE
International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019; pp. 8261–8265.
137. Contributing Data to Deepfake Detection Research—Google Research Blog. Available online: https://blog.research.google/2019
/09/contributing-data-to-deepfake-detection.html (accessed on 5 October 2023).
138. Wang, Y.; Chen, X.; Zhu, J.; Chu, W.; Tai, Y.; Wang, C.; Li, J.; Wu, Y.; Huang, F.; Ji, R. HifiFace: 3D Shape and Semantic Prior
Guided High Fidelity Face Swapping. arXiv 2021, arXiv:2106.09965.
139. He, Y.; Gan, B.; Chen, S.; Zhou, Y.; Yin, G.; Song, L.; Sheng, L.; Shao, J.; Liu, Z. ForgeryNet: A Versatile Benchmark for
Comprehensive Forgery Analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,
Nashville, TN, USA, 21–25 June 2021; IEEE Computer Society: Washington, DC, USA, 2021; pp. 4358–4367.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.