A Review of Deepfake Techniques Architecture Detection and Datasets

This document is a comprehensive review of deepfake techniques, focusing on detection methods, architectures, and datasets. It highlights the challenges in deepfake detection, particularly the lack of dataset diversity and the need for improved standardization in dataset ethics and privacy. The paper serves as a vital resource for researchers and practitioners in the field, outlining current challenges and future directions for deepfake detection research.

Uploaded by

Himashree PG

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views25 pages

A Review of Deepfake Techniques Architecture Detection and Datasets

Uploaded by

Himashree PG

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Received 12 August 2024, accepted 1 October 2024, date of publication 9 October 2024, date of current version 30 October 2024.

Digital Object Identifier 10.1109/ACCESS.2024.3477257

A Review of Deepfake Techniques: Architecture,

Detection, and Datasets
PETER EDWARDS , JEAN-CHRISTOPHE NEBEL , (Senior Member, IEEE), DARREL GREENHILL,
AND XING LIANG , (Member, IEEE)
School of Computer Science and Mathematics, Kingston University, KT1 2EE London, U.K.
Corresponding author: Xing Liang ([email protected])

ABSTRACT Driven by continuous advancements in artificial intelligence, especially deep learning, the
level of realism associated with deepfake technology continues to improve year after year, which poses
unprecedented challenges to the field of deepfake detection. The boundary between what we as humans can
detect as real or fake becomes evermore blurred as new generations of algorithms such as Dall-E 3 and Stable
Diffusion are released. This paper provides a comprehensive study into the landscape of deepfake detection,
exploring in-depth the key challenges, recognising recent successes, and suggesting promising avenues for
future research. A meta-literature review is conducted to identify the current challenges and future directions,
which form the foundation of this work. They are investigated by analysing state-of-the-art research with
a focus on the key components that are crucial to the design of a deepfake detector, i.e., the architecture,
detection methods and datasets. A major challenge identified by this study is the lack of dataset diversity
leading to unfair attribute representation. This must be addressed by improving standardisation on dataset
ethics and privacy. This is one of the main reasons for the insufficient generalisation capability of current
deepfake detectors as demonstrated by their unsatisfactory performance when faced with unseen data or
data in the wild. This literature review provides deepfake detection researchers and practitioners with the
latest information that will serve as a vital resource for their continued and important activity, now and in
the future.

INDEX TERMS Deepfakes, deepfake detection, generative AI, deep learning, machine learning, artificial
intelligence, datasets, survey.

I. INTRODUCTION Learning (DL), have been instrumental in the acceleration

The technology behind deepfake media has come a long of this technology, contributing to the spread of fake media
way since its first inception in 2017, where, according to a throughout our society.
leading UK newspaper [1], the term originated from a social Early incarnations of deepfake media were primitive and
media user who created a series of pornographic videos by often associated with static imagery that was of low qual-
swapping the faces with those of celebrities. At first, the ity. In recent years, this has shifted towards higher-quality
novelty of this technology led to the release of several face- imagery and video content due to improvements in DL model
swapping apps, including Facelab [2] and FaceApp [3], which training and the sharing of open-source algorithms for content
allowed people to generate content for entertainment value. generation. We are now approaching a pivotal point where
However, the world is starting to see the full potential of the lines between real and fake media are becoming blurred.
this technology and the darker side in which it may be used News headlines have illustrated the severity and risks involv-
to cause harm. Advancements in Artificial Intelligence (AI), ing high-profile public figures in which fraudulent acts depict
principally in the fields of Machine Learning (ML) and Deep disinformation through fake news. Evidence has highlighted
that, during the conflict between Russia and Ukraine, in 2022,
The associate editor coordinating the review of this manuscript and a deepfake of Ukrainian President Zelensky was distributed
approving it for publication was Asadullah Shaikh . to incite the citizens of Ukraine to surrender [4]. Doubts over

2024 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License.
154718 For more information, see https://creativecommons.org/licenses/by/4.0/ VOLUME 12, 2024
P. Edwards et al.: Review of Deepfake Techniques: Architecture, Detection, and Datasets

the integrity and authenticity of the video ensured that it failed image and video media have led to the need for a more robust
its objective. This deepfake was one of many examples where approach.
such technology, if in the wrong hands, can become a weapon Artefacts uncovered in the spatial and frequency domain
in the digital age that we live in. have exposed vital information relating to either the pixel
Academic research in the field of deepfakes has signif- formation (spatial domain) that makes up the overall image
icantly grown since 2017, as illustrated in Figure 1 which over time or the frequency representation, e.g., low or
shows statistical data collected from Dimensions [5] using high-frequency components (frequency domain), which cor-
the publication type ‘Article’ or ‘Preprint’ and a date range responds to the rate at which the pixel information is
from ‘2017’ to ‘2024’. Note that a linear trendline was used changing. For example, a disturbance in the surrounding
to extrapolate them until 2025. Unsurprisingly, considering pixel formation between the old and new content can provide
the mass media attention around deepfakes and fake news, valuable statistical information that may expose the boundary
the statistics highlight that the volume of published research of where the manipulation occurred [10]. Additionally, facial
on deepfake detection is far outpacing research on deepfake blending inconsistencies, which are inherently transferred by
creation illustrating the demand for solutions on this contro- the synthesis process during the creation of a deepfake, leave
versial topic. traceable artefacts in the image statistics [11]. The camera
model NoisePrint has demonstrated success in applying the
Photo-Response Non-Uniformity technique to extract and
compare noise signatures from images in a manner similar to
extracting a person’s fingerprint [12]. Furthermore, utilising
the spatial and frequency domain as handcrafted features for
ML has paved the way forward with novel detection methods
capable of inferring information based on complex informa-
tion with limited human interaction. Typically used with a
binary classifier and a fully connected layer, the process of
selecting the features to be learned can present challenges,
particularly when the underlying pipeline for deepfake cre-
ation is continuously evolving. Indeed, this can result in poor
generalisations on unseen data.
An important milestone in the detection of deepfakes has
been made possible through DL, whereby a model is able
FIGURE 1. Statistics depicting the number of deepfake, deepfake creation to learn complex multi-dimensional patterns from complex
or deepfake detection studies published over the last seven years (2017 datasets using artificial neurons that replicate the way in
to 2023).
which the human brain works [13]. In addition, this enables
a richer representation of features to be learned in a way that
Prior to deepfakes, the detection of manipulated imagery would not be achievable through standard ML [14].
often focused on the semantic characteristics within the Since the taking off of DL before 2012, DL architec-
image in terms of what can be seen and its overall composure. tures have rapidly evolved, driving significant advancements
For example, research by O’Brien and Farid [6] focused on in deepfake research, which can be observed in Figure 2.
the vanishing point in an image to establish the relationship This progress began with Convolutional Neural Networks
with common reflections and determine the feasibility of (CNNs), including VGG, CapsNet, which played a founda-
the image containing forged content. Lighting and shadow tional role in shaping modern deepfake detection techniques
details within the image are other examples of inconsistencies and paved the way for future advancements with their novel
that may occur. Furthermore, Johnson and Farid [7] report approaches. However, as the years have advanced, increased
that lighting and cast shadows can be used to approximate levels of interest have shifted towards the development of
if the lighting source is consistent for all the objects within hybrid architectures. In particular, this can be seen in the
the image and, therefore, with reasonable accuracy, identify number of variant architectures, including Transformer and
if manipulation has occurred. These techniques have suc- CapsNet, which have seen great interest from the academic
cessfully transitioned to the task of deepfake detection, with community.
promising results. For example, Wu et al. [8] explain that To establish the starting point, ten previously published
subtle clues in the swapped face region can expose incon- literature review papers between 2022 and 2023 were eval-
sistencies that do not match the composure of the image as uated in Section III to identify the key successes and
a whole and are often the unintentional result of the deep- challenges. Informed by them, four challenge themes, i.e.,
fake creation pipeline. Additionally, this technique can be dataset, architecture and scalability, explainability and eval-
effective against video content as explained by Zhu et al. [9], uation, were established to further complement this study
where inconsistencies in the inter-frame sequence highlight and guide the reader towards the main themes. Section IV
abnormalities. However, quality and detail improvements in provides the reader with a high-level overview of the common