0% found this document useful (0 votes)
4 views3 pages

Evo

Generative adversarial networks (GANs), autoencoders, and diffusion models are key technologies behind deepfake, which creates hyperrealistic synthetic media. The term 'deepfake' originated from a Redditor and has raised concerns regarding its misuse in misinformation and criminal activities. While deepfake technology shows great potential, there is a risk of overhype leading to disillusionment, emphasizing the need for realistic expectations in its development and application.

Uploaded by

peirissuggreewa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views3 pages

Evo

Generative adversarial networks (GANs), autoencoders, and diffusion models are key technologies behind deepfake, which creates hyperrealistic synthetic media. The term 'deepfake' originated from a Redditor and has raised concerns regarding its misuse in misinformation and criminal activities. While deepfake technology shows great potential, there is a risk of overhype leading to disillusionment, emphasizing the need for realistic expectations in its development and application.

Uploaded by

peirissuggreewa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

***Generative adversarial networks (GANs), autoencoder, and diffusion models can be

considered three of the most ingenious inventions of the ongoing artificial intelligence (AI)
spring. Without diving into technical details, an aspect of these AI architectures should be
emphasized: They can create synthetic but hyperrealistic audio, photo, and video. This
hyperrealistic synthetic audiovisual media, in a wide sense, is referred to as deepfake. Taking an
etymological look at deepfake can also help comprehend it. Deepfake is a portmanteau of the
terms deep and fake. The term deep refers to deep learning-based creation and editing of
deepfake, and fake denotes the synthetic nature of deepfake. Introduction The term deepfake was
coined after a Redditor (also named Deepfakes), who created a subreddit on AI-generated
nonconsensual pornography and publicly shared its open source code.***

The origins of deepfake and the technology behind it can be traced back to the late 1990s. Video
Rewrite18 (1997) was the first automated system that could create lip-syncing videos, ancient
examples of deepfake. Another technological breakthrough was active appearance models19
(2001), a computer vision algorithm that matched statistical models of appearance to images.
Furthermore, the advancement of deep learning, increasing computing power, and the ease of
access to a large volume of data prepared the ground for the technology trigger of deepfake. Ian
Goodfellow invented the technology behind deepfake (GANs)20 in 2014. GANs consist of two
neural networks—generator and discriminator—that compete against each other in a zero-sum
game. The generator manufactures fake data, and the discriminator tries to distinguish authentic
samples from fake ones. Analogously, the generator takes the role of counterfeiter and prints
counterfeit money; the discriminator plays the role of detective and evaluates the banknotes’
authenticity. Two neural networks sharpen each other thanks to the feedback loop between
them.20 “Not all synthetic images are born equally and by same manufacture methods.”21
Another neural network architecture used to generate deepfake is variational autoencoder (VAE),
a modernized version of traditional autoencoder. Generic VAE is skillful for deepfake
manufacture because it “generates latent vectors that follow a Gaussian unit distribution. By
doing this, it allows us to generate new images by sampling a latent vector from the Gaussian di
str i but ion, which could then be passed to the decoder network.”22 The third model behind
deepfake is diffusion models, a variety of probabilistic generative models based on the forward
diffusion stage (which progressively injects noise and destroys data) and forward diffusion stage
(which reverses the process for sample generation). A famous example of diffusion models is
OpenAI’s text-to-image model, DALL·E 3. Diffusion models are considered the new driving
force of generative models and show great potential in many tasks, e.g., image generation, image
superresolution, image inpainting, and image-to-image translation.23,24 The aforementioned-
stated three models have advantages and drawbacks against each other,21,22,23 and each has a
wide range of variations.23 After a concise account of the technology behind deepfake, we can
turn to the Hype Cycle analysis. The first wave of venture capital investment is a stage indicator
of technology trigger, as shown in Figure 1. This investment in deepfake arrived three years after
the invention of GANs.25

Deepfake emerged in the age of post-truth politics, which fueled concerns over the misuse of
deepfake for dis- and misinformation. Face presentation attacks, deepfake-powered child
pornography and online grooming, fabricated evidence, financial fraud, and identity theft are
other perils of deepfake. On top of the malicious uses and growing concerns, the media covered
deepfake with eye-catching headlines[7].

[7]

Figure 1.Mainstream media’s (overhyped) coverage of deepfake.

These factors resulted in a great deal of (negative) overhype around the capabilities and societal
impacts of deepfake. Before proceeding further, let’s take a step back and discuss the overhype
around nascent technologies and its potential impacts. The AI springs created short-term
overhype that brought sizeable investment and immense enthusiasm in the AI domain.
Subsequently, the gap between short-term overhype and reality (the challenges of developing and
adopting new technologies) brought disillusionment and skepticism about the capabilities and
future of AI. A combination of many factors beyond the scope of this study paved the way for AI
winters, one of which was media overhype. As AI history reminds us, if deepfake technology
fails to meet the inflated expectations attributed to it, the likelihood of a “deepfake winter”
escalates. Consequently, setting realistic expectations for deepfake (without forgetting its great
long-term potential) is of utmost importance for anyone interested in synthetic media, especially
for technology and innovation management domains. This raises the question of how to set
feasible expectations for exciting nascent technology.

Deepfake manufacturing models have significantly advanced in recent years thanks to numerous
GAN, autoencoder, and diffusion model variants. A retrospective look at deepfake supports this
argument. Figure 2 visualizes GANs’ progress in face generation from 2014 to 2022. Deepfake
videos of Tom Cruise62 and music band AllttA’s duet with deepfake Jay-Z63 demonstrate that
hyperrealistic deepfake production is now within the realms of possibility. Finally, OpenAI’s
text-to-video model Sora indicates the capabilities of text-to-video on the horizon.

[7]

Figure 2.Face synthesis deepfakes between 2014–2022.

You might also like