Synopsis Report
Synopsis Report
ABSTRACT
The growing computation power has made the deep learning algorithms so powerful
that creating a indistinguishable human synthesized video popularly called as deep fakes have
became very simple. Scenarios where these realistic face swapped deep fakes are used to create
political distress, fake terrorism events, revenge porn, blackmail peoples are easily envisioned.
In this work, we describe a new deep learning-based method that can effectively distinguish AI-
generated fake videos from real videos.Our method is capable of automatically detecting the
replacement and reenactment deep fakes. We are trying to use Artificial Intelligence(AI) to
fight Artificial Intelligence(AI). Our system uses a Res-Next Convolution neural network to
extract the frame-level features and these features and further used to train the Long Short Term
Memory(LSTM) based Recurrent Neural Network(RNN) to classify whether the video is
subject to any kind of manipulation or not, i.e whether the video is deep fake or real video. To
emulate the real time scenarios and make the model perform better on real time data, we
evaluate our method on large amount of balanced and mixed data-set prepared by mixing the
various available data-set like Face-Forensic, Deepfake detection challenge, and Celeb-DF. We
also show how our system can achieve competitive result using very simple and robust
approach.
KEYWORDS
Res-Next convolution neural network
Recurrent neural network(RNN)
Long short term memory(LSTM)
OpenCV
PyTorch
GAN(Generative Adversarial Network)
Deep Learning
Face Recognition
INTRODUCTION
In the world of ever growing Social media platforms, Deepfakes are considered as the
major threat of the AI. There are many Scenarios where these realistic face swapped deepfakes
are used to create political distress, fake terrorism events, revenge porn, blackmail peoples are
easily envisioned.Some of the examples are Brad Pitt, Angelina Jolie nude videos. It becomes
very important to spot the difference between the deepfake and pristine video. We are using AI
to fight AI.Deepfakes are created using tools like FaceApp and Face Swap , which using pre-
trained neural networks like GAN or Auto encoders for these deepfakes creation. Our method
uses a LSTM based artificial neural network to process the sequential temporal analysis of the
video frames and pre-trained Res-Next CNN to extract the frame level features. ResNext
Convolution neural network extracts the frame-level features and these features are further used
to train the Long Short Term Memory based artificial Recurrent Neural Network to classify the
video as Deepfake or real.
To emulate the real time scenarios and make the model perform better on real time data,
we trained our method with large amount of balanced and combination of various available
dataset like FaceForensic, Deepfake detection challenge, and Celeb-D. Further to make the
ready to use for the customers, we have developed a front end application where the user the
user will upload the video. The video will be processed by the model and the output will be
rendered back to the user with the classification of the video as deepfake or real and confidence
of the model. GHRCEM-Wagholi,Pune, Department of Computer Engineering 2019-2020 3
Deepfake Video Detection 3.2 Motivation of the Project The increasing sophistication of
mobile camera technology and the ever growing reach of social media and media sharing
portals have made the creation and propagation of digital videos more convenient than ever
before. Deep learning has given rise to technologies that would have been thought impossible
only a handful of years ago. Modern generative models are one example of these, capable of
synthesizing hyper realistic images, speech, music, and even video. These models have found
use in a wide variety of applications, including making the world more accessible through text-
to-speech, and helping generate training data for medical imaging. Like any trans-formative
technology, this has created new challenges. So-called "deep fakes" produced by deep
generative models that can manipulate video and audio clips. Since their first appearance in late
2017, many open-source deep fake generation methods and tools have emerged now, leading to
a growing number of synthesized media clips. While many are likely intended to be humorous,
others could be harmful to individuals and society. Until recently, the number of fake videos
and their degrees of realism has been increasing due to availability of the editing tools, the high
demand on domain expertise. Spreading of the Deep fakes over the social media platforms have
become very common leading to spamming and peculating wrong information over the
platform. Just imagine a deep fake of our prime minister declaring war against neighboring
countries, or a Deep fake of reputed celebrity abusing the fans. These types of the deep fakes
will be terrible, and lead to threatening, misleading of common people. To overcome such a
situation, Deep fake detection is very important. So, we describe a new deep learning-based
method that can effectively distinguish AIgenerated fake videos (Deep Fake Videos) from real
videos.
It’s incredibly important to develop technology that can spot fakes, so that the deep
fakes can be identified and prevented from spreading over the internet. The increasing
sophistication of mobile camera technology and the ever growing reach of social media and
media sharing portals have made the creation and propagation of digital videos more
convenient than ever before. Deep learning has given rise to technologies that would have been
thought impossible only a handful of years ago. Modern generative models are one example of
these, capable of synthesizing hyper realistic images, speech, music, and even video. These
models have found use in a wide variety of applications, including making the world more
accessible through text-to-speech, and helping generate training data for medical imaging. Like
any trans-formative technology, this has created new challenges. So-called "deep fakes"
produced by deep generative models that can manipulate video and audio clips. Since their first
appearance in late 2017, many open-source deep fake generation methods and tools have
emerged now, leading to a growing number of synthesized media clips. While many are likely
intended to be humorous, others could be harmful to individuals and society. Until recently, the
number of fake videos and their degrees of realism has been increasing due to availability of
the editing tools, the high demand on domain expertise.
LITERATURE SURVEY
Face Warping Artifacts used the approach to detect artifacts by comparing the
generated face areas and their surrounding regions with a dedicated Convolutional Neural
Network model. In this work there were two-fold of Face Artifacts. Their method is based on
the observations that current deepfake algorithm can only generate images of limited
resolutions, which are then needed to be further transformed to match the faces to be replaced
in the source video. Their method has not considered the temporal analysis of the frames.
Detection by Eye Blinking describes a new method for detecting the deepfakes by the eye
blinking as a crucial parameter leading to classification of the videos as deepfake or pristine.
The Long-term Recurrent Convolution Network (LRCN) was used for temporal analysis of the
cropped frames of eye blinking.
As today the deepfake generation algorithms have become so powerful that lack of eye
blinking can not be the only clue for detection of the deepfakes. There must be certain other
parameters must be considered for the detection of deepfakes like teeth enchantment, wrinkles
on faces, wrong placement of eyebrows etc. Capsule networks to detect forged images and
videos uses a method that uses a capsule network to detect forged, manipulated images and
videos in different scenarios, like replay attack detection and computer-generated video
detection. In their method, they have used random noise in the training phase which is not a
good option. Still the model performed beneficial in their dataset but may fail on real time data
due to noise in training. Our method is proposed to be trained on noiseless and real time
datasets. Recurrent Neural Network (RNN) for deepfake detection used the approach of using
RNN for sequential processing of the frames along with ImageNet pre-trained model. Their
process used the HOHO dataset consisting of just 600 videos. Their dataset consists small
number of videos and same type of videos, which may not perform very well on the real time
data. We will be training out model on large number of Realtime data.
OBJECTIVE
Convincing manipulations of digital images and videos have been demonstrated for
several decades through the use of visual effects, recent advances in deep learning have led to a
dramatic increase in the realism of fake content and the accessibility in which it can be created.
These so-called AI-synthesized media (popularly referred to as deep fakes).Creating the Deep
Fakes using the Artificially intelligent tools are simple task. But, when it comes to detection of
these Deep Fakes, it is major challenge. Already in the history there are many examples where
the deepfakes are used as powerful way to create political tension, fake terrorism events,
revenge porn, blackmail peoples etc.So it becomes very important to detect these deepfake and
avoid the percolation of deepfake through social media platforms. We have taken a step
forward in detecting the deep fakes using LSTM based artificial Neural network.
There are many tools available for creating the deep fakes, but for deep fake detection
there is hardly any tool available. Our approach for detecting the deep fakes will be great
contribution in avoiding the percolation of the deep fakes over the world wide web. We will be
providing a web-based platform for the user to upload the video and classify it as fake or real.
This project can be scaled up from developing a web-based platform to a browser plugin for
automatic deep fake detection’s. Even big application like WhatsApp, Facebook can integrate
this project with their application for easy pre-detection of deep fakes before sending to another
user. A description of the software with Size of input, bounds on input, input validation, input
dependency, i/o state diagram, Major inputs, and outputs are described without regard to
implementation detail.
MAJOR CONSTRAINTS
User: User of the application will be able detect the whether the uploaded video is fake
or real, Along with the model confidence of the prediction.
Prediction: The User will be able to see the playing video with the output on the face
along with the confidence of the model.
Easy and User-friendly User-Interface: Users seem to prefer a more simplified process
of Deep Fake video detection. Hence, a straight forward and user-friendly interface is
implemented.The UI contains a browse tab to select the video for processing. It reduces
the complications and at the same time enrich the user experience.
Cross-platform compatibility: with an ever-increasing target market, accessibility
should be your main priority. By enabling a cross-platform compatibility feature, you can
increase your reach to across different platforms. Being a server side application it will
run on any device that has a web browser installed in it.
PARAMETER IDENTIFIED
a. Blinking of eyes
b. Teeth enchantment
c. Bigger distance for eyes
d. Moustaches
e. Double edges, eyes, ears, nose
f. Iris segmentation
g. Wrinkles on face
h. Inconsistent head pose
i. Face angle
j. Skin tone
k. Facial Expressions
l. Lighting
m. Different Pose
n. Double chins
Hardware required:
Computer system(laptop)
Corei5 with 8 GB of Ram and 2.8 GHz processor speed
1TB hard-disk
Nvidia graphic card
Software required:
FUTURE SCOPE
There is always a scope for enhancements in any developed system, especially when the
project build using latest trending technology and has a good scope in future.
Web based platform can be upscaled to a browser plugin for ease of access to the user.
Currently only Face Deep Fakes are being detected by the algorithm, but the algorithm
can be enhanced in detecting full body deep fakes.
REFERENCES