0% found this document useful (0 votes)

35 views13 pages

Locally GAN-generated Face Detection Based On An Improved Xception

This document summarizes a research paper that proposes a model for detecting locally generated fake face images. The model is based on an improved Xception network. Key points: 1) The paper constructs the first dataset of locally generated fake faces (LGGF) by applying an image completion method to natural faces with randomly generated masks. 2) The improved Xception model removes residual blocks to reduce overfitting, uses dilated convolutions in preprocessing for multi-scale features, and adds a feature pyramid network to extract multi-level features. 3) Experimental results show the proposed model outperforms other models, especially for faces with small generated regions.

Uploaded by

m23ce070

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views13 pages

Locally GAN-generated Face Detection Based On An Improved Xception

Uploaded by

m23ce070

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Information Sciences 572 (2021) 16–28

Contents lists available at ScienceDirect

Information Sciences
journal homepage: www.elsevier.com/locate/ins

Locally GAN-generated face detection based on an improved

Xception
Beijing Chen a,b, Xingwang Ju a, Bin Xiao c, Weiping Ding d,⇑, Yuhui Zheng a,
Victor Hugo C. de Albuquerque e
a
School of Computer and Software, Engineering Research Center of Digital Forensics, Ministry of Education, Nanjing University of Information Science
and Technology, Nanjing 210044, China
b
Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology (CICAEET), Nanjing University of Information Science and
Technology, Nanjing 210044, China
c
School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
d
School of Information Science and Technology, Nantong University, Nantong 226019, China
e
Laboratory of Bioinformatics, University of Fortaleza, Fortaleza, Brazil

a r t i c l e i n f o a b s t r a c t

Article history: It has become a research hotspot to detect whether a face is natural or GAN-generated.
Received 8 September 2020 However, all the existing works focus on whole GAN-generated faces. So, an improved
Received in revised form 7 January 2021 Xception model is proposed for locally GAN-generated face detection. To the best of our
Accepted 2 May 2021
knowledge, our work is the first one to address this issue. Some improvements over
Available online 4 May 2021
Xception are as follows: (1) Four residual blocks are removed to avoid the overfitting prob-
lem as much as possible; (2) Inception block with the dilated convolution is used to replace
Keywords:
the common convolution layer in the pre-processing module of the Xception to obtain
Generated face
Inception block
multi-scale features; (3) Feature pyramid network is utilized to obtain multi-level features
Xception for final decision. The first locally GAN-based generated face (LGGF) dataset is constructed
Feature pyramid network by the pluralistic image completion method on the basis of FFHQ dataset. It has a total
952,000 images with the generated regions in different shapes and sizes. Experimental
results demonstrate the superiority of the proposed model which outperforms some exist-
ing models, especially for the faces having small generated regions.
Ó 2021 Elsevier Inc. All rights reserved.

1. Introduction

In daily life, unique biological information is often used in identity authentication, such as face, fingerprint, iris, and voi-
ceprint, etc. Compared with other biological information, the face has more advantages, such as lower acquisition cost, more
information, and no physical contact. Therefore, face images have been widely applied in the identification and authentica-
tion services. New applications, such as face payment, face retrieval, and face check-in, have come one after another, entering
the daily life in an all-round way. The ‘‘face brushing era” has arrived.
However, the emergence of the Generative Adversarial Networks (GAN)-based face generation technology makes the
‘‘brush face” unsafety. In the past five years, with the development of deep learning, various GAN models have been pro-
posed, such as BEGAN [1], PGGAN [2], and StyleGAN [3], etc. Consequently, many tools have been published, such as

⇑ Corresponding author.
E-mail address: [email protected] (W. Ding).

https://doi.org/10.1016/j.ins.2021.05.006
0020-0255/Ó 2021 Elsevier Inc. All rights reserved.
B. Chen, X. Ju, B. Xiao et al. Information Sciences 572 (2021) 16–28

Face2Face [4], FaceSwap [5], DeepFake [6], and so on. Even a non-professional person can generate a realistic image with
these tools. Although the use of these tools enriches and facilitates people’s material and spiritual life, there are some secu-
rity concerns. The large number of fake faces has brought huge challenges to politics, justice, criminal investigation, repu-
tation protection, and even social stability. For example, according to the Associated Press, a spy created a fake profile
photo on LinkedIn account of Katie Jones, a 30-something redhead, to attract would-be targets. She was connected to Wash-
ington political figures, including a deputy assistant secretary of state, a senior aide to a senator and the economist Paul
Winfree.
Therefore, GAN-generated face detection is a worthwhile investment. As a consequence, many methods have been pro-
posed in the past five years. These methods can be classified into two categories: intrinsic attributes-based [7–10] and deep
learning-based [11–18]. The intrinsic attribute-based methods exploit the inconsistence of different types of attributes
between the natural image and the GAN-generated image, such as global symmetry [7], brightness [8], color differences
[9], compression [10], and so on. The deep learning-based methods extract the expected statistical features through network
learning automatically, thus they are usually more effective than the intrinsic attribute-based methods. For the generated
image detection, Quan et al [11] used two large 7 7 kernels in the first two layers to extract distinctive feature to detect
the real or fake image. Mo et al. [12] utilized high-pass filter as the pre-processing layers and then extracted features from
the pre-processed input by CNN for fake face detection. Nataraj et al. [13] adopted color co-occurrence matrix as the network
input for fake face detection. Hulzebosch et al. [14] considered some pre-processing operations, such as filtering by two high-
pass filters, using the co-occurrence matrix, and transforming to the HSV color space. However, Liu et al. [15] pointed out
that using the handcraft features as the input would lead to the loss of the raw data information. Therefore, they used
the raw image as the input, and proposed Gram-Net by using the Gram matrix block to extract the global texture features.
For the fake face video detection, Afchar et al. [16] uses the Inception block as the initial layers to detect the deepfake and
Face2Face. In order to stack convolution outputs of different shapes and thereby increase the function space to optimize the
model, they improved the inception block by replacing the common convolution layers with the dilated convolution layers.
Khodabakhsh et al. [17] utilized the common image classification models to detect generated face videos, such as AlexNet,
VGG19, Resnet50, Inception-v3, and Xception. The experimental comparison showed that the Xception was the best one
among five models. Rössler et al. [18] constructed a huge dataset for deepfake and Face2Face detection, and also showed that
the Xception model achieved the best overall performance on their constructed dataset among the six compared models. In
addition, Chang et al. [19] and Wang et al. [20] combined the tools of data augmentation with some existing models to
improve the performance of these models.
All of the works mentioned above focus on the whole generated face images. But there are some realistic scenarios where
only a small part even a tiny part of the image is generated and the rest is natural, for example, face image restoration,
glasses removal, and mask removed, etc. Since the generated regions may very small and the small-generated regions
may be reduced to a point even nothing in the feature map after a deep network having multi-layer pooling, the whole gen-
erated face detection methods mentioned above may be not effective in locally generated face case. Therefore, how to extract
effective features from such small-generated regions is a very critical problem.
To the best of our knowledge, this paper is the first work to address the locally GAN-generated face detection. The main
contributions are as follows:
(1) The first Locally GAN-Generated Face (LGGF) dataset is constructed. The randomly generated binary masks are pasted
on the natural images from the natural FFHQ dataset [3] to obtain the incomplete images. Then, the incomplete images are
restored to generate the LGGF dataset by the pluralistic image completion method [21]. The LGGF dataset has a total 952,000
images with the generated regions in different shapes and sizes.
(2) An improved model based on Xception [22] is proposed for detecting locally GAN-generated faces. The Xception is
improved by removing four residual blocks to alleviate overfitting, introducing Inception block with dilated convolution into
the pre-processing module to obtain multi-scale features, and combining with Feature Pyramid Network (FPN) [23] to
extract multi-level features for the final decision.
The structure of the paper is divided as follows. Section 2 reviews the relevant works and some relevant technologies.
Section 3 describes the structure of the proposed model in detail. The experimental dataset construction and results analysis
are presented in Section 4. Finally, Section 5 summarizes the work of the paper.

2. Related work and technologies

In this section, firstly, the pluralistic image completion method is introduced for constructing locally GAN-generated
datasets; then, the Xception model for the whole generated face detection is reviewed; finally, some preliminaries related
with the proposed model is presented, such as the Inception block with the dilated convolutions, and squeeze-and-
excitation (SE) module.

2.1. Pluralistic image completion method

Traditional image content completion methods [24,25] have good performance in background completion but are not sat-
isfactory on reconstructing unique content that does not present in the images [21]. So, some CNN-based methods [26,27]
17
B. Chen, X. Ju, B. Xiao et al. Information Sciences 572 (2021) 16–28

have been proposed. However, these methods may lead to distorted structures and blur textures [21], especially for large
missing regions. Recently, the GAN-based methods [21,28] are becoming more and more mature. They can generate high-
level semantic content, making images more natural and realistic.
In this paper, the pluralistic method proposed by Zheng et al. [21] is utilized to construct the locally GAN-generated face
dataset because the pluralistic method not only improves the visual perception but also does not limit to the attributes of the
target. It has two parallel paths: the reconstructive path and the generative path. As for the reconstructive path, the given
ground truth of the image is utilized to get the prior distribution of the missing parts. Then, the image is rebuilt from the
prior distribution. As for the generative path, the conditional prior is coupled to the distribution obtained in the reconstruc-
tive path.

2.2. Xception model

The Xception model [22] is based on the depthwise separable convolution, which decomposes ordinary convolution into
two processes: spatial convolution and pointwise convolution. The spatial convolution performs on each input channel inde-
pendently. Then, the pointwise convolution uses a 1 1 kernel to convolve point by point. It not only decreases the number
of parameters but also reduces the number of calculations. As shown in Fig. 1, the Xception model consists of 14 residual
blocks. The 14 residual blocks contain 3 common convolution layers and 33 depthwise separable convolutions in total.
All the three common convolution layers are in the pre-processing module. Moreover, the Xception utilizes the global aver-
age pooling with fully connection layer in the decision module.
In [18], the Xception is utilized for the whole fake face detection and achieves good performance. However, although
separable convolution alleviates the overfitting of the network to some extent, the Xception model still has the problem of
overfitting, especially for the locally generated faces detection. Overfitting caused by data imbalance is a common phe-
nomenon in classification models [29]. Regarding the locally GAN-generated face detection issue, the Xception model also
exists this phenomenon. Specifically, for a pair of natural and locally GAN-generated images, the features from the natural
regions are much more than those from the generated regions. The data is imbalanced. So, the Xception model tends to
learn natural features. As a result, the Xception is easy to fall into overfitting when detecting the locally GAN-generated
faces.

2.3. Inception block with the dilated convolutions and squeeze-and-excitation module

MesoNet [16] uses the Inception block as the initial layers. In addition, in order to stack convolution outputs of different
shapes and thereby increase the function space to optimize the model, the Inception block is improved by replacing the com-
mon convolution layers with the dilated convolution layers. The improved Inception block contains four parallel groups of
convolution layers. The first group has a 1 1 convolution served as a jump connection between consecutive modules, while
each of the other three groups is with the dilated convolutions, trying to obtain multi-scale information. Finally, the final
output is obtained by concatenation. In this paper, we take advantage of the Inception block with dilated convolution to
extract multi-scale features from the locally GAN-generated faces.
The SE module [30] was first proposed for the ImageNet classification problem. The motivation of the SE module is to
establish the interdependence between feature channels explicitly. A SE module contains a global pooling layer, two fully
connection layers, and an activation layer. After the processing of the global pooling, the SE module turns each feature
map into a specific value, which represents the weight of a channel of the feature. Then, the obtained weights are pro-
cessed by two fully connection layers with ‘relu’ activation and normalized to 0–1 by using the nonlinear activation
function Sigmoid. Finally, the original feature map is weighted by multiplying the normalized weights to complete
the re-calibration.

Fig. 1. Architecture of the Xception model.

18
B. Chen, X. Ju, B. Xiao et al. Information Sciences 572 (2021) 16–28

3. Proposed model

In this section, a model for the locally GAN-generated face detection is proposed. The model is an improved version of the
Xception model. Two improvements by removing four residual blocks and adding the SE module are considered to avoid the
overfitting problem as much as possible. The Inception block with the dilated convolution and the SE module is used to
replace each common convolution layer in the pre-processing module to obtain the multi-scale features. Moreover, the
FPN is added to achieve the multi-level features for final decision. The details are described in the following.

3.1. SE-Inception block and SE-residual block

In this paper, the Inception block is combined with the SE module to promote the positive feature maps and suppress the
negative ones for the locally generated face detection task. The reason is that the SE module can establish the relationship
between channels and weight channels according to potential importance. As shown in Fig. 2 (a), the input is processed by
the Inception block before the SE-module. Here, c, h, and w represent the number of channels, rows, and columns of the fea-
ture map, respectively. Scale denotes the multiplication operation. The SE module is also introduced into the residual block
with the same purpose. The architecture is presented in Fig. 2 (b).

3.2. Multi-level feature extraction based on FPN

Since the very small-generated regions may be reduced to a point even nothing in the feature map after multi-layer con-
volution and many times of pooling, the final decision may be very difficult. Therefore, the FPN [23] is utilized to extract
multi-level features for the final decision from the feature maps of the information content layers and those of the deep
semantic layers. The FPN has been successfully used in small object detection [31]. In this paper, the FPN is combined with
global average pooling and fully connection layer. The architecture of FPN is shown in Fig. 2(c).
As shown in Fig. 2 (c), the FPN combines a bottom-up pathway, a top-down pathway, and some lateral connections by add
operation to achieve feature maps of different layers. Therefore, the achieved maps with different levels contain the depth

Fig. 2. Architecture of the SE-Inception block, SE-residual block, FPN connecting with global average pooling.

19
B. Chen, X. Ju, B. Xiao et al. Information Sciences 572 (2021) 16–28

Pre-processing Information content Deep semantic Decision module

module module module

Output
8×8
16×16 Global Average Merge
16×16
64×64 32×32 FPN and Dense
256×256 128×128

SE-Inception block SE-residual block 4 repeated SE-residual block

Fig. 3. Main architecture of the proposed model.

semantic information and the shallow context information. Then, the feature of each scale connects with the global average
pooling and fully connection layer. Finally, the results with different scales are merged for making a decision.

3.3. Main architecture of the proposed model

The overall architecture of the proposed locally GAN-generated face detection model is illustrated in Fig. 3. The proposed
model can be divided into 4 modules: the pre-processing module, information content module, and deep semantic module
for feature extraction, the decision module for final decision. All convolutions for the network are the depthwise separable
convolutions with 3 3 kernel except for the pre-processing module using the Inception block. The activation function is the
‘elu’, which is shown to be superior to ‘relu’ in [22].
In the pre-processing module, the SE-Inception block as shown in Fig. 3 (a) is used to replace the common convolutional
layer in the original Xception model to obtain the multi-scale features. The SE-Inception block increases the functional space
to optimize the pre-processing module by stacking the output of several convolutional layers with different kernel shapes.
The pre-processing module contains two SE-Inception blocks and a common convolution layer between them. For the two
SE-Inception blocks, each has a common convolution layer and three dilated convolution layers. The number of the kernels in
two block are (6, 10, 10, 6) and (12, 20, 20, 12), respectively. The rates of the dilated convolution are set as r = 1, 2, 3, respec-
tively. For the convolution layer, it is used to reduce image resolution with the kernel stride 2. The number of the kernels for
this convolution layer is 32.
In the information content module, the inherent features in the images are learned by the relatively shallow layer. The
information content module is composed of three SE-residual pooling blocks. Each block has a SE-residual block with a
max-pooling layer. The 1 1 kernel of convolution with stride 2 is used as the transition convolution. The number of con-
volution kernel for three blocks are 128, 256, 512, respectively.
In the deep semantic module, more high-level semantic information is obtained with the network layers going deep.
Since the excessive increase of convolutional layers and convolutional kernels would lead to overfitting, the proposed model
removes four residual blocks in the deep semantic module of the Xception model and also changes the number of kernels in
some convolutional layers. The deep semantic module contains four SE-residual blocks, a SE-residual pooling block, and a
depthwise separable convolution layer. The number of kernels for all the four SE-residual blocks are 512. The number of ker-
nels for the SE-residual pooling block and the depthwise separable convolution layer are 728 and 1024, respectively.
In the decision module, the multi-level feature maps from three different layers are obtained by FPN for the final decision-
making. One layer is the output of the last block of the deep semantic module. The other two layers are from the last two
blocks of the information content module. The resolutions of three layers are different with 8 8, 16 16 and 32 32,
respectively. The details for the decision module are shown in Fig. 4. After FPN, the features with multi-levels are connected
with both the global average pooling and fully connection layer (Dense). Here, each fully connection layer utilizes 16 neu-
rons. Finally, the features are merged by the concate operation for the final softmax classification decision.
Finally, after the training of the network, the classification result is achieved. The loss function for the whole network
adopts the cross-entropy given by,

X
L¼ pi logðyi Þ þ ð1 pi Þlogð1 yi Þ ð1Þ
i

where pi is the predicted label, yi is the actual label.

20
B. Chen, X. Ju, B. Xiao et al. Information Sciences 572 (2021) 16–28

Fig. 4. Detailed architecture for the decision module. Numbers in the form (m m, n) refer to the kernel size m and the number of kernels n in the
convolutional.

4. Experimental results and analysis

In this section, firstly, the construction of the LGGF dataset is described in Section 4.1. Then, the experimental setup is
elaborated in Section 4.2. The experimental results are described in Section 4.3. Finally, some analysis and discussion are
presented in Section 4.4.

4.1. Dataset construction

To the best of our knowledge, there is no public available dataset for the locally GAN-generated face detection currently.
So, the LGGF dataset is constructed by the procedure shown in Fig. 5.
In order to construct the LGGF dataset, firstly, the natural face image library FFHQ [3] which is served as a benchmark for
some GANs is used to create a partially incomplete image dataset. The FFHQ dataset is a high quality, natural face image
dataset containing 70,000 images at 1024 1024 resolution. It contains many variations, including race, gender, age, back-
ground, and so on. At the same time, it also has a good coverage area for the head of the faces, for example, glasses, head-
bands, masks, hats, and so on. In this paper, all the 70,000 images are resized to 256 256 in the consideration of
computation complexity.
Then, the binary ground truth masks with different sizes are created by Matlab. Six different sizes are considered with the
ratio of the whole image from 0.5% and 5.5% every 1.0%. In addition, two types in shape (regular rectangular shape and irreg-
ular shape) are taken into account. Each type and size contains 70,000 masks. These masks appear at arbitrary positions in
the natural images. Then, the FFHQ dataset is combined with these two types and six different sizes of masks, obtaining
twelve incomplete image datasets.
Finally, the pluralistic model [21] described in section 2.1 is utilized to restore the incomplete region of each image in the
twelve incomplete region image datasets. The model has been trained by Zheng et al. [21] on the Celeba-HQ dataset [2]. Con-
sequently, the LGGF dataset is achieved. It has a total 952,000 images with the generated regions in different shapes and
sizes. Some samples are shown in Fig. 6.
In addition, in the following experiments, we want to construct such an experimental dataset which contains the FFHQ
dataset having 7000 natural images, and in which the number of natural images and generated images is the same. There-
fore, two sub-datasets are created from the LGGF dataset, i.e., a regular sub-dataset and an irregular one. The regular sub-

Fig. 5. Overview of the construction of the locally GAN-generated face (LGGF) dataset.

21
B. Chen, X. Ju, B. Xiao et al. Information Sciences 572 (2021) 16–28

Fig. 6. Some examples for the LGGF dataset, In each group, the left one is the natural image, the middle one is the incomplete image with the mask, the right
one is the locally generated image after restoration.

dataset randomly selects 70,000 images with the regular rectangular generated regions from the LGGF dataset, while the
irregular sub-dataset randomly selects 70,000 images with the irregular generated regions. The generated regions in both
sub-datasets are in different sizes with the ratio of the whole image from 0.5% and 5.5%. The number of images of different
region sizes is the same.

4.2. Experimental setup

To evaluate the performance of the proposed locally GAN-generated face detection model, we carried out a series of
experiments. In these experiments, the original FFHQ dataset and two sub-datasets (regular sub-dataset and irregular
one) are divided into training, validation, and testing sets with the ratio 5:1:4. Moreover, in each experiment, the number
of generated samples and natural samples are the same in each set of the training, validation, and testing sets.
To evaluate the performance of the proposed model, three metrics are considered: Accuracy, Precision, and Recall. Let P
and N be the number of locally generated samples and natural samples in the datasets, TP, TN, FP, and FN be the number of the
correctly detected locally generated samples, correctly detected natural samples, samples erroneously detected as locally
generated samples, and falsely detected as natural samples, respectively. The definition of three metrics is given by,

TP þ TN
Accuracy ¼ ð2Þ
PþN

TP
Precision ¼ ð3Þ
TP þ FP

TP
Recall ¼ ð4Þ
TP þ FN
It can be seen from these equations that: (a) the higher the Accuracy, the better the overall performance; (b) the higher
the Precision, the better the detection of the locally generated image; (c) the higher the Recall, the lower the misjudgment
rate of the locally generated image.
All the experiments are performed in Keras and done with a single 11 GB GeForce GTX 1080 Ti, i7-6900 K CPU, and 64 GB
RAM. Regarding the training, we use the adam optimization algorithm with the initial learning rate 0.001, and set the total
epochs as 128. The source code and experimental datasets are available at https://github.com/imagecbj/Locally-GAN-gener-
ated-face-detection-based-on-an-improved-Xception.

4.3. Experimental results

The proposed model presented in Fig. 3 is an improved model of the Xception with three main modifications. In order to
show the improvement of three main modifications, the ablation experiments are carried out. The ablation experimental
results are shown in the Table 1. It can be observed from this table that each modification has contributed to the final per-
22
B. Chen, X. Ju, B. Xiao et al. Information Sciences 572 (2021) 16–28

Table 1
Ablation experimental results for three main modifications (M1: removing four residual blocks; M2: using Inception block with the dilated convolution to
replace the common convolution layer; M3: adding FPN to obtain multi-level features).

Model FFHQ dataset + Regular sub-dataset FFHQ dataset + Irregular sub-dataset

Accuracy Precision Recall Accuracy Precision Recall
Xception 0.9882 0.9880 0.9850 0.9747 0.9828 0.9661
Xception + M1 0.9885 0.9850 0.9921 0.9795 0.9885 0.9703
Xception + M1 + M2 0.9894 0.9953 0.9831 0.9852 0.9949 0.9754
Proposed(Xception + M1 + M2 + M3) 0.9925 0.9931 0.9916 0.9871 0.9901 0.9842

Table 2
Effectiveness comparison of different models. ’-’ means that the corresponding model does not converge.

Model FFHQ dataset + Regular sub-dataset FFHQ dataset + Irregular sub-dataset

Accuracy Precision Recall Accuracy Precision Recall
NICG [11] – –
IH [12] – –
MesoNet [16] 0.9521 0.9547 0.949 0.9014 0.9101 0.8881
Xception [22] 0.9882 0.9880 0.985 0.9747 0.9828 0.9661
Cross-Net [14] 0.9796 0.9896 0.9691 0.9777 0.9794 0.9760
Gram-Net [15] 0.9603 0.9477 0.9744 0.9423 0.9346 0.9511
Proposed 0.9925 0.9931 0.9916 0.9871 0.9901 0.9842

formance of the proposed model. Among these three modifications, the modification M2 by using Inception block achieves
the most contribution.
Then, in order to evaluate the performance of the proposed model, the proposed model is compared with six recent
advanced models. The compared six models are NICG model [11], IH model [12], MesoNet model [16], Xception model
[22], Cross-Net model [14], and Gram-Net model [15].
The first test evaluates the effectiveness of the proposed model. Table 2 shows the results of different models for the reg-
ular and irregular sub-datasets in terms of Accuracy, Precision, and Recall. Notice that there are no results for the IH and
NICG models because these two models cannot converge in both regular and irregular sub-datasets. It can be seen from
Table 2 that the proposed model is superior to other models. Therefore, it is more suitable for the detection of locally
GAN-generated faces. Due to the non-convergence of the IH and NICG models, they are no longer considered in the following
tests.
The second test is to evaluate the effect on the size of the generated regions. Here, for each type (regular or irregular),
there are six testing sets for six different sizes with the ratio of the whole image between 0.5% and 5.5%. Regarding each test-
ing set, it contains 28,000 natural images and 28,000 generated images with a fixed generated region size. Fig. 7 shows the
results of different sizes of the generated regions. It can be found from Fig. 7 that: (a) the accuracies increase for all the com-
pared models with the increase of the sizes of the generated regions; (b) the proposed model outperforms other models no
matter what sizes, especially for the small sizes. This is because of the use of Inception block with the dilated convolution
and FPN, which separately extract multi-scale and multi-level features.

Fig. 7. Accuracy of different sizes of the locally generated regions.

23
B. Chen, X. Ju, B. Xiao et al. Information Sciences 572 (2021) 16–28

Table 3
Comparison of robustness against smoothing filtering in the testing set (FFHQ dataset + regular sub-dataset).

Model Gaussian filtering Mean filtering

Accuracy Precision Recall Accuracy Precision Recall
MesoNet [16] 0.9241 0.9764 0.8681 0.8945 0.9771 0.8061
Xception [22] 0.9288 0.9947 0.8603 0.7803 0.9863 0.5673
Cross-Net [14] 0.9035 0.9891 0.8097 0.7687 0.9894 0.5379
Gram-Net [15] 0.8603 0.9962 0.7849 0.6736 0.9943 0.6057
Proposed 0.9431 0.9988 0.8866 0.8025 0.9988 0.6044
Model Median filtering Bilateral filtering
Accuracy Precision Recall Accuracy Precision Recall
MesoNet [16] 0.8899 0.9389 0.8306 0.9249 0.9758 0.8702
Xception [22] 0.8632 0.9937 0.7309 0.9200 0.9954 0.8434
Cross-Net [14] 0.8743 0.9781 0.759 0.8947 0.9831 0.7985
Gram-Net [15] 0.8108 0.9834 0.7311 0.8592 0.9853 0.7868
Proposed 0.9147 0.9925 0.8343 0.9261 0.9973 0.8542

Table 4
Comparison of robustness against smoothing filtering in the testing set (FFHQ dataset + irregular sub-dataset).

Model Gaussian filtering Mean filtering

Accuracy Precision Recall Accuracy Precision Recall
MesoNet [16] 0.8804 0.9425 0.8063 0.8551 0.9433 0.7497
Xception [22] 0.9227 0.9985 0.8462 0.7916 0.9976 0.5830
Cross-Net [14] 0.8935 0.987 0.7957 0.7581 0.9794 0.5228
Gram-Net [15] 0.8651 0.9856 0.7942 0.7269 0.9929 0.6480
Proposed 0.9253 0.9945 0.8547 0.8065 0.9915 0.6165
Model Median filtering Bilateral filtering
Accuracy Precision Recall Accuracy Precision Recall
MesoNet [16] 0.8546 0.8886 0.8034 0.8837 0.9371 0.8182
Xception [22] 0.8901 0.9941 0.7840 0.9116 0.9976 0.8245
Cross-Net [14] 0.8628 0.9786 0.7380 0.8845 0.9865 0.7781
Gram-Net [15] 0.8236 0.9750 0.7484 0.8613 0.9786 0.7927
Proposed 0.8954 0.9832 0.8032 0.9257 0.9917 0.8379

€The robustness ability of the proposed model is evaluated in the third test. Therefore, four smoothing filtering operations
are performed on two testing sets including the regular and irregular sub-dataset respectively. They are Gaussian filtering,
mean filtering, median filtering, and bilateral filtering. Tables 3 and 4 present the results on two testing sets respectively. It
can be seen from two tables that although the metrics values decrease after the filtering operations, the proposed model still
has considerable performance and is superior to the other four models except mean filtering.
The fourth test assesses the generalization performance of the proposed model for different image restoration methods.
Here, we consider the other two recent image restoration methods: DFNet[32], and Generative_inpainting[33]. They are used
to restore the incomplete regions of each image in the two incomplete image sub-datasets corresponding to the testing set of
the regular and irregular sub-datasets. Each sub-dataset has 28,000 images. Then, different well-trained models are used to
detect these restored images. These models are trained by the training set whose images restored by the pluralistic image
restoration method [21]. The results are shown in Tables 5 and 6 for two restoration methods, i.e., DFNet [32], and Gener-
ative_inpainting [33]. It can be observed from Table 5 and Table 6 as well as Table2 that: (i) the performance of each detec-
tion model is getting worse when using the testing set created by other restoration methods. The reason is that the tests
corresponding to Tables 5 and 6 are cross-dataset evaluation, while the test of Table 2 is in-dataset evaluation; (ii) the pro-
posed model still performs well and achieves the best performance than the other four models in each test.

Table 5
Performance comparison of different models for the dataset created by the DFNet [32].

Model FFHQ dataset + Regular sub-dataset FFHQ dataset + Irregular sub-dataset

Accuracy Precision Recall Accuracy Precision Recall
MesoNet [16] 0.5432 0.5169 0.9536 0.5536 0.5295 0.955
Xception [22] 0.815 0.7668 0.8834 0.8036 0.7686 0.8568
Cross-Net [14] 0.7473 0.6888 0.8634 0.734 0.6891 0.8304
Gram-Net [15] 0.743 0.6123 0.8736 0.7323 0.6021 0.8725
Proposed 0.916 0.9246 0.8988 0.8962 0.9243 0.8604

24
B. Chen, X. Ju, B. Xiao et al. Information Sciences 572 (2021) 16–28

Table 6
Performance comparison of different models for the dataset created by the Generative_inpainting [33].

Model FFHQ dataset + Regular sub-dataset FFHQ dataset + Irregular sub-dataset

Accuracy Precision Recall Accuracy Precision Recall
MesoNet [16] 0.5179 0.5029 0.9012 0.523 0.513 0.8934
Xception [22] 0.7149 0.7072 0.6771 0.7324 0.7317 0.7146
Cross-Net [14] 0.6585 0.633 0.6812 0.6655 0.6501 0.6931
Gram-Net [15] 0.7053 0.5831 0.8291 0.7251 0.6219 0.828
Proposed 0.8925 0.8838 0.8941 0.8868 0.8877 0.8834

Table 7
Comparison of the performance in detecting very small-generated regions.

Model FFHQ dataset + Regular sub-dataset FFHQ dataset + Irregular sub-dataset

Accuracy Precision Recall Accuracy Precision Recall
MesoNet [16] 0.7537 0.9076 0.5647 0.7087 0.8132 0.5422
Xception [22] 0.8739 0.9291 0.8091 0.8755 0.9011 0.8436
Cross-Net [14] 0.8573 0.9831 0.7143 0.8444 0.9757 0.7062
Gram-Net [15] 0.7412 0.6644 0.9744 0.7766 0.6564 0.9511
Proposed 0.8986 0.9347 0.8565 0.9376 0.9403 0.9348

The fifth test is to verify that the proposed model also has good discrimination ability on very small-generated regions.
Hence, we consider the locally generated images having very small sizes with the ratio between 0.1% and 1.0%. The results
are presented in Table 7. The results show that the proposed model also achieves the best performance among five compared
models. The MesoNet almost fails with the Recall around 0.5.
Finally, we test the performance of the proposed model in some realistic scenarios, such as glasses removal, mask
removal, and the other occlusion removal, etc. Therefore, we create three sub-datasets corresponding to these three realistic
scenarios. Regarding the sub-dataset for mask removal, the original face images with masks are collected from the Internet.
Regarding the sub-datasets for glasses removal and the other occlusion removal, the original face images with glasses and
other occlusions are from the natural FFHQ dataset. The occlusion in these original images are removed by the software
Adobe Photoshop and then repaired by the pluralistic image restoration method [21]. Finally, 300 pairs of testing images
for each sub-dataset are obtained. Each pair contains an original face image and its corresponding repaired one. Some sam-
ples are shown in Fig. 8. All these 900 pairs are used as testing images to evaluate the performance of each model in the

Fig. 8. Samples of different cases of the occlusion removal.

25
B. Chen, X. Ju, B. Xiao et al. Information Sciences 572 (2021) 16–28

Table 8
Performance comparison of different cases of the occlusion removal.

Model Glasses removal Mask removal The other occlusion removal

Accuracy Precision Recall Accuracy Precision Recall Accuracy Precision Recall
MesoNet [16] 0.9614 0.9295 0.9970 0.9449 0.9048 0.9940 0.9411 0.9031 0.9875
Xception [22] 0.9942 0.9867 1.0000 0.9804 0.9613 1.0000 0.9843 0.9733 0.9967
Cross-Net [14] 0.9873 0.9820 1.0000 0.9773 0.9535 1.0000 0.9832 0.9743 0.9901
Gram-Net [15] 0.9965 0.9934 1.0000 0.9846 0.9672 1.0000 0.9982 1.0000 0.9960
Proposed 0.9984 0.9966 1.0000 0.9644 0.9345 1.0000 0.9953 0.9936 0.9972

realistic scenarios. The detection results are presented in Table 8. It can be observed from Table 8 that: (i) the performance of
each model for three sub-datasets is better than that shown in Table 2 for the regular sub-dataset and irregular sub-dataset.
The main reason is that the glasses removal and mask removal usually need to restore the relatively greater regions than
most cases in the regular sub-dataset and irregular sub-dataset; (ii) our model still achieves the overall best performance
among five compared models.

4.4. Experimental analysis and discussion

The proposed model has compared with the six models previously designed for the whole GAN-generated faces in Tables
2–8 and also Fig. 7. The comparison has shown that the proposed model is very suitable for the locally GAN-generated face
detection. The main reason is that the proposed model utilizes the Inception block with the dilated convolution and FPN to
extract multi-scale and multi-level features. As for some models initially for the whole GAN-generated faces, such as NICG
model [11] and IH model [12], they almost fail in the final decision. Fig. 9 presents the loss and accuracy of these two models.
It can be observed from Fig. 9 that these two models can not converge in both sub-datasets. So, the development of the mod-
els for the locally GAN-generated face detection is very necessary.

Fig. 9. Loss and accuracy for the NICG and IH models.

26
B. Chen, X. Ju, B. Xiao et al. Information Sciences 572 (2021) 16–28

The proposed model is an improved version of Xception. Compared with the Xception, the proposed model achieves the
better performance and the less parameters. The proposed model removes four residual blocks and reduces 1024 convolu-
tion kernels in the last layer. The number of the parameters for the proposed model and Xception are 922 k, and 2,506 k,
respectively.

5. Conclusions

In this paper, the locally GAN-generated face detection problem is investigated. An improved Xception model is proposed
for locally GAN-generated face detection. The Inception block with the dilated convolution and FPN as well as SE module are
introduced into Xception to obtain the multi-scale and multi-level features for the small-generated face regions. The LGGF
dataset is constructed on the basis of the natural FFHQ dataset and has a total 952,000 images with the generated regions in
different shapes and sizes. Experimental results prove that the proposed model has better performance than other models
designed for the whole generated faces, especially for the faces with the small-generated regions. Certainly, as for the locally
GAN-generated faces, their detection is only an image-level task, while the localization of the generated regions, the pixel-
level task, is also very important and more difficult than the detection task. So, the localization task is our future work.

CRediT authorship contribution statement

Beijing Chen: Conceptualization, Methodology, Writing-Reviewing and Editing. Xingwang Ju: Data curation, Software,
Validation, Writing-Original draft preparation. Bin Xiao: Methodology, Writing-Reviewing and Editing. Weiping Ding:
Investigation, Writing-Reviewing and Editing. Yuhui Zheng: Methodology, Visualization.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have
appeared to influence the work reported in this paper.

Acknowledgements

The authors would like to express the sincere appreciation to the editor and anonymous reviewers for their insightful com-
ments, which greatly improve the quality of this paper.
This work was supported by the National Natural Science Foundation of China under Grants 62072251, 61976120, and
61972206, the Natural Science Foundation of Jiangsu Province under Grant BK20191445, the PAPD fund.

References

[1] D. Berthelot, T. Schumm, L. Metz, Began: Boundary equilibrium generative adversarial networks, arXiv preprint arXiv:1703.10717, 2017.
[2] T. Karras, T. Aila, S. Laine, et al., Progressive growing of gans for improved quality, stability, and variation, arXiv preprint arXiv:1710.10196, 2017.
[3] T. Karras, S. Laine, T. Aila, A style-based generator architecture for generative adversarial networks, in: Proceedings of the 2019 IEEE Conference on
Computer Vision and Pattern Recognition (CVPR2019), 2019, pp. 401-4410.
[4] J. ThiesJ, M. Zollhofer, M. Stamminger, et al., Face2face: Real-time face capture and reenactment of rgb videos, in: Proceedings of the 2016 IEEE
Conference on Computer Vision and Pattern Recognition (CVPR2016), 2016, pp. 2387-2395.
[5] I. Korshunova, W. Shi, J. Dambre, et al., Fast face-swap using convolutional neural networks, in: Proceedings of the 2017 IEEE International Conference
on Computer Vision (ICCV2017), 2017, pp. 3677-3685.
[6] D. Güera, E. Delp, Deepfake video detection using recurrent neural networks, in: Proceedings of the 15th IEEE International Conference on Advanced
Video and Signal Based Surveillance (AVSS2018), 2018, pp. 1-6.
[7] F. Matern, C. Riess, M. Stamminger, Exploiting visual artifacts to expose deepfakes and face manipulations, in: Proceedings of the 2019 IEEE Winter
Applications of Computer Vision Workshops (WACVW2019), 2019, pp. 83-92.
[8] S. McCloskey, M. Albright, Detecting GAN-generated imagery using color cues, arXiv preprint arXiv:1812.08247, 2018.
[9] H. Li, B. Li, S. Tan, et al., Detection of deep network generated images using disparities in color components, arXiv preprint arXiv:1808.07276, 2018.
[10] F. Marra, D. Gragnaniello, D. Cozzolino, et al., Detection of GAN-generated fake images over social networks, in: Proceedings of the 2018 IEEE
Conference on Multimedia Information Processing and Retrieval (MIPR2018), 2018, pp. 384-389.
[11] W. Quan, K. Wang, D.M. Yan, et al, Distinguishing between natural and computer-generated images using convolutional neural networks, IEEE Trans.
Inf. Forensics Secur. 13 (11) (2018) 2772–2787.
[12] H. Mo, B. Chen, W. Luo, Fake faces identification via convolutional neural network, in, in: Proceedings of the 6th ACM Workshop on Information Hiding
and Multimedia Security (IH&MMSec2018), 2018, pp. 43–47.
[13] L. Nataraj, T.M. Mohammed, B.S. Manjunath, et al., Detecting GAN generated fake images using co-occurrence matrices, Electronic Imaging, 2019, pp.
532-1-532-7.
[14] N. Hulzebosch, S. Ibrahimi, M. Worring, Detecting CNN-generated Facial images in real-world scenarios, in: Proceedings of the 2020 IEEE/CVF
Conference on Computer Vision and Pattern Recognition Workshops (CVPR-W2020), 2020, pp. 642-643.
[15] Z. Liu, X. Qi, P.H.S. Torr, Global texture enhancement for fake face detection in the wild, in: Proceedings of the 2020 IEEE/CVF Conference on Computer
Vision and Pattern Recognition (CVPR2020), 2020, pp. 8060-8069.
[16] D. Afchar, V. Nozick, J. Yamagishi, et al., MesoNet: a compact facial video forgery detection network, in: Proceedings of the 2018 IEEE International
Workshop on Information Forensics and Security (WIFS2018), 2018, pp. 1-7.
[17] A. Khodabakhsh, R. Ramachandra, K. Raja, et al, Fake face detection methods: Can they be generalized?, in, in: Proceedings of the 2018 International
Conference of the Biometrics Special Interest Group, 2018, pp. 1–6.
[18] A. Rössler, D. Cozzolino, L. Verdoliva, et al., Faceforensics++: Learning to detect manipulated facial images, arXiv preprint arXiv:1901.08971, 2019.

27
B. Chen, X. Ju, B. Xiao et al. Information Sciences 572 (2021) 16–28

[19] X. Chang, J. Wu, T. Yang, et al, DeepFake face Image detection based on improved VGG convolutional nneural network, in, in: Proceedings of the 39th
Chinese Control Conference (CCC2020), 2020, pp. 7252–7256.
[20] S.Y. Wang, O. Wang, R. Zhang, et al., CNN-generated images are surprisingly easy to spot... for now, in: Proceedings of the 2020 IEEE Conference on
Computer Vision and Pattern Recognition (CVPR2020), 2020, pp. 8695-8704.
[21] C. Zheng, T.J. Cham, J. Cai, Pluralistic image completion, in, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
(CVPR2019), 2019, pp. 1438–1447.
[22] F. Chollet, Xception, Deep learning with depthwise separable convolutions, in, in: Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition (CVPR2017), 2017, pp. 1251–1258.
[23] T.Y. Lin, P. Dollár, R. Girshick, et al., Feature pyramid networks for object detection, in: Proceedings of the 2017 IEEE Conference on Computer Vision
and Pattern Recognition (CVPR2017), 2017, pp. 2117-2125.
[24] C. Antonio, P. Patrick, T. Kentaro, Region filling and object removal by exemplar-based image inpainting, IEEE Trans. Image Process. 13 (9) (2004) 1200–
1212.
[25] J. Hays, A.A. Efros, Scene completion using millions of photographs, ACM Trans. Graphics 26 (3) (2007) 4-es.
[26] Y. Zeng, J. Fu, H. Chao, et al., Learning pyramid-context encoder network for high-quality image inpainting, in: Proceedings of the 2019 IEEE Conference
on Computer Vision and Pattern Recognition (CVPR2019), 2019: 1486-1494.
[27] R. Sakurai, S. Yamane, J.H. Lee, Restoring aspect ratio distortion of natural images with convolutional neural network, IEEE Trans. Ind. Inf. 15 (1) (2019)
563–571.
[28] G. Kanojia, S. Raman, MIC-GAN: multi-view assisted image completion using conditional generative adversarial networks, in: Proceedings of the 2020
National Conference on Communications (NCC2020), 2020, pp. 1-6.
[29] M.S. Santos, J.P. Soares, P.H. Abreu, et al, Cross-validation for imbalanced datasets: avoiding overoptimistic and overfitting approaches, IEEE Comput.
Intell. Mag. 13 (4) (2018) 59–76.
[30] J. Hu, L. Shen, G. Sun, Squeeze-and-excitation networks, in, in: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition
(CVPR2018), 2018, pp. 7132–7141.
[31] B. Singh, M. Najibi, L.S. Davis, SNIPER: Efficient multi-scale training, Adv. Neural Inf. Process. Syst. (2018) 9310–9320.
[32] X. Hong, P. Xiong, R. Ji, Deep fusion network for image completion, in: Proceedings of the 27th ACM International Conference on Multimedia
(MM2019), 2019, pp. 2033-2042.
[33] J. Yu, Z. Lin, J. Yang J, Free-form image inpainting with gated convolution, in: Proceedings of the 2019 IEEE International Conference on Computer
Vision (CVPR2019), 2019, pp. 4471-4480.

Liveness Detection in Face Recognition Using Deep Learning
No ratings yet
Liveness Detection in Face Recognition Using Deep Learning
4 pages
Ceramic Cores For Turbine Blades: A Tooling Perspective: Pradyumna R &baigmah
No ratings yet
Ceramic Cores For Turbine Blades: A Tooling Perspective: Pradyumna R &baigmah
7 pages
Design of Concrete Structures RCC Marathon by Sandeep Jyani Helpful
No ratings yet
Design of Concrete Structures RCC Marathon by Sandeep Jyani Helpful
446 pages
Example Practical Report
84% (19)
Example Practical Report
45 pages
Sales Summit-Delhi, 21may
No ratings yet
Sales Summit-Delhi, 21may
88 pages
Innovative Face Detection Using Artificial Intelligence
No ratings yet
Innovative Face Detection Using Artificial Intelligence
4 pages
Detecting Fake Images Using Convolutional Neutral Networks - A Deep Learning Approach
No ratings yet
Detecting Fake Images Using Convolutional Neutral Networks - A Deep Learning Approach
5 pages
Deep Learning-Based Recognition of Facial Expressions
No ratings yet
Deep Learning-Based Recognition of Facial Expressions
10 pages
Fake Face Detection Using CNN
No ratings yet
Fake Face Detection Using CNN
6 pages
IJISRT24JAN401
No ratings yet
IJISRT24JAN401
4 pages
Deep Learning For Face Recognition
No ratings yet
Deep Learning For Face Recognition
47 pages
Articulo10 Covic
No ratings yet
Articulo10 Covic
5 pages
Criminal Face Recognition Using GAN
No ratings yet
Criminal Face Recognition Using GAN
3 pages
ML Question Bank - Beena Kapadia
No ratings yet
ML Question Bank - Beena Kapadia
3 pages
Conference Template A4
No ratings yet
Conference Template A4
6 pages
Bhavya
No ratings yet
Bhavya
7 pages
Global-Local Facial Fusion Based GAN Generated Fake Faces
No ratings yet
Global-Local Facial Fusion Based GAN Generated Fake Faces
18 pages
Digital Transformation Management For Agile Organizations A Compass To Sail The Digital World (Stefano Bresciani, Alberto Ferraris Etc.) (Z-Library)
No ratings yet
Digital Transformation Management For Agile Organizations A Compass To Sail The Digital World (Stefano Bresciani, Alberto Ferraris Etc.) (Z-Library)
204 pages
Generating Users Desired Face Image Using The Conditional Generative Adversarial Network and Relevance Feedback
No ratings yet
Generating Users Desired Face Image Using The Conditional Generative Adversarial Network and Relevance Feedback
11 pages
Untitled Presentation
No ratings yet
Untitled Presentation
20 pages
A GAN Based Model For Deepfake Detection in Social Media
No ratings yet
A GAN Based Model For Deepfake Detection in Social Media
10 pages
Face Mask Detection in Image and
No ratings yet
Face Mask Detection in Image and
20 pages
Openface: A General-Purpose Face Recognition Library With Mobile Applications
No ratings yet
Openface: A General-Purpose Face Recognition Library With Mobile Applications
20 pages
Symmetry: Emotion Classification Using A Tensorflow Generative Adversarial Network Implementation
No ratings yet
Symmetry: Emotion Classification Using A Tensorflow Generative Adversarial Network Implementation
19 pages
Deepfake Image Detection A Comparative Study of Three Different Convolutional Neural Networks
No ratings yet
Deepfake Image Detection A Comparative Study of Three Different Convolutional Neural Networks
7 pages
Towards Generalizable Deepfake Detection
No ratings yet
Towards Generalizable Deepfake Detection
10 pages
Robust Attentive Deep Neural Network For Detecting
No ratings yet
Robust Attentive Deep Neural Network For Detecting
10 pages
GAN-based Face Reconstruction For Masked-Face: Farnaz Farahanipad Mohammad Rezaei Mohammadsadegh Nasr
No ratings yet
GAN-based Face Reconstruction For Masked-Face: Farnaz Farahanipad Mohammad Rezaei Mohammadsadegh Nasr
5 pages
Eyes Tell All
No ratings yet
Eyes Tell All
5 pages
10.3934 Mbe.2024071
No ratings yet
10.3934 Mbe.2024071
25 pages
Optimized GAN-Based Pipeline For High-Quality Face Restoration From CCTV Images
No ratings yet
Optimized GAN-Based Pipeline For High-Quality Face Restoration From CCTV Images
29 pages
Shading Devices PDF
No ratings yet
Shading Devices PDF
39 pages
New Smart Face Generation
No ratings yet
New Smart Face Generation
9 pages
On Hallucinating Context and Background Pixels From A Face Mask Using
No ratings yet
On Hallucinating Context and Background Pixels From A Face Mask Using
14 pages
Detection of Deepfake Videos Using Long Distance Attention 2
No ratings yet
Detection of Deepfake Videos Using Long Distance Attention 2
10 pages
3D Cartoon Face Generation With Controllable Expressions From A Single GAN Image
No ratings yet
3D Cartoon Face Generation With Controllable Expressions From A Single GAN Image
11 pages
Generalization of Forgery Detection With Meta Deepfake Detection Model
No ratings yet
Generalization of Forgery Detection With Meta Deepfake Detection Model
12 pages
Domain Generalization Via Ensemble Stacking For Fa
No ratings yet
Domain Generalization Via Ensemble Stacking For Fa
25 pages
Chapter - 7:referesce
No ratings yet
Chapter - 7:referesce
3 pages
没代码ProActive DeepFake Detection Using GAN-based Visible
No ratings yet
没代码ProActive DeepFake Detection Using GAN-based Visible
27 pages
GAN-Generated Faces Detection A Survey and New Perspectives
No ratings yet
GAN-Generated Faces Detection A Survey and New Perspectives
10 pages
Conference Template A114
No ratings yet
Conference Template A114
6 pages
Conference Template A4
No ratings yet
Conference Template A4
3 pages
Arc 2 Face
No ratings yet
Arc 2 Face
29 pages
Paper 21-Fake Face Generator
No ratings yet
Paper 21-Fake Face Generator
6 pages
B5 PPT
No ratings yet
B5 PPT
31 pages
3606-Article Text-6772-1-10-20210422
No ratings yet
3606-Article Text-6772-1-10-20210422
14 pages
A Momentum-Based Local Face Adversarial Example Generation Algorithm
No ratings yet
A Momentum-Based Local Face Adversarial Example Generation Algorithm
17 pages
Conference Template A14
No ratings yet
Conference Template A14
4 pages
Guarnera DeepFake Detection by Analyzing Convolutional Traces CVPRW 2020 Paper
No ratings yet
Guarnera DeepFake Detection by Analyzing Convolutional Traces CVPRW 2020 Paper
10 pages
Seminar
No ratings yet
Seminar
18 pages
Deep Fake Detection - Finalized
No ratings yet
Deep Fake Detection - Finalized
8 pages
DFFMD A Deepfake Face Mask Dataset For Infectious Disease Era With Deepfake Detection Algorithms
No ratings yet
DFFMD A Deepfake Face Mask Dataset For Infectious Disease Era With Deepfake Detection Algorithms
12 pages
Fake Image Detection PDF
No ratings yet
Fake Image Detection PDF
19 pages
Taktis Multi Protocol FCAP MAN 1431KE M
No ratings yet
Taktis Multi Protocol FCAP MAN 1431KE M
187 pages
A Deep Learning Based Approach For Real Time Face Recognition System
No ratings yet
A Deep Learning Based Approach For Real Time Face Recognition System
4 pages
Ijsret v8 Issue4 461
No ratings yet
Ijsret v8 Issue4 461
4 pages
Arasetv47 N1 PP16 28
No ratings yet
Arasetv47 N1 PP16 28
13 pages
GTA-Net A Robust Method For Deepfake Face Image Detection
No ratings yet
GTA-Net A Robust Method For Deepfake Face Image Detection
6 pages
Paper2 FInal Shashi Gowda
No ratings yet
Paper2 FInal Shashi Gowda
20 pages
Advancing GANDeepfake Detection
No ratings yet
Advancing GANDeepfake Detection
24 pages
1 s2.0 S2215016125000767 Main
No ratings yet
1 s2.0 S2215016125000767 Main
10 pages
RhinoGold 4.0 - Level 1 - Tutorial 015P - Channel Pendant PDF
No ratings yet
RhinoGold 4.0 - Level 1 - Tutorial 015P - Channel Pendant PDF
2 pages
Deep Fake Deection
No ratings yet
Deep Fake Deection
13 pages
Detection Methods For AI Generated Visual Content (2020-2025)
No ratings yet
Detection Methods For AI Generated Visual Content (2020-2025)
10 pages
Retail Management Synopsis
0% (1)
Retail Management Synopsis
2 pages
Latex Tutorial
No ratings yet
Latex Tutorial
44 pages
DigiFace 1M
No ratings yet
DigiFace 1M
10 pages
Final Repot Face Project
No ratings yet
Final Repot Face Project
46 pages
Major Report
No ratings yet
Major Report
46 pages
DM 04 04 Rule-Based Classification
No ratings yet
DM 04 04 Rule-Based Classification
72 pages
Unit 6 Arrays: Structure
100% (1)
Unit 6 Arrays: Structure
14 pages
Bio 11 Syllabus
No ratings yet
Bio 11 Syllabus
4 pages
E Commerce Marketing
No ratings yet
E Commerce Marketing
11 pages
Httpserver23.Chat Avenue - Comclienthtmlchatwebchat.cache - Js
No ratings yet
Httpserver23.Chat Avenue - Comclienthtmlchatwebchat.cache - Js
96 pages
04 APAAR Misconception Eng
No ratings yet
04 APAAR Misconception Eng
5 pages
Circular Permutation in All Cases
No ratings yet
Circular Permutation in All Cases
2 pages
Design and Manufacture of TDS Measurement and Cont
No ratings yet
Design and Manufacture of TDS Measurement and Cont
21 pages
Bibliography
No ratings yet
Bibliography
3 pages
Mit - It Cost Model PDF
No ratings yet
Mit - It Cost Model PDF
139 pages
Economics PPT Education and Skills
No ratings yet
Economics PPT Education and Skills
16 pages
Manifestation of Globalization
No ratings yet
Manifestation of Globalization
3 pages
Factors For Success in Customer Relationship Management (CRM) Systems
No ratings yet
Factors For Success in Customer Relationship Management (CRM) Systems
27 pages
Sigma Rules in Technical Writing
No ratings yet
Sigma Rules in Technical Writing
4 pages
Non-Orthogonal Multiple Access For Cooperative Communications: Challenges, Opportunities, and Trends
No ratings yet
Non-Orthogonal Multiple Access For Cooperative Communications: Challenges, Opportunities, and Trends
20 pages
Review C Topic 2.13 EquationsInequalitiesInverses Updated
No ratings yet
Review C Topic 2.13 EquationsInequalitiesInverses Updated
2 pages
Extended Reality
No ratings yet
Extended Reality
2 pages
Lab 6
No ratings yet
Lab 6
5 pages
Ibm Devops and Software Engineering: Sahish Pandav
No ratings yet
Ibm Devops and Software Engineering: Sahish Pandav
1 page
Object Detection: Advances, Applications, and Algorithms
From Everand
Object Detection: Advances, Applications, and Algorithms
Fouad Sabry
No ratings yet

Locally GAN-generated Face Detection Based On An Improved Xception

Uploaded by

Locally GAN-generated Face Detection Based On An Improved Xception

Uploaded by

Information Sciences 572 (2021) 16–28

Contents lists available at ScienceDirect

Locally GAN-generated face detection based on an improved

2. Related work and technologies

2.1. Pluralistic image completion method

2.2. Xception model

Fig. 1. Architecture of the Xception model.

3.1. SE-Inception block and SE-residual block

3.2. Multi-level feature extraction based on FPN

Pre-processing Information content Deep semantic Decision module

SE-Inception block SE-residual block 4 repeated SE-residual block

Fig. 3. Main architecture of the proposed model.

3.3. Main architecture of the proposed model

where pi is the predicted label, yi is the actual label.

4. Experimental results and analysis

4.1. Dataset construction

4.2. Experimental setup

4.3. Experimental results

Model FFHQ dataset + Regular sub-dataset FFHQ dataset + Irregular sub-dataset

Model FFHQ dataset + Regular sub-dataset FFHQ dataset + Irregular sub-dataset

Fig. 7. Accuracy of different sizes of the locally generated regions.

Model Gaussian filtering Mean filtering

Model Gaussian filtering Mean filtering

Model FFHQ dataset + Regular sub-dataset FFHQ dataset + Irregular sub-dataset

Model FFHQ dataset + Regular sub-dataset FFHQ dataset + Irregular sub-dataset

Model FFHQ dataset + Regular sub-dataset FFHQ dataset + Irregular sub-dataset

Fig. 8. Samples of different cases of the occlusion removal.

Model Glasses removal Mask removal The other occlusion removal

4.4. Experimental analysis and discussion

Fig. 9. Loss and accuracy for the NICG and IH models.

CRediT authorship contribution statement

Declaration of Competing Interest

You might also like