0% found this document useful (0 votes)
20 views14 pages

Iwdw 2019

Uploaded by

Taosiful Akash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views14 pages

Iwdw 2019

Uploaded by

Taosiful Akash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/318675590

Detection of Face Morphing Attacks by Deep Learning

Conference Paper · July 2017


DOI: 10.1007/978-3-319-64185-0_9

CITATIONS READS
133 5,970

4 authors, including:

Wojciech Samek Anna Hilsmann


Technische Universität Berlin Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut
273 PUBLICATIONS 24,220 CITATIONS 113 PUBLICATIONS 1,049 CITATIONS

SEE PROFILE SEE PROFILE

Peter Eisert
Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut
317 PUBLICATIONS 4,609 CITATIONS

SEE PROFILE

All content following this page was uploaded by Peter Eisert on 20 October 2017.

The user has requested enhancement of the downloaded file.


Detection of Face Morphing Attacks by Deep Learning

Clemens Seibold1 , Wojciech Samek1 , Anna Hilsmann1 and Peter Eisert1,2


1
Fraunhofer HHI, Einsteinufer 37, 10587 Berlin, Germany
2
Humboldt University Berlin, Unter den Linden 6, 10099 Berlin, Germany

Abstract. Identification by biometric features has become more popular in the last decade.
High quality video and fingerprint sensors have become less expensive and are nowadays
standard components in many mobile devices. Thus, many devices can be unlocked via fin-
gerprint or face verification. The state of the art accuracy of biometric facial recognition
systems prompted even systems that need high security standards like border control at air-
ports to rely on biometric systems. While most biometric facial recognition systems perform
quite accurate under a controlled environment, they can easily be tricked by morphing at-
tacks. The concept of a morphing attack is to create a synthetic face image that contains
characteristics of two different individuals and to use this image on a document or as refer-
ence image in a database. Using this image for authentication, a biometric facial recognition
system accepts both individuals. In this paper, we propose a morphing attack detection ap-
proach based on convolutional neural networks. We present an automatic morphing pipeline
to generate morphing attacks, train neural networks based on this data and analyze their
accuracy. The accuracy of different well-known network architectures are compared and the
advantage of using pretrained networks compared to networks learned from scratch is studied.

Keywords: automatic face morphing - face image forgery detection - convolutional neural
networks - morphing attack

1 Introduction

Biometric facial recognition systems are nowadays present in many areas of daily life. They are
used to identify people, find pictures of the same person in your digital image collection, to make
suggestions for tagging people in social media or for verification tasks like unlocking a phone or
a computer. Apart from these consumer market applications, biometric facial recognition systems
found also their way into sovereign tasks like automatic border control at airports. In particular,
for these tasks the verification system has to be reliable and secure.
Even though biometric facial recognition systems achieve false rejection rates below 1% at a
false acceptance rate of 0.1% in a controlled environment [1], they can easily be tricked by a specific
attack, known as morphing attack [2]. The concept of this attack is to create a synthetic face image
that is, in the eye of the recognition system, similar to the faces of two different individuals. Thus,
both can use the same synthetic face image for authentication. This attack is usually performed
by creating a face image that contains characteristics of both faces, for example by face morphing.
Since such an attack has a drastic impact on the authenticity of its underlying system, its automatic
detection is of outmost importance.
The detection of attacks by image manipulation has been studied for various tasks. Several
publications deal with the detection of resampling artifacts [3, 4], the use of specific filters, e.g.
median filters [5], or JPEG-double compression [6, 7]. Beside these analyses based on signal theory,
2

Fig. 1. Morphing attack (left, right: original images, center: morphing attack)

several authors proposed image forgery detection methods based on semantic image content by
analyzing reflections and shadows [8, 9]. Recently, the detection and analysis of forged face images
with the purpose of tricking a facial recognition system became interesting to many researchers,
which were certainly influenced by the publication of a manual for morphing attacks from Ferrara
et al. [2]. Makrushin et al. [10] proposed an approach for automatic generation of facial morphs
and their detection based on the distribution of Benford features extracted from quantized DCT
coefficients of JPEG-compressed morphs. Raghavendra et al. [11] presented a morphing detection
method based on binary statistical images features and evaluated its accuracy on manually created
morphed face images.
In this work, we analyze the practical usability of deep neural networks (DNNs) for detection of
morphed face images (morphing attacks). DNNs are state of the art image classification methods,
since the DNN AlexNet [12] broke in 2012 the record error rate of 25.7% in the object recognition
challenge (ILSVRC2012) and reduced it to 16.4%. This DNN architecture has five convolutional
layers and ends with three fully connected layers. Pursuing the concept of ”simple but deep”,
Simonyan and Zisserman [13] showed in 2015 that a simple architecture concept (VGG19) with
only 3x3 convolutional filters, but with a depth of 16 convolutional layers also performs as well as
state of the art architectures with error rates for object classification of about 7%. Szegedy et al.
[14] moved away from the concept of a simple architecture and introduced the GoogLeNet with a
complex structure containing inception layers. An inception layer consists of several convolutional
filters that operate in parallel and whose outputs are concatenated. On one hand, the GoogLeNet
needs, due to this structure, less parameters to describe even complex features and is thus less prone
to overfitting, but on the other hand, the learned features are more difficult to interpret.
We focus on the accuracy analysis of the three architectures named above, since all of these
architectures have successfully been used for the task of image classification and pretrained models
are publicly available. Similar to the task of object classification, we do not want to detect low-
level artifacts, e.g. resampling, median filtering or blurring artifacts like Bayar and Stamm [15]
3

did using a different CNN architecture for image forgery detection. Instead, we want our DNNs to
decide based on semantic features like unrealistic eyes forms, specular highlights or other semantic
artifacts caused by the morphing process.
Since a huge amount of data is needed for the training of DNNs, we designed a fully automatic
morphing algorithm that takes two 2D face images, aligns the facial features, such that they are at
the same position, blends the whole aligned images and gets rid of ghosting artifacts due to different
hairstyles, ear geometry etc. in a post-processing step that fits the blended face into a warped source
image. Some components, like the face alignment, are exchangeable due to the pipeline structure
of our morphing algorithm. To overcome the problem of learning low-level features, like resampling
artifacts, all images - original face images and morphing attacks - are preprocessed in several ways.
In section 2, we describe our fully automatic morphing algorithm with exchangeable components
which is used to create a database of morphed images. The preprocessing steps, which are indepen-
dent of the morphing process, and our database of original face images and morphing attacks are
presented in section 3. Our morphing attack detection experiments are shown in section 4. Finally,
the results are presented and discussed.

2 Morphing Pipeline

In this section, we introduce our fully automatic morphing algorithm. We start by introducing the
general steps of morphing algorithms and finally describe our specific algorithm.

2.1 Overview on a General Morphing Pipeline

To trick a facial recognition system to match two different persons with one synthetic reference
image, the synthetic image has to contain characteristics from both faces. If the feature space of
the recognition system is known, the characteristics can be directly combined in that space. Since
commercial face recognition system are often a black box and an attacker would not rely on the
presence of a specific system, we adapt the facial characteristics in image space. The usual process
of face image morphing is divided into four steps:

1. Locate facial landmarks


2. Align images by image warping such that the detected facial landmarks are at nearly the same
position
3. Blend the images, e.g. additive blending with a blending factor of 0.5
4. Replace the inner part of a warped original image by the corresponding part of the blended
image. To reduce artifacts, an optimal cutting path is computed.

Some morphing algorithms cut the inner part of the face and insert it into one original not warped
face image. They apply an inverse morphing on the blended image and also need to handle the
border between blended and original image.

2.2 Morphing Pipeline Implementation

Figure 2 illustrates our face image morphing pipeline. In the following we describe our implemen-
tation of the single steps in detail.
4

···

a) Input image with facial d) Input image with facial


landmarks landmarks

b) Warped im- c) Warped im-


age (left part) age (right part)

f) Aligned and blended image


e) Final morph g) Final morph

Fig. 2. Morphing pipeline


5

Landmark Detection Sixty-eight facial landmarks are located using the dLib’s [16] implementa-
tion of [17]. These landmarks are located around mouth, eyes, nose, eyebrows and the lower half of
the head’s silhouette, see also figures 2a,d.

Image Alignment We implemented two different warping methods to align the images. One is
based on a transformation of a triangle mesh and the other one uses the Beier-Neely morphing
algorithm [18]. In both cases, the transformation is performed such that both faces are warped to
an average geometry of both faces. The exact definition of the average geometry depends on the
warping method, but both methods rely on average positions of the facial landmarks. These are
calculated by averaging pairwise the location of the facial landmarks in the original images.
a. Triangle warping: We first add eight additional control points to the detected 68 facial landmark
locations in each image to be able to morph also the region outside of the convex hull of the
detected points. They are located in the four corners and on the middle of the four lines of a
bounding box around the head. These eight control points are also subject to the averaging of
facial landmark positions as described above. Following, the average positions are used to create
a triangle mesh using the Delaunay [19] triangulation, see also figure 3a. To create the piecewise
affine warped version of the first source image, for each pixel in the target image, we calculate
the barycentric coordinates for the triangle enclosing the point and use them to calculate the
color by bilinear interpolation in the source image. This corresponds to rendering the triangle
mesh using the common rendering pipeline with the average positions of the landmarks as
coordinates, the landmark’s positions of a source image as texture coordinates and the source
image as texture map. The other image is warped accordingly.

b. Field morphing: In contrast to the triangle warping, the field morphing [18] needs no corre-
sponding feature points, but corresponding feature line segments. To get corresponding feature
line segments, we use the estimated landmark positions and connect them according to a prede-
fined pattern, see figure 3b. The basic idea of the field morphing is to move every pixel according
to the movement of each line segment weighted by the distance to them. Assuming there is only
one line segment Q in the source image with end points Qs and Qe , one corresponding line
Q0 in the destination image and a given position pw in the warped image, the corresponding
position ps in the source image can be calculated as follows. First, the position of the point pw is
calculated relative to the line in the destination image. This is done by calculating the distance
d from pw to a line that is defined by extending the line segment Q0 to an infinite length. The
distance d is signed, such that it is positive, if the normal of the line segment Q0 points towards
pw and otherwise negative. Given the closest point s on this line to pw , we parameterize the
position of this intersection point by pw using the start and end point of the line segment:
s − Q0s
u= (1)
kQ0e − Q0s k
The point pw can now be described relative to the line segment by u, d and the parameters of
the line segment. The corresponding position in the source image is defined relative to the line
in the source image as
u · kQe − Qs k + Qs + Qn · d, (2)
where Qn is the normal of the line segment Q. The multi-line case is straightforward and can
be done for every pixel individually: For each line segment, the motion of a pixel is calculated
6

a) Triangle mesh warping b) Feature lines for the field morphing

Fig. 3. Initialization for Face Warping

as described above and weighted according to its distance to the line segment with the weight
decreasing nonlinear with the distance to a line segment. These motions and weights are added
up and finally the motion is divided by the sum of all weights at this pixel. For further details
see [18].

Image Blending Our blending is performed by an additive alpha blending with a blending factor
of 0.5 for both warped images. More complex blending methods, like spatially invariant blending
factors based of the spatial frequencies or selected features might be useful to retain facial charac-
teristics, e.g. moles and freckles, and will be part of our future work.

Warped/Blended Image Fusion The warped and blended images are already usable for tricking
a biometric facial recognition system but would be rejected by every human due to the ghosting
artifacts around the borders of the heads, see also figure 2f. To get rid of these artifacts, we use the
blended image only for the inner part of the synthetic face image, the rest of the image is taken
from one of the warped source images. As a consequence of combining these two images, we have
to deal with the border between the blended part of the face and the background. In order to hide
this border to the human eye and morphing detection systems, we calculate a subtle transition
between foreground and background. Two ellipses define a transition zone between foreground and
background. They are located such that the outer ellipse is barely on the chin and underneath the
hairline and the inner ellipse surrounds all detected facial landmarks. Figure 4a shows the ellipses
in one blended image. The cyan point denotes the center of the ellipse and is defined relative to
7

the two red marked facial landmaks on the nose. The width is defined relative to the X-distance
from the center to the position of the red marked facial landmarks next to the ears and the height
relative to the Y-distance from the center to the red marked facial landmarks below the mouth
and at the chin. The border/transition in this area is calculated separately for high and low spatial
frequencies. The transition for low frequencies is calculated using Poisson Image Editing [20], while
an optimal cutting path through the transition zone is searched for the high frequency details. For
that purpose, the area defined by the two ellipses is unrolled from Cartesian to polar coordinates
[21]. Thus, the column in the converted image defines the angle and the row defines the radius, see
also figure 4a,b.

(b) Unrolled region and estimated optimal cut

(c) Unrolled region after removing low spatial frequen-


(a) Region for cutting edge estimation cies

Fig. 4. Estimation of cutting edge for the high spatial frequencies

On these images, we calculate an optimal path using the graph cut algorithm for image seg-
mentation. The top row and the bottom row are defined as foreground and background, and the
rest as unknown component. In contrast to classical segmentation approaches, we want to find
transitions with a minimal color difference as in [22]. However, in contrast to [22], we use a differ-
ent objective function that uses a normalized color difference instead of the sum of absolute color
differences. Thereby, we encourage cuts through similar regions, volatile or smooth, in both images
and avoid artifacts along the cut. The cost function for two neighboring pixels (x, y), (x + 1, y) or
(x, y), (x, y + 1) being not in the same class, in our case taken from different images, is thus defined
8

as the weighted intensity difference between both images at these pixels:


s
(Ib (x, y) − Is (x, y))2 + (Ib (x + 1, y) − Is (x + 1, y))2
C(x, y, x + 1, y) = (3)
nx (x, y)
s
(Ib (x, y) − Is (x, y))2 + (Ib (x, y + 1) − Is (x, y + 1))2
C(x, y, x, y + 1) = , (4)
ny (x, y)

where Ib (x, y) is the intensity at pixel (x, y) in the unrolled blended image after removing the low
spatial frequencies, Is (x, y) in the corresponding unrolled background and

nx (x, y) =(Ib (x − 1, y) − Ib (x + 1, y))2 + (Ib (x, y) − Ib (x + 2, y))2


(5)
+ (Is (x − 1, y) − Is (x + 1, y))2 + (Is (x, y) − Is (x + 2, y))2
ny (x, y) =(Ib (x, y − 1) − Ib (x, y + 1))2 + (Ib (x, y) − Ib (x, y + 2))2
(6)
+ (Is (x, y − 1) − Is (x, y + 1))2 + (Is (x, y) − Is (x, y + 2))2
the normalization function. Put simply, cuts through regions that differ in both images are penalized.
Figures 4b-c show the estimated cutting edge on an unrolled face image.
By separating high spatial frequencies and low spatial frequencies, we can obtain a border that
provides a smooth transition and does not cut through high spatial frequency biometric character-
istics. The estimation of the transition in the lower frequencies part allows a smoothly and subtly
adjustment of different colors or different brightness on the skin due to both, differences in skin color
and illumination. The estimation of a sharp border for the high frequencies provides a cutting path
through the image along similar regions, thus visible borders or cuts through individual biometric
characteristics like freckles, moles or scares are avoided.

3 Database and Preprocessing


In this section, we describe our face image database and the preprocessing steps that are applied to
both types of images, original face images and morphing attacks. The preprocessing is applied to
increase the variety of the data and prevent the neural networks from learning low-level artifacts like
gradient distributions. Finally, we introduce a modified cross-validation method. It considers the
dependency between our original face images and morphing attacks, caused by the morphing process
itself, since it fuses two original images to one forged synthetic image containing characteristics of
both original images.

3.1 Database
Our face image database contains about 1250 face images of different individuals. The database
can be divided into 9 different groups of images, with nearly each group having similar capture
conditions regarding illumination and camera type and properties. One group contains face images
captured under varying conditions and with different cameras. All images show a face in a frontal
pose. The captured person has open eyes, their mouth closed and a neutral expression, as demanded
for passport images. Morphing attacks are only created using images from the same group. In this
way, we can acquire morphing attacks of high quality, since difficult alignments due to different
focal length, distortions or other camera parameters are avoided.
9

To ensure that our automatically created morphing attacks can be used successfully to trick bio-
metric facial recognition systems, we tested our attack on a commercial biometric facial recognition
software [23]. We used the system to check whether the morphing attack and the two genuine face
images that were used to generate this attack show the same person. In about 95% of cases, both
people were recognized when we set the false acceptance rate of the recognition system to 0.1% as
recommended by FRONTEX for automated border control gates [24]. Since we select the source
images for a morphing attack randomly, this rate might be increased by morphing only faces that
do not differ heavily.

3.2 Preprocessing
To increase the variety in the data, we artificially add two types of noise and two types of blur
to both, original images and morphed images. The parameters for artificially noise and blur were
chosen randomly for each image, but with an upper limit as described in table 1.

Kind of noise/blur Upper limit


Salt-and-pepper noise 1% of all pixels
Gaussian noise Standard deviation up to 0.05
Gaussian blur Standard deviation up to 0.5% of image width
motion blur Angle randomly, length up to 1.2% of image width

Table 1. Preprocessing parameters

Beside adding more variety, this kind of preprocessing is also intended to prevent our networks
from learning low-level artifacts caused by resampling or JPEG-compression. In summary, our
dataset contains about 9000 fake images that are randomly processed by adding salt and pepper
noise, Gaussian noise, motion blur, Gaussian Blur or none of them. The original face images are
processed in the same way.

Fig. 5. Morphing attacks after (left half) and before (right half) preprocessing: Gaussian blur, Gaussian
noise, motion blur, Salt-and-pepper noise (from left to right)
10

3.3 Test Set Selection


We use 4-fold cross-validation to evaluate the robustness of our trained neural networks. Since the
samples in our data are not independent due to the fact that a morphing attack is performed by
averaging two samples (original face images), the separation of the datasets for the 4-fold cross-
validation has to consider this dependency. Hence, we separate the dataset of morphed and original
face images slightly different to the classical 4-fold cross-validation. The test and training sets are
created as follow:
1. We separate the individuals into four disjunctive sets I1,2,3,4 of the same size.
2. We create the test sets T1,2,3,4 , with Ti containing all images showing individuals in Ii and
morphing attacks based on individuals in Ii .
3. A training set T̄i contains all images that are not in Ti .
By separating the images that way, some morphing attacks will appear in two test data sets, but
we removed correlation between test and training data.

4 Experiments
In this section, we describe our experimental setup, in particular the preprocessing of the data to fit
the requirements of the networks, e.g. the input size, outline the network architectures we use for the
image forgery detection and present the accuracy of the networks for the task of forgery detection.
For each network architecture, we analyze the morphing detection accuracy for the network trained
from scratch and starting the training with pretrained models, which were trained for the task of
object classification on the ILSVRC dataset.

4.1 Network Architectures


We analyze three different deep convolutional network architectures, AlexNet, GoogLeNet and
VGG19, which are all well known for object classification tasks. All three networks have nearly
the same input image size that is 224x224 pixels for the VGG19 and GoogLeNet architecture and
227x227 pixels for the AlexNet. While the AlexNet was the first DNN that revolutionized the area
of object classification and its design focuses on performance on two graphics cards, the architecture
of VGG19 was designed to be simple but powerful and all of its convolutional layers have kernels
of size 3x3 and exactly one predecessor and one successor layer. In contrast to the AlexNet and
the VGG19 architecture, the GoogLeNet has a quite complex structure. Its key components are
inception modules [14]. These modules consist of multiple different sized convolution filters that
process the same input and concatenate the results. Using this type of modules, the GoogLetNet
achieves a similar accuracy for object classification tasks as the VGG19, but needs less than one-
tenth of the parameters the VGG19 needs.

4.2 Preprocessing on Network Input


To reduce unnecessary variety, that can be removed in a trivial preprocessing step, we rotate the
image on the image plane, such that the eyes are at the same height. All networks were fed with
a complete facial image that contain only the inner part of the face that is between the eyebrows
and the mouth and between the outer parts of the eyes. Since neural networks need images of fixed
11

size as input, the images are rescaled such that this region fits the input size of the network. In
order to get more variety in the data, the region extracted from the rotated and scaled image is
randomly moved for up to 2 pixels in any direction. This process does not harm the correctness of
this data since the features in the face are not aligned, e.g. the position and size of the eyebrows,
nose, pupils varies from human to human, moreover the detected borders of the region are subject
to small inaccuracies.

4.3 Training

We trained all three network architectures from scratch as well as starting with pretrained models,
which were trained for the task of object classification on the ILSVRC dataset, using the deep
learning framework caffe [25]. The average color, that is recommended to subtract before feeding
the network, was set for all experiments to the average color of the ILSVRC dataset, since the
pretrained networks use this average color - except for the AlexNet, which used an average image,
which has the same average color. The training samples for one epoch of the network were shuffled
and some original images duplicated, such that the probability of selecting a morphing attack is
equal to the probability of selecting an original face image. The networks that were trained from
scratch were initialized randomly, except for the VGG19, which was initialized using the random
initialization procedure of [26]. In both cases, pretrained and training from scratch, the training
was terminated after the loss function was not improved for several epochs.

4.4 Results

The training of all networks converged and the average loss on images of the training set was
below 5 · 10−5 for all networks. The accuracy in terms of False Acceptance Rate (FAR) and False
Rejection Rate (FRR) is shown in table 2. The FAR is defined as relative amount of morphing
attacks classified as genuine images and the FRR as the relative amount of genuine images classified
as morphing attacks. The FRR of VGG19 (pretrained) is only about a third of the FRR of the
AlexNet (pretrained) and 2.1 percent points lower than the FRR of the GoogLeNet (pretrained).
All network architectures performed better, when starting the training with pretrained networks
instead of learning all weights from scratch. The FRR of the network architectures GoogLeNet and
VGG19 outperformed the AlexNet in both cases, while the FAR is for all architectures nearly the
same, but depends of the initialization of the weights.

AlexNet GoogLeNet VGG19


FRR FAR FRR FAR FRR FAR
From Scratch 16.2% 1.9% 10.0% 1.8% 10.9% 2.2%
Pretrained 11.4% 0.9% 5.6% 1.2% 3.5% 0.8%
Table 2. Accuracy of the trained networks in terms of False Acceptance Rate and False Rejection Rate
12

5 Conclusion

In this paper, we proposed a morphing attack detection method based on deep convolution neural
networks. A fully automatic face image morphing pipeline with exchangeable components was
presented that was used to create training and test samples for the networks. Instead of detecting
classical traces of tampering, e.g. caused by resampling or JPEG double compression, and their
anti-forensic methods [27–29] we focused on semantic artifacts. To avoid learning a detector for
these classical tampering traces, all images were preprocessed by scaling, rotating and cropping,
before feeding them to the network. In addition, we added different kinds of noise and blur to the
training and test data.
We trained three different convolutional neural network architectures from scratch and starting
with pretrained networks. The FRR of our trained networks differ between 3.5% and 16.2% and the
FAR between 0.8% and 2.2%. The VGG19 (pretrained) achieved for both rates the best result with
a FRR of 3.5% and a FAR of 0.8%. The pretrained networks outperformed the networks trained
from scratch for every architecture. This suggests that the features learned for object classification
are also useful for detection of morphing attacks.
In future work, we plan to analyze the decisions made by our networks. In particular, we plan to
study the regions that contribute to the decision of a network and analyze the differences between
different architectures and pretrained networks using the LRP toolbox [30].

Acknowledgment

The work in this paper has been funded in part by the German Federal Ministry of Education and
Research (BMBF) through the Research Program ANANAS under Contract No. FKZ: 16KIS0511.

References

1. Spreeuwers, L.J., Hendrikse, A.J., Gerritsen, K.J.: Evaluation of automatic face recognition for auto-
matic border control on actual data recorded of travellers at schiphol airport. In: Proceedings of the
International Conference of Biometrics Special Interest Group (BIOSIG). (Sept 2012) 1–6
2. Ferrara, M., Franco, A., Maltoni, D.: The magic passport. In: IEEE International Joint Conference on
Biometrics. (Sept 2014) 1–7
3. Popescu, A.C., Farid, H.: Exposing digital forgeries by detecting traces of resampling. IEEE Transac-
tions on Signal Processing 53(2) (Feb 2005) 758–767
4. Kirchner, M., Gloe, T.: On resampling detection in re-compressed images. In: 2009 First IEEE Inter-
national Workshop on Information Forensics and Security (WIFS). (Dec 2009) 21–25
5. Kirchner, M., Fridrich, J.: On detection of median filtering in digital images. Proc. SPIE 7541 (2010)
754110–754110–12
6. Lukáš, J., Fridrich, J.: Estimation of primary quantization matrix in double compressed jpeg images.
In: Proc. of DFRWS. (2003)
7. Farid, H.: Exposing digital forgeries from jpeg ghosts. IEEE Transactions on Information Forensics
and Security 4(1) (March 2009) 154–160
8. Johnson, M.K., Farid, H.: Exposing digital forgeries through specular highlights on the eye. In:
Information Hiding. Volume 4567 of Lecture Notes in Computer Science. (2008) 311–325
9. Kee, E., O’brien, J.F., Farid, H.: Exposing photo manipulation from shading and shadows. ACM Trans.
Graph. 33(5) (September 2014) 165:1–165:21
13

10. Makrushin, A., Neubert, T., Dittmann, J.: Automatic generation and detection of visually faultless
facial morphs. In: Proceedings of the 12th International Joint Conference on Computer Vision, Imaging
and Computer Graphics Theory and Applications (VISIGRAPP 2017). (2017) 39–50
11. Raghavendra, R., Raja, K.B., Busch, C.: Detecting morphed face images. In: 8th IEEE International
Conference on Biometrics Theory, Applications and Systems, BTAS 2016, Niagara Falls, NY, USA,
September 6-9, 2016. (2016) 1–7
12. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural
networks. In: Advances in Neural Information Processing Systems 25. Curran Associates, Inc. (2012)
1097–1105
13. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR
abs/1409.1556 (2014)
14. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabi-
novich, A.: Going deeper with convolutions. In: Computer Vision and Pattern Recognition (CVPR).
(2015)
15. Bayar, B., Stamm, M.C.: A deep learning approach to universal image manipulation detection using
a new convolutional layer. In: Proceedings of the 4th ACM Workshop on Information Hiding and
Multimedia Security, New York, NY, USA, ACM (2016) 5–10
16. King, D.E.: Dlib-ml: A machine learning toolkit. J. Mach. Learn. Res. 10 (December 2009) 1755–1758
17. Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In:
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. CVPR ’14,
Washington, DC, USA, IEEE Computer Society (2014) 1867–1874
18. Beier, T., Neely, S.: Feature-based image metamorphosis. In: Proceedings of the 19th Annual Conference
on Computer Graphics and Interactive Techniques. SIGGRAPH ’92, New York, NY, USA, ACM (1992)
35–42
19. Delaunay, B.: Sur la sphere vide. Izv. Akad. Nauk SSSR, Otdelenie Matematicheskii i Estestvennyka
Nauk 7 (1934) 793–800
20. Pérez, P., Gangnet, M., Blake, A.: Poisson image editing. In: ACM SIGGRAPH 2003 Papers. SIG-
GRAPH ’03, New York, NY, USA, ACM (2003) 313–318
21. Prestele, B., Schneider, D.C., Eisert, P.: System for the automated segmentation of heads from arbitrary
background. In: ICIP, IEEE (2011) 3257–3260
22. Paier, W., Kettern, M., Hilsmann, A., Eisert, P.: Video-based facial re-animation. In: Proceedings
of the 12th European Conference on Visual Media Production, London, United Kingdom, November
24-25, 2015. (2015) 4:1–4:10
23. Neurotechnology Inc.: Verilook 9.0/megamatcher 9.0 faces identification technology.
http://www.neurotechnology.com/ (2017)
24. FRONTEX - Research and Development Unit: Best practice technical guidelines for automated border
control (abc) systems - v2.0 (2012)
25. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.:
Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 (2014)
26. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In:
In Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS10).
Society for Artificial Intelligence and Statistics. (2010)
27. Stamm, M.C., Liu, K.J.R.: Anti-forensics of digital image compression. IEEE Transactions on Infor-
mation Forensics and Security 6(3) (Sept 2011) 1050–1065
28. Yu, J., Zhan, Y., Yang, J., Kang, X. In: A Multi-purpose Image Counter-anti-forensic Method Using
Convolutional Neural Networks. Springer International Publishing, Cham (2017) 3–15
29. Kirchner, M., Bohme, R.: Hiding traces of resampling in digital images. IEEE Transactions on Infor-
mation Forensics and Security 3(4) (Dec 2008) 582–592
30. Lapuschkin, S., Binder, A., Montavon, G., Müller, K.R., Samek, W.: Analyzing classifiers: Fisher
vectors and deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition (CVPR). (2016) 2912–2920

View publication stats

You might also like