Face Liveness Detection: Department of ECE Cmrcet
Face Liveness Detection: Department of ECE Cmrcet
Face Liveness Detection: Department of ECE Cmrcet
CHAPTER- I
INTRODUCTION
Introduction:
The general public has immense need for security measures against spoof attack. Biometrics
is the fastest growing segment of such security industry. Some of the familiar techniques for
identification are facial recognition, fingerprint recognition, handwriting verification, hand
geometry, retinal and iris scanner. Among these techniques, the one which has developed
rapidly in recent years is face recognition technology and it is more direct, user friendly and
convenient compared to other methods. Therefore, it has been applied to various security
systems. But, in general, face recognition algorithms are not able to differentiate „live‟ face
from „not live‟ face which is a major security issue. It is an easy way to spoof face
recognition systems by facial pictures such as portrait photographs. In order to guard against
such spoofing, a secure system needs liveness detection.
Liveness detection has been a very active research topic in fingerprint recognition and its
recognition communities in recent years. But in face recognition, approaches are very much
limited to deal with this problem. Liveness is the act of differentiating the feature space into
live and non-living. Imposters will try to introduce a large number of spoofed biometrics
into system. With the help of liveness detection, the performance of a biometric system will
improve. It is an important and challenging issue which determines the trustworthiness of
biometric system security against spoofing. In face recognition, the usual attack methods
may be classified into several categories. The classification is based on what verification
proof is provided to face verification system, such as a stolen photo, stolen face photos,
recorded video, 3D face models with the abilities of blinking and lip moving, 3D face
models with various expressions and so on. Anti-spoof problem should be well solved
before face recognition systems could be widely applied in our daily life.
CHAPTER II
LITERATURE REVIEW
There are many approaches implemented in Face Liveness Detection. In this section, some of
the most interesting liveness detection methods are presented.
This approach is used by Gahyun Kim et al [1]. The basic purpose is to differentiate between
live face and fake face (2-D paper masks) in terms of shape and detailedness. The authors
have proposed a single image-based fake face detection method based on frequency and
texture analyses for differentiating live faces from 2-D paper masks. The authors have carried
out power spectrum based method for the frequency analysis, which exploits both the low
frequency information and the information residing in the high frequency regions. Moreover,
description method based on Local Binary Pattern (LBP) has been implemented for analyzing
the textures on the given facial images. They tried to exploit frequency and texture
information in differentiating the live face image from 2-D paper masks. The authors
suggested that the frequency information is used because of two reasons. First one is that the
difference in the existence of 3-D shapes, which leads to the difference in the low frequency
regions which is related to the illumination component generated by overall shape of a face.
Secondly, the difference in the detail information between the live faces and the masks
triggers the discrepancy in the high frequency information. The texture information is taken
as the images taken from the 2-D objects (especially, the illumination components) tend to
suffer from the loss of texture information compared to the images taken from the 3-D
objects. For feature extraction, frequency-based feature extraction, Texture-based feature
extraction and Fusion-based feature extraction are being implemented.
For extracting the frequency information, at first, the authors have transformed the facial
image into the frequency domain with help of 2-D discrete Fourier transform. Then the
transformed result is divided into several groups of concentric rings such that each ring
represents a corresponding region in the frequency band. Finally, 1-D feature vector is
acquired by combining the average energy values of all the concentric rings. For texture-
based feature extraction, they used Local Binary Pattern (LBP) which is one of the most
popular techniques for describing the texture information of the images. For the final one i.e.
fusion-based feature extraction, the authors utilizes Support Vector Machine (SVM)
classifier for learning liveness detectors with the feature vectors generated by power
spectrum-based and LBP-based methods. The fusion-based method extracts a feature vector
by the combination of the decision value of SVM classifier which are trained by power
spectrum-based feature vectors and SVM classifier which are trained by LBP-based feature
vectors. The authors have used two types of databases for their experiments: BERC Webcam
Database and BERC ATM Database. All the images in webcam database were captured under
three different illumination conditions and the fake faces (non-live) were captured from
printed paper, magazine and caricature images. Experimental results of the proposed
approach showed that LBP based method shows more promising result than frequency- based
method when images are captured from prints and caricature. Overall, the fusionbased
method showed best result with error rate of 4.42% compared to frequency based with 5.43%
and LBP-based method with 12.46% error rate.
The technique of face liveness detection using variable focusing was implemented by
Sooyeon Kim et al. [3]. The key approach is to utilize the variation of pixel values by
focusing between two images sequentially taken in different focuses which is one of the
camera functions. Assuming that there is no big difference in movement, the authors have
tried to find the difference in focus values between real and fake faces when two sequential
images(in/out focus) are collected from each subject. In case of real faces, focused regions
are clear and others are blurred due to depth information. In contrast, there is little difference
between images taken in different focuses from a printed copy of a face, because they are not
solid. The basic constraint of this method is that it relies on the degree of Depth of Field
(DoF) that determines the range of focus variations at pixels from the sequentially taken
images. The DoF is the range between the nearest and farthest objects in a given focus. To
increase the liveness detection performance, the authors have increased out focussing effect
for which the DoF should be narrow. In this method, Sum Modified Laplacian(SML) is used
for focus value measurement. The SML represents degrees of focusing in images and those
values are represented as a transformed 2nd-order differential filter.
Department of ECE Page 3
CMRCET
Face liveness detection
In the first step, two sequential pictures by focusing the camera on facial components are
being. One is focused on a nose and the other is on ears. The nose is the closest to the camera
lens, while the ears are the farthest. The depth gap between them is sufficient to express a 3D
effect. In order to judge the degree of focusing, SMLs of both the pictures are being
calculated. The third step is to get the difference of SMLs. For one-dimensional analysis, sum
differences of SMLs (DoS) in each of columns are calculated. The authors found out that the
sums of DoS of real faces show similar patterns consistently, whereas those of fake faces do
not. The differences in the patterns between real and fake faces are used as features to detect
face liveness. For testing, the authors have considered False Acceptance Rate (FAR) and
False Rejection Rate (FRR). FAR is a rate of the numbers of fake images misclassified as real
and FRR is a rate of the numbers of real images misclassified as fake. The experimental
results showed that when Depth of Field (DoF) is very small, FAR is 2.86% and FRR is
0.00% but when DoF is large, the average FAR and FRR is increased. Thus the results
showed that this method is crucially dependent on DoF and for better results, it is very
important to make DoF small.
The technique of Component-based face coding approach for liveness detection was
employed by Jianwei Yang et al. [9]. The authors have proposed a method which consists of
four steps:
(2) Coding the low level features respectively for all the components
(3) Deriving the high-level face representation by pooling the codes with weights derived
from Fisher criterion
(4) Concatenating the histograms from all components into a classifier for identification.
(1) Faces are blurred because of limited resolution of photos or screens and redefocus of
camera
(2) Faces appearance vary more or less for reflectance change caused by Gamma Correction
of camera.
(3) Face appearance also change for abnormal shading on surfaces of photos and screens. At
first, the authors have expanded the detected face to obtain the holistic-face (H-Face). Then
the H-Face is divided into six components (parts) which includes contour region, facial
region, left eye region, right eye region, mouth region and nose region. Moreover, contour
region and facial region is further divided into 2 × 2 grids, respectively. For all the twelve
components, dense low-level features (e.g., LBP, LPQ, HOG, etc.) are extracted. Given the
densely extracted local features, a component-based coding is performed based on an offline
trained codebook to obtain local codes. Then the codes are concatenated into a high-level
descriptor with weights derived from Fisher criterion analysis. Fisher ratio is used to describe
the difference of micro textures between genuine faces and fake faces. At last, the authors
feed features into a support vector machine (SVM) classifier.
For experimentation, the authors have used three different kinds of databases: NUAA
Database, CASIA Database and Print-Attack Database. The authors showed that the proposed
approach achieved better performance for all the databases.
The technique that combines standard techniques in 2D face biometrics was introduced by
Kollreider et al. They have looked into the matter using real-time techniques and applied
them to real life spoofing scenarios in an indoor environment. First of all, the algorithm
searches for faces and if the face is detected, a timer is started to define the period for
collecting evidence. Then evidence is collected for the liveness detection of the faces. For
liveness detection, 3D properties or eye blinking or mouth movements in non-interactive
mode are being analyzed. If no such response is found, responses are asked and checked at
random. After the time period expires, verify the liveness of the face. For experimentation, a
low cost web-cam that delivered 320x240 pixel frames at 25 fps was employed and
computation was done on a standard laptop. The authors suggested that the performance of
the proposed method is efficient for the task of public usage.
The method based on optical flow field was introduced by Bao et al. It analyzes the
differences and properties of optical flow generated from 3D objects and 2D planes. The
motion of optical flow field is a combination of four basic movement types: Translation,
rotation, moving and swing. The authors found that the first three basic types are generating
quite similar optical flow fields for both 2D and for 3D images. The fourth type creates the
actual differences in optical flow field. Their approach is basically based on the idea that the
optical flow field for 2D objects can be represented as a projection transformation. The
optical flow allows to deduce the reference field, thus allows to determine whether the test
region is planar or not. For that, the difference among optical flow fields is calculated. To
decide whether a face is a real face or not, this difference is being noted as a threshold. The
Experiment was conducted on three groups of sample data. The first group contained 100
printed face pictures that were translated and randomly rotated, the second group contains
100 pictures from group 1 that were folded and curled before the test, the third group
consisted of faces of real people (10 people, each 10 times) doing gestures like swinging,
shaking, etc. The authors conducted the experiment for 10 seconds. The camera had sampling
rate of 30 frames per second. The calculation was done for every 10 frames.
A combination of face parts detection and an estimation of optical flow field for face liveness
detection were introduced by Kollreider et al. [6]. This approach is able to differentiate
between motion of points and motion of lines. The authors have suggested a method which
analyzes the trajectories of single parts of a live face. The information which is being
obtained can be used to decide whether a printed image was used or not. This approach uses a
model-based Gabor decomposition and SVM for detection of face parts. The basic idea of
this method is based on the assumption that a 3D face generates a 2D motion which is higher
at central face regions than at the outer face regions such as ears. Therefore, parts which are
farther away move differently from parts which are nearer to the camera. But, a photograph
generates a constant motion on different face regions. With the information of the face parts
positions and their velocity, it is possible to compare how fast they are in relation to each
other [17]. This information is used to differentiate between a live face from a photograph.
The database which is used contained 100 videos of Head Rotation Shot-subset (DVD002
media) of the XM2VTS database. All data were downsized to 300x240 pixels. Videos were
cut (3 to 5 frames) and were used for live and non-live sequences. Each person‟s last frame
was taken and was translated horizontally and vertically to get two non-live sequences per
Department of ECE Page 6
CMRCET
Face liveness detection
person. Therefore,200 live and 200 non live sequences were examined. Most of the live
sequences achieved a score of 0.75 out of 1, whereas the non-live pictures achieved a score
less than 0.5. It was also noticed that glasses and moustaches lowered the score, as they were
close to the camera. The authors mentioned that the system will be error free if sequences
containing only horizontal movements are used. By considering a liveness score greater than
0.5 as alive, the proposed system separates 400 test sequences with error rate of 0.75%.
For classification, the standard sparse logistic regression classifier was extended both
nonlinearly and spatially to improve its generalization capability under the settings of high
dimensionality and small size samples. The authors found out that the nonlinear sparse
logistic regression significantly improves the anti-photo spoof performance, while the spatial
extension leads to a sparse low rank bilinear logistic regression model. To evaluate their
method, a publicly available large photograph-imposter database containing over 50K photo
images from 15 subjects is collected by the authors. Preliminary experiments on this
database show that the method proposed by the authors gives good detection performance,
with advantages of realtime testing, non- intrusion and no requirement extra hardware.
Although Tan et al. have presented very effective results in their work [11]; the authors
overlooked the problem of bad illumination conditions. Peixoto et al. [12] extended their
work to deal with images even under bad illumination conditions either for spoof attempts
coming from a laptop display or high quality printed images. The basic key is that the
brightness of the image captured from LCD screen affects the image in such a way that the
high-frequency regions become prone to a “blurring” effect due to the pixels with higher
values brightening their neighbourhood. This makes the fake images show less borders than
the real face image.
The authors have detected whether an image is a spoof or not by exploring such information.
First, they have analyzed the image using Difference of Gaussian (DoG) filter that uses two
Gaussian filters with different standard deviations as limits. The basic idea of the authors was
to keep the highmiddle-frequencies to detect the borders in order to remove the noise. But
DoG filtering does not detect the borders properly under bad illumination conditions. For the
classification stage, Sparse Logistic Regression Model similar to the model in Tan et al. [11]
was used by the authors. To minimize the effects of bad illumination, the image was pre-
processed in order to homogenize it, so that the illumination changes become more
controlled. The authors have used the contrast-limited adaptive histogram equalization
(CLAHE). The main idea of CLAHE is that it operates on small regions in the image, called
tiles. The Experimental results for NUAA Imposter Database of Tan et al.and proposed
extension for bad illumination by Peixoto et al.
CHAPTER-III
PROPOSED METHOD
3. Proposed system:
The proposed liveness detection system based on color texture and image distortion
analysis is shown below. The details of the proposed system are as follows.
A. Colour Spaces
RGB is the most used colour space for sensing, represen- tating and displaying colour
images. However, its application in image analysis is quite limited due to the high correlation
between the three colour components (red, green and blue) and the imperfect separation of
the luminance and chrominance information. On the other hand, the different colour channels
can be more discriminative for detecting recapturing artefacts, i.e. providing higher contrast
for different visual cues from natural skin tones.
In this work, we considered two other colour spaces, HSV and YCbCr, to explore the colour
texture information in addition to RGB. Both of these colour spaces are based on the
separation of the luminance and the chrominance components. In the HSV colour space, hue
and saturation dimensions define the chrominance of the image while the value dimension
corresponds to the luminance. The YCbCr space separates the RGB components into
luminance (Y), chrominance blue (Cb)
B. Texture descriptors:
In principle, texture descriptors originally designed for gray- scale images can be applied on
colour images by combin- ing the features extracted from different colour channels. In this
present study, the colour texture of the face images is analysed using five descriptors: Local
Binary Patterns (LBP), occurrence of Adjacent Local Binary Patterns (CoALBP), Local
Phase Quantization (LPQ), Binarized Statistical Image Features (BSIF) and Scale-Invariant
Descriptor (SID) that have shown to be very promising features in prior studies , related to
gray-scale texture based face anti-spoofing. Detailed descriptions of each of these features are
presented in the following.
Local Binary Patterns (LBP): The LBP descriptor pro- posed by Ojala et al. [50] is a highly
discriminative gray- scale texture descriptor. For each pixel in an image, a binary code is
computed by thresholding a circularly symmetric neighbourhood with the value of the central
pixel.
Specular reflection component image has been widely used for specular reflection removal
and face illumination normalization. In this paper, we separate the specular reflection
component Is from an input face image or video frame utilizing an iterative method (with 6
iterations) proposed in, which assumes that the illumination is
Given that most of the face images (in the Idiap, CASIA, and MSU databases) are captured
indoors under relatively con- trolled illumination, these three assumptions are reasonable.
After calculating the specular reflection component image Is, we represent the specularity
intensity distribution with three dimensional features: i) specular pixel percentage r, ii) mean
intensity of specular pixels μ, and iii) variance of specular pixel intensities σ.
D. Blurriness Features:
For short distance spoof attacks, spoof faces are often defocused in mobile phone cameras.
The reason is that the spoofing medium (printed paper, tablet screen, and mobile phone
screen) usually have limited size, and the attackers have to place them close to the camera in
order to conceal the boundaries of the attack medium. As a result, spoof faces tend to be
defocused, and the image blur due to defocus can be used as another cue for anti-spoofing.
We utilize two types of blurriness features (denoted as b1 and b2) that were proposed in
respectively. In blurriness is measured based on the difference between the original input
image and its blurred version. The larger the difference, the lower the blurriness in the
original image. In blurriness is measured based on the average edge width in the input image.
Both these two methods output non-reference (without a clear image as reference) blurriness
score between 0 ∼ 1, but emphasizing different measures of blurriness.
Recaptured face images tend to show a different color distribution compared to colors in the
genuine face images. This is caused by the imperfect color reproduction property of printing
and display media. This chromatic degradation was explored in [35] for detecting recaptured
images, but its effectiveness in spoof face detection is unknown. Since the absolute color
distribution is dependent on illumination and camera variations, we propose to devise
invariant features to detect abnormal chromaticity in spoof faces. That is, we first convert the
normalized facial image from the RGB space into the HSV (Hue, Saturation, and Value)
space and then compute the mean, deviation, and skewness of each channel as a chromatic
feature. Since these three features are equivalent to the three statistical moments in each
channel, they are also referred to as chromatic moment features. Besides these three features,
the percentages of pixels in the minimal and maximal histogram bins of each channel are
used as two additional features.
Another important difference between genuine and spoof faces is the color diversity. In
particular, genuine faces tend to have richer colors. This diversity tends to fade out in spoof
faces due to the color reproduction loss during image/video recapture. In this paper, we
follow the method used in to measure the image color diversity. First, color quantization is
performed on the normalized face image. Two measure-ments are then pooled from the color
distribution: i) the histogram bin counts of the top 100 most frequently appearing colors, and
ii) the number of distinct colors appearing in the normalized face image. The dimensionality
of the color diversity feature vector is 101.
The above four types of feature (specular reflection, blurri-ness, chromatic moment, and color
diversity) are finally con-catenated together, resulting in an IDA feature vector with 121
dimensions. Although the IDA feature vector is extracted from the facial region, it contains
only image distortion information, and not any characterization of facial appearance.
Therefore, we expect that the IDA feature can alleviate the problem of training bias
encountered in the commonly used texture features.
CHAPTER IV
RESULT
4.1 Result:
This paper introduced a new texture descriptor known as Dynamic Local Ternary Pattern
(DLTP) in the face liveness detection method. By following Weber’s law, in DLTP, the
threshold value sets dynamically instead of by a manual setting. Comparison of DLTP is
performed with Local Ternary Pattern (LTP) and systematically examined and compared
these two techniques in relation to variation of their threshold values. For benchmarking, the
performance evaluation is carried out on both publicly available face spoof databases
(NUAA, Replay-Attack and CASIA), and our self collected UPM face spoof database. A best
threshold value of LTP is utilized to compare the performance of DLTP for face spoof attacks.
The comparative analysis of both techniques also shows that DLTP out-performed LTP and
other state-of-the-art approaches for face pattern analysis in a face liveness detection method.
The dynamic threshold in DLTP was found to be more robust for noise with a central pixel
value and invariance with respect to illumination transformation and texture variations as
compared to LTP and other texture descriptors.
CHAPTER V
CONCLUSION
This work provided an overview of different approaches of face liveness detection. It
presented a categorization based on the type of techniques used and types of liveness
indicator/clue used for face liveness detection which helps understanding different spoof
attacks scenarios and their relation to the developed solutions. A review of most interesting
approaches for liveness detection was presented. The most common problems that have been
observed in case of many liveness detection techniques are the effects of illumination change,
effects of amplified noise on images which damages the texture information. For blinking and
movement of eyes based liveness detection methods, eyes glasses which causes reflection
must be considered for future development of liveness detection solutions. Furthermore, the
datasets, which play an important role in the performance of liveness detection solutions,
must be informative and diverse that mimics the expected application scenarios. Non-
interactive video sequences must include interactive sequences where the users perform
certain tasks. Future attack datasets must consider attacks like 3D sculpture faces and
improved texture information. Our main aim is to give a clear pathway for future
development of more secured, user friendly and efficient approaches for face liveness
detection.
CHAPTER VI
REFERENCES
1. Chingovska, I.; Nesli, E.; André, A.; Sébastien, M. Face Recognition Systems under
Spoofing Attacks. In Face Recognition across the Imaging Spectrum; Bourlai, T., Ed.;
Springer: Berlin/Heidelberg, Germany, 2016; pp. 165–194.
2. Parveen, S.; Ahmad, S.M.S.; Hanafi, M.; Azizun,W.A.W. Face anti-spoofing methods.
Curr. Sci. 2015, 108, 1491–1500.
3. Yi, D.; Lei, Z.; Zhang, Z.; Li, S.Z. Face anti-spoofing: Multi-spectral approach. In
Handbook of Biometric Anti-Spoofing; Marcel, S., Nixon, M.S., Li, S.Z., Eds.; Springer:
London, UK, 2014; pp. 83–102.