3D Face Recognition Based On Deep Learning - 8816269 PDF
3D Face Recognition Based On Deep Learning - 8816269 PDF
Abstract - At present, face recognition is based on face recognition has made great progress. From the 1990s to
two-dimensional face database, Although the accuracy of face the beginning of the 21st century,the technology of face
recognition is high, the influence of lighting, posture and recognition based on feature extraction has greatly advanced
expression on face recognition still exists. Because 3D face data the developments and advancements of face recognition
not only has 2D face information, but also 3D face depth
information. Reasonable use of depth information of 3D face can
technology.such as the Eigenface method and LDA(Linear
effectively reduce the impact of illumination, attitude and Discriminant Analysis), both methods are based on
expression on face recognition. Therefore, this paper proposed a mathematical methods to find a better representation of the
3D face recognition method based on depth learning. Firstly, face face; but the most classic is LBP (Local Binary Pattern)
image is preprocessed. Secondly, the output of the convolutional feature wihich proposed by Ahonen, Timo and Hadid etc. LBP
neural network is a abstract feature of the image, then, we fuse (Local Binary Pattern) is a feature that is filtered by a local
the abstract features of the 2D face abstract feature and the face window and encoded by determining the difference between
depth map, and use the merged image as the input of the fully the peripheral and central pixel values of the form. In the
connected layer. Finally, the output of the fully connected layer is current scientific research environment, this feature It can
treated as the input of the classifier. Experimental data is
two-dimensional face images and face depth maps which
express the face very well, and it is the experimental result
converted from 3D point clound data. The experimental results with has best recognition effect in the face recognition field
show that the recognition algorithm has strong robustness to the for a long time [4].
influence factors such as light, attitude and expression. At present, the first step of most face recognition
methods is extract features of images, such as multi-scale
Index Terms - Deep Learningˈ3D Face RecognitionˈCo Gabor wavelet features [5] and Haar features [6]. Then use
nvolutional Neural NetworkˈFeature Fusion. the extracted features to calculate the similarity between the
test face image and the feature face. Finally, face
I. INTRODUCTION recognition is implemented based on the similarity results.
In summary, the main process of the classic face
A. Face Recognition
recognition technology is shown as in Figure1.
The concept of face recognition originated in the 1960s,
it is an important biometric recognition technology. Moreover,
the face has the characteristics of highly non-rigid and has the
advantages of directness, convenience, and good interactivity.
In recent years, face recognition is the most popular in
the field of biometrics recognition and it is also one direction
of the most popular research of artificial intelligence and
computer vision.
Bledsoe and Chan were the first two people to start
Fig. 1 Basic process of face recognition
researches on face recognition [1]. They proposed a research
In recent years, with the rise of deep learning and
method characterized by face structure and ratio,the theory it
artificial intelligence,the method of deep learning have also
is classified faces by distinguishing the distance between these
been widely used in the field of face recognition, and become
features. After that, there are many scholars who researching
one of the main methods in this field. A series of face
face recognition, but most of the research methods in the early
recognition methods based on deep learning methods such as
days were based on template matching, Because the research
Facebook's DeepFace method and Face++'s PyramindCNN
on face recognition was basically limited to ideal
method have achieved very good experimental results on LFW
conditions,and single training samples, no expression changes
face database which proves the effectiveness of deep learning
in training samples, the background in the training samples is
in face recognition tasks [7] [8].
simple and so on, it cannot be applied to the actual scene [2]
At present, the traditional method and the deep learning
[3].
method have achieve good results in recognition accuracy, but
With the rapid development of computer technology and
the robustness of face recognition based on two-dimensional
hardware technology, and at the same time,the technology of
1577
of a specific size to obtain a plurality of feature activation
maps.
During the convolution process, the convolution kernel is
slid in the horizontal and vertical directions along the
coordinates of the input image by a set step, and each time the
step is swipped, a new value is obtained. Figure 6 shows the
convolution operation with a step size of 1 and a convolution
kernel size of 2 x 2.
Convolution is a commonly used linear filtering method
in images, which can achieve image filtering effects such as
Fig. 5 Face recognition convolutional neural network model structure image denoising and image sharpening. The theoretical
As shown in Figure 5 above, this paper uses a 9-layer convolution operation process is as follows:
convolutional neural network model. There are three
convolutional layers, three pooling layers and three fully h( x ) = f (τ )ω ( x − τ )dτ (1)
connected layers. The specific description is shown in Table I.
TABLE I
FACE RECOGNITION MODEL LAYER STRUCTURE DESCRIPTION TABLE Where f( τ ) is the input data, ω is the kernel function,
and h(x) is the output.
Layer Input Output Description
name specification specification
32,3×3 convolution
kernels with a step
Conv1 112×92×3 112×92×32 size of 1,
padding=’SAME’
64,3×3 convolution
kernels with a step Fig. 6 Convolution process diagram
Conv2 56×46×32 56×46×64 size of 1,
2) Pooling Layer
padding=’SAME’ In CNN (Convolutional Neural Network), the pooling
layer is also called the downsampling layer, which aggregates
Pool2 56×46×64 28×23×64 Max-Pooling the features of different locations of the feature images
extracted by the convolutional layer to gradually realize the
feature representation of the image from high level to low
128,3×3 level. In the convolutional neural network model, the pooling
convolution kernels
Conv3 28×23×64 28×23×128 with a step size of 1, layer is usually between successive convolutional layers,
padding=’SAME’ because after the pooling operation, the size of the image will
be significantly reduced, thereby reducing the training
Pool3 28×23×128 14×12×128 Max-Pooling parameters, reducing the amount of calculation and
dimensionality reduction prevents over-fitting.
Fc1 14×12×128 1×21504 Fully connected Pooling operations are achieved by pooling kernel. In
layer general, there are two pooling operations style; the first is to
operate the pooling kernel to select the pixel maximum of the
Fc2 1×21504 1×256 Fully connected pooling area. This pooling is called the maximum pooling, as
layer
shown in Figure 7 below; the other is calculate the pixel
Output 1×256 10 Output layer
average of the pooled area, this pooling is called the average
pooling, as shown in Figure 8 below.
The process of image feature extraction produces two
B. Convolutional Neural Network Basic Structure kinds of errors: one is the estimated mean shift error caused by
1) Convolution layer the convolutional layer parameter, the other is the estimated
The convolutional layer which also can be called the variance of the field size. In general, the maximum Pooling
feature extraction layer, is the core of CNN and one of the can reduce the first type of error,the result of maximum
important features of CNN different from traditional neural Pooling retaining more texture information, the average
networks. The convolution layer is obtained by convoluting an pooling can reduce the second type of error, the result of
input image of a previous layer using a plurality of
convolution kernels which has same size but value is different
1578
B. Two-dimensional data experiment results
This group of experiments uses the two-dimensional
color face image in the above database as experimental data,
the purpose is to provide comparative experimental data for
subsequent 3D image experiments and experimentsof 3D and
2D image fusion.
In this group of experiments, the data was divided into 5
groups, and then four of them were used as training data, and
the other group was used as test data, and the loop operation
Fig. 7 Maximum pooling process diagram was 5-cross-validation.
The results of 5-cross-validation experiments based on
the recognition of 2D face images are shown in Table II
below.
TABLE II
CROSS-VALIDATION RESULTS BASED ON 2D FACE IMAGES
Training batches Testing batch Accuracy
1,2,3,4 5 0.8333
1,2,3,5 4 0.7778
Fig. 8 Average pooling process diagram
1,2,3,4 5 0.8889
1,2,3,5 4 0.8333
1,2,4,5 3 0.8333
1,3,4,5 2 0.9444
2,3,4,5 1 0.9444
1579
D. Two-dimensional and three-dimensional data fusion In this group of experiments, two small experiments were
experiment results designed. The first group selected the image with weak
In this group of experiments, from the perspective of illumination intensity as the test set; the second group selected
merging two-dimensional face color maps and face depth the image with strong illumination intensity as the test set.
maps, the data merged with the above two sets of experiments Table V and VI show the robustness of the three groups
is divided into five groups for cross-validation. of trained models to illumination.
The cross-validation experimental results based on the
recognition of the two-dimensional face color map and the V. CONCLUSION
face depth map fusion data are shown in Table IV below. This paper proposes a three-dimensional face recognition
TABLE IV
method based on two-dimensional data and three-dimensional
CROSS-VALIDATION RESULTS BASED ON 2D FACE DATA AND FACE data fusion. Firstly, this paper combines two-dimensional face
DEPTH MAP FUSION DATA image with face depth map representing three-dimensional
Training batches Testing batch Accuracy
information, based on the fused data, two-dimensional Five
sets of experiments were designed for data and
1,2,3,4 5 0.9444 three-dimensional data. The first three experiments were face
recognition of two-dimensional face data recognition,
1,2,3,5 4 0.8333 three-dimensional face data recognition, and two-dimensional
face data and three-dimensional face data fusion data. The
1,2,4,5 3 0.8333 latter two experiments Based on the above three sets of
experimentally trained three sets of identification network
1,3,4,5 2 0.9999 models for the robustness analysis of light intensity, the first
three sets of experimental comparisons and the latter two sets
2,3,4,5 1 0.8889 of experiments were compared in the experiment, the
experimental results show that under the same experimental
It can be concluded from the above Table IV that the
conditions, The recognition accuracy of new 3D face data
average accuracy of using two-dimensional face image data
based on 2D face data and 3D face data fusion and the
and the face depth map fusion data as experimental data is
robustness to illumination intensity are better than the
0.9021.
recognition accuracy of 2D and 3D face data and the
E. Robust experiment against lighting influence factors illumination intensity alone.
This group of experiments is based on the above two sets
ACKNOWLEDGMENTS
of experimentally trained models, and then use the additional
selected face images with consistent light intensity as the test The authors are grateful to the anonymous reviewers who
set, and verify the accuracy of the three sets of models by made constructive comments. This work is supported by the
comparing the accuracy of the three sets of models to the National Natural Science Foundation of China (no.61203302
specified test set. and no. 61403277) and Natural Science Foundation of Tianjin
of China (16JCYBJC 15400).
TABLE V
TEST SET IS THE EXPERIMENTAL RESULT OF WEAK LIGHT INTENSITY REFERENCES
Models Accuracy [1] Chan. H, Bledsoe. W, “A Man-Machine Facial Recognition System”:
Some Preliminary Results. In TR, 1965.
[2] Weihong.Wang, Jiexiao.Yang, Jianwei, Li. Sheng, Zhou. Dixin, “Face
2D face image training model 0.9111
recognition based on deep learning”, Lecture Notes in Computer Science,
Vol.8944, pp:812-820, 2015.
Face depth map training model 0.9222 [3] Qian-Yu. Li, Jian-Guo. Jiang, Mei-Bin. Qi, “Face Recognition Algorithm
Based on Improved Deep Networks”, Tien Tzu Hsueh Pao/Acta
Data training model after two-dimensional face image 0.9355 Electronica Sinica, Vol.45, No.3, pp:619-625, March 1, 2017.
and face depth image fusion [4] Jianzheng. Liu, Chunlin. Fang, Wu. Chao, “A Fusion Face Recognition
Approach Based on 7-Layer Deep Learning Neural Network”, Journal of
Electrical and Computer Engineering, Vol.2016, 2016.
TABLE VI [5] Wiskott L, Fellous J M, Kuiger N,et al. “Face recognition by slastic bunch
TEST SET IS THE EXPERIMENTAL RESULT OF STRONG LIGHT INTENSITY graph matching”. Pattern Analysis and Machine Intelligence, IEEE
Transaction on,Vol.19No.7,pp:775-779,1997.
Models Accuracy [6] R. Lienhart, J. Maydt. “An extended set of Haar-like features for rapid
oject detection”. International Conference on Image Processing. 2002.
Proceedings. IEEE,vol.1,pp:I-900-I-903,2002.
2D face image training model 0.9333
[7] Weitao. Wan, Jiansheng. Chen, “Occlusion robust face recognition based
on mask learning”, Proceedings - International Conference on Image
Face depth map training model 0.9222 Processing, ICIP, Vol.2017-September, pp:3795-3799,February 20, 2018.
[8] Gaili. Yue, Lei. Lu, “Face Recognition Based on Histogram Equalization
Data training model after two-dimensional face image 0.9444 and Convolution Neural Network”, Proceedings - 2018 10th International
and face depth image fusion Conference on Intelligent Human-Machine Systems and Cybernetics,
IHMSC,2018, Vol.1, pp:336-339, November 9, 2018.
1580
[9] Shijie. Qiao, Jie. Ma, “A Face Recognition System Based on Convolution
Neural Network”, Proceedings 2018 Chinese Automation Congress,
CAC ,2018, pp:1923-1927, January 22, 2019.
[10]Jian. Zhang, Zhenjie. Hou, Zhuoran. Wu, Yongkang. Chen, Weikang. Li,
“Research of 3D face recognition algorithm based on deep learning
stacked denoising autoencoder theory”, Proceedings of 2016 8th IEEE
International Conference on Communication Software and Networks,
ICCSN 2016, pp:663-667, October 7, 2016.
[11]Al-Waisy. Alaa S, Qahwaji. Rami, Ipson. Stanley, Al-Fahdawi Shumoos,
“A multimodal deep learning framework using local feature
representations for face”, recognitionMachine Vision and Applications,
Vol.29, No.1, pp:35-54, January 1, 2018.
[12]Coskun. Musab, Ucar. Aysegul, Yildirim. Ozal, Demir. Yakup, “Face
recognition based on convolutional neural network”, Proceedings of the
International Conference on Modern Electrical and Energy Systems,
MEES,2017, Vol.2018-January, pp:376-379, January 5, 2018.
[13]Chen. Li, Wei. Wei, Jingzhong. Wang, Wanbing. Tang, Shuai. Zhao,
“Face recognition based on deep belief network combined with
center-symmetric local binary pattern”, Lecture Notes in Electrical
Engineering, Vol.393, pp:277-283, 2016.
[14]Li-Hua. Guo, Xin-Ya. Niu; Jun. Ma, Yan-Neng. Liu, “Research of face
recognition algorithm using the deep tiled convolutional neural networks
and Map-Reduce method”, Xitong Gongcheng Lilun yu Shijian/System
Engineering Theory and Practice, Vol.34, pp:283-286, June 1, 2014.
[15]Viola P, Jones M J. Robust Real-Time Face Detection. International
Journal of Computer Vision, Vol.57No.2, pp:137-154,2004.
[16]Texas 3DFR.http://live.ece.utexas.edu/research/texas3dfr
1581