0% found this document useful (0 votes)
40 views10 pages

Proof: An Efficient Fuzzy Deep Learning Approach To Recognize 2D Faces Using Fadf and Resnet-164 Architecture

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views10 pages

Proof: An Efficient Fuzzy Deep Learning Approach To Recognize 2D Faces Using Fadf and Resnet-164 Architecture

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Journal of Intelligent & Fuzzy Systems xx (20xx) x–xx 1

DOI:10.3233/JIFS-211114
IOS Press

An efficient fuzzy deep learning approach


to recognize 2D faces using FADF
and ResNet-164 architecture

F
O
K. Seethalakshmia,∗ , S. Vallib , T. Veeramakalic , K.V. Kanimozhid , S. Hemalathae and M. Sambathf
a Department of Computer Science and Engineering, Vel Tech Rangarajan Dr. Sagunthala R&D Institute

O
of Science and Technology, Chennai, Tamil Nadu, India
b Department of Computer Science and Engineering, College of Engineering, Anna University, Chennai,

Tamil Nadu, India

PR
c Department of Data science and Business Systems, School of Computing, SRM Institute of Science

and Technology, Kattangulathur, Tamil Nadu, India


d Department of Computer Science and Engineering, Saveetha School of Engineering, Saveetha Institute

of Medical and Technical Sciences, Chennai, India


e Department of Computer Science and Engineering, Panimalar Institute of Technology, Chennai,
D
Tamil Nadu, India
f Department of Computer Science and Engineering, Hindustan Institute of Technology and Science,
TE

Chennai, Tamil Nadu, India


EC

Abstract. Deep learning using fuzzy is highly modular and more accurate. Adaptive Fuzzy Anisotropy diffusion filter
(FADF) is used to remove noise from the image while preserving edges, lines and improve smoothing effects. By detecting
edge and noise information through pre-edge detection using fuzzy contrast enhancement, post-edge detection using fuzzy
morphological gradient filter and noise detection technique. Convolution Neural Network (CNN) ResNet-164 architecture is
used for automatic feature extraction. The resultant feature vectors are classified using ANFIS deep learning. Top-1 error rate
R

is reduced from 21.43% to 18.8%. Top-5 error rate is reduced to 2.68%. The proposed work results in high accuracy rate with
low computation cost. The recognition rate of 99.18% and accuracy of 98.24% is achieved on standard dataset. Compared
R

to the existing techniques the proposed work outperforms in all aspects. Experimental results provide better result than the
existing techniques on FACES 94, Feret, Yale-B, CMU-PIE, JAFFE dataset and other state-of-art dataset.
O

Keywords: Fuzzy anisotropy diffusion, edge detection, contrast enhancement, CNN (ResNet), feature extraction, ANFIS
deep learning
C

1. Introduction activities through surveillance cameras and computer


interfaced security system. Identification and authen-
Computer vision, multimedia, and pattern recog- tication has been widely implemented in security
nition focus on face recognition and fingerprint system. Face recognition is commonly used since it
recognition for reducing the malpractices and illegal identifies the person exactly. Still, occluded images,
pose, expression variation, illumination variation,
∗ Corresponding author. K. Seethalakshmi, Assistant Professor, controlled and uncontrolled images pose challenges
Department of Computer Science and Engineering, Vel Tech Ran- [1].
garajan Dr. Sagunthala R&D Institute of Science and Technology, Image processing identifies expressive information
Chennai, Tamil Nadu, India. E-mail: seethalakshmik@veltech.
from an image by preprocessing, extracting features
edu.in.

ISSN 1064-1246/$35.00 © 2022 – IOS Press. All rights reserved.


2 K. Seethalakshmi et al. / An efficient fuzzy deep learning approach to recognize 2D faces

and classifying them. Preprocessing enhances the rapidity of the algorithm. They capture both local
contrast of the image, eliminates noise from the image and global information and the experiment results
through filters and preserves the edges and line infor- are more significantly better than the other competi-
mation. A number of noise removal methods are tive algorithms. AFLW face dataset and Pascal face
available to deal with, Gaussian noise, salt and pepper dataset are compared.
noise and Poisson noise. Wiener filter, median filter, CNN has convolutional layers, pooling layers
and linear filter are used for removing noise. and fully connected layers [18]. CNN is a type of
Convolution neural network overcomes the draw- deep learning which learns the system very fast.
back of artificial neural network (ANN) which is A weighted mixture deep neural network is auto-
computationally expensive. CNN [2] automatically matically extracting the feature for facial expression
extracts the features from the image by reducing the recognition [19]. They used CK+ [35], JAFFE [37],

F
parameter required to setup the model. It allows to Oulu-CASIA [36] dataset and obtained recognition
encode image specific features into the architecture accuracy of 0.970, 0.922, and 0.923 respectively.

O
making it appropriate for image focused task. Certain CNN architecture with five convolution layers fol-
common layer such as hidden layer, pooling layer, lowed by one max pooling layer and three fully

O
convolution layers are stacked as neural network to connected layers followed by Softmax [3]. GPU and
form CNN. Multiple hidden layers are stacked up non-saturating neuron is used in improving the per-

PR
on each other and is called as deep learning. CNN formance of the system. Dropout is the technique
architecture exists as Vgg-vd16 [4], Vgg-vd19 [5], adopted in fully connected layer to reduce over fit.
ResNet-101 [6] and AlexNet [3]. Based on the sys- ImageNet LSVRC data set is used in the imple-
tem performance and the error rate any one of these mentation. They achieved top-5 error rate of 15.3%
architectures is selected. In this work ResNet-101 compared to the next entry in that layer which is
D
architecture is used to evaluate the accuracy of the 26.2% error rate.
system. The authors [4] in their work proposed to solve two
Face recognition related feature extraction and facts namely first how large scale dataset is accumu-
TE

dimensionality reduction have been attempted using lated by grouping of automation and human in the
LBP [7], RLBP [9], texture (GLCM) [13], local direc- loop. Second the complexity in deep neural network.
tional ternary pattern [14], sparse manifold subspace They also discuss about the data purity, performance
learning [8], PCA [10], LDA [11] and ADA [12]. rate and time complexity. Standard LFW and YTF
EC

Classification has been achieved using machine learn- face benchmark data set is used in the implementa-
ing such as K-NN [15], SVM [16], and neural network tion. They achieved 98.95% accuracy rate for LFW
[17] to identify the correct individual among the test- and 97.3 for YTF dataset.
ing images. Deep learning is an extension of neural Very deep neural network architecture to reduce the
R

network from machine learning concepts. error rate from top-1 to top-5 using very small 3 × 3
The proposed work removes noise using fuzzy convolution filter layers [5]. The ConvNet architec-
R

anisotropy diffusion and preserves the edges by ture is used and achieved top 1 error rate of 25.5 and
despekcle the noise. Preprocessed image is given as top-5 error rate of 8.0. ImageNet dataset is used in
O

input to the convolution neural network ResNet-164 the implementation and provides better result than
architecture to extract feature vector. Finally, fuzzy existing state of the art results.
C

min-max hyperbox deep learning is used for classifi- Deep residual learning (ResNet) architecture is
cation. The related works are presented next. implemented to improve deep convolution neural net-
work for deeper training network than the existing
network [6]. ImageNet dataset is used with 152 layers
2. Related works which is 8 × 8 deeper than VGG Net, 3 × 3 convo-
lution with 512 layers followed by average pooling
The deep convolution neural network on Hyper- and fully connected 1000 layers. 3.57% error rate was
Face images with four challenging factors such as obtained which is best when compared to the existing
face detection, landmarks localization, pose esti- results.
mation and gender recognition [1]. The HyperFace Fuzzy deep neural network with sparse autoen-
is divided into two factors HyperFace-ResNet and coder (FDNNSA) to understanding intention of
Fast-HyperFace to accomplish state-of-the-art per- human being based on human emotions and informa-
formance and a high face indicator to increase the tion such as age, gender, and region in which the fuzzy
K. Seethalakshmi et al. / An efficient fuzzy deep learning approach to recognize 2D faces 3

model invented by Perona-Malik in 1990 [34] and it is


a non-linear diffusion filter used in spatial regulariza-
tion approaches. In ADF edges are preserved based
on a single constant value gradient parameter. If the
pixel value is greater than the parameter those edge
are preserved. But when there is an overlapping exists
between edge and noise then this condition will not
work as efficient for the given image. Hence FADF
is proposed for edge preservation and despekcle the
Fig. 1. Block diagram of the proposed system. noise.
Here, FADF is performed in three steps has been

F
C-means (FCM) is used to cluster the input data and done such as image preprocessing, fuzzy inference
system and diffusion iteration. Image preprocessing

O
FDNNSA is used to detect the intention of the human
[38]. The ability of feature extraction is improved by is divided into edge detection and noise removal.
fuzzy technique by removing the redundancy through Edge detection is a two-step process. Such as pre edge

O
restricted Boltzmann machine (F3RBM) is developed detection is done through fuzzy contrast enhance-
and those features are imported into SVM which ment technique and post edge detection is done by

PR
attains fast and high-precision automatic classifica- morphology gradient operation to improve the per-
tion of dissimilar samples [39]. formance of the edge detection technique. Noise
The proposed system enhances the contrast and detection is done using adaptive median filter. The
preserves the edges, thereby improving the qual- obtained membership value from edge detection and
ity. The performance and accuracy of the system is noise removal is used in fuzzy inference system.
D
increased by deep learning. The next section gives
the block diagram of the proposed system. 3.1.1. Fuzzy preprocessing
A. Pre-edge detection using fuzzy contrast equal-
TE

ization
3. Proposed work Fuzzy contrast equalization is used to improve the
brightness of the input image based on the intensity
EC

The proposed system has three stages. Preprocess- values such as low, medium and high which are said
ing is done through FADF to improve the quality of to be membership function and A,B, and C are the
the image. Features are extracted using CNN ResNet- fuzzy rule applied to improve the contrast of the input
164 architecture and deep fuzzy classifier is used in image. Initially set the fuzzy limit based on member-
ship function then apply the fuzzy rules and finally
R

training and classifying the images to provide bet-


ter accuracy than the existing approaches. Figure 1 defuzzification is done to convert the linguistic data
into the required crips data. Figure 2 shown the con-
R

illustrates the overall block diagram of the proposed


system. trast enhanced image after performing the following
fuzzy rules,
O

Fuzzy Rules for the membership function as fol-


3.1. Fuzzy anisotropy diffusion filter lows,
C

FADF removes the noise from the input image


without removing the significant parts of the image A = IF “the intensity value is LOW” THEN “the output is LOW”
content. Edges, lines or other details important for the B = IF “the intensity value is MEDIUM” THEN “the output is
understanding of the images are preserved. FADF is MEDIUM”
C = IF “the intensity value is HIGH THEN the output is HIGH”
an extension of ADF is also called an extended PM

Untitled

(mamdani)

Fig. 2. Contrast enhancement using fuzzy inference system.


4 K. Seethalakshmi et al. / An efficient fuzzy deep learning approach to recognize 2D faces

F
O
Fig. 4. Result obtained using morphological operation.
Fig. 3. Edge detection using fuzzy rules.

O
Next the resultant output image is given as input to C. Post edge detection using Fuzzy morphological

PR
edge detection using fuzzy logic. gradient
M N μmn Fuzzy morphological gradient is done to smooth
F= with μm,n ∈ [0, 1] (1) the detected edge. The morphological closing opera-
m=1 n=1 gmn
tion is done by erosion operation followed by dilation
In the above Equation (1) M × N is number pixel operation which smooth’s the edges by several iter-
ations. Based upon the different dilation depth the
D
in the image. Where g is the gray term based on the
brightness, gmn is the intensity value of (m,n)th pixel iteration is taken place. Morphological smoothing is
value and μmn is the membership value to enhance done by the opening operation followed by closing
TE

the brightness of the image. operation which removes the dark and bright artifact
B. Edge detection using fuzzy logic of noise. Here the dark and bright are the membership
After fuzzy contrast equalization the edge detec- function. Figure 4 illustrate the image obtained after
performing fuzzy morphological gradient operation.
EC

tion is done by initializing fuzzy membership


function such as Ix, Iy and Iout with the following The fuzzy logic X and Y is follows,
fuzzy rules. Next the edge detected image is given
as input to the post edge detection operation using
X = IF “the artifact is DARK” THEN “remove artifact DARK”
morphological gradient operation. Figure 3 shown
R

Y = IF “the artifact is BRIGHT” THEN “remove artifact


the detected image after applying fuzzy rules. Fig- BRIGHT”
ure 4 illustrate the fuzzy system rules to detect the
R

edge feature.
D. Noise Removal
O

The noise region is removed using fuzzy inference


Rule1 ‘If Ix is zero and Iy is zero then Iout is white’; system to compute the diffusion coefficient which is
Rule2 ‘If Ix is not zero or Iy is not zero then Iout is black’; highly strong at noisy region rather than the smooth
C

Rule3 ‘If Ix is not zero or Iy is zero then Iout is black’;


region. The degree of noise region is calculated by
Rule4 ‘If Ix is zero or Iy is not zero then Iout is black’;
standard deviation from each noisy pixel intensity
and local mean of the neighborhood.
M  
y −l n
μ (α )
l=1 i=1 ki l i
3.1.2. Fuzzy inference system
FEdge = M n  (2)
μ l (α i ) The obtained edge and noise information is given
l=1 i=1 k i
as input to the fuzzy inference system which in turn
In Equation 2, FEdge is used for calculating final map with the output diffusion coefficient after per-
pixel classification as edge pixel, where αi are the forming fuzzy logic. The logic behind this is if the
fuzzy sets associated with the antecedent part of the edge information is available then no need to perform
fuzzy rule base, y−l is the output class center and M smooth operation, else if there is no edge informa-
is the number of fuzzy rules being considered. tion is avail then it seem that there is an existence of
K. Seethalakshmi et al. / An efficient fuzzy deep learning approach to recognize 2D faces 5

noise information hence it required smoothing oper-


ation. Finally defuzzification is done get single value
as output form FIS. Edge and noise are membership
function between the interval 0 and 1. Whereas Fx
and Fy are fuzzy logic as follows,

Fx = IF edge = 1 THEN noise = 0 “No need to apply smooth”


Fy = IF edge = 0 THEN noise = 1 “Apply smooth”

Fig. 5. Building block of Residual Learning.

F
3.1.3. Diffusion coefficient iteration
tion is performed along with zero padding operation
The resultant Defuzzification output is fuzzy coef-
which leads to increase in the dimension and no extra

O
ficient used to control the diffusion coefficient during
parameter is added. Second the projection shortcut
iteration approach. The gradient and degree of edge
operation is performed to map dimensionality vector.

O
and noise is used to control the speed of iteration
For both shortcuts the common stride value is taken
approach. When the number of iteration is less then
as 2.
there is a loss of information in the image whereas

PR
In Fig. 3, F (X) + X performs the feed forward neu-
when there is more iteration which in turn gives more
ral network operation with shortcut connection. But
information about the image.
in this work shortcut operation performs the iden-
tity mapping. The building block is defined using
3.2. Convolution neural network (CNN) Equation (3).
D
The obtained edge and contrast features are given y = F (x, {Wi }) + x (3)
as input to the CNN architecture consisting of resid-
TE

Where x and y are the input and output vector of the


ual learning (ReLU) layer, convolution layer, pooling
layer. The function F (x, {Wi }) represents the residual
layer, and fully connected layer. Residual learning
mapping to be learned. The dimensionality of F and X
is an extension of the plain layer VGG-34 [21].
must be equal in residual network layer. When there
ResNet layer improves the accuracy from signif-
EC

is a mismatch in dimension, the shortcut is introduced


icantly increased depth which is greater than the
by performing the linear projection Ws in Equation
existing model.
(4). This matches the dimensionality of the input and
output model.
3.2.1. ResNet-164 model architecture
R

When the numbers of layers are increased, degra- y = F (x, {Wi }) + Ws x (4)
dation occurs. But overlapping and over fitting
R

The Fig. 7 illustrates the ResNet architecture for


prevent degradation. Hence deep residual network
different input images which consists of 19 parameter
layer address the degradation problem by using short-
layers with stride value of 2. For each layer short-
O

cut layer. So in this approach the identity mapping is


cut operation is performed to map the dimensionality
done as shortcut between each layer. Figure 5 illus-
vector. This reduces degradation. Each shortcut arrow
C

trates the building block of the residual network layer.


consists of two weight values with one ReLu value.
X is the identity mapping operation done as short-
This shortcut implementation does not incur cost
cut in between two layers. It is easy to compute and
and time. When the numbers of layers are increased
provides better error rate than the existing VGG-34
from ResNet-152 to ResNet-164 better accuracy is
layer.
achieved than the existing model. Error rate is stable
The only difference between the VGG-34 and
from top-3 to top-5. The following Fig. 6 illustrates
residual learning is the addition of shortcut in between
the structure of ResNet164-layer architecture.
two layers. The shortcut plays vital role in ResNet
layer based on the dimensionality of the input and 3.4. ANFIS deep classifier
output image. If the dimensionality is equal at source
and destination of the image then the shortcut is iden- An adaptive neuro-fuzzy inference system is a
tical vector. When the dimensionality is increased the type of artificial neural network that is based on
shortcut is taken in two ways. The identity opera- Takagi–Sugeno fuzzy inference system. Since, it is
6 K. Seethalakshmi et al. / An efficient fuzzy deep learning approach to recognize 2D faces

F
O
O
PR
D
TE
EC
R
R
O
C

Fig. 6. ResNet architecture for different input images with 19 parameter layers and shortcut for each layer.

the combination of both neural networks and fuzzy of set of fuzzy IF–THEN rules. Hence, ANFIS is used
logic principles. It has the ability to identify the to classify the features obtained from the ResNet-164
advantages of both in a single framework. It consists architecture.
K. Seethalakshmi et al. / An efficient fuzzy deep learning approach to recognize 2D faces 7

F
O
O
PR
D
TE
EC

Fig. 7. ResNet-164 Architecture.

The ANFIS Deep classifier produces better result of 94.98% and 90.86% on CK+ and JAFFE dataset
than the existing machine learning algorithms such using CNN-VGG. Our approach achieved an accu-
R

as SVM, K-NN. The next section discusses the result racy of 98.24% and 97.51 % on CK+ and JAFFE.
and compares with existing works. Table 1 also gives the accuracy obtained by CNN
R

[23], and gACNN [30] deep learning models. Accu-


4. Experimentation and result racy of the proposed system is calculated using the
O

following Equation (5).


The results are compared based on the accuracy, tp + tn
recognition rate and error rate. accuracy = (5)
C

tp + fp + fn + tn
4.1. Results based on accuracy
Table 1
The proposed DCNN ResNet-164 architecture Accuracy Rate Obtained from Different Deep Learning Models
model provides better accuracy rate when compared Methods Accuracy (%)
to the existing approaches [23, 27]. The authors Yale-B CK+ JAFFE CMU-PIE
[27] achieved an accuracy rate of 90.58%, 90.02% CNN [23] – 93.12 88.92 –
p-CNN [27] 90.58 – 90.02 90.58
and 90.58% for the Yale-B dataset with illuminated CNN-VGG [23] – 94.98 90.86 –
images, JAFFE with pose variation images and CMU- gACNN [30] – 81.07 – –
PIE with expression variation images. This work Proposed Fuzzy 95.62 98.24 97.51 97.23
has enhanced the accuracy by 95.62%, 97.51% and Deep ResNet-164
architecture
97.23%. Biao Yang et al. [23] achieved an accuracy
8 K. Seethalakshmi et al. / An efficient fuzzy deep learning approach to recognize 2D faces

Table 2 Table 3
Error Rate (%) from Top-1 to Top-5 for Validation Recognition Rate Based On Classifier and Preprocessing
Model Top-1 Error Top-5 Error Methods Recognition Rate (%)
VGG-16 [24] 28.07 9.33 Extended Yale-B CMU-PIE
Google Net [25] – 9.15 VGG classifier + Original 65.69 95.91
PReLu-Net [26] 24.27 7.38 image [22]
ResNet-152 [6] 21.43 5.71 VGG Classifier + 85.86 96.98
Proposed ResNet-164 18.8 3.03 Preprocessed Image
[22]
In-Net + CNN Classifier 91.82 98.94
[22]
ResNet Classifier + 95.62 97.23
Original image

F
4.2. Comparison of error rate ResNet Classifier + 96.35 99.18
Filtered Image

O
Top-1 error rate implies that the target class will
be the first search prediction. Top-5 error rate implies Table 4

O
that the target class will be anywhere in the first five Accuracy Based on Standard Dataset
search predictions. Table 2 compares the top-1 and Method Accuracy (%)
top-5 error rate. The ResNet model is tested and vali- Face 94 Face95 Face 96 Grimace

PR
dataset dataset dataset dataset
dated according to the error rate obtained. K. He et al.
PCA [31] 72.10 69.87 70.95 74.79
[6] achieved 21.43% top-1 error rate and 5.71% top- LDA [32] 79.39 76.61 78.34 81.93
5 error rate. This approach reduced the top-1 error LBP [33] 85.93 80.47 84.14 86.45
rate to 18.8 % and top-5 error rate to 3.03% for the DL + LBP [28] 93.6 90.6 91.6 96.6
training dataset. The number of layers was increased Proposed Method 98.05 94.56 95.23 99.26
D
along with shortcut parameter resulting in reducing
the top-5 error rate to 2.68%. Equation (6) is used for
TE

Total Number of images correctly identified


calculating error rate. RR =
Total number of images

Total number error image identified × 100


Errorrate = (7)
EC

total number of image


(6)
4.4. Validation based on standard dataset
4.3. Results based on recognition rate
Table 4 illustrates the accuracy of classification
R

The authors [22] used VGG classifier and obtained by several existing methods. In this work
compared with the proposed DCNN Res-Net-164 the training job is run around 450 iteration using GPU
R

architecture model on Extended Yale-B and CMU- Tesla V100 using 16 GB. Least possible 8 GB mem-
PIE dataset. In the proposed work along with the ory is sufficient for training any deep learning system.
O

ResNet, the preprocessing model improves the recog- CNN is used for feature extraction and produces an
nition rate of the system. Anisotropy diffusion filter is accuracy of 98.05 % on Face 94 dataset, 94.56 % on
Face 95 dataset, 95.23% on Face 96 dataset and 99.26
C

used initially to remove the noise from the image and


to give better quality by preserving edges from the % on Grimace dataset, which is higher than the exist-
image. The preprocessed image gives better perfor- ing methods proposed by author [28] and the other
mance than the original image. Equation (7) is used approaches [31–33].
in calculating the recognition rate.
The existing VGG approach [22] on original image 4.5. Confusion matrix based of fuzzy deep
provided a recognition rate of 65.69% and 95.91% learning method
on Yale-B and CMU-PIE dataset. Our approach
increased the recognition rate by 95.62 % and 97.23% Based on the different dataset used in this experi-
for the original image. The preprocessed image ment the following confusion matrix is constructed to
resulted in 85.86% and 96.98% recognition rate using analyses the performance of the fuzzy deep learning
VGG classifier [22]. Our approach enhanced the classification. The calculation of the confusion matrix
recognition rate by 96.35% and 99.18%. is depending upon the precession, recall and accuracy
K. Seethalakshmi et al. / An efficient fuzzy deep learning approach to recognize 2D faces 9

surprise. 7500 images from CMU-PIE dataset where


used and 327 images representing expression varia-
tion from CK+ dataset. Along with this 2660 images
from Face 94, 95, 96 and Grimace dataset are used in
result comparison.

6. Conclusion and future work

Fuzzy Deep Convolution neural network based


face recognition system is addressed in this work.

F
Fig. 8. Confusion matrix based on fuzzy deep learning. Fuzzy based Anisotropy diffusion filter removes

O
noise to give better clarity to image in the preprocess-
of the recognition rate in which actual outcome with ing stage. The proposed model improves the accuracy
rate without any loss of information. Compared to the

O
respect to the expected outcome. Accuracy, as given
by Equation (5), is how close the measured value is to existing techniques, the proposed work outperforms
the actual value. Precision, which is given by Equa- with respect to recognition rate, accuracy, Top-1 and

PR
tion (8), is the real value obtained by the system. The Top-5 error rate. Obtained features are classified
recall is the relevant information gathered in the mea- using deep ANFIS classifier where results are bet-
sured value and is given by Equation (9). True positive ter than the existing machine learning approaches. A
tp, implies that the correct face image is identified Huge dataset is trained and tested using deep learning
as the correct image. True negative tn, implies that approach which provides outstanding result than the
D
the incorrect face image is identified as the incorrect existing works. As future work, better preprocessing
image, false positive fp, implies that the correct face and segmentation algorithms are needed for occluded
TE

image is identified as the incorrect image, and false and partial occluded images. New features can be for-
negative fn, implies that the incorrect face image is mulated. 3D images can also be trained and tested in
identified as the correct image. Figure 8 depicts the future. Fuzzy recurrent neural network (FRNN) will
confusion matrix for fuzzy deep learning approach. be used to reduce the time and space by implementing
EC

long short-term memory (LSTM) architecture.


tp
precision = (8)
tp + fp
References
R

tp
recall = (9)
tp + fn [1] R. Ranjan and R. Chellappa, Hyper Face: A Deep Multi-task
R

Learning Framework for Face Detection, Landmark Local-


ization, Pose Estimation, and Gender Recognition, IEEE
Transactions on Pattern Analysis and Machine Intelligence
O

5. Data set 41(1) (2017), 121–135.


[2] T. Zhang, W. Zheng, Z. Cui, Y. Zong, J. Yan and K. Yan,
Deep Neural Network-Driven Feature Learning Method for
C

FERET dataset consists of 856 faces and it con- Multi-view Facial Expression Recognition, IEEE Transac-
tains 2413 facial images under different poses. The tions on Multimedia 18(12) (2016), 2528–2536.
extended Yale-B dataset has 16128 images with 9 [3] A. Krizhevsky I. Sutskever and G.E. Hinton, ImageNet
classification with deep convolutional neural networks, Pro-
poses and 64 illumination conditions of 28 individu- ceeding in International Conference on Advance in Neural
als. In this work, the extended Yale-B dataset is used. Information Processing System (2012), 1097–1105.
In training, 252 images of 9 poses and maximum [4] O.M. Parkhi, A. Vedaldi and A. Zisserman, Deep face recog-
nition, British Machine Vision Conference (2015), 1–12.
illuminated images under 45º illumination conditions
[5] K. Simonyan and A. Zisserman, Very deep convolu-
resulting in 1260 images are used. The other images tional networks for large-scale image recognition. (2014),
are used for testing. The JAFFE dataset is also used; 1409–1556.
251 images with 7 different expression variations [6] K. He, X. Zhang, S. Ren and J. Sun, Deep residual learn-
ing for image recognition, IEEE Conference on Computer
from the JAFFE dataset is used for experimentation. Vision Pattern Recognition (2016), 770–778.
The seven different expression variations are hap- [7] J.-Y. Jung, S.-W. Kim, C.-H. Yoo, W.-J. Park and S.-J.
piness, sadness, fear, disgust, anger, contempt and Ko, LBP-Ferns-Based Feature Extraction for Robust Facial
10 K. Seethalakshmi et al. / An efficient fuzzy deep learning approach to recognize 2D faces

Recognition, IEEE Transactions on Consumer Electronics Going deeper with convolutions, IEEE Conference on Com-
62(4) (2016), 446–453. puter Vision and Pattern Recognition (2015), 1–9.
[8] M. Shao, M. Ma and Y. Fu, Sparse Manifold Subspace [26] K. He, X. Zhang, S. Ren and J. Sun, Delving deep into rec-
Learning, Springer, Low-Rank and Sparse Modeling for tifiers: Surpassing human-level performance on imagenet
Visual Analysis (2014), 117–132. classification, IEEE International Conference on Computer
[9] W. Deng, J. Hu and J. Guo, Compressive Binary Patterns: Vision (2016), 1026–1034.
Designing a Robust Binary Face Descriptor with Random- [27] X. Yin and X. Liu, Multi-Task Convolutional Neural
Field Eigenfilters, IEEE Transactions on Pattern Analysis Network for Pose-Invariant Face Recognition, IEEE Trans-
and Machine Intelligence 41(3) (2019), 758–767. actions on Image Processing 27(2) (2018), 964–975.
[10] S.X. Wu, H.-T. Wai, L. Li and A. Scaglione, A Review of [28] A. Vinay, A. Gupta, A. Bharadwaj, A. Srinivasan, K. Ala-
Distributed Algorithms for Principal Component Analysis, subramanya Murthy and S. Natarajan, Deep Learning on
Proceedings of the IEEE 106(8) (2018), 1321–1340. Binary Patterns for Face Recognition, International Con-
[11] H. Zhao, Z. Wang and F. Nie, A New Formulation of ference on Computational Intelligence and Data Science,

F
Linear Discriminant Analysis for Robust Dimensionality Elsevier, 132 (2018), 76–83.
Reduction, IEEE Transactions on Knowledge and Data [29] A.R. Rivera, J.R. Castillo and O.O. Chae, Local directional
Engineering 31(4) (2018), 629–640. number pattern for face analysis: Face and expression recog-

O
[12] T. Luo, F. Nie and D. Yi, Dimension Reduction for Non- nition, IEEE Transaction on Image Processing 22(5) (2013),
Gaussian Data by Adaptive Discriminative Analysis, IEEE 1740–1752.
Transactions on Cybernetics 49(3) (2019), 933–946. [30] Y. Li, J. Zeng, S. Shan and X. Chen, Occlusion Aware Facial

O
[13] B. Xiao, K. Wang, X. Bi, W. Li and J. Han, 2D-LBP: An Expression Recognition Using CNN with Attention Mecha-
Enhanced Local Binary Feature for Texture Image Classifi- nism, IEEE Transactions on Image Processing 28(5) (2019),
cation, IEEE Transactions on Circuits and Systems for Video 2439–2450.

PR
Technology 29(9) (2018), 2796–2808. [31] J. Yang, D. Zhang, F. Alejandro Frangi and J.-
[14] B. Ryu, A.R. Rivera, J. Kim and O. Chae, Local Direc- yu. Yang, Two-dimensional PCA: a new approach to
tional Ternary Pattern for Facial Expression Recognition, appearance-based face representation and recognition,
IEEE Transactions On Image Processing 26(12) (2017), IEEE Transactions on Pattern Analysis and Machine Intel-
6006–6018. ligence 26(1) (2004), 131–137.
[15] Q. Liu and C. Liu, A Novel Locally Linear KNN Method [32] L.-F. Chen, H.-Y.M. Liao, M.-T. Ko, J.-C. Lin and G.-J. Yu,
D
with Applications to Visual Recognition, IEEE Transac- A new LDA-based face recognition system which can solve
tions on Neural Networks And Learning Systems 28(9) the small sample size problem, Pattern Recognition 33(10)
(2017), 2010–2020. (2000), 1713–1726.
TE

[16] S. Wang, B. Pan, H. Chen and Q. Ji, Thermal Augmented [33] Z. Guo, L. Zhang and D. Zhang, A completed modeling
Expression Recognition, IEEE Transactions on Cybernetics of local binary pattern operator for texture classification,
48(7) (2018), 2203–2214. IEEE Transactions on Image Processing 19(6) (2010),
[17] T.-H. Chan, K. Jia, S. Gao, J. Lu, Z. Zeng and Y. Ma, 1657–1663.
PCANet: A Simple Deep Learning Baseline for Image Clas- [34] P. Perona and J. Malik, Scale-Space and Edge Detection
EC

sification? IEEE Transactions on Image Processing 24(12) Using Anisotropic Diffusion, IEEE Transactions on Pattern
(2015), 5017–5032. Analysis and Machine Intelligence 12(7) (1990), 629–639.
[18] Y. Liu, X. Yuan, X. Gong, Z. Xie and F. Fang and Z. Luo, [35] P. Lucey, J.F. Cohn, T. Kanade, J. Saragih Z. Ambadar
Conditional convolution neural network enhanced random and I. Matthews, The Extended Cohn-Kanade Dataset
forest for facial expression Recognition, Elsevier, Pattern (CK+): A complete dataset for action unit and emotion-
R

Recognition. 84 (2018), 251–261. specified expression, IEEE Computer Society Conference


[19] B. Yang, J. Cao, R. Ni and Y. Zhang, Facial Expression on Computer Vision and Pattern Recognition - Workshops
Recognition using Weighted Mixture Deep Neural Network (2010), 94–101. Retrieved from https://www.kaggle.com/
R

Based on Double-Channel Facial Images, IEEE Access 6 shawon10/ckplus.


(2018), 4630–4640. [36] G. Zhao, X. Huang, M. Taini, S.Z. Li and M. Pietikäinen,
[20] V. Kazemi and J. Sullivan, One Millisecond Face Alignment Facial expression recognition from near-infrared videos,
O

with an Ensemble of Regression Trees, IEEE Confer- Image and Vision Computing 29(9) (2011), 607–619.
ence on Computer Vision and Pattern Recognition (2014), Retrieved from https://paperswithcode.com/dataset/oulu-
1867–1874. casia.
C

[21] K. Simonyan and A. Zisserman, Very deep convolutional [37] M. Lyons, S. Akamatsu, M. Kamachi and J. Gyoba, Cod-
networks for large-scale image Recognition, International ing facial expressions with Gabor wavelets, Proceedings
Conference on Learning Representations 6 (2015), 1–14. Third IEEE International Conference on Automatic Face
[22] O.M. Parkhi, A. Vedaldi and A. Zisserman, Deep face recog- and Gesture Recognition (1998), 200–205. Retrieved from
nition, British Machine Vision Conference 1 (2015), 6. https://paperswithcode.com/dataset/jaffe.
[23] B. Yang, J. Cao, R. Ni and Y. Zhang, Facial Expression [38] L. Chen, W. Su, M. Wu, W. Pedrycz and K. Hirota, A
Recognition Using Weighted Mixture Deep Neural Network Fuzzy Deep Neural Network With Sparse Autoencoder for
Based on Double-Channel Facial Images, IEEE Access 6 Emotional Intention Understanding in Human–Robot Inter-
(2018), 4630–4640. action, IEEE Transactions on Fuzzy Systems 28(7) (2020),
[24] K. Simonyan and A. Zisserman, Very deep convolutional 1252–1264.
networks for large-scale image recognition, International [39] X. Lu, L. Meng, C. Chen and P. Wang, Fuzzy Remov-
Conference on Learning Representations 6 (2015), 1–14. ing Redundancy Restricted Boltzmann Machine: Improving
[25] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Learning Speed and Classification Accuracy, IEEE Trans-
Anguelov, D. Erhan, V. Vanhoucke and A. Rabinovich, actions on Fuzzy Systems 28(10) (2020), 2495–2509.

You might also like