0% found this document useful (0 votes)
49 views12 pages

A Classification of Arab Ethnicity Based On Face Image Using Deep Learning Approach

This document summarizes a research paper that aims to classify Arab ethnicity based on face images using deep learning approaches. Specifically, it creates an Arab dataset with labels for three ethnic groups (Gulf Cooperation Council countries, Levant, and Egyptian) and uses both supervised and unsupervised deep learning methods for classification. For supervised learning, a convolutional neural network achieved an accuracy of 56.97% on the Arab dataset. For unsupervised learning (deep clustering), three methods were tested and achieved accuracies ranging from 32% to 59% on different datasets, demonstrating deep clustering's potential for ethnicity classification.

Uploaded by

nadiya abd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views12 pages

A Classification of Arab Ethnicity Based On Face Image Using Deep Learning Approach

This document summarizes a research paper that aims to classify Arab ethnicity based on face images using deep learning approaches. Specifically, it creates an Arab dataset with labels for three ethnic groups (Gulf Cooperation Council countries, Levant, and Egyptian) and uses both supervised and unsupervised deep learning methods for classification. For supervised learning, a convolutional neural network achieved an accuracy of 56.97% on the Arab dataset. For unsupervised learning (deep clustering), three methods were tested and achieved accuracies ranging from 32% to 59% on different datasets, demonstrating deep clustering's potential for ethnicity classification.

Uploaded by

nadiya abd
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Received March 3, 2021, accepted March 19, 2021, date of publication March 26, 2021, date of current version

April 7, 2021.
Digital Object Identifier 10.1109/ACCESS.2021.3069022

A Classification of Arab Ethnicity Based on Face


Image Using Deep Learning Approach
NORAH A. AL-HUMAIDAN AND MASTER PRINCE
Department of Computer Science, Qassim University, Mulaydha 51452, Saudi Arabia
Corresponding author: Norah A. Al-Humaidan ([email protected])
This work was supported by the Qassim University, Saudi Arabia, to complete Master Thesis under the course M.S. in computer science.

ABSTRACT Human face and facial features gain a lot of attention from researchers and are considered as
one of the most popular topics recently. Features and information extracted from a person are known as soft
biometric, they have been used to improve the recognition performance and enhance the search engine for
face images, which can be further applied in various fields such as law enforcement, surveillance videos,
advertisement, and social media profiling. By observing relevant studies in the field, we noted a lack of
mention of the Arab world and an absence of Arab dataset as well. Therefore, our aim in this paper is to
create an Arab dataset with proper labeling of Arab sub-ethnic groups, then classify these labels using deep
learning approaches. Arab image dataset that was created consists of three labels: Gulf Cooperation Council
countries (GCC), the Levant, and Egyptian. Two types of learning were used to solve the problem. The
first type is supervised deep learning (classification); a Convolutional Neural Network (CNN) pre-trained
model has been used as CNN models achieved state of art results in computer vision classification problems.
The second type is unsupervised deep learning (deep clustering). The aim of using unsupervised learning is
to explore the ability of such models in classifying ethnicities. To our knowledge, this is the first time deep
clustering is used for ethnicity classification problems. For this, three methods were chosen. The best result
of training a pre-trained CNN on the full Arab dataset then evaluating on a different dataset was 56.97%, and
52.12% when Arab dataset labels were balanced. The methods of deep clustering were applied on different
datasets, showed an ACC from 32% to 59%, and NMI and ARI result from zero to 0.2714 and 0.2543
respectively.

INDEX TERMS Arab, convolutional neural network (CNN), deep learning, deep clustering, ethnicity.

I. INTRODUCTION search space in databases for user identification systems. Also


The research on human face ethnicity and gender recognition Kumar et al. [6] used soft biometrics to enhance the search
was initially studied by psychologists from the perspective of engine for face images. Besides, in Human-Computer Inter-
cognitive science [1]. face (HCI) field, the computer can provide speech recognition
In Computer Vision, human face and facial features gain a or offer options to the user based on soft biometrics [7], [8].
lot of attention from researchers and are considered as one In surveillance videos, based on soft biometrics, suspects
of the most popular topics recently [2], [3]. Features and can be located [8]. Niinuma et al. [9] used it for continuous
information extracted from a person such as age, gender, and user authentication. It also can be used in law enforcement,
ethnicity, are known as soft biometric. Other soft biometric advertisement, social media profiling [3], [10].
examples are hair color, eyes color, height, width, scars, As mentioned before, ethnicity is a soft biometric. It is
marks, the shape of nose and mouth, etc. Soft biometrics defined as: ‘‘the fact or state of belonging to a social group
can’t identify a person’s identity by itself. However, it has that has a common national or cultural tradition’’ [11]. It is
been used to improve the recognition performance by com- multifaceted and keeps changing over time based on cul-
bining it with hard biometric [4]. Veropoulos et al. [5] used tural or geographical factors. Therefore, it is used for people
soft biometric classification as a filtering step to limit the who share the same race, language, nationality, or religion,
etc [12], [13]. In Computer Vision, ethnicity classification
The associate editor coordinating the review of this manuscript and using face image has been getting a lot of attention that
approving it for publication was Larbi Boubchir . addresses two main topics; basic races such as Black, White,

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 9, 2021 50755
N. A. AL-Humaidan, M. Prince: Classification of Arab Ethnicity Based on Face Image Using Deep Learning Approach

Asian, etc. [3], [14]. The second type focuses on smaller and Non-Chinese. The results were 100% vs 99.4%, 99.8 vs
ethnic groups or sub-ethnic groups, these groups can be 99.9 and 99.4 vs 99.5 vs 99.9 respectively. They were com-
people of the same nationality, for example [15] proposed a pared to previous works and showed good improvements.
Myanmar / non-Myanmar classification method, [16] focused Anwar and Islam [14] used a CNN called VGG-face, which
on East Asian countries: Vietnam, Burma, Thailand, China, was pre-trained on a large face dataset of 2.6 million images
Korea, Japan, Indonesia, and Malaysia, while [10] classified to extract features then SVM with linear kernel is used as
Bangladeshi, Chinese and Indian people. Besides, sub-ethnic a classifier. It was trained on three classes; Asian, African-
groups can define smaller groups in the same country, for American, and Caucasian from ten different datasets they are
example, [17] perform classification on eight ethnic groups Computer Vision Lab (CVL), Chicago Face Database (CFD),
from China. FERET, Multi-racial mega resolution (MR2) face database,
In this research, our focus is on Arab ethnicity. Arab UT Dallas face database, Psychological Image Collection at
refers to people who live in the Arab world, which consists Stirling (PICS) Aberdeen, Japanese Female Facial Expres-
of 22 countries from North Africa and Western Asia who sion (JAFFE), CAS-PEAL-R1, Montreal Set of Facial Dis-
have Arabic as their official language [18]. Our contribution plays of Emotion Database (MSFDE) and Chinese University
is to provide an Arab face dataset that consists of three labels of Hong Kong face database (CUFC). Average classification
which are; Gulf Cooperation Council countries, Levant, and accuracy over all databases is 98.28%, 99.66%, and 99.05%
Egyptians, and perform supervised classification on it. And for Asian, African-American, and Caucasian respectively.
lastly, test some deep clustering methods on our dataset and Masood et al. [3] also used pre-trained VGGNet, which is a
a benchmark dataset. To the best of our knowledge, this is 16-layer architecture in their proposed work. They attempted
the first time that deep clustering is performed to solve the to classify the ethnicity of Mongolian, Caucasian, and the
ethnicity classification problem. Negro using ANN and CNN. ANN was applied after cal-
The rest of the paper is divided into 5 sections. The second culating geometric features, calculating normalized forehead
section contains literature reviewed. The methodology comes area, and extracting skin color. FERET database was used
after that, in section four experimental results are discussed. in the experiments. CNN achieved superior results with
Finally, the entire work is concluded and further develop- 98.6%, while ANN was good as compared to other works
ments are suggested. with 82.4%.
Srinivas et al. [16] presented a new dataset called WEAFD,
II. RELATED WORK which consists of constrained and unconstrained images of
The interest in CNN reached the field of ethnicity classifi- people from East Asian countries. Also, a CNN model that
cation. In this section, we will summarize some studies that has three convolutional layers and several fully connected
used CNN for ethnicity classification. layers was applied on WEAFD to classify age, gender, and
Heng et al. [10] hybridized a pre-trained CNN classifier ethnicity. they presented two networks; one with full face
(VGG-16), which was trained on ImageNet, output with images and the second with face divided into regions. Age
image ranking engine and train Support Vector Machine and gender results were better in the first network, while
(SVM) using the hybrid features. The approach evaluated ethnicity was better in the second one. However, the results of
on a new dataset which contains Bangladeshi, Chinese and ethnicity (24.06%, 33.33%) and age (38.04%, 36.43%) were
Indian people faces. The result showed improvement when low in both networks compared to gender (88.02%, 84.70%).
compared to Faster R-CNN and Wang’s method with an They explain that low results could be because of the lack
accuracy of 95.2%. of training data for age and ethnicity. Also, the quality of
Narang and Bourlai [19] investigated the problems of dis- labels could be another reason as they mentioned that it is
tance, night time, and uncontrolled conditions of face images. more likely for a human to make mistakes while labeling age
They used NIR images taken from 30, 60, 90, and 120 meters and ethnicity data than gender.
at night and visible images at a distance of 1.5 meters from In [21], the structure of Gudi’s CNN as follows: con-
the Long Distance WVU Database, they used Long Distance volutional layer, local contrast normalization and a max-
Heterogeneous Face LDHF database as well. They used CNN pooling layer, another two convolutional layers, after that a
(VGG architecture), to classify the gender and ethnicity of fully connected layer. For the preprocessing step they use
Asians and Caucasians under these environments, and the Global Contrast Normalization on the VicarVision dataset.
results 78.98% showed improvement from previous results. The classification accuracy is 92.24%. The model performed
Wang et al. [20] proposed a deep CNN classification model well on Caucasian and East Asian, which has a higher number
consists of three convolutional and pooling layers and end of images (thousands). However, it didn’t perform well on
it with two fully-connected layers. The model was applied the rest of the labels. The average precision for all labels was
on several datasets in addition to two self-collected datasets. 61.52%. In this case, average precision was a better indicator
Some of the datasets were used only for testing to evaluate to understand how well the model performed.
the classification of images that are not from the same trained Chen et al. [22] used four different algorithms, which are
dataset. The classification was done separately on white and k-Nearest Neighbor (kNN), SVM, Two-Layer Neural Net-
black, Chinese and non-Chinese, finally on Han, Uyghurs, work, and CNN to classify Korean, Japanese, and Chinese

50756 VOLUME 9, 2021


N. A. AL-Humaidan, M. Prince: Classification of Arab Ethnicity Based on Face Image Using Deep Learning Approach

TABLE 1. Summary of ethnicity classification studies in the literature.

with and without identified gender. CNN architecture consists


of two convolutional layers followed by one fully-connected
layer and a dropout layer. The dataset used for experiments
is self-collected. CNN was the best with 89.2% accuracy
for 3 classes and 83.5% for 6 classes, which includes gen-
der as well. Despite the good accuracy, CNN only predicts
61.3% accuracy on other new images, which means overfit-
ting occurred.
Table 1 summarizes all the studies. However, the Arab
world ethnicity has not been considered in the literature yet.
One reason could be the lack of a dataset. Hence, a dataset
containing images of people from different parts of the Arab
world and label them accordingly is needed. In this study,
we aim to create a dataset that contains Arab images with FIGURE 1. Map of Arab countries that included in Arab dataset and their
corresponding labels.
labels. However, labeling people according to their countries
could be hard to classify due to the closeness of a lot of
countries. Therefore, we have decided to classify according
to a wider range (regions) such as Gulf Cooperation Council own label. The map in Figure 1 illustrates the distribution of
countries (GCC), which contains Kuwait, Oman, Qatar, Saudi countries for each label
Arabia and the United Arab Emirates. Another known region All related studies mentioned previously are performed
is Al-Sham (the Levant) which consists of four countries; under supervised learning. Labeling data to perform super-
Syria, Palestine, Jordan, Lebanon. The last label is Egypt, vised learning methods take a lot of time and effort.
which has the largest population in Arab world with over Therefore, there is a need to develop unsupervised learning
100 million inhabitants [23]. It has more population than methods to deal with unlabeled data. Clustering is one of the
both CGG and the Levant, therefore we decided to give it its most popular unsupervised methods. It means grouping data

VOLUME 9, 2021 50757


N. A. AL-Humaidan, M. Prince: Classification of Arab Ethnicity Based on Face Image Using Deep Learning Approach

that are more similar to each other to form a cluster [24]. Actors, Actresses, Singers, Football players, Social
Chen et al. [22], attempted to apply unsupervised learning to media influencers, Writers, Announcers, Journalists,
ethnicity classification by using k-means clustering. They did Businessmen / Businesswomen, Ministers, Producers,
not report experiment details nor results of the experiment, Filmmakers, Artists, etc.
they only mentioned that the algorithm failed to cluster labels 2. Check their nationalities, origins if available. They
due to background noises and similarity between labels. must belong to only one of the labels and no others:
To our knowledge, deep clustering has not been used to ◦ Saudi, Kuwaiti. Qatari, Emirati, Omani, Bahraini
solve ethnicity classification problems before. Deep clus- are called: GCC.
tering becomes an interesting field to researchers after ◦ Syrian, Palestinian, Jordanian, Lebanese are
Deep Embedded Clustering (DEC) [25] was proposed by called: The Levant
Min et al. [26]. In this study, deep clustering methods will be ◦ Egyptian
applied to the labeled datasets, to conclude the effectiveness
3. Download images from google search using a modi-
of the methods.
fied python script from [30] accordingly. We download
Methods that will be used are DEC, the blueprint of a lot
10 images for each subject.
of deep clustering methods. Also, we will use Improved Deep
4. Use a face detector to detect faces in images of each
Embedded Clustering (IDEC) [27], which is slightly different
subject, mentioned in detail in preprocessing section.
than DEC by keeping the decoder after the pre-training phase.
5. Cleaning: Remove unrelated images to subjects
It showed improvement in results from DEC [27]. The last
(images of objects or other people), duplicated images,
method is Dynamic Autoencoder (DynAE) [28]. DynAE has
and images that were mistaking by the detector as faces.
a bit similar structure to the other two; however, its main
contribution is the dynamic loss function. It achieved state Table 2 illustrate the number of subjects and images for each
of art results in image clustering of three benchmark datasets label in the Arab dataset. A subject is a unique person; that
(MNIST-full, MNIST-test, and USPS) according to [29]. means, the dataset may have more than one image for one
subject. As shown in Figure 3, the dataset is unbalanced.
III. METHODOLOGY GCC has 70% of the subjects. In addition to the Arab dataset,
This work aims to classify sub-ethnic groups of Arabs. A Private Arab dataset was collected from non-public figures,
To accomplish this, an Arab dataset is introduced by col- will be used to evaluate models. The private dataset consists
lecting images from the internet that belongs to specific of 88 Egyptian subjects, 104 GCC subjects, and 91 from the
subjects. Then some pre-processing and data cleaning tasks Levant.
have been performed on the collected images. Once the
dataset was ready, supervised and unsupervised models are TABLE 2. Arab dataset information.

applied to solve the ethnicity classification problem. CNN


was chosen for the supervised learning due to its amaz-
ing performance in solving ethnicity classification problems
in [2], [10], [14], [20]. In unsupervised learning, three deep
clustering methods will be used. To evaluate models, our Arab
dataset, as well as other datasets are used. Several matrices
are used to report evaluation results. The summary of the
methodology is shown in Figure 2.

FIGURE 2. Methodology summary.

A. DATASETS
Our dataset provides labeled images from the Arab world. FIGURE 3. The chart illustrates the dataset’s labels and percentage of
We decided to choose three labels as explained in the related gender for each label.

work section. The process to create an Arab dataset follows:


1. Collect names of subjects. Subjects are public fig- In addition to the Arab dataset, we will introduce other
ures with accessible background information, such as datasets that will be used in our experiments:

50758 VOLUME 9, 2021


N. A. AL-Humaidan, M. Prince: Classification of Arab Ethnicity Based on Face Image Using Deep Learning Approach

Racial Faces in-the-Wild (RFW) [31], [32], [33] was col- B. PREPROCESSING
lected from MS-Celeb-1M, and have four labels in which, Dlib’s pre-trained face detector based on a modification to
each label contains 10K images of 3K subjects [31]. RFW is the standard Histogram of Oriented Gradients + Linear SVM
similar to the Arab dataset in terms of data source (internet). method for object detection from [36] was used to detect
Moreover, it identifies subjects individually. Therefore, it is faces from all images. Then, the detected face is cropped
suitable to be combined with the Arab dataset to classify and resized to 224 × 224. The size was chosen based on
four labels (Arab, Asian, Black, White) without concern of pre-trained CNN that used the same size. The steps are illus-
subjects overlapping between train and test sets. Figure 4 trated in Figure 5.
shows samples from the Arab dataset and RFW.

FIGURE 5. An illustration of pre-processing steps.

After that, cleaning images is implemented. Images such


as duplicated, unrelated results to the subject (due to google
search error or other faces in the same image as the subject)
and detected errors will be removed.

C. DATA AUGMENTATION
Data augmentation (DA) is a way to reduce overfitting [37]
by applying certain methods to images during the training
process. In our experiments, Different data augmentation
method sets will be used in different experiments to find
suitable for Arab dataset. The methods that will be used
in our experiments: flip horizontally and/or vertically (FL),
multiply all pixels by random values to make them brighter
or darker(ML), increase or decrease hue and saturation by
random values (HS), rescaling, blur images, adjust image
contrast, dropout (set some pixels to zero) and convert to
grayscale.

D. CLASSIFICATION MODEL
In this section, we represent classification models that will
be used to solve the ethnicity classification problem. Two
FIGURE 4. Samples of all datasets labels. types of learning are going to be used separately to solve the
problem, supervised learning, and unsupervised learning.
BUPT-Transferface has 50K images of African, Asian and
Indian, also over 460K images and 10K subjects of White. 1) SUPERVISED LEARNING
FERET dataset [33], [34] is a well-known benchmark After pre-processing, a CNN model will be trained on the
dataset of facial images used to report and compare results of Arab dataset. Convolutional layers in CNN work as feature
different methods. It contains high-quality images for individ- extractors, so there is no need for a separate feature extraction
ual subjects with different poses, expressions, and lighting. step [38]. The model which we use is a pre-trained CNN
Furthermore, it provides gender, age, and ethnicity informa- model. Usually, pre-trained models are trained on millions
tion about subjects. of images then used to train on small datasets (thousands in
Lastly UTK dataset [35]. It contains images collected from our case) [39], [40]. It was proven that pre-trained models
the internet and provides their age, gender, and ethnicity. improve results and outperform newly trained CNN from
However, the dataset does not provide information about the scratch [39], [40].
individual identity of subjects. Therefore, we cannot be sure The model architecture to be used is ResNet-50 layers that
if overlapping occurs between train and test sets if it is used were created by He et al. [41]. They were motivated by the
to train a classification model. degradation problem, which can be explained as; when the

VOLUME 9, 2021 50759


N. A. AL-Humaidan, M. Prince: Classification of Arab Ethnicity Based on Face Image Using Deep Learning Approach

network depth increased, accuracy gets saturated and then ResNet50 was trained by Cao et al. [42] on the
degrades rapidly after the saturation region. This degradation VGGface2 dataset, which contains 3,31 million images
is not caused by overfitting, and adding more layers to a of 9131 subjects.
deep model that leads to higher training error was unexpected We will use the pre-trained ResNet50 model. However,
since theoretically, the network was supposed to perform the last layer of the model is going to be replaced with a
better while going deeper [41]. new fully-connected layer that has an output of three classes.
Shortcut connections between blocks differentiate ResNet Then we start training with categorical cross-entropy loss
from other models [41]. Two types of shortcuts are used in function (Ln ) for the nth training sample which is given by
ResNet-50 layers; Identity shortcuts are used when input/ the equation:
output have the same dimensions, while projection shortcuts
are used to match dimensions [41]. Figure 6 shows more X3
details about ResNet-50 layers architecture. Downsampling Ln = − xi log2 (si ) (1)
i=1
is performed between blocks with a stride of 2 [41].
where xi is the truth label (0 or 1) of class i and si (0 to 1) is
the probability of an object classified as a member of class i.
Categorical cross-entropy loss function or sometimes called
softmax loss function was used to train the ResNet50 pre-
trained model [42]. It is a common and popular choice for
classification problems (multiclass classification) [43]. The
model aims to minimize the loss function to improve the
performance.
We tune the model with different hyperparameters. The
total number of experiments is over 60. The hyperparameters
were:
• Learning rate (LR): constant LR 0.01, 0.001, 0.0001, and
automatic LR that increase at certain values of the epoch.
• Optimizer: SGD and Adam
• Freezing layers (blocks): there are 4 blocks in ResNet-
50 explained in Figure 6. Switch freezing between
blocks in different experiments and sometimes made
all the network trainable. Freezing layers in pre-trained
CNN means that the layer will not learn from the train-
ing process and only use what it learned from training
before.
• Data augmentation: different sets of methods or not use
at all in some experiments.
The number of epochs is 30 and the batch size is 64 for
all experiments. We test all models on a private dataset and
represent the top five accuracies in the results and discussion
section.

2) DEEP CLUSTERING MODELS (UNSUPERVISED LEARNING)


The general idea of deep clustering consists of two stages;
pre-training an autoencoder, which allows the network to
learn features that are used to initialize the cluster cen-
ters [44], and fine-tuning, where clustering and feature learn-
ing are jointly performed [44]. The methods we will use for
clustering are as mentioned before; DEC [25], IDEC [27], and
DynAE [28].
The first two methods were implemented with a convolu-
tional network that was introduced in [44]. The difference
between DEC and IDEC is that DEC discards the decoder
after pre-training and fine-tune the encoder with clustering
loss, while IDEC keeps the decoder. Figures 7 and 8 illustrate
FIGURE 6. An illustration of ResNet-50 layers architecture. each method architecture.

50760 VOLUME 9, 2021


N. A. AL-Humaidan, M. Prince: Classification of Arab Ethnicity Based on Face Image Using Deep Learning Approach

labels. The metric takes cluster results from the clustering


algorithm and a ground-truth label and then discovers the
best matching between them, which can be computed by the
Hungarian algorithm [45]. The second metric is Normalized
Mutual Information (NMI):
M (k.r)
NMI (l.c) = (4)
1
2 [H (k) + H (r)]
Equation 4, where M is the mutual information metric, H is
entropy, k is the ground-truth label and r is the clustering
result. Mutual information measures the mutual dependence
FIGURE 7. DEC architecture [25]. of two groups, which are ground-truth and clustering results.
NMI a normalized version of it and permutations does not
affect its results [46]. When NMI equal to 0, it means the two
are independent. And if it equal to 1, that means the two are
identical.
The last metric is the Adjusted Rand Index (ARI), which is
the chance-corrected version of the Rand Index (RI). RI focus
on the pairwise agreement. For each possible pair, it evaluates
how similar the two clusters treat them [47]. RI is calculated
by:

FIGURE 8. IDEC architecture [27].


a+b
RI = (5)
a+b+c+d
where a and b are pairs that both ground truth and clustering
results agree. c and d represent the disagreement, on one
side they are put together, where they are separated on the
other [47]. And ARI is calculated using Equation 5 by:
RI − E(RI )
ARI = (6)
RI max − E (RI )

IV. RESULTS AND DISCUSSION


FIGURE 9. DynAE architecture [28]. Experiments were done using Google Colab and Deep Learn-
ing AMI (Ubuntu 18.04) Version 28.1 and g3s.xlarge from
Amazon Web Services (AWS).
In DynAE [28], they overcame the trade-off between clus-
tering and reconstruction by using dynamic loss function. A. CLASSIFICATION RESULTS
Figure 9 shows the general architecture of DynAE. Arab dataset subjects were divided into 80% training set and
For all methods, the number of clusters is a prior knowl- 20% validation set without subjects overlapping, i.e. images
edge given before the start of clustering. of subjects used in train sets are not used in the test set.
60 Experiments were done on the Arab dataset using the
E. EVALUATION METRICS ResNet50 pre-trained model to tune hyperparameters, please
For supervised model (classification) evaluation we evaluate refer to the supervised learning section for more details about
it using accuracy metric: hyperparameters. After that, all models were evaluated on a
M different dataset to determine the best model. Accuracy was
accuracy = (2) calculated according to equation 1. Table 4 represents only
N
Equation 2, where M is the number of correct samples and top-5 accuracy results that were obtained by testing on the
N is the number of all samples. Besides, we use two metrics private dataset, last two have equal results. As we can see
that are widely used to evaluate deep clustering methods [26]. in Table 3, all top-5 results used SGD as the optimizer. 5 out
The first one is unsupervised clustering accuracy (ACC): of 6 used data augmentation but different sets of methods, and
Pn 3 had no frozen layer, while one had the first block frozen,
1{ki = m (ri ) another had the first two blocks and the last had the first
ACC = max i=1 (3)
m n three blocks frozen. 5 out of 6 used the same learning rate
Equation 3, where ki is the ground-truth label, ri is the method, which starts from 0.01 and exponentially increasing
clustering algorithm result, and m ranges over all one-to- by a factor of 0.1 every 5 epochs. The last model used a
one mappings that are possible between clusters and true learning rate that starts from 0.01, increasing by a factor

VOLUME 9, 2021 50761


N. A. AL-Humaidan, M. Prince: Classification of Arab Ethnicity Based on Face Image Using Deep Learning Approach

TABLE 3. Hyper parameters of five models trained on Arab dataset and evaluated different dataset (our private dataset) that achieved highest accuracy.

TABLE 4. Top 5 highest accuracies on a private dataset and Arab dataset.

FIGURE 11. Normalized confusion matrix of exp2, model evaluated on


private dataset after training and validating in Arab balanced dataset.

To solve this concern, we did another experiment (exp2)


with a modified Arab dataset (Arab balanced dataset). The
modified dataset has a similar number of subjects and images
for each label. Hyperparameters used in this experiment are
the same as model-1. The accuracy result on the Arab bal-
anced dataset was 0.5349. When evaluated on a private Arab
FIGURE 10. Normalized confusion matrix of exp1, model-1 evaluated on dataset, the accuracy was 0.5212. In the confusion matrix
private dataset after training and validating in Arab dataset.
Figure 11, GCC again had the highest correct predictions
by 65%, lower than the model-1 result by 10%. Levant and
Egyptian have 42% and 45% respectively. This experiment
of 0.1 at epoch 5, epoch 10, and epoch 20. The best model shows that the model can identify GCC better than the oth-
was optimized with SGD, has its first block froze, has FL, ers even with a similar number of subjects/images. 31% of
HS, and rescaling as DA, and LR 0.01 increase exponentially Levant were predicted as Egyptian, which is greater by 9%
by 0.1 at every 5 epochs. than model-1 results. We can see in both exp 1 and 2; models
Results of all models tested on Arab dataset and private struggle in classification. Especially in classifying Levant and
dataset are represented in Table 4. The accuracies of testing Egyptian.
in the Arab dataset were between 0.72 and 0.76. However, The third experiment (exp 3) had four labels, three labels
it drops to 0.56 when testing on a different dataset. The from the RFW dataset (Black, Asian and White) and one
best accuracy was 0.5697 by model-1 and comes close to it Arab label from Arab dataset labels combined. The dataset
0.5606 by model-2. is divided into 80% of subjects for the train set and 20%
We will look deeper into model-1 prediction results (exp1). test set. The hyperparameters were the same as model-1.
Confusion matrix of model 1 evaluated on a private Arab Testing on the same dataset results in high accuracy of 0.9663.
dataset shown in Figure 10. As we can see 75% of GCC were However, we had two tests on two different datasets (BUPT-
predicted correctly. Levant and Egyptian labels have 43% TRANSFERFACE, UTK) combined with the Arab private
and 48% of images predicted correctly, respectively. Another dataset as one label. The two tests achieved 0.9675 and
thing we noticed, over 30% of Levant and Egyptians were 0.6995 respectively. Figures 12 and 13 show the confusion
predicted as GCC. We were concern if GCC dominating the matrix of both tests. 88% of Arab label was predicted cor-
dataset by 70% had caused the model to be biased toward rectly in the two tests, there is 9% of Arabs were wrongly
GCC. predicted as White. As for other labels, there was a wide

50762 VOLUME 9, 2021


N. A. AL-Humaidan, M. Prince: Classification of Arab Ethnicity Based on Face Image Using Deep Learning Approach

ACC (Equation 3) measures how many individuals clustered


correctly. NMI (Equation 4) focuses on partitioning and
distribution of ground truth and clusters. ARI (Equation 6)
considers counting all pairs that are assigned to the same or
different clusters in predicted and ground truth. We did some
experiments with balanced and unbalanced datasets because
according to [48], the cluster size could affect the results.
Table 5 shows ACC, NMI, and ARI for each method with
different datasets. All experiments have three labels except
the last two experiments, one had four labels which is a
combination of RFW dataset (Black, White Asian) and Arab
label from Arab dataset. And the last experiment had five
classes, the Indian class from RFW is added.
FIGURE 12. (test-1) Normalized confusion matrix of exp3, model
The best ACC was 0.5955 in FERET by DynAE. In
evaluated on BUPT-TRANSFERFACE + private Arab (one label called Arab) Figure 14, most images are clustered in White, when we look
datasets after training and validating in RFW + Arab (one label called at the statistics of the FERET dataset (Asian: 952, Black: 257,
Arab) datasets.
White: 2883), the White class consists of 70% of total images
which affect the ACC.

FIGURE 13. (test-2) Normalized confusion matrix of exp3, model


evaluated on UTK + private Arab (one label called Arab) datasets after
training and validating in RFW + Arab (one label called Arab) datasets. FIGURE 14. Normalized confusion matrix of DynAE clusters on FERET
dataset.

gap between Black and White results in test-1 (BUPT-


TRANSFERFACE dataset) and test-2 (UTK dataset). Test-1 Worst ACC was 0.3206 in RFW (4 labels) + Arab by
has almost all labels predicted correctly while in test-2, 30% DynAE too. Figure 15 shows that the correct prediction of
of Black and 36% of White were predicted as Arab. all labels is low, with Black having ACC of 44% being the
Through these experiments, we noticed that the model can highest, while the rest were from 36% to 22%.
successfully identify Arabs up to 88% when put with others. NMI and ARI consider the unmatched parts of clusters,
Even though around 30% of Black and White were mistaken the distribution of images, and pairing [48]. Their best results
as Arab in test-2. However, the model does not give good were 0.2714 and 0.2543 respectively, by DEC applied to
classification performance if we classify Arab labels together, RFW (3 labels) + Arab, while there are several lows, most
it probably because the similarity between Arab classes is notable in all experiments on the FERET dataset. DynAE
higher. with FERET dataset achieved NMI of 0.0012 and ARI of
−0.0008. Figure 14 shows that images of each label were
B. DEEP CLUSTERING RESULTS distributed throughout the clusters by the same percentage,
Experiments were done using DEC, IDEC, and DynAE. The which means there is no specific relation between cluster
size of images used is 60×60 for all datasets. The parameters items.
used are the same as the implementation in their respective The second one is Figure 16 which an NMI of 0.0740 and
papers for all three methods. Adam was the optimizer for an ARI of 0.0565. Around half of GCC is in one cluster while
DEC and IDEC and pre-training phase in DynAE while Egyptian and Levant were similarly distributed.
SGD was used for the clustering phase. In DEC and IDEC Figure 17 has better results than the previous ones with an
CNN was used while in DynAE it was a fully connected NMI of 0.1902 and ARI of 0.1938. Black is dominating one
network. Three metrics were used to evaluate experiments: cluster. While White and Asian are similarly distributed in

VOLUME 9, 2021 50763


N. A. AL-Humaidan, M. Prince: Classification of Arab Ethnicity Based on Face Image Using Deep Learning Approach

TABLE 5. ACC, NMI, and ARI results of deep clustering methods for each dataset.

FIGURE 15. Normalized confusion matrix of DynAE clusters on RFW(4 FIGURE 17. Normalized confusion matrix of IDEC clusters on RFW dataset.
labels) + Arab dataset.

FIGURE 18. Normalized confusion matrix of DEC clusters on a


FIGURE 16. Normalized confusion matrix of DEC clusters on Arab dataset. combination of RFW dataset and Arab dataset. (4 labels).

the other two clusters. However, the correct clusters here are showed a small improvement. However, they are still low.
higher than the previous one. Ground-truth and clusters are nearly independent and not
Figure 18 has the best results for NMI and ARI. Arab similar.
and Black are both dominating one cluster for each. Asian Another experiment was done on the RFW dataset with
and White are similarly distributed, with half of Asians been three labels (Black, White, Asian). Results of NMI and ARI
clustered correctly. are much better than previous experiments. THE best NMI
Experiments on the FERET dataset and the balanced ver- was 0. 2071 by DynAE, while the lowest was 0. 1897 by DEC.
sion of it have similar results. Even though the ACC is and best ARI was 0.1938 by IDEC, while worst 0.1854 by
between 40% and 59%, NMI and ARI are between 0 and 0.02. DynAE. ACC was near 53% for all methods.
These results tell that partitions are random, that ground truth The last two experiments were done on RFW + Arab, one
and clusters are independent, and the model is uncertain about has four labels: Black, White and Asian from RFW and Arab,
the clusters. while the other has Indian as an addition. DynAE performed
Experiments on the Arab dataset and the balanced version the worst in terms of all metrics in both experiments. DEC and
results are also similar. ACC is between 37% and 47%. NMI IDEC for the first experiment had ACC of 52% for both, NMI
and ARI are slightly better here, NMI achieved from 0.03 to of 0.2714 and 0.2682 which is the best of all experiments,
0.07 while ARI from 0.01 to 0.08. Only one experiment, and ARI of 0.2543 and 0.2459 respectively. As for the last
DynAE on Arab balanced dataset, performed worse than the experiment, DEC and IDEC had ACC of 44% and 42%, NMI
rest. It had NMI and ARI near zero. The rest of the results of 0.2451 and 0.2366, ARI of 0.2024 and 0.1818 respectively.

50764 VOLUME 9, 2021


N. A. AL-Humaidan, M. Prince: Classification of Arab Ethnicity Based on Face Image Using Deep Learning Approach

evaluated on a different dataset and the best accuracy result


was 0.5697. Another experiment was done after balancing the
number of subjects in each class. The accuracy after evalua-
tion in a different dataset was 0.5212. From both experiments,
the model is struggling to identify between labels, which can
be due to the strong similarity between them.
A third experiment was done to classify Arabs as a whole
and the other three ethnicities (Black, White, Asian) from the
RFW dataset. The model was evaluated two times with two
datasets (BUPT-TRANSFERFACE, UTK) each combined
with our private Arab dataset. The results were 0.9675 and
0.6995 respectively.
For deep clustering experiments. ACC results were
FIGURE 19. Normalized confusion matrix of DEC clusters on a between 59% and 32%. However, NMI and ARI vary accord-
combination of RFW dataset and Arab dataset. (5 labels).
ing to each dataset and method. They were Zero in FERET
TABLE 6. Accuracy of supervised learning model tested on the same dataset experiments. And the best was experiments on a com-
dataset, and average ACC of all three unsupervised learning methods.
bination of three labels from RFW and one label Arab. The
best was NMI of 0.2714 and ARI of 0.2543 by DEC. In the
future, we would like to investigate more methods regarding
ethnicity classification.
This study has some limitations; first, our Arab dataset
does not cover all countries of the Arab world. The limited
We can see a similarity in the confusion matrix in time and lack of knowledge about public figures in other
Figures 18 and 19, 78% and 71% of Black were grouped countries made it hard to collect a proper amount of sub-
respectively. Almost half of Asians and White were grouped jects. Moreover, the Arab dataset is unbalanced with GCC
in one cluster in both experiments, while Arabs were divided have 2/3 of subjects. We recommend that in future work,
into two clusters in Figure 18, one of the clusters has also to increase the number of subjects for other labels and to
around 20% of Asians and White as well. Then in Figure 19 cover other countries if possible. Regarding age, the Arab
Arab and Indian are separated into three clusters. dataset does not have people under 17, we are not sure if the
Based on the discussion above no consistency has been same results can be applied to them. Also, we resized images
seen in the performance of any of the methods considered for to a small size (60 × 60) while performing Deep clustering
the experiment as far as deep clustering is concern. Moreover, methods, due to limited memory. We are concerned about
the lower score of NMI and ARI confirms low intra and high how the size could affect the quality of the performances.
inter-cluster similarity as well. So it can be said that facial
features are much similar across the borderline of different LINK OF THE DATASET
ethnic groups. And this can be one of the reasons for the https://www.dropbox.com/sh/j4kjs9z9qnkewad/
poorer performance of clustering. In support of this conclu- AABixLKWaME-3YiCfqKOdmlSa?dl = 0
sion, it is noticeable that all the models across supervised and ACKNOWLEDGMENT
unsupervised provide the best accuracy with RFW dataset, Portions of the research in this article use the FERET database
the NMI and ARI score is higher for this dataset as well. of facial images collected under the FERET program, spon-
The conclusion can be drawn based on experiments. sored by the DOD Counterdrug Technology Development
Table 6 shows a comparative accuracy of supervised and Program Office.
unsupervised learning models with two datasets Arab and
RFW (3 labels) + Arab (1 label). Supervised learning wit- REFERENCES
nesses better results whereas unsupervised methods that were [1] C. Yu, Y. Fang, and Y. Li, ‘‘Multi-task learning for face ethnicity and gender
recognition,’’ in Proc. Chin. Conf. Biometric Recognit., 2014, pp. 136–144.
used could not match the performance level of the supervised [2] H. Ding, D. Huang, Y. Wang, and L. Chen, ‘‘Facial ethnicity classification
learning model yet. based on boosted local texture and shape descriptions,’’ in Proc. 10th IEEE
Int. Conf. Workshops Autom. Face Gesture Recognit. (FG), Apr. 2013,
pp. 1–6.
V. CONCLUSION AND FUTURE WORK [3] S. Masood, S. Gupta, A. Wajid, S. Gupta, and M. Ahmed, ‘‘Prediction of
In this study, we investigate the possibility of a CNN model to human ethnicity from facial images using neural networks,’’ in Advances
classify sub-ethnic groups of Arabs. First, we create an Arab in Intelligent Systems and Computing. Singapore: Springer, 2018,
pp. 217–226.
dataset with three labels chosen according to countries’ dis- [4] A. K. Jain, S. C. Dass, and K. Nandakumar, ‘‘Soft biometric traits for per-
tribution into regions. Then a pre-train ResNet50 model was sonal recognition systems,’’ in Biometric Authentication. Berlin, Germany:
used to classify the Arab dataset. Over 60 experiments were Springer, 2004, pp. 731–738.
[5] K. Veropoulos, G. Bebis, and M. Webster, ‘‘Investigating the impact of
done to fine-tune hyperparameters, explained in more detail face categorization on recognition performance,’’ in Advances in Visual
in the supervised learning section. After that, the models have Computing. Berlin, Germany: Springer, 2005, pp. 207–218.

VOLUME 9, 2021 50765


N. A. AL-Humaidan, M. Prince: Classification of Arab Ethnicity Based on Face Image Using Deep Learning Approach

[6] N. Kumar, P. Belhumeur, and S. Nayar, ‘‘FaceTracer: A search engine [30] J. Dobies. Simple_Image_Download. Accessed: Mar. 26, 2021. [Online].
for large collections of images with faces,’’ in Computer Vision—ECCV. Available: https://github.com/RiddlerQ/simple_image_download
Berlin, Germany: Springer, 2008, pp. 340–353. [31] M. Wang, W. Deng, J. Hu, X. Tao, and Y. Huang, ‘‘Racial faces in the wild:
[7] Y. Hu, Y. Fu, U. Tariq, and T. S. Huang, ‘‘Subjective experiments on gender Reducing racial bias by information maximization adaptation network,’’ in
and ethnicity recognition from different face representations,’’ in Advances Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2019, pp. 692–702.
in Multimedia Modeling. Berlin, Germany: Springer, 2010, pp. 66–75. [32] M. Wang and W. Deng, ‘‘Mitigating bias in face recognition using
[8] A. Dantcheva, P. Elia, and A. Ross, ‘‘What else does your biometric data skewness-aware reinforcement learning,’’ in Proc. IEEE/CVF Conf. Com-
reveal? A survey on soft biometrics,’’ IEEE Trans. Inf. Forensics Security, put. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 9322–9331.
vol. 11, no. 3, pp. 441–467, Mar. 2016. [33] P. J. Phillips, H. Wechsler, J. Huang, and P. J. Rauss, ‘‘The FERET database
[9] K. Niinuma, U. Park, and A. K. Jain, ‘‘Soft biometric traits for continuous and evaluation procedure for face-recognition algorithms,’’ Image Vis.
user authentication,’’ IEEE Trans. Inf. Forensics Security, vol. 5, no. 4, Comput., vol. 16, no. 5, pp. 295–306, Apr. 1998.
pp. 771–780, Dec. 2010. [34] P. J. Phillips, H. Moon, S. A. Rizvi, and P. J. Rauss, ‘‘The FERET evalu-
[10] Z. Heng, M. Dipu, and K.-H. Yap, ‘‘Hybrid supervised deep learning ation methodology for face-recognition algorithms,’’ IEEE Trans. Pattern
for ethnicity classification using face images,’’ in Proc. IEEE Int. Symp. Anal. Mach. Intell., vol. 22, no. 10, pp. 1090–1104, 2000.
Circuits Syst. (ISCAS), May 2018, pp. 1–5. [35] Github.io. UTKFace. Accessed: Mar. 26, 2021. [Online]. Available:
[11] E. J. Jewell and F. R. Abate, The New Oxford American Dictionary. https://susanqq.github.io/UTKFace/
London, U.K.: Oxford Univ. Press, 2001. [36] N. Dalal and B. Triggs, ‘‘Histograms of oriented gradients for human
[12] D. J. Da Silva Santos, N. B. Palomares, D. Normando, C. Cardoso, and detection to cite this version?: HAL Id?: Inria-00548512 histograms of
A. Quintão, ‘‘Race versus ethnicity: Differing for better application,’’ Den- oriented gradients for human detection,’’ in Proc. IEEE Comput. Soc. Conf.
tal Press J. Orthodontics, vol. 15, no. 3, pp. 4–121, 2010. Comput. Vis. Pattern Recognit., Jun. 2010, pp. 886–893.
[13] G. Muhammad, M. Hussain, F. Alenezy, G. Bebis, A. M. Mirza, and [37] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ‘‘ImageNet classification
H. Aboalsamh, ‘‘Race classification from face images using local descrip- with deep convolutional neural networks,’’ in Proc. Adv. Neural Inf. Pro-
tors,’’ Int. J. Artif. Intell. Tools, vol. 21, no. 5, Oct. 2012, Art. no. 1250019. cess. Syst. (NIPS), 2012, pp. 1097–1105.
[38] W. Rawat and Z. Wang, ‘‘Deep convolutional neural networks for image
[14] I. Anwar and N. Ul Islam, ‘‘Learned features are better for ethnicity clas-
classification: A comprehensive review,’’ Neural Comput., vol. 29, no. 9,
sification,’’ Cybern. Inf. Technol., vol. 17, no. 3, pp. 152–164, Sep. 2017.
pp. 2352–2449, Sep. 2017.
[15] H. H. K. Tin and M. M. Sein, ‘‘Race identification from face images,’’
[39] R. Girshick, J. Donahue, T. Darrell, and J. Malik, ‘‘Rich feature hierarchies
Proc. Int. Conf. Adv. Comput. Eng. (ACE), 2011, pp. 1–4.
for accurate object detection and semantic segmentation,’’ in Proc. IEEE
[16] N. Srinivas, H. Atwal, D. C. Rose, G. Mahalingam, K. Ricanek, and Conf. Comput. Vis. Pattern Recognit., Jun. 2014, pp. 580–587.
D. S. Bolme, ‘‘Age, gender, and fine-grained ethnicity prediction using [40] K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman, ‘‘Return of the
convolutional neural networks for the east asian face dataset,’’ in Proc. devil in the details: Delving deep into convolutional nets,’’ in Proc. Brit.
12th IEEE Int. Conf. Autom. Face Gesture Recognit. (FG), May 2017, Mach. Vis. Conf., 2014, pp. 1–12.
pp. 953–960. [41] K. He, X. Zhang, S. Ren, and J. Sun, ‘‘Deep residual learning for image
[17] C. Wang, Q. Zhang, X. Duan, and J. Gan, ‘‘Multi-ethnical Chinese facial recognition,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR),
characterization and analysis,’’ Multimedia Tools Appl., vol. 77, no. 23, Jun. 2016, pp. 770–778.
pp. 30311–30329, Dec. 2018. [42] Q. Cao, L. Shen, W. Xie, O. M. Parkhi, and A. Zisserman, ‘‘VGGFace2:
[18] Arab Countries 2019. Accessed: May 8, 2019. [Online]. Available: A dataset for recognising faces across pose and age,’’ in Proc. 13th IEEE
http://worldpopulationreview.com/countries/arab-countries/ Int. Conf. Autom. Face Gesture Recognit. (FG), May 2018, pp. 67–74.
[19] N. Narang and T. Bourlai, ‘‘Gender and ethnicity classification using deep [43] B. Barz and J. Denzler, ‘‘Deep learning on small datasets without pre-
learning in heterogeneous face recognition,’’ in Proc. Int. Conf. Biometrics training using cosine loss,’’ in Proc. IEEE Winter Conf. Appl. Comput. Vis.
(ICB), Jun. 2016, pp. 1–8. (WACV), Mar. 2020, pp. 1360–1369.
[20] W. Wang, F. He, and Q. Zhao, ‘‘Facial ethnicity classification with deep [44] X. Guo, E. Zhu, X. Liu, and J. Yin, ‘‘Deep embedded clustering with data
convolutional neural networks,’’ in Biometric Recognition. Cham, Switzer- augmentation,’’ in Proc. 10th Asian Conf. Mach. Learn., vol. 95, 2018,
land: Springer, 2016, pp. 176–185. pp. 550–565.
[21] A. Gudi, ‘‘Recognizing semantic features in faces using deep learning,’’ [45] H. W. Kuhn, ‘‘The hungarian method for the assignment problem,’’ Nav.
2016, arXiv:1512.00743. [Online]. Available: https://arxiv.org/abs/ Res. Logistics Quart., vol. 2, nos. 1–2, pp. 83–97, Mar. 1955.
1512.00743 [46] Scikit-learn.org. 2.3. Clustering—Scikit-Learn 0.24.1 Documentation.
[22] H. Chen, Y. Deng, and S. Zhang, ‘‘Where am i from?-east asian ethnicity Accessed: Mar. 26, 2021. [Online]. Available: https://scikit-learn.org/
classification from facial recognition,’’ Project Study Stanford Univ., San stable/modules/clustering.html
Francisco, CA, USA, Tech. Rep., 2016. [47] V. Labatut, ‘‘Generalised measures for the evaluation of community detec-
[23] Gov.eg. CAPMAS. Accessed: Mar. 26, 2021. [Online]. Available: tion methods,’’ Int. J. Social Netw. Mining, vol. 2, no. 1, pp. 44–63, 2015.
https://www.capmas.gov.eg/Pages/populationClock.aspx [48] M. Rezaei and P. Franti, ‘‘Set matching measures for external cluster
[24] N. Grira, M. Crucianu, and N. Boujemaa, ‘‘Unsupervised and semi- validity,’’ IEEE Trans. Knowl. Data Eng., vol. 28, no. 8, pp. 2173–2186,
supervised clustering: A brief survey,’’ in A Review of Machine Learning Aug. 2016.
Techniques for Processing Multimedia Content, Re-port of the MUSCLE
European Network of Excellence (6th Framework Programme). Citeseer,
2004, pp. 1–12. NORAH A. AL-HUMAIDAN received the B.S. degree in computer science
[25] J. Xie, R. Girshick, and A. Farhadi, ‘‘Unsupervised deep embedding for from Qassim University, Saudi Arabia, in 2016, where she is currently
clustering analysis,’’ in Proc. 33rd Int. Conf. Mach. Learn. (ICML), vol. 1, pursuing the master’s degree with the Department of Computer Science.
2016, pp. 740–749.
[26] E. Min, X. Guo, Q. Liu, G. Zhang, J. Cui, and J. Long, ‘‘A survey of clus- MASTER PRINCE received the B.S. degree in
tering with deep learning: From the perspective of network architecture,’’
computer science from Patna University, India,
IEEE Access, vol. 6, pp. 39501–39514, 2018.
in 1996, the M.S. degree in computer science
[27] X. Guo, L. Gao, X. Liu, and J. Yin, ‘‘Improved deep embedded clustering
from Indira Gandhi National Open University,
with local structure preservation,’’ in Proc. Int. Jt. Conf. Artif. Intell.
(IJCAI), 2017, pp. 1753–1759. New Delhi, India, in 2004, and the Ph.D. degree
[28] N. Mrabah, N. M. Khan, R. Ksantini, and Z. Lachiri, ‘‘Deep clustering with in computer science from Pune University, India,
a dynamic autoencoder: From reconstruction towards centroids construc- in 2008. Since 2009, he has been working as an
tion,’’ Neural Netw., vol. 130, pp. 206–228, Oct. 2020. Assistant Professor with the Department of Com-
[29] Paperswithcode.com. Papers With Code—Deep Clustering With a puter Science, Qassim University, Saudi Arabia.
Dynamic Autoencoder: From Reconstruction Towards Centroids His research interests include computer vision and
Construction. Accessed: Mar. 26, 2021. [Online]. Available: machine learning. He received the Best Ph.D. Thesis Dissertation of the Year
https://paperswithcode.com/paper/deep-clustering-with-a-dynamic- 2009 Award of Pune University.
autoencoder

50766 VOLUME 9, 2021

You might also like