BTVN Tech

This technical report proposes a method for real-time smile detection using deep learning. Existing smile detection methods are divided into two categories: traditional feature-based methods and CNN-based methods. The report studies different CNN architectures including BKNet, ResNet, VGG and MobileNets combined with the RetinaNet object detection framework for smile detection. The proposed method combines the lightweight BKNet with RetinaNet to improve accuracy and inference time over the original RetinaNet. Evaluation on two datasets shows the proposed method achieves real-time performance while obtaining higher accuracy than YOLO, the state-of-the-art real-time detection method.

Uploaded by

huyductran2606

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views14 pages

BTVN Tech

Uploaded by

huyductran2606

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 14

HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY

TECHNICAL REPORT
REAL-TIME SMILE DETECTION
USING DEEP LEARNING
Trần Đức Huy
[email protected]

Ngành Công nghệ thông tin Việt – Nhật

Chuyên ngành Công nghệ thông tin

Giảng viên hướng dẫn: Dương Thanh Tùng

Chữ ký của GVHD

Bộ môn: Technical writing & presentation IT2030

Viện: Công nghệ thông tin

HÀ NỘI, 12/2023

1
Abstract: Real-time smile detection from facial images is useful in many real world
applications such as automatic photo capturing in mobile phone cameras or interactive
distance learning. In this paper, we study different architectures of object detection
deep networks for solving real-time smile detection problem. We then propose a
combination of a lightweight convolutional neural network architecture (BKNet) with
an efficient object detection framework (RetinaNet). The evaluation on the two
datasets (GENKI-4K, UCF Selfie) with a mid-range hardware device (GTX TITAN
Black) show that our proposed method helps in improving both accuracy and
inference time of the original RetinaNet to reach real-time performance. In
comparison with the state-of-the-art object detection framework (YOLO), our method
has higher inference time, but still reaches real-time performance and obtains higher
accuracy of smile detection on both experimented datasets.
Keywords: Deep Learning; Convolutional Neural Network; Real-Time Smile
Detection.

Contents
1. INTRODUCTION........................................................................................4
2. RELATED WORK.......................................................................................5
2.1 Handcrafted feature-based methods............................................................5
2.2 CNN-based methods...................................................................................5
3. METHODOLOGY....................................................................................6
3.1 BKNet..........................................................................................................6
3.2 RetinaNet.....................................................................................................7
3.3 Our proposed method..................................................................................8
4. EXPERIMENTS........................................................................................8
4.1 Dataset.........................................................................................................8
4.2 Experimental setup......................................................................................9
4.3 Results and discussions..............................................................................10
5. CONCLUSION AND PERSPECTIVES................................................11
Reference.............................................................................................................12

2
List of figures and tables
Figure 1: BKNet architecture [6]............................................................................6
Figure 2: RetinaNet architecture [14].....................................................................7
Figure 3: Pipeline of our smile detection method...................................................8
Figure 4: Sample detection results of our method on GENKI-4K (the first two rows)
and UCF Selfie (the last two rows)..............................................................11
Table 1: Accuracy and inference time of studied smile detection methods...........10

3
1. INTRODUCTION
Smile is one of the most important facial expressions to present human’s
feelings of happiness, satisfaction or pleasure [3]. Automatic smile detection from
facial images is being widely used in many real world applications such as product
rating, automatic photo capturing, distance learning, patient monitoring, etc.
Although gaining much attention from the scientific community, real time smile
detection from facial images captured in real world scenarios of unconstrained
conditions such as lighting and background is still a difficult and challenging
problem. This can be explained by the fact that to detect smile, the task of locating
human faces in images and the task of classifying a face is smiling or not often
require heavy computations which may take too long to reach real time
performance (below 40ms, equivalent to 25 frames per seconds).
There exist many methods to detect smile from facial images [5, 7, 8, 11, 16,
23]. These methods are generally divided into two categories: (1) traditional
handcrafted feature-based methods and (2) deep learning-based methods. In the
first case, sliding windows are used to generate sub-regions of the image,
handcrafted features are then extracted from these sub-regions. Finally, classifiers
such as Support Vector Machine (SVM), Extreme Learning Machine (ELM) are
applied in each sub-region to classify it as smile or non-smile. In the second case,
Convolutional Neural Networks (CNNs) are usually used to automatically propose
sub-regions from input image, extract features and categorize them as smile or
non-smile. Nowadays, CNN-based methods are much more popular than
handcrafted feature-based methods due to its amazing performance in terms of
accuracy.
In order to speed up performance of smile detection, several methods are
proposed in the literature such as YOLO [17], Faster R-CNN [18], RetinaNet [14]
and MobileNets [10]. Among these methods, YOLO and MobileNets are
considered as the most state-of-the-art real-time object detection methods in terms
of speed. For example, YOLO is capable of gaining up to 25 frames per second. In
contrast, RetinaNet as claimed by the authors [14] obtains better accuracy than
YOLO but having lower time performance.
In this paper, our work is inspired by the disadvantage of RetinaNet where the
goal is to improve inference time performance of RetinaNet for real-time smile
detection. To reach this goal, we exploit the use of RetinaNet with several CNN
architectures namely BKNet [6], ResNet [9], VGG [20] and MobileNets [10]. We
then propose a new method as a combination of RetinaNet and BKNet for real-
time smile detection of facial images. We will show that our method reaches real-
time smile detection performance while still having a good accuracy.
The rest of the paper is described as follows: We briefly review the state of the
art methods in Section 2. In Section 3, we explain in detail the methods we use to
experiment real-time smile detection. Our experimental results are shown in
Section 4. In the final Section 5, we present our conclusions and perspectives.

4
2. RELATED WORK
2.1 Handcrafted feature-based methods
Handcrafted feature-based methods detect smile based on the extraction of facial
geometry information. The detected face is then fed to a classification solution to
further categorize it as smile or not. Shinohara et al., 2004 [19] used Fisher Weight
Map (FWM) and Higher-Order Local Auto-Correlation (HLAC) for face
representation. The authors in [12] extracted the lips and cheeks from the human face
by using a 6-dimensional feature vector.
Freire et al., 2009 [7] used Local Binary Patterns (LBP) as the main descriptors
and SVM as classifier to detect smile. The method achieves 90% accuracy of smile
detection. Other authors in [4, 15] introduced the use of Histogram of Oriented
Gradients (HOG) to extract features from facial images for face recognition and smile
detection. Recent works by Huang et al., 2018 [11] optimizes face detection method
by adding a pre-processing step for skin color detection, edge detection and face
estimation. The candidates are then passed to traditional classifier using HOG and
SVM. Although this method can achieve 120fps for full HD images, its true positive
rate is only 64.0%.
Gao et al., 2016 [8] proposed a semi-automated method which combines multiple
features (Self-Similarity of Gradients (GSS), HOG, Raw pixel) and multiple classifiers
(AdaBoost, Linear ELM). The evaluation showed that the proposed method obtains an
improved performance up to 94.61% accuracy of smile detection. An et al., 2015 [1]
proposed a smile detector by using ELM. The proposed ELM classifier was trained
with different feature descriptors such as LBP, Local Phase Quantization (LPQ), HOG
and compared with other benchmark classifiers such as Linear Discriminant Analysis
(LDA), SVM. The evaluation on GENKI4K dataset [24] showed that the ELM-based
smile detector takes 90 ms for prediction time compared to 70 ms and 3600 ms of
LDA and SVM classifiers respectively.
2.2 CNN-based methods
CNN-based methods use Convolutional Neural Networks (CNNs) to automatically
learn high level features of images and classify them as smile or non-smile. Zhang et
al., 2015 [26] used signal recognition and verification as supervision to learn
expression features for smile detection. The results showed that their method is better
in reducing 21% of the error rate in GENKI-4K dataset. A deep convolutional network
called Smile-CNN [2] was released for smile detection with the accuracy of 92.4%
and 91.8% for SVM and AdaBoost classifiers, respectively.
Zhang et al., 2018 [25] proposed acceleration methods using heatmap (for global
face and facial parts) for a cascaded CNN model to speed up inference time. This
method achieves 0.499s per face detected using a GeForce GTX TITAN Black. In
another work, Nguyen et al., 2018 [16] introduced the use of Faster R-CNN to speed
up inference time of smile detection process. Although the method gains up to 50%
faster inference performance than the original Faster R-CNN with the acceptable
accuracy of 84.5%, it has not yet reached the real time performance of smile detection.

5
Dinh et al., 2017 [6] proposed a lightweight CNN architecture named BKNet to
detect smile. The evaluation showed that the proposed CNN architecture gains high
smile detection accuracy (95.08%) compared to state of the art methods. Redmon and
Farhadi, 2017 [17] introduced a high-speed real-time object detection method named
YOLO. The method takes the entire image in a single instance and predicts the
bounding box coordinates and class probabilities for these boxes. The biggest
advantage of YOLO is its super speed which can process up to 25 frames per second.
YOLO is currently the most state-of-art approach for real-time object detection.
3. METHODOLOGY
3.1 BKNet
Dinh et al., 2017 [6] proposed a deep CNN architecture based on VGG network
[20] named BKNet (see Fig. 1). BKNet is constructed from four stacked convolutional
blocks. The first three blocks consist of two convolutional layers with the number of
filters being 32, 64 and 128, respectively. The last convolutional block includes three
convolutional layers instead of two like previous blocks. BKNet has the window size
of 3×3 and is followed by a ReLU (Rectified Linear Unit) activation function for each
convolutional layer. Each convolutional block is followed by a max pooling layer with
size of 2×2. At the end of network, 2 fully connected layers with 256 neurons are used
and followed by a ReLU activation function. The output will be passed through the
last fully connected layer including 2 neurons with softmax activation function.
Input
Convolutional 3x3s1, 32, BatchNorm, ReLU
Convolutional 3x3s1, 32, BatchNorm, ReLU
Max Pool 2x2s2
Convolutional 3x3s1, 64, BatchNorm, ReLU
Convolutional 3x3s1, 64, BatchNorm, ReLU
Max Pool 2x2s2
Convolutional 3x3s1, 128, BatchNorm, ReLU
Convolutional 3x3s1, 128, BatchNorm, ReLU
Max Pool 2x2s2
Convolutional 3x3s1, 256, BatchNorm, ReLU
Convolutional 3x3s1, 256, BatchNorm, ReLU
Convolutional 3x3s1, 256, BatchNorm, ReLU
Max Pool 2x2s2
Fully Connected, 256 neurons, BatchNorm, ReLU
Fully Connected, 256 neurons, BatchNorm, ReLU
Fully Connected, 2 neurons, soft-max
Figure 1: BKNet architecture [6]

By using a half number of convolutional layers and less blocks than original VGG
architecture [20], BKNet becomes a lightweight and powerful network structure for

6
smile detection. Besides, batch normalization layer is added to each convolutional
layer and fully connected layer to normalize the data. This allows using higher
learning rate and minimizing the effects from large errors during the training process.
3.2 RetinaNet
Lin et al., 2018 [14] proposed a convolutional neural network namely RetinaNet
for high speed object detection. Fig. 2 presents the architecture of RetinaNet. From the
figure, RetinaNet contains a backbone network, namely Feature Pyramid Network
(FPN) and two sub-networks, namely Classification Subnet and Box Regression
Subnet.
Feature Pyramid Network (FPN). FPN acts as the backbone network of RetinaNet.
It provides rich and multi-scale image features by combining low resolution with high
resolution via a top-down pathway and lateral connection. Each level of FPN encodes
a different kind of information at a different scale. At each FPN level, there are several
anchors moving around the FPN feature maps. An anchor is a rectangle with different
sizes and ratios
REAL-TIME SMILE DETECTION USING DEEP LEARNING

Figure 2: RetinaNet architecture [14]

presenting position of a potential object. These anchors are resized according to the
scale of the FPN levels and can be duplicated on all the possible positions in the
feature maps. Due to this, FPN is used for detecting objects at different scales.
Classification Subnet. Each FPN level is fed into two sub-networks in order to
fully exploit every FPN level that holds a different kind of information. The first sub-
network is classification subnet, it predicts the probability of object existing on each
spatial position for each anchor and object classes. The sub-network is simple Fully
Convolutional Network (FCN) followed to each FPN level and takes the feature map
from a pyramid level as input.
Box Regression Subnet. In order to regress the offset for each anchor with the
groundtruth object, each pyramid level is attached with another small FCN if an object
exists. The architecture of the box regression subnet is similar to classification subnet
except that it finishes with 4 linear outputs per spatial location.

7
3.3 Our proposed method

Figure 3: Pipeline of our smile detection method

Fig. 3 presents the pipeline of our proposed smile detection method. From the
figure, the input image is annotated with each human face coordinate, which will be
fed into RetinaNet with a CNN network as a feature extractor. Learned representations
are then used as input for the subnetworks: (1) bounding box (bbox) regression subnet,
and (2) classification subnet. For (1), the subnet performs a bounding box regression,
presenting human face locations. Output of the bbox subnet is the anchor box to a
nearby human face location if one exists. For (2), the subnet performs a classification
of background and human face, predicting the probability of human face presence as
smile or non-smile at each position for each of the anchors.
By analyzing the architecture of RetinaNet, we realize that its current backbone
network (FPN) is built on top of ResNet architecture [9], which is a very complex
CNN with many convolution modules and each module has many convolutional
layers. From this, the idea behind our method is to find a simple network to replace
ResNet for improving time performance of RetinaNet while still keeping a good
accuracy of real-time smile detection. By doing the research study, we select the two
networks (BKNet and VGG) to perform this idea. We chose VGG [20] since it is the
state-of-the-art detection and classification method in terms of accuracy and BKNet
since it is proved by the authors [6] to be a powerful and lightweight network structure
for smile extraction and classification. By the experiments in the next section, we will
show that BKNet should be the best choice to implement our idea.
4. EXPERIMENT
4.1 Dataset
GENKI-4K [21]. This dataset contains facial images captured in various real life
conditions, contexts, and backgrounds. All the images are manually labeled into 2
classes: smile and non-smile. There are in total 4,000 images in the dataset in which
2,162 images are labeled as smile and 1,838 images are labeled as non-smile. Since
this dataset is gathered from various real life scenarios, there exist several unclear
images which are very difficult to identify smile. Due to this, we pre-process the
dataset by manually eliminating the unclear images. After the removal, we have in
total of 3,699 images in which 2,008 images are labeled as smile and 1,691 images are
labeled as none-smile.
UCF Selfie [13]. This dataset includes 46,836 selfie images classified in 36
different attributes such as facial gestures (smiling, frowning, mouth open), face

8
shapes (oval, round, heart) and lighting conditions (harsh, dim). There are in total
12,207 smile and 34,629 nonsmile images. Similarly to GENKI-4K dataset, we have
to manually eliminate images having the confusion between smile and non-smile.
After elimination process, we select 5,034 smile images and the same amount of non-
smile images for training our model with a balanced dataset.
Human face annotation. As mentioned above, RetinaNet uses several anchors
moving around the FPN feature maps. To feed FPN, coordinates of the anchors
presenting position of potential objects need to be provided. For this, we pre-process
data by creating human face annotations to the experimental images. There are many
methods to generate these annotations. In this work, we apply the Haar-cascade based
method for face detection proposed in [22]. In this method, the image features are
extracted from input by using different rectangles. The AdaBoost classifier is then
used to select only those features known to highest accuracy in face classifier.
4.2 Experiment setup
Hardware and software configuration. We perform our experiments on a HP
Server with 6-core 2.40GHz Xeon E5-2620 v3 processor, 32GB RAM. In order to
measure detection and inference time of our models, we conduct it on a mid-range
hardware configuration, the NVIDIA GTX Titan Blacks (a graphics card with
performance equivalent to GTX 1060, each has 2880 CUDA cores running at 889MHz
and 6GB RAM at 7GHz). The experiment is performed with TensorFlow and Python
2.7 running on Debian 9.
Model configuration. With RetinaNet, we fine-tune the backbone network (FPN)
on top of three networks: VGG16, ResNet50 and BKNet. For its classification subnet,
we apply four 3×3 convolutional layers followed by ReLU activation function. Then,
we apply sigmoid activation function to the outputs and focal loss is used as the loss
function. For the bounding box regression subnet, smooth l1 with sigma=3 is taken as
the loss function.
Training and inference. In the training process, we do not use pre-trained weights
of the backbone network, we evaluate accuracy and inference time of all models by
training them from scratch. To train the models, we assume that the prior probability
of all anchor boxes and the bias of convolutional layers of classification subnet is 0.01.
The weight decay is 0.0001 and the initial learning rate is 10−5 with Adam optimizer
for 10k iterations in 50 epochs. The inference step involves forwarding an image
through the network. We decode box predictions from 1k top scoring predictions per
FPN level with a thresholding detector equal to 0.05. The highest predictions from all
FPN levels are merged with a threshold of 0.5 to yield the final detections.
Accuracy measurement. We define accuracy to measure the correctness of the
classification process as follows

9
In our experiment, we consider a “correctly detected image” based only on the
ground truth label (smile or not smile) because we found out that the network
generally performs correct face boundary detection.
Inference time measurement. Our models’ inference time measurement is started
after the model is fully loaded. Test image is then fed into the network as input. We
stop the timing measurement after the network produces a probability output for
smile/non-smile classification. We use time() method from time module in Python
standard library to perform this timing measurement. This function provides timing
precision up to 1 microsecond on Linux platforms.
4.3 Results and discussions
Table 1 shows the accuracy and inference time of our studied smile detection
methods with test data after training the models in 50 epochs on both GENKI-4K and
UCF Selfie datasets with a mid-range hardware configuration (GTX TITAN Black).
Bold values indicate our proposed method in this paper. We chose to experiment our
methods on a mid-range device since this affect a larger proportion of users than high-
end GPUs.
It can be seen from the table that the combination of RetinaNet with MobileNets
and BKNet are able to reach real-time performance on both datasets, while with
VGG16, although having a high accuracy, it has not yet reached real-time
performance of smile detection. We do not compare our result with the method of
Dinh et al., 2017 [6] (95.08% accuracy on GENKI-4K) since BKNet is proposed for
classification problem of smile or non-smile images
Table 1: Accuracy and inference time of studied smile detection methods

10
Figure 4: Sample detection results of our method on GENKI-4K (the first two rows) and
UCF Selfie (the last two rows)

only, while our method is for detection problem which includes a boundary box of the
smile or non-smile face.
It is noticable that although YOLO obtains fastest performance on both datasets
(31ms on GENKI-4K and 35ms on UCF Selfie), it has lower accuracy than our
method of RetinaNet and BKNet (93.17% vs. 94.04% on GENKI-4K and 94.23% vs.
95.19% on UCF Selfie). This highlights the advantage of our proposed method as a
high quality real-time smile detector with better accuracy than YOLO on both tested
datasets.
Fig. 4 shows some qualitative results of our proposed method on GENKI-4K
dataset (the first two rows) and UCF Selfie dataset (the last two rows). The rectangle
describes the detected region on each image with the corresponding score. From this
figure, our method is able to detect smile or non-smile accurately from images of
various facial shapes (oval, round, heart) with different lighting conditions (harsh,
dim).
5. CONCLUSION AND PERSPECTIVES
In this paper, we studied several combinations of convolutional neural networks
(VGG16, MobileNets, BKNet) as feature extractors with the high-speed object
detection framework (RetinaNet) for real-time smile detection of facial images. The
experiments on the two datasets (GENKI-4K, UCF Selfie) with a mid-range hardware
device (GTX TITAN Black) show that our proposed method of RetinaNet with BKNet

11
helps in enhancing both accuracy and inference time of the original RetinaNet to reach
real-time performance of smile detection.
Compared with the state-of-the-art object detection framework (YOLO), our
method has lower inference time, but better accuracy of smile detection. Several
research directions can be taken into account to continue this work. Firstly, more data
can be used to train the model for gaining better detection accuracy. Secondly, the
number of blocks and layers of the networks (RetinaNet and BKNet) can be further
finetuned to get better accuracy yet faster performance. Last but not least, more
experiments could be performed on more devices to examine the scalability of our
method.

REFERENCES
[1] L. An, S. Yang, and B. Bhanu, “Efficient smile detection by extreme learning
machine,” Neurocomputing, vol. 149, pp. 354–363, 2015.
[2] J. Chen, Q. Ou, Z. Chi, and H. Fu, “Smile detection in the wild with deep
convolutional neural networks,” Machine Vision and Applications, vol. 28, no. 1-2, pp.
173–183, 2017.
[3] F. De la Torre and J. F. Cohn, “Facial expression analysis,” in Visual Analysis of
Humans. Springer, 2011, pp. 377–409.
[4] O. Déniz, G. Bueno, J. Salido, and F. De la Torre, “Face recognition using
histograms of oriented gradients,” Pattern Recognition Letters, vol. 32, no. 12, pp.
1598–1603, 2011.
[5] O. Déniz, M. Castrillon, J. Lorenzo, L. Anton, and G. Bueno, “Smile detection for
user interfaces,” in International Symposium on Visual Computing. Springer, 2008, pp.
602–611.
[6] V. S. Dinh, T. B. C. Le, and P. T. Do, “Facial smile detection using convolutional
neural networks,” in Knowledge and Systems Engineering (KSE), 2017 9th
International Conference on. Hue, Viet Nam, 2017, pp. 136–141.
[7] D. Freire, M. C. Santana, and O. Déniz-Suárez, “Smile detection using local
binary patterns and support vector machines.” in VISAPP (1), 2009, pp. 398–401.
[8] Y. Gao, H. Liu, P. Wu, and C. Wang, “A new descriptor of gradients self-similarity
for smile detection in unconstrained scenarios,” Neurocomputing, vol. 174, pp. 1077–
1086, 2016. 144 CHI CUONG NGUYEN et al.
[9] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”
in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las
Vegas, NV, USA, June 2016, pp. 770–778.

12
[10] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M.
Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for
mobile vision applications,” arXiv preprint arXiv:1704.04861, 2017.
[11] D.-Y. Huang, C.-H. Chen, T.-Y. Chen, J.-H. Wu, and C.-C. Ko, “Real-time face
detection using a moving camera,” in 2018 32nd International Conference on
Advanced Information Networking and Applications Workshops (WAINA). Krakow,
Poland, May 16–18, 2018, pp. 609–614.
[12] A. Ito, X. Wang, M. Suzuki, and S. Makino, “Smile and laughter recognition using
speech processing and face recognition from conversation video,” in 2005
International Conference on Cyberworlds (CW’05). Singapore, Singapore, Nov. 23-25,
2005, pp. 8–pp.
[13] M. M. Kalayeh, M. Seifu, W. LaLanne, and M. Shah, “How to take a good selfie?”
in Proceedings of the 23rd ACM International Conference on Multimedia, ser. MM
’15. New York, NY, USA: ACM, 2015, pp. 923–926. [Online]. Available:
http://doi.acm.org/10.1145/2733373.2806365
[14] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Doll´ar, “Focal loss for dense object
detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, KIEM TRA
LAI CAC THONG TIN TL NAY, 2018.
[15] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,”
International Journal of Computer Vision, vol. 60, no. 2, pp. 91–110, 2004.
[16] C. C. Nguyen, G. S. Tran, T. P. Nghiem, N. Q. Doan, D. Gratadour, J. C. Burie, and
C. M. Luong, “Towards real-time smile detection based on faster region convolutional
neural network,” in Multimedia Analysis and Pattern Recognition (MAPR), 2018 1st
International Conference on. Ho Chi Minh City, Viet Nam, 2018, pp. 1–6.
[17] J. Redmon and A. Farhadi, “YOLO9000: better, faster, stronger,” in 2017 IEEE
Conference on Computer Vision and Pattern Recognition, CVPR, Honolulu, HI, USA,
July 21–26, 2017, pp. 6517–6525. [Online]. Available:
https://doi.org/10.1109/CVPR.2017.690
[18] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object
detection with region proposal networks,” in Advances in Neural Information
Processing Systems, 2015, pp. 91–99.
[19] Y. Shinohara and N. Otsuf, “Facial expression recognition using fisher weight
maps,” in Sixth IEEE International Conference on Automatic Face and Gesture
Recognition, 2004. Proceedings. Seoul, South Korea, South Korea, May 19, 2004, pp.
499–504.
[20] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale
image recognition,” 2015. [Online]. Available: arXiv.orgicsiarXiv:1409.1556

13
[21] http://mplab.ucsd.edu, “The MPLab GENKI Database, GENKI-4K Subset.”
[22] P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple
features,” in Computer Vision and Pattern Recognition, 2001 IEEE Computer Society
Conference on, CVPR, vol. 1. IEEE, 2001, pp. I–I.
[23] P. Viola and M. J. Jones, “Robust real-time face detection,” International Journal
of Computer Vision, vol. 57, no. 2, pp. 137–154, 2004. REAL-TIME SMILE DETECTION
USING DEEP LEARNING 145
[24] J. Whitehill, G. Littlewort, I. Fasel, M. Bartlett, and J. Movellan, “Toward practical
smile detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.
31, no. 11, pp. 2106–2111, 2009.
[25] H. Zhang, X. Wang, J. Zhu, and C.-C. J. Kuo, “Accelerating proposal generation
network for fast face detection on mobile devices,” in 2018 25th IEEE International
Conference on Image Processing (ICIP). Athens, Greece, Oct. 7–10, 2018, pp. 326–
330.
[26] K. Zhang, Y. Huang, H. Wu, and L. Wang, “Facial smile detection based on deep
learning features,” in 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).
Kuala Lumpur, Malaysia, Nov. 3–6, 2015, pp. 534–538.
Received on November 22, 2018
Revised on January 24, 2019

SSP - CG - Creative Technologies - PDF Version 1
90% (10)
SSP - CG - Creative Technologies - PDF Version 1
34 pages
CAD Commands
No ratings yet
CAD Commands
47 pages
Facial Smile Detection Based On Deep Learning Features
No ratings yet
Facial Smile Detection Based On Deep Learning Features
5 pages
Work With Computers: Interact With A Computer
No ratings yet
Work With Computers: Interact With A Computer
25 pages
Smile Recognition Based On Comprehensive Dataset Construction and Bayesian Neural Architecture Search
No ratings yet
Smile Recognition Based On Comprehensive Dataset Construction and Bayesian Neural Architecture Search
5 pages
DBMS Unit 1 Database System Architecture GTU Study Material Presentations
No ratings yet
DBMS Unit 1 Database System Architecture GTU Study Material Presentations
38 pages
Facial Expression Recognition Ijariie10058
No ratings yet
Facial Expression Recognition Ijariie10058
6 pages
APC MGE Galaxy 3500 Operation - Manual (EN)
No ratings yet
APC MGE Galaxy 3500 Operation - Manual (EN)
48 pages
Paperback Practical Laravel
No ratings yet
Paperback Practical Laravel
177 pages
Facial Emotion Recognition A Comparison of
No ratings yet
Facial Emotion Recognition A Comparison of
4 pages
Unit 3-Non CNN Approaches To Object Recognition
No ratings yet
Unit 3-Non CNN Approaches To Object Recognition
26 pages
Capstone Project: Fingerprint Based ATM Machine Maharashtra State Board of Technical Education (2022-2023)
No ratings yet
Capstone Project: Fingerprint Based ATM Machine Maharashtra State Board of Technical Education (2022-2023)
76 pages
Advanced Java Programming Lab Manual
No ratings yet
Advanced Java Programming Lab Manual
51 pages
Transfer Learning Approach Based On MobileNet Architecture For Human Smile Detection
No ratings yet
Transfer Learning Approach Based On MobileNet Architecture For Human Smile Detection
9 pages
An Efficient Deep Learning Based Oral Lesion Detecting Using Random Forest Classifier
No ratings yet
An Efficient Deep Learning Based Oral Lesion Detecting Using Random Forest Classifier
20 pages
TDX Oli DMR PS 4105
No ratings yet
TDX Oli DMR PS 4105
3 pages
Tera Form
No ratings yet
Tera Form
12 pages
An Evolutive Approach For Smile Recognition in Video Sequences
No ratings yet
An Evolutive Approach For Smile Recognition in Video Sequences
17 pages
Deep Features Creation For Smile Classification in Biometric Systems
No ratings yet
Deep Features Creation For Smile Classification in Biometric Systems
3 pages
Home Automation Using Iot
No ratings yet
Home Automation Using Iot
16 pages
88 Submission-1
No ratings yet
88 Submission-1
10 pages
Emotion Detection: Using Opencv and Tensorflow
No ratings yet
Emotion Detection: Using Opencv and Tensorflow
35 pages
Facial Emotion Recognition Thesis
100% (3)
Facial Emotion Recognition Thesis
5 pages
MYRIAD MODEL User Reference Guide
No ratings yet
MYRIAD MODEL User Reference Guide
74 pages
IJIVP Vol 12 Iss 1 Paper 6 2531 2536
No ratings yet
IJIVP Vol 12 Iss 1 Paper 6 2531 2536
7 pages
IEEE Report Face Emotions Recognition
No ratings yet
IEEE Report Face Emotions Recognition
4 pages
Improved Edge Detection Based Fast Face Detection Method Using Enhanced Fourier Transform (IED-FFD) Towards Facial Expression Recongnition
No ratings yet
Improved Edge Detection Based Fast Face Detection Method Using Enhanced Fourier Transform (IED-FFD) Towards Facial Expression Recongnition
7 pages
Wild Facial Expression Recognition
No ratings yet
Wild Facial Expression Recognition
11 pages
Auto Capture Report
No ratings yet
Auto Capture Report
13 pages
ml2 Copy
No ratings yet
ml2 Copy
20 pages
Title: "Facial Expression Recognition Using Convolutional Neural Network (CNN) : A Review"
No ratings yet
Title: "Facial Expression Recognition Using Convolutional Neural Network (CNN) : A Review"
15 pages
C Embededed 01
100% (1)
C Embededed 01
142 pages
Os390 HCD Configuration Definition Planning
No ratings yet
Os390 HCD Configuration Definition Planning
116 pages
Computer Vision Integrated Website
No ratings yet
Computer Vision Integrated Website
6 pages
Emotion Recognition PDF
No ratings yet
Emotion Recognition PDF
7 pages
Face Expression Rec Survey
No ratings yet
Face Expression Rec Survey
56 pages
Emotion Detection Using Integration of Deep Learning and SVM
No ratings yet
Emotion Detection Using Integration of Deep Learning and SVM
4 pages
Vol. 49. No. 04. April 2022 第 49 卷第 04 期 2022 年 4 月湖南大学学报（自然科学版） Journal of Hunan University（Natural Sciences）
No ratings yet
Vol. 49. No. 04. April 2022 第 49 卷第 04 期 2022 年 4 月湖南大学学报（自然科学版） Journal of Hunan University（Natural Sciences）
14 pages
An Efficient Real-Time Emotion Detection Using Camera and Facial Landmarks
No ratings yet
An Efficient Real-Time Emotion Detection Using Camera and Facial Landmarks
5 pages
Investigation of Deep Neural Network Compression Based On Tucker Decomposition For The Classification of Lesions in Cavity Oral
No ratings yet
Investigation of Deep Neural Network Compression Based On Tucker Decomposition For The Classification of Lesions in Cavity Oral
8 pages
Izrada Diplomskog Haarovi
No ratings yet
Izrada Diplomskog Haarovi
6 pages
Thesis Facial Expression Recognition
100% (1)
Thesis Facial Expression Recognition
4 pages
Simatic PCS 7 Process Control System CFC Readme V8.2 SP1 Upd5 (Online)
No ratings yet
Simatic PCS 7 Process Control System CFC Readme V8.2 SP1 Upd5 (Online)
24 pages
387-Article Text-2438-8-10-20240801
No ratings yet
387-Article Text-2438-8-10-20240801
10 pages
Machine Learning Approach
No ratings yet
Machine Learning Approach
6 pages
Recognizing Facial Expressions Through Tracking: Salih Burak Gokturk
No ratings yet
Recognizing Facial Expressions Through Tracking: Salih Burak Gokturk
22 pages
10 Stryker 1688 4K AIM Integrated Camera Head and Coupler M
No ratings yet
10 Stryker 1688 4K AIM Integrated Camera Head and Coupler M
68 pages
Comparing Frei-Chen & Sobel Edge Detector in Face Emotion Recognition With Euclidean Distance and LVQ Network
No ratings yet
Comparing Frei-Chen & Sobel Edge Detector in Face Emotion Recognition With Euclidean Distance and LVQ Network
5 pages
Recognizing Facial Expressions Through Tracking: Salih Burak Gokturk
No ratings yet
Recognizing Facial Expressions Through Tracking: Salih Burak Gokturk
22 pages
Mirror, Mirror On The Wall: Automating Dental Smile Analysis With AI in Smart Mirrors
No ratings yet
Mirror, Mirror On The Wall: Automating Dental Smile Analysis With AI in Smart Mirrors
11 pages
Detection of Human Emotions in An Image Using CNN
No ratings yet
Detection of Human Emotions in An Image Using CNN
6 pages
MPC HC Guide
No ratings yet
MPC HC Guide
4 pages
Gcapsnet: Multi-Feature Aware Pose and Geometry Based Facial Expression Recognition Using Deep Learning
No ratings yet
Gcapsnet: Multi-Feature Aware Pose and Geometry Based Facial Expression Recognition Using Deep Learning
19 pages
File
No ratings yet
File
24 pages
TransferLearning PDF
No ratings yet
TransferLearning PDF
7 pages
Ham10000 Model Xai
No ratings yet
Ham10000 Model Xai
12 pages
Facial Recognition Techniques-Presentation
No ratings yet
Facial Recognition Techniques-Presentation
12 pages
Emotion Detector
No ratings yet
Emotion Detector
5 pages
Facial Emotion Recognition Using Machine Learning PDF
No ratings yet
Facial Emotion Recognition Using Machine Learning PDF
63 pages
Facial Expression Recognition Using SVM-CHARM
No ratings yet
Facial Expression Recognition Using SVM-CHARM
12 pages
An Experimental Study in Real-Time Facial Emotion Recognition On New 3RL Dataset
No ratings yet
An Experimental Study in Real-Time Facial Emotion Recognition On New 3RL Dataset
9 pages
Analysis of Facial Expression Recognition Using Histogram of Oriented Gradient (HOG)
No ratings yet
Analysis of Facial Expression Recognition Using Histogram of Oriented Gradient (HOG)
5 pages
Essay Fina
No ratings yet
Essay Fina
32 pages
Facial Expression Emotion Recognition Model Integrating Philosophy and Machine Learning
No ratings yet
Facial Expression Emotion Recognition Model Integrating Philosophy and Machine Learning
5 pages
Architecture and Second Digital Turn The Evolution of Digital Tools Within The Design Process
No ratings yet
Architecture and Second Digital Turn The Evolution of Digital Tools Within The Design Process
49 pages
Emotion Recognition Using Facial Feature Extraction
No ratings yet
Emotion Recognition Using Facial Feature Extraction
126 pages
Literature Review - Sheet1
No ratings yet
Literature Review - Sheet1
4 pages
Facial Expression Recognition Using Facial Landmarks: A Novel Approach
No ratings yet
Facial Expression Recognition Using Facial Landmarks: A Novel Approach
5 pages
PHD Thesis On Facial Expression Recognition
100% (3)
PHD Thesis On Facial Expression Recognition
6 pages
Emotion Recognition and Drowsiness Detection Using
No ratings yet
Emotion Recognition and Drowsiness Detection Using
7 pages
Thesis Facial Recognition
100% (2)
Thesis Facial Recognition
5 pages
Feature Extraction and Classification Methods of Facial Expression: A Surey
No ratings yet
Feature Extraction and Classification Methods of Facial Expression: A Surey
7 pages
Digital Image Processing
No ratings yet
Digital Image Processing
7 pages
Intelligent Facial Emotion Recognition Using modified-PSO: Abstract
No ratings yet
Intelligent Facial Emotion Recognition Using modified-PSO: Abstract
9 pages
CHAP8Virtual Memory
No ratings yet
CHAP8Virtual Memory
58 pages
Sequence Control
No ratings yet
Sequence Control
2 pages
Interconnection Structures
No ratings yet
Interconnection Structures
7 pages
6.4.1.4 Lab - Smile Detection
No ratings yet
6.4.1.4 Lab - Smile Detection
19 pages
A_facial_expression_recognizer_using_modified_ResN
No ratings yet
A_facial_expression_recognizer_using_modified_ResN
9 pages
Deepi Pro
No ratings yet
Deepi Pro
63 pages
C-Bus Toolkit 1.15 Release Notes
No ratings yet
C-Bus Toolkit 1.15 Release Notes
21 pages
White Hill Smile Detection
No ratings yet
White Hill Smile Detection
6 pages
Business Technology Lesson 2
No ratings yet
Business Technology Lesson 2
20 pages
Mobile Device Forensic Tool Test Specification V 3.1
No ratings yet
Mobile Device Forensic Tool Test Specification V 3.1
21 pages
Twincat - How Are Functions Assigned To The Cores?: Products
No ratings yet
Twincat - How Are Functions Assigned To The Cores?: Products
4 pages
Cal-C Feb-2017
No ratings yet
Cal-C Feb-2017
2 pages
Computer Science 2021 22
No ratings yet
Computer Science 2021 22
8 pages
Comprehensive ETABS Guide - Structural Analysis & Design
No ratings yet
Comprehensive ETABS Guide - Structural Analysis & Design
70 pages
Automated Smiley Face Extraction Based On Genetic Algorithm
No ratings yet
Automated Smiley Face Extraction Based On Genetic Algorithm
7 pages