Long Distance Face Recognition For Enhanced Performance of Internet of Things Service Interface
Long Distance Face Recognition For Enhanced Performance of Internet of Things Service Interface
2298/CSIS130926059M
1. Introduction
Until now, the Internet has been utilized as the optimal space by humans to share
information as producers or consumers of information. In the future, not only
information produced by humans but also everyday things will be connected to the
Internet and will evolve so that the Internet of things can share the information of things
via the Internet. Currently, industries, academics and governments from around the
world are working on developing technologies and services for an intelligent network of
things in various forms with Machine to Machine (M2M) or Internet of Things (IoT)
[1]. Humans communicate with objects and services through IoT and objects and
services communicate each other through IoT technology. As such, IoT interconnects
human, objects and ambient environments including services and. It includes the
traditional IoT services such as Smart Home/Security/Entertainment,
Logistics/Distribution/Material Management/Security Management, Transportation/
Ambulance/Defense as well as various IT convergence services such as object
recognition through location or motion and sensing information and situational
awareness [2]. For example, when viewing from the human and service perspectives, a
962 Moon H.M. and Pan S.B.
By extending the traditional concept of the Internet, IoT is a next generation internet
paradigm that encompasses networks of objects vs. object, and human vs. objects, which
Long Distance Face Recognition for Enhanced Performance of IoT Service Interface 963
various ambient objects are participating in the internet [8], [9], [10]. The definition of
IoT can be generally divided into: Internet-based definition, Semantic- based definition
and object-based definition. Firstly, the internet-based IoT definition is focused on
network construction to therefore be able to connect with any objects, anywhere, and
anyone such as the International Telecommunication Union (ITU) [11], [12]. Currently,
the world is changing such that internet-based IoT is connecting a number of
surrounding objects including mobile internet, Radio Frequency Identification (RFID),
and sensor network and the objects are communicating with each other autonomously
[13]. Secondly, the semantic-based definition is approaching IoT from the point of view
of how to express, store, search and systemize many objects that will be included in IoT
and the information which is produced from these objects [14], [15], [16]. Lastly, the
RFID international standard organization, Global Standard 1 (GS1)/ Electronic Product
Code (EPC) global defined IoT for the first time based on objects having the sole
identifier-EPC. This made it possible to have object recognition and global location
tracking by attaching a RFID tag with EPC to objects by reading these codes in real time
through RFID readers installed all over the world and by storing and managing that
information in IoT infrastructure distributed system [17]. Based on this, it is possible to:
monitor and manage object information, which is part of IoT in real time and have
various IoT services through a standardized interface. Recently, advances moving
beyond simple identification studies are underway to provide various and intelligent IoT
services through the development of an advanced interface including situation
recognition and human recognition [3], [18], [19].
Face recognition technology is examined in various studies ranging from still image-
based face recognition in a controlled environment to video image-based face
recognition from a crowded environment [20], [21], [22]. In this paper, we utilized
LDA, which uses a feature extraction method using basis vector. In order to express
two-dimensional face images, face shape and texture information are vectorized. For
face shape information, physiographic features like the distance and ratio of face
elements such as eye, nose and mouth are used. Texture information is expressed as
brightness information itself in the face area. By arraying the brightness value of two
dimensional face images in order, features are extracted by expressing one-dimensional
vector. The feature extraction process in face recognition is to find the base vector for
linear transition. LDA is to find the basis vector which reduces the scatter within the
class and increases the distance between averages of each class [23], [24]. LDA use face
images as a feature vector for face recognition by reflecting the face images to the basis
vector.
Table 1 briefly shows the training process of the LDA technique. Table 2 briefly
shows the recognition process of the LDA technique. In here, the most similar feature
vector images are used as recognition result images by measuring the similarity of
feature vectors between recognition images obtained and training images.
964 Moon H.M. and Pan S.B.
( x mean )( x mean) , x
1
Si i
T
mean
Pi
xX i xX i
3. Definition of within-class scatter of matrix Sw
C
SW S
i 1
i
,
4. Definition of between-class scatter of matrix Sb
C P
x
1
SB ni (meani mean)(meani mean)T , mean i
P
i 1 i 1
5. Definition of matrix that maximizes the ratio of Sw and Sb
W T S BW
Wopt arg max w1, w2 , wm
W T SW W
S B wi i SW wi , i 1, 2,m
- C : number of classes, ni : number of images per class
y
1
yi yi mean, mean i
P
i 1
3. Definition of feature vector for recognition image using Wopt
i
y Wopt yi
For long distance face recognition, since the size of face images extracted according to
the distance between camera and the subject is different, the size of face images to be
verified should be normalized to fit to the size of training images. Therefore,
interpolation is used to adjust the image size [25]. The nearest neighbor interpolation is
the simplest method among interpolations and it refers to the pixel of nearest original
images from the location that the output pixel is to be produced. Bilinear interpolation is
a technique to produce the pixel to be interpolated using the adjacent four pixels. The
interpolated pixel is determined by the sum of four pixels multiplied by a weighted
value. At this time, weighted values are determined linearly and are inversely
Long Distance Face Recognition for Enhanced Performance of IoT Service Interface 965
proportional to the distance from each of the adjacent pixels. Figure 1 shows the bilinear
interpolation using one-dimensional linear interpolation. To find the interpolated pixel I,
bilinear interpolation is performed using the values of the adjacent four pixels (A, B, C
and D). The bilinear interpolation provides a better image than nearest neighbor
interpolation but it increases the computational complexity and the edge parts are not as
smooth.
Figure 3 is the flowchart of the proposed LDA-based long distance face recognition.
Figure 3(a) shows the overall flow and Figure 3(b) presents the normalization process of
face images being entered. The overall flow of the face recognition algorithm is the
same as in existing face recognition algorithm. However, it has a difference in that the
proposed algorithm uses face images at a distance of 1m to 5m as training images and
adds a normalization process for face images based on distance.
(a) The overall flow of proposed system (b) The normalization process of face images
Fig. 3. Long distance face recognition flowchart using LDA
The training process using face images from a distance is as follows. If face images at
a distance of 1m to 5m are entered, the average face vector of the normalized face image
is calculated through a normalization process. After calculating the difference of average
face vectors in each face image, then find the covariance matrix. After finding the
eigenvector and eigenvalue from the determined covariance matrix, finally Wpca is
generated. Wpca generated through PCA is optimized by LDA again. To find Wlda which
is the data that the ratio of between-class scatter and within-class scatter in LDA is
maximal. The test process of using face images from 1m to 5m distances is as follows.
When the face image from 1m to 5m distances is entered, it is normalized through a
normalization process. From the normalized face images, feature vectors are extracted
through a difference of average face image vector and Wlda projection. Finally, after
comparing the feature vectors in the test area and the training areas, the face image that
has the most similar value is classified.
The normalization of face image by distance is as follows. Once the face images for
training are entered, the size of the input face images is judged. If the size of the image
is 50×50, equalization will be conducted. However, if the size is smaller than 50×50
then equalization will be conducted after enlarging the size to 50×50 through
interpolation. All face images entered through this process will be normalized into a
50×50 image size. Figure 4 is the original images at increasing distances and Figure 5 is
the original face images extracted from person 1 according to the distance change of 1m
Long Distance Face Recognition for Enhanced Performance of IoT Service Interface 967
to 5m. The sizes of extracted face images are 50×50, 30×30, 20×20, 16×16 and 12×12
from 1m to 5m, respectively. The face images extracted by distance are normalized by
four kinds of interpolation as shown in Figure 5.
(d) 4 m (e) 5 m
Fig. 4. Examples of original images at increasing distances
4. Experiment Result
Face recognition experiment uses ETRI face DB. As shown in Table 3, an ETRI face
DB obtained 500 face images (1m to 5m: 100 images for each) per person from 10
different people in various lighting environments and at different distances [27].
Acquired face images were obtained using different distances ranging from 1m to 5m. In
this paper based on the experimental images face images extracted from 1m to 2m were
considered as short distance and face images extracted from 3m to 5m were considered
as long distance. Face recognition, which is an 1:N search method rather than 1:1
authentication, classifies by comparing results for verified images of first face images
which are the most similar among face images stored in the database. In addition, the
experiment was carried out under the assumption that every face is extracted from input
images regardless of distance. Additionally, a twisting or rotation of the face was not
considered.
968 Moon H.M. and Pan S.B.
This experiment was carried out using Table 4 in order to find appropriate interpolations
for the proposed algorithm. LDA was used as the face recognition method and
Euclidean Distance was used for similarity measure. For normalization of the face image
size by distance of 1m to 5m, the nearest neighbor, bilinear, bicubic convolution and
Lanczos3 interpolations were used [28].
Figure 6 shows the results of LDA-based face recognition rate using normalized face
images by distance through interpolation. In the experimental condition as shown in
CASE 1 in Table 4 in order to get training images per person, 20 images of 1m face
image were used and 80 images of face images at distances of 1m to 5m were used for
verification images. As a result, when Lanczos3 interpolation was used for short
distance the face recognition was 85.6%, which was the best performance. At long
distance, when bicubic convolution and Lanczos3 were used, it showed similar
performance at 44% and 44.1%, respectively. Figure 7 shows the results of LDA-based
face recognition rate using normalized face images by distance through interpolation.
For experiment condition as shown in CASE 2 in Table 4, a total of 20 face images for
1m to 5m distance by each 4 images were used to generate training images. As for test
images, each of 80 face images at 1m to 5m distances was used. As a result, when
Lanczos3 interpolation was used the face recognition was 92.9%, which showed the best
face recognition performance. In long distance, the face recognition performance was
excellent for 75.0% when bilinear interpolation was used. Figure 8 shows the results of
LDA-based face recognition rate using normalized face images by distance through
interpolation. For experiment condition as shown in CASE 3 in Table 4, a total of 50
face images for 1m to 5m distance by each 10 images were used for training images. As
Long Distance Face Recognition for Enhanced Performance of IoT Service Interface 969
for test images, each of 80 face images by 1m to 5m distances was used. As a result,
when bilinear interpolation was used the face recognition was 93.8%, which showed the
best face recognition performance. In long distance, the face recognition performance
was excellent for 78.54% when nearest neighbor interpolation was used.
As a result, when the short distance face image is used as training, it is better to use
Lanczos3 at the image normalization method in LDA-based face recognition. However,
when using the face images at 1m to 5m distances as training, the face recognition
performance was the best using the bilinear interpolation. When comparing CASE 1 and
CASE 2 results, it was confirmed that it had better performance when using face images
at 1m to 5 m distances than that of short distances. In addition, according to CASE 2 and
CASE 3 results, as the number of training images per person increased, the recognition
performance was improved. CASE 3 has better performance than CASE 2 but 50 pages
of the training images per person are not used for general face recognition, so in this
paper, the CASE 2 condition was considered.
Through this experiment, the effect on the face recognition rate of the configuration of
the training image and the excellence of LDA-based face recognition when face images
that are at a distance were used as training images are proved. Figure 9 shows the results
of the configuration of training image effect on face recognition in LCA-based face
recognition. In CASE 1, Lanczos3 interpolation was used and in CASE 2 and CASE 3
bilinear interpolation was used for normalization of the face image size. L2 was used for
the similarity measure. As a result, when using a single distance for training images of
CASE 2, the performance was 85.8% at short distances and 44.0% at long distances.
When using face images at distances of 1m to 5m, the short distance had better
performance for 91.9% than when using single distance for training images, which was
Long Distance Face Recognition for Enhanced Performance of IoT Service Interface 971
75.0%. Consequentially, when the same number of training images was used, the face
recognition rate was improved if multi-distance face images were used rather than single
distance face images.
Through this experiment, when the face images at 1m to 5m distances were used, the
similarity measure that is appropriate to long distance face recognition is proposed. The
configuration of training images were like in CASE 2 and LDA was used for the face
recognizer. Bilinear interpolation was used as image normalization method. For
similarity measure, Manhattan Distance (L1), Euclidean Distance (L2), Cosine
Similarity (Cos), and Mahalanobis Distance (Mah) distance scale method were used
[29]. Figure 10 shows the face recognition rate of LDA-based face recognition
according to similarity measure. As a result, L2 was used for short distance and it
showed the best performance at 91.9%. In long distance, L1 and L2 showed similar
performance at 75.1% and 75%, respectively. The overall average face recognition rate
of 1m to 5m were 81.6%, 81.8%, 76.0% and 80.7% respectively when using L1, L2,
Cos and Mah and the recognition rate of L2 was the best. Consequently, in LDA-based
long distance face recognition when multi-distance images were used as training, the
face recognition performance was the best when the Euclidean Distance (L2) similarity
measure was used.
5. Conclusion
The world considers IoT as a means of securing national competitiveness and develops
efficient interface based on technology development and IoT dissemination by policy.
972 Moon H.M. and Pan S.B.
This paper proposes a long distance face recognition algorithm, which is applicable as
an USN or MSM service-based technology. Face recognition, which has used the
existing single distance face images as training images, has the disadvantage of lower
recognition rate as the distance between the surveillance camera and the user increases.
In this paper, an LDA-based long distance face recognition algorithm appropriate to the
environment of the surveillance camera is proposed. Face images at distance were used
in a proposed face recognition algorithm and the low resolution images at distance were
normalized using a bilinear interpolation. For the similarity measure, Euclidean Distance
measure method was used. A major result of this experiment showed that the proposed
face recognition algorithm had improved face recognition rate for 6.1% in short distance
and for 31.0% in long distance compared to the LDA-based face recognition using
existing short distance face images.
In future, the proposed algorithm will be developed into a structure that is able to use
minimization and low power processing of the proposed algorithm suitable for an object
communication service environment. Additionally, technologies that can protect
personal information effectively used in human recognition received on a mobile robot
or terminal device will be developed.
Acknowledgments. This research was supported by a Basic Science Research Program through
the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science
and Technology (2011-0023147)
References
1. Strategy, I.T.U., Unit, P.: ITU Internet Reports 2005: The Internet of Things, International
Telecommunication Union, Geneva. (2005)
2. Gonzalez-Miranda, S., Alcarria, R., Robles, T., Morales, A., Gonzalez, I., Montcada, E: An
IoT-leveraged Information System for Future Shopping Environments. IT Convergence
Practice, Vol. 1, No. 3, 49–65. (2013)
3. Sundmaeker, H., Guillemin, P., Friess, P., Woelfflé, S.: Vision and Challenges for Realising
the Internet of Things. CERP-IoT–Cluster of European Research Projects on the Internet of
Things, 1–230. (2010)
4. Perera, C., Zaslavsky, A., Christen, P., Georgakopoulos, D.: Context Aware Computing for
the Internet of Things: a survey. IEEE Communications Surveys & Tutorials, Vol. 16, No. 1,
414–455. (2014)
5. Moon, H.M., Pan, S.B.: A New Human Identification Method for Intelligent Video
Surveillance System. In Proceedings of 19th International Conference on Computer
Communication and Networks, Zurich, Switzerland, 1–6. (2010)
6. Tsai, H.C., Wang, W.C., Wang, J.C., Wang, J.F.: Long Distance Person Identification using
Height Measurement and Face Recognition. In Proceedings of 2009 IEEE Region 10
Conference, Singapore, 1–4. (2009)
7. Yao, Y., Abidi, B., Kalka, N.D., Schmid, N., Abidi, M.: High Magnification and Long
Distance Face Recognition: Database Acquisition, Evaluation, and Enhancement. In
Proceeding of 2006 Biometrics Symposium: Special Session on Research at the Biometric
Consortium Conference, Baltimore, Maryland, 1–6. (2006)
8. Atzori, L., Iera, A., Morabito, G.: The Internet of Things: A Survey. Computer Networks,
Vol. 54, No. 15, 2787–2805. (2010)
Long Distance Face Recognition for Enhanced Performance of IoT Service Interface 973
9. Gershenfeld, N., Krikorian, R., Cohen, D.: The Internet of Things. Scientific American, Vol.
291. No. 4, 76–81. (2004)
10. Ashton, K.: That ‘Internet of things’ Thing, RFID Journal, Vol. 22, 1–6. (2009)
11. ITU.: http://www.itu.int/en/ITU-T/techwatch/Pages/internetofthings.aspx
12. IPSO.: http://ipso-alliance.org/about
13. Weinstein, R.: RFID: A Technical Overview and Its Application to the Enterprise. IT
Professional, Vol. 7, No. 3, 27–33. (2005)
14. Toma, I., Simperl, E., Hench, G.: A Joint Roadmap for Semantic Technologies and the
Internet of Things. In Proceeding of 3rd STI Roadmapping Workshop, Helsinki, Greece.
(2009)
15. Katasonov, A., Kaykova, O., Khriyenko, O., Nikitin, S., Terziyan, V.: Smart Semantic
Middleware for the Internet of Things. In Proceedings of the Fifth International Conference
on Informatics in Control, Automation and Robotics, Robotics and Automation, Madeira,
Portugal. (2008)
16. Song, Z., Cárdenas, A.A., Masuoka, R.: Semantic Middleware for The Internet of Things.
Internet of Things, 1–8. (2010)
17. San Jose, J.I., de Dios, J.J., Zangroniz, R., Pastor, J.M.,: WebServices Integration on An
RFID-based Tracking System for Urban Transportation Monitoring. IT Convergence
Practice, Vol. 1, No. 4, 1–23. (2013)
18. Miorandi, D., Sicari, S., Pellegrini, F.D., Chlamtachm, I.: Internet of Things: Vision,
Applications and Research Challenges. Ad Hoc Networks, Vol. 10, No. 7, 1497–1516.
(2012)
19. Domingo, M.C.: An Overview of the Internet of Things for People with Disabilities. Journal
of Network and Computer Applications, Vol. 35, No. 2, 584–596. (2012)
20. Wiskott, L., Fellous, J.M., Kr¨uger, N., von der Malsburg, C.: Face Recognition by Elastic
Bunch Graph Matching. IEEE Transactions on Pattern Analysis and Machine Intelligence,
Vol. 19, No. 7, 775–779. (1997)
21. Chellappa, R., Wilson, C.L., Sirohey, S.: Human and Machine Recognition of Faces: A
Survey. Proceedings of IEEE, Vol. 83, No. 5, 705–741. (1995)
22. Zhao, W., Chellappa, R., Phillips, P.J., Rosenfeld, A.: Face Recognition: a Literature Survey.
ACM Computing Surveys, Vol. 35, 399–458. (2003)
23. Turk, M., Pentland, A.: Eigenfaces for Recognition. Journal of Cognitive Neuroscience, Vol.
3, No. 1, 71–86. (1991)
24. Belhumeur, P., Hespanha, J., Kriegman, D.: Eigenfaces vs. Fisherfaces: Recognition using
Class Specific Linear Projection. IEEE Transactions on Pattern Analysis and Machine
Intelligence, Vol. 19, No. 7, 771–720. (1999)
25. Parker, J.A. , Kenyon, R.V., Troxel, D.E.: Comparison of Interpolating Methods for Image
Resampling. IEEE Transactions on Medical Imaging, Vol. 2, No. 1, 31–39. (1983)
26. Keys, R.G.: Cubic Convolution Interpolation for Digital Image Processing. IEEE
Transactions on Acoustic, Speech, and Signal Processing, Vol. asp-29, No. 6, 1153–1160.
(1981)
27. Kim, D.H., Lee, J.Y., Yoon, H.S., Cha, E.Y., A Non-Cooperative User Authentication
System in Robot Environments. IEEE Transactions on Consumer Electronics, Vol. 53, No.2,
804–810. (2007)
28. Duchon, C.E.: Lanczos Filtering in One and Two Dimensions. Journal of Applied
Meteorology, Vol. 18, No. 8, 1076–1022. (1979)
29. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, John Wiley & Sons, USA. (2004)
974 Moon H.M. and Pan S.B.
Hae-Min Moon received the B.S. degree in Control, Instrumentation, and Robot
Engineering in 2009 from Chosun University, Gwangju, Korea. He received the M.S.
degrees in Information and Communication Engineering in 2010 from Chosun
University, Gwangju, Korea. He is currently working toward the Ph.D. degree. His
research interests include image interpolation, video surveillance, and video
compression.
Sung Bum Pan is the corresponding author of this paper. He received the B.S., M.S.,
and Ph.D. degrees in Electronics Engineering from Sogang University, Korea, in 1991,
1995, and 1999, respectively. He was a team leader at Biometric Technology Research
Team of ETRI from 1999 to 2005. He is now professor at Chosun University. His
current research interests are biometrics, security, and VLSI architectures for real-time
image processing.